Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


« Return to show page

Transcript for Episode #174:
Happy developers use Python 3

Recorded on Wednesday, Mar 18, 2020.

00:00 OKKEN: Hello and welcome to Python Bytes, where we deliver news and headlines directly to your earbuds. This is Episode 174 recorded in March 18th 2020. I'm Brian Okken.

00:11 KENNEDY: and I'm Michael Kennedy.

00:12 OKKEN: And this week, this episode is brought to you by TalkPython Courses and the PyTest book.

00:18 KENNEDY: Yeah, Yay! It's brought to you by us. More about that later, huh?

00:22 OKKEN: Yeah. So we're doing that little something a little different. We're recording in two different locations, because of, actually, we always record in two different locations.

00:30 KENNEDY: But the location's are sometimes not. especially the location you're in.

00:35 OKKEN: Yeah, I often Yeah, record somewhere else. But today I'm at home because a lot of people are at home working remotely in home offices now because of Ah, I don't even know how to pronounce it. I read it COVID, COVID19.

00:50 KENNEDY: Yeah, it is an insane time on so many levels, but I would say certainly there's a lot of tech people out there who may be working from home for the first time. You know, I know there's a lot of large companies that feel like you need to go to be in the office and you need to do the work.

01:08 KENNEDY: And yet a lot of the tools that we use as developers are very suited to the situation that many of us around the world find ourselves in. Working from home working asynchronously and whatnot, right?

01:21 KENNEDY: GitHub, Slack, Email, Zoom. Whatever it is, it's interesting to see the rest of the world scale up to kind of, you know, what we've been doing for a long time.

01:31 OKKEN: We were lucky that our our office was recently moved during in July, and during the move we tried to set everybody up to be able to remote work because some people had longer commutes them before and it happened to be. I mean, it's just fortunate that we set that up before this happened, and I'm also very fortunate that I'm a software worker. There's a lot of people that I mean, our work can continue for most part, with little interruption, but is a harder environment. But a lot of people that are not technical workers Ah, can't do that.

02:05 KENNEDY: Yeah, it's such a bummer. You know, my daughter, she just got a new job and she was supposed to start, actually, she was supposed to her today and they sent her a message. You know what? Or clothes or closed indefinitely, and there's no reason for you to come and get trained to work here, because who knows what it's gonna look like in a month or two? I mean, that's the reality for a lot of people. It's rough.

02:27 OKKEN: One of the reasons why we started talking about this this morning is just to say, you know, to reach out to everybody and say, Yeah, I hope everybody's doing okay And, yeah, I let us know some stories. If you want to share.

02:39 KENNEDY: Yeah, maybe some interesting tech angles, right? Like problems you run into or things you found that really worked or whatever. But yeah, everyone out there be safe. It's not always fun, but just find a place to hole up and just wait this thing out and be safe.

02:54 OKKEN: Yeah, that's a good idea for some extra things, like related to that. I'll add one of these

02:59 OKKEN: on their add-ons at the end, I'll add on to that.

03:03 KENNEDY: All right, Super.

03:04 OKKEN: Well, I want to start out with talking about community. I was partly thinking about this because of the Corona virus stuff. And a lot of people possibly have maybe two extra hours in the day because they're not commuting. Maybe. Or, you know, I'm sorry if you have a two hour commute or an hour commute on each end, but, uh, but you might have some extra time. So one of the things you might want to share and spend some time doing is beefing up documentation on open source projects.

03:30 OKKEN: Is actually was a great article called Documentation Is a Way to Build Community by Melissa Mendonza, I think, Sorry, Melissa, but it talks about how educational materials can have a huge impact and and effectively bring people into a community. And beefing up the documentation story on open source projects can actually help bring more people to use it and help. I mean, it seems obvious, but but it isn't really, and people aren't doing it. There's a lot of projects that had lack in really good documentation, and there's a lot of reasons for that. And talking about the reasons I think is interesting, decentralized development and a lot of projects start with just somebody scratching their own itch, and they don't need documentation for that. But it grows into other people getting involved and a lot of people its more glamorous to add new features or fix a nasty bug and adding more documentation, nobody really kind of knows how to do that. I think it's important in a spending some more focus. One of the directions of this article says it was targeting a specific project, but I think it really could be really more than just This one is splitting up the documentation into organizing it, as in four different areas tutorials, how-to's, reference guides, and explanations these four areas and subsections of those can be targeted towards different people targeted towards beginners or advanced people or somebody just looking something up.

05:02 OKKEN: One of the great things about that is it makes it easier for somebody to jump in and say, Oh, there's like, one little piece of things, how to do something. I can contribute to that. I might not know how or why it works, but I can contribute a how-to and some tutorials, whereas maybe some of the more expert people in the project can do some of the explanations of how things are working and, uh, also a lot of teams kind of shift or some projects have, the new people come in, Say you want to help out one of the right documentation and I think that's a great thing. But then you've got documentation that's just filled with the beginner people that content from beginners, that might not be, you know, from some of the experience people. And so I think there's some good information here, and I think focusing on documentation be a good thing.

05:46 KENNEDY: I like the article. I like the idea of it right that

05:50 KENNEDY: you can build a community. Certainly you can contribute to these projects quite easily in this way, breaking it up into these categories is really clever because then you can definitely just sit down and think, Oh, I'm gonna write some docks for this thing. Well, that's pretty wide open, right? But I'm gonna write a short tutorial, which I had to learn because I had to use this thing. Now I know how to do that. Why don't I generalize that? Make a tutorial? That seems like a really easy way to get yourself on the contributor list,

06:16 KENNEDY: beef up your resume. Say I contributed this project, etcetera. I think it's good

06:21 OKKEN: one of these. I'd like to reach out to people. Some of the beginner stuff. A great thing to do is while you're learning a project isn't writing new content. But while you're reading documentation on a project,

06:32 OKKEN: if there's typos, if there's, ah, just grammar errors it may have been written by somebody that isn't native English, so you can help out by just fixing some of those things. And then also, while you're going through things, if if you stumble on something and it's it's difficult to follow the instructions, it might be that the instructions need to be a modified. And why not just do like a pull a request of modifying those instructions to be the way it really works?

06:58 KENNEDY: Yeah, that'd be great. You know, another area that might be interesting is to write tests a lot.

07:03 KENNEDY: A lot of projects lack tests or, you know, they're just marginally tested, and you're like, Okay, I'm gonna create this tutorial, and I want to make sure the things I'm saying work, so let me add some tests to verify what I believe to be true, to be true and go out and commit that back to the project

07:19 OKKEN: in modifying tests. If the tests are not readable, they should be. And maybe you can make a memorial.

07:24 KENNEDY: Yeah, I guess I kind of started thinking about that because documentation and test feel a little bit like a form of documentation.

07:31 KENNEDY: Yeah, definitely.

07:32 KENNEDY: Yeah.

07:33 KENNEDY: Cool. Well, I'm pretty passionate about fast websites. As you probably know, I talk about trying to make websites fast all the time.

07:41 KENNEDY: Our Web site's pretty fast

07:42 OKKEN: speed is important to a slow website. Strong push people away.

07:46 KENNEDY: They do. I think it was Amazon, or somebody did a study saying, Like every you know, 100 milliseconds late in sea of perceived latency to the user and has a very tangible, like whole number percentage drop in actual sales sales. Not the most important thing, necessarily. Maybe if your Amazon they are, but it's just gives you a sense of like 100 milliseconds, you can barely perceive that as a person. And yet, as those things add up right, it starts to really make a difference. in behavior. So I want to talk about this article sort of riff on some topics covered in the article, more or less called The Jingo Speed Handbook, making a Django app Faster Bye, Shibel Mansour. Now the title has Django, and some of the examples are really about Django. But this actually applies to most websites and Python Web sites and whatnot.

08:38 KENNEDY: So if you do Flask, I think it's still be super super relevant.

08:42 KENNEDY: The first thing, though, that I want to point out, is actually a Django thing, and it does appear at least Pyramid as well. So in Django, there's a thing called the Django Debug toolbar, and it lets you explore the different requests he held, another taking. You could even get in there and look at the ORM calls and what's happening. So that's pretty awesome. Like Pyramid has this as well. you could actually see the SQLAlchemy calls go into the database, and the timing and how many database queries there even are on a given page. It's pretty ridiculous to be able to use that to analyze what yours. It's almost like

09:18 KENNEDY: you've attached a little debugger profiler all the time and it's just right there.

09:22 OKKEN: That's cool. Do you have to turn it off then?

09:24 KENNEDY: Well, when you go into production, you don't include it in the setting. Like the run settings for production. Obviously, right, That would be bad. But some of those settings, even in the debug mode, you have to turn them on. I'm not sure about the Django one, but the Pyramid one. You definitely like the profiler is not on by default, because that'll slow it down a little bit. But you can click a box and go do the request again. All right, so that's a real quick and easy way just to see what your app is up to.

09:49 KENNEDY: Then one of the things you really want to pay attention to and this is gonna be a bit of a theme on today's show is talking to databases. So when you're working with an ORM or just talking to the database specifically here, the Django ORM, this is super relevant for SQLAlchemy as well. It is. You wanna like, be really careful of the so called the N + 1 problem which happens when you navigate relationships.

10:15 KENNEDY: So, for example, if I have, let's say a category, I'm gonna show a category of books, and the category has a book's relationship. Or maybe there's some other things. Like I get all the categories back and want to tell you how many books are in each one or some, like as you go through the things that come back. You end up doing one query for each property that you access on each instance of that object. So if you do a query that returns 20 things, you might end up talking to the database 21 times. It's a common problem in ORMs.

10:47 KENNEDY: But it is also has an easy fix, which is why that debug toolbar is cool because you could turn it on and say, Well, turn it on. So look, why are there 24 queries on this page? I feel like I did one like Well sort-of so you can use `select_related` and `prefetch_related` and it'll basically join or pre-query that was related objects together in one massive queries so you don't actually go back to the database N + 1 times.

11:14 KENNEDY: Yeah and that's a big deal. And, you know, SQLAlchemy has a joint load in some query that you can basically accomplish the same thing. So he's got a cool example of not a huge database, but doing making these in these two properties in the Django ORM, going 24x faster, right? I mean, it's basically not changing the code at all except saying.

11:33 KENNEDY: You know, I'm gonna use this related property. So just query that as part of the query instead of like doing 20 you know, however many queries, you're going back for a really, really nice. Related to that is indexes. So if you're not thinking about using indexes, you should be, I mean, that's like easily 1000 times faster to do a query against a lot of data with an index versus without, and if you've got these joins. It's even better, you know, so super important. But do be aware that indexes make writes slower, so if you have not, most websites don't write data like crazy. Although some APIs do so is usually not as big of a problem, but it just be aware that writes are slow with indexes, but queries are much, much faster. Another thing they talk about, which is really helpful is using pagination, where instead of saying, here's 1000 items, here's 50 and you can ask for the next 50 in the next 50 and so on.

12:29 KENNEDY: That's super easy to do with Django ORM or SQLAlchemy or anything like that. So that's a really good one.

12:34 OKKEN: So does that often line up with. Like if your page only shows 50 things only fetch 50 things, then

12:41 KENNEDY: Yeah, yeah, exactly. And it's super easy to put in the query string like Page equals five, right? And then you just do, `skip` and `limit` or whatever the ORM has, like for the skip and take type of thing. Right so super easy. You can compute it yourself, but it makes a big difference, right?

13:02 KENNEDY: Also, if you have long running tasks, long running things to do, make them either background tasks in like other processes or celery or something, or just use, If the person making the call has to wait on it, be sure to use async, right, so you're you're not blocking up everything. Another super easy way to make things fast and many of these things were doing it by them bites out of him and the other websites is to turn on Gzip

13:29 KENNEDY: so you can just go to like NGINX or whatever your web server is and say

13:33 KENNEDY: Gzip the response. He's got a really simple example here where the response size of the page and the CSS and whatnot is 9x smaller by just adding the Gzip middleware to Django,

13:46 KENNEDY: I wouldn't actually added to Django if this was me. I would add it to NGINX because that's the outer shell web server. Just let it do it and you don't have to, Um, probably not talking directly to the server running Django anyway, somewhere along the way. Gzip. Pure content cause that'll be big.

14:04 KENNEDY: Similarly, minify your static files and bundle them and cache them and all of those good things right. there's some cool libraries that he talked about in there. I think it was called `White Space`. That they're using in Django to minify and bundle the files. So we don't use white space and we don't use Django. We use `webassets` and `cssmin` and `jsmin`, which are three awesome python libraries to bundle. That's like So if you go and look at Python Bytes or Talk Python or all the other sites, you can see that there's like a packed CSS and a packed JavaScript. It's like has probably 20 CSS files. It's smushed into one with those things and modified and whatnot, so that's pretty cool. There's two ways to measure page performance. One is like, How fast is the server responding? Right. But that's not the most important thing to the user. The most important things, how's it feel to them? So Google has the same called Page Speed, which they're even using for measuring like your SEO Ranking.

15:04 KENNEDY: So put your website into their I have ah link for Talk Python Training's ranking. I got I spent three days straight getting it from like 40 out of 100 to 99 or 100 out of 100 but it was ah was quite the journey, so that took a while.

15:21 KENNEDY: You can both measure it for mobile and desktop and has slightly different rankings also shrink your images with ImageOptim,

15:30 KENNEDY: which works for Mac OS and Linux, is it doesn't work on Windows. But there's some really great options there on Basically, do completely lossless compression of your images do they might be, like 40 or 50% smaller and visually you could not. You literally couldn't distinguish them.

15:45 OKKEN: Interesting. Yeah,

15:46 KENNEDY: Yeah, And then last recommendation is lazy. Load your images. This is not something I've really explored. But apparently Google chrome images now support a lazy attributes.

15:58 OKKEN: Oh nice.

15:59 KENNEDY: Yeah, and then for things that don't support it, there's a lazyload. JavaScript library. Basically your images, you say. Here's

16:05 KENNEDY: as it scrolls interview. It'll download him. But if it's off the page and you never scrolled and it'll never load in.

16:10 OKKEN: that's great.

16:11 KENNEDY: Yeah, pretty clever. So this is just some of the things covered in that article. So if you're out there and you're like, I need to get my sight to go faster, it cannot be three seconds per page load. That's ridiculous. Like start looking through some of these things it'll really help, especially if you're using Jingo. But even if using some other python remark, I think it'll still be quite relevant.

16:30 OKKEN: Yeah, mostly is a relevant to any Any web stuff?

16:32 KENNEDY: Yeah. Yeah, they're super super general, like some of the libraries. They talk about plug-ins to Django. So it's kind of little extra boost if you're doing Django, but yeah, this is relevant, everyone. What you got next?

16:41 OKKEN: Well, this actually came into us. Ah, listener suggestion from the author of the library.

16:46 KENNEDY: So this is like jet podcasting, right?

16:48 OKKEN: Yeah. It just came in this morning, and I love it. It's from Conrad Hallus, us. I think it's called a D A C I T E. Maybe de-cite da-site, day-site...

16:59 OKKEN: But it's cool. It simplifies the creation of dataclasses from dictionaries. So when I first heard it, I'm thinking Okay, well, I love and I'm using dataclasses, like all the time now, because I really like him. A lot of cool aspects of him. You can have default values. I really like that. I can easily have exclude some of the fields you can take them out of the comparison so some objects can be equal, even if they're not completely equal sort of thing. And I love that aspect. And there's a whole bunch other cool stuff about him, so I'm using him more and more. But our data all over us do we get from databases and whatever. It often gets converted to dictionaries, not to dataclasses. So this is a little library that has basically it's one function called `from_dict` that converts dictionaries to dataclasses. And my first reaction was, I can already do that if you do the `**` or these

17:54 KENNEDY: Like the dictionary to keyword argument type of thing.

17:59 OKKEN: Yeah, you mean you can do that for simple dataclasses and ah, simple dictionaries? That works just fine, but I looked into this more and this from dicked from Dacite. It allows you to do nested structures so you can have a dataclass with another dataclass field and arrays of lists or tuples of data classes and some of the types you can do unions in their collections, nested structures. It even has this thing called typehooks, which allows you to ah, have a custom converter for certain types of data that come in. So his example is like if for all the strings, lower case them or something like that, But you can definitely have that. For certain types. It's pretty neat.

18:42 KENNEDY: Oh, that's cool. Or if you got, like, some kind of string that the datetime you hearts it out of an isostring.

18:48 OKKEN: Yeah, that's good example. Actually, that's cool. So one of the things you would that messes you up if on my example have just taken a dictionary and expanding it as arguments to ah, dataclass constructor is that it doesn't really work if all the names don't match up. But this one allows you to have if your data class only has a few fields. But your dictionary has, like, tons of stuff in there by default, he just ignores the stuff that doesn't match up. And so if you've got, like, a name and an ID and there's names and ID's coming from the dictionary, but there's also like a whole bunch others things like you are l and stuff like that. It just ignores that that's the default. But you can also turn on strict mode that says no. I expected to match up directly, and I want a warning. And then there's a whole bunch of exceptions that get raised if something goes wrong in the conversion. And I'm just excited to use this because it's ah really cool tool to convert data to dataclasses. It's nice.

19:45 KENNEDY: Yeah, that's like super nice. It's one of those things that seems to automate like the crummy part of programming, right? Like I'm getting this data submitted to me from an API or from somebody calling my API and who knows what they're sending me. But here's how it like as long is this thing lines up, right? Right. I tell that these fields are not optional or this type has to be such and such that works the more good. Otherwise, you don't tell them `400: that didn't work` or the file couldn't be loaded or whatever it is.

20:11 OKKEN: And there's definitely so Conrad made a point in the documentation to say that is not a ah schema validation library. That's not the intent of it. It is ah, really just intended for the conversion. So especially with external APIs, I think combining this with a schema validation is it is a good idea, but you could definitely go from schema validation to this. And I have dataclasses in the end would be great.

20:37 KENNEDY: Yeah, it's a cool project. And I love how it leverages the brand new Python stuff, the dataclasses.

20:42 OKKEN: Anyway, we should plug ourselves.

20:44 KENNEDY: Yeah, Well, we should definitely let people know about what we're doing. Right. So, uh, you've got this book on testing or something?

20:53 OKKEN: I actually kind of love that. I had some feedback early on when the book came out. Python testing with PyTest was the book that I'm talking about. And it did come out in 2017 the end of 2017. And I got some really great feedback from people saying they really loved following the book on this podcast. And I apologize for the lawnmower in the background if it goes through.

21:14 OKKEN: I wanted to point out that I had a couple of people asked me. It came out in 2017 is it's still valid, and I want to take the time to say yes it is. The intent of the book was never to be a thorough, complete inventory of everything you could do with PyTest it was a quick what are the like 80% of PyTest that you're going to use all the time and that will is the Core PyTest how to think about it? There is new goodies that have been added since 2017 and it's good to check those out. But you could run with what's in this book and still be very productive.

21:48 KENNEDY: Nice. It's definitely made me more productive and better with PyTest.

21:54 KENNEDY: I wanted to tell people about the courses that we have over TalkPython training. We've got a bunch of new ones we've been releasing. I do try to let you know when the new ones are out, but we've got, like, 120 hours of Python content over there. A bunch of projects that you can do the 100 days of code courses all have, like, projects for every single day for 100 days.

22:13 KENNEDY: And yeah, so just check them out. We're going to release a couple of courses coming soon, and I'll be sure to let you know, but yeah, support us by checking out our work, right?

22:23 OKKEN: Yeah. I want to tell people. One of the things I love about the talk Python courses is there's a lot of content there, and I'm a busy person. And sometimes it's overwhelming to me to look at a course to say it's like 12 hours of content on the course or something like that, six hours or something even. And however the way that you've got it set up with the whole bookmarks into separate videos and different topics, it's the outline of the courses air so incredible that if you really need to just jump to the right place to learn something, you can do that. And even though you can just watch him in series and just watch the whole thing, You could do that, of course. But but being able to jump around and go back and use it as a reference is a great thing. So thanks.

23:06 KENNEDY: Yeah, thanks. Yeah, we definitely worked hard on making that a possibility.

23:10 KENNEDY: Appreciate that. Now, do you know what the python clock reads right now?

23:14 OKKEN: Oh, I haven't checked. What does it read? It reads `0000`

23:18 KENNEDY: zero. The python clock Bell has tolled for the folks who have to convert. This next thing I want to share with everyone comes from Linkedin and Barry Warsaw Barry's been part of Python for a very long time doing a lot of cool stuff there.

23:37 KENNEDY: And he was on the team that helped linked in move from legacy Python to modern Python. OK, yeah, So it's called how we retired Python 2 and improved developer happiness.

23:49 KENNEDY: So a couple years ago, 2018 LinkedIn's

23:53 KENNEDY: We started working on this multi-quarter effort to transition the Python3 to maybe some of the lessons from here will help people out there for whom they haven't actually migrated all the way to Python3. That'd be good, right? So basically, they said they did, ah, inventory, and they found they have 550 code repositories.

23:53 KENNEDY: They had to migrate.

23:53 KENNEDY: That's a lot of different projects, and some of them depend on on the others.

23:53 KENNEDY: So they said, Look, Python is not the thing powering our main Web app. I think its Java I'm not 100% sure,

23:53 KENNEDY: but anyway, it's it's not their main thing. And so there's a bunch of like Independent MicroService's and tools and datascience projects that are, all using this.

23:53 KENNEDY: So their first pass it getting all those different things migrated was to say, we're gonna have a bilingual philosophy for python, meaning it'll run on two and three at the same time. And then once you get it there, the main problem that you could run into his I depend on the library. Like this is standard legacy Python. I depend on the library that requires Python 2 therefore every thing I use that I build. That depends on that library. Must also be Python 2 right? So this bilingual thing that they did this was to prevent that blockade, right? So anyone who wants to build new stuff on point on three could still use the libraries and do so. That was the plan. They actually had the whole team that oversaw this effort across projects across like thousands of engineers called the _Horizontal Initiatives Program_.

23:53 KENNEDY: So that was, like, kind of across all these different projects. Address that. And then in phase 1/4 1st quarter 2019 they went, and they found the most important repositories, the ones that were, if you put them into a dependency graph at the bottom. And they said, we're gonna port those to Python three first because they're blocking everything else. And then they kind of finished it off in the second half of 2019. So they basically said, All right, now we got the foundation done. We can start upgrading the libraries that depend on all these lower level bits.

23:53 KENNEDY: And then, you know, they said, Looking back, you'll like this part, Brian. They said our primary indicator for knowing that the migration was done that we were all right was that our builds passed and our tests ran and everything was okay. And then eventually they went through and said, All right, we're gonna turn off the ability to run Python 2 type of tests in continuous Integration.

23:53 KENNEDY: Now, let's see what keeps working.

23:53 OKKEN: Oh, yeah, OK,

23:53 KENNEDY: yeah. So one of things you could imagine important is, ah, having tests, right? Because if you don't have tests, CI/CD doesn't tell you a lot.

23:53 KENNEDY: It just does the CD part.

23:53 KENNEDY: But it for better or worse?

23:53 KENNEDY: Yeah, So they said, Look, here's some guidelines for people other organizations who are on similar past but earlier. They said plan early and engage your organization's python experts find and leverage champions in the affected teams and helped them promote the benefits of python three to everyone.

23:53 KENNEDY: Adopt this bilingual approach so people can at least begin if they want to go to Python 3.

23:53 KENNEDY: Invest in tests and test coverage because these will be your best metrics of success. And then finally ensure your data models explicitly deal with this, what used to be one thing bytes and strings and by then, two, and now is, of course, two totally separate things like that. They said that was the really the biggest challenge that they ran into is that making that distinction correctly?

23:53 OKKEN: Yeah, those are a hurdle.

23:53 KENNEDY: Are you guys all of upgraded?

23:53 OKKEN: Yeah, it was a library that we were using that didn't support Python three. Yet the reasoning was, the library talks to a deal. Well, that has,

23:53 OKKEN: you know, C++ strings or C strings and old python strings converted just fine. But they don't now.

23:53 KENNEDY: Unicode. Fancy ones. Yeah, not so easy.

23:53 KENNEDY: Cool. So, uh, wrap this up. They said the benefits they have from this whole process is they no longer have to worry about supporting Python 2. And they have seen their support loads decrease and decrease in a good way. Not you want to support the old crummy stuff.

23:53 KENNEDY: You can depend on the latest open source libraries. A lot of libraries these days only work with Python 3 in the opportunistically and enthusiastically adopted type hinting and MyPy to improve overall quality, which is pretty cool.

23:53 OKKEN: Yeah, This is really cool.

23:53 KENNEDY: Yeah, I'm looking for this next one you got.

23:53 OKKEN: This actually ties nicely because you brought up the Django speed ups and I probably should have talked to us about this right afterwards. But anyway, here we go. There was article that I'm not saying they agree or disagree because I don't know enough about it. But the article was called The Troublesome Active Record Pattern.

23:53 OKKEN: And I guess in, you know, like Ruby and stuff that they talk about active record more, I think. But in Python world, it's the object relational mappers ORMs, like the Django ORM or SQLAlchemy is also an ORM, and those are essentially the same as active record that I think it's the same pattern, right?

23:53 KENNEDY: Well, certainly the Django ORM follows that pattern. SQLAlchemy. It has a lot of similarities, but its design pattern is technically called a unit of work. (Okay) the main variation is like on Django or things like that. As you go to the object you call `save()`, whereas so that happens on the individual objects, whereas in SQLAlchemy, you make a bunch of changes. And then there's this unit of work thing and you call `save()` and it changed its submits all the changes in one giant batch.

23:53 KENNEDY: But here's the interesting thing is like, this whole article is like the troublesome active record pattern, but my reading of it really waas The troublesome ORM pattern so for the most part, it's kind of a material distinction. Although technically design pattern wise, they're not exactly the same.

23:53 OKKEN: Okay, well, yeah, So the idea being like you just brought it up that the object when you're referencing a bunch of objects and you have `objects.save()` and things like that, there's a whole bunch of issues with that. One of the issues is if you want to query things about the data, not necessarily all the data, but things like, if you've got a bunch of books, for example, and you just want to count the number of books, well you? You might have to just retrieve them all. Or, if you wanna count all of the software testing books written by Oregon authors, you'd have to just ask me or you have to grab like all of them and grab all the data and then search in Python. Look for stuff in a for-loop or something. The other problem was around transactions because if I have have ah, book item and then change something about it and then save it back in, there's nothing stopping some other process. You know, the read, modify, write doesn't work that well if you've got multiple readers and writers. And I was looking this up SQLAlchemy has sessions. Or you said there's a unit of work thing. I don't know if those are atomic.

23:53 KENNEDY: yeah, they're the same. Yeah,

23:53 OKKEN: Okay. Django has an, atomic setting, but I don't know if that's by default or if it always here. If you have to specifically say work with transactions. I did notice in some of the Django documentation that does say that transaction slow things down so you don't want to do transactions if you're just reading. For instance, But and then the author of the article, Cal Peterson, mentions that REST APIs often have the same problems, and some some microservice architectures have a similar sort of issue. They just it's just around. REST APIs instead of the object model reading tons of data when you don't need to. He brought up some solutions, at least four, you can just directly use SQL or use some properties that do queries that are more like sql doing transactions helps, too, but basically he was recommending avoiding the active record style access patterns around the rest. APIs. He brought up that a graphic you l and RPC style APIs are some solutions to the same problem and rest APIs as somebody that's moving towards learning more about Web development in working with ORMs. I really did want to bring this up and find out what you thought of all of this.

23:53 KENNEDY: Sure, it didn't seem that there's there are a lot of good, valid points Cal is making here. I feel like the focus should almost be, is it? Of the troublesome active record pattern, is you're using your ORM wrong. Learn how to use it, Right? So let me give you some examples. So the one of the challenge here that we see is,

23:53 KENNEDY: if you're going to create a record, and you want to get it back, you have to get it back by the primary key. Maybe if you're doing exactly on just the straight ORM record pattern. Uh, but you just do a query and through it, like I give me the first of one item or something like that. There's a part where he's looping over stuff, saying he were looping back to just get the ISBN off these things. but you're pulling all these properties like you're doing basically a `SELECT * from table` just to ultimately (and serialization of that result) just to get, like, the ISBN. well, in SQLalchemy, I don't know Django ORM well enough but in SQLAlchemy. You could say only return these columns. I want just the ID and the title or the I was some of the ID and the ISBN. Don't return the other results right, So that's an option. The N + 1 thing we already discussed, right? You just use the subquery or the bfilter, select, or whatever it is for Django, and you can avoid those, right? So, like, as you kind of go through these, right? Okay, well, most of the time, these problems are actually solved with some aspect of, like, a proper ORM. Now the transaction one. Is it really, I think super interesting, because it's sort of often gets to the heart of this debate about ORMs.

23:53 KENNEDY: If you're saying, Okay here's this active record thing where it's not really leveraging transactions. We know transactions, are good. And so this this is bad because it doesn't do. But in practice, it's not so clean as that. So, for example, suppose I'm working on a Web app and I have a grid like a grade that was maybe could be loaded off of REST endpoint. Bring that into it. Right, And I've got this grid and I can type in it and there's a button that calls, says save. There's no way that it makes sense to do a transaction are around that right. I'm not gonna transactionally begin loading the grid and wait for me to press safe. Right? That's gonna lock up the database for every user, any scenario like that. Like rest in points, right? If I've got a phone and I've got my mobile app and it hits the REST endpoint pulls it down the data and I hit a type on it and I had saved you can't do that transactionally, that you just you would lock up the site like right away, right there doesn't make any sense.

23:53 KENNEDY: So there's just other patterns. Like optimistic concurrency is a super common pattern in ORMs that would work with active record or seek welcomes. You know, are beautiful in the ideas. I'm going to make some kind of version in that record. I'm gonna pull it back. It's gonna come with the version that I got and when you hit, save you say, update this record where the version is the version I have. So if someone else's updated it, it increments that version and it says, no, no, there's no record. You can't update this.

23:53 KENNEDY: Right, So you basically say ah looks like someone changes behind you like your grid and their grid they had saved before you. So you got to deal with, like syncing this backup, right? So there's a lot of times where it would feel great to like have a transaction. But the transaction that actually can't be used anyway and, oh, rooms have, like, nice, built in ways that you can easily slot in, like optimistic concurrency and stuff.

23:53 KENNEDY: So that's my thought. I think this is an interesting article. It's definitely interesting to think about all the points brought up, but I often think that the tools have, like, clever, not obvious ways to solve most problems.

23:53 OKKEN: Yeah, and they get to be a be a little bit on Cal's side here that the tools have clever, non obvious ways to deal with them.

23:53 OKKEN: Maybe that's that's an issue, that all of our beginning tutorials on how to use Jango or how to use sequel coming or how to use other ORMs are just ignoring that stuff because it's more advanced. But people often just read the beginning tutorial, and then they'll do a startup or something.

23:53 KENNEDY: Yeah, sure. And then you end up with your page loading like in six seconds, and you don't know why? Which is not great.

23:53 OKKEN: Maybe we could teach people the right way to do it from the beginning.

23:53 KENNEDY: I do wish that some of these patterns were more built in like I wish optimistic concurrency was there by default in the ORMs. And you kind of gotta, like, roll out yourself. And then what? Not So anyway, it's a really interesting article. Think about. And I think it dovetails nicely with my a sort of performance, one as well, because it's their kind of two sides of the same coin a bit there.

23:53 KENNEDY: Yeah.

23:53 KENNEDY: All

23:53 KENNEDY: right, well, I have the second side to your coin. That is the Dasite Whatever. That one was called.

23:53 KENNEDY: Yeah. So this

23:53 KENNEDY: is Ah, cool thing by a Steve Dare called "types at the edge of python the edges of Python". And so Steve apparently creates a bunch of APIs, I think. Yeah, he was using fast API the time when he was talking about all these ideas, but it's just kind of generally avowed for all of them. This Look when I started the new When I create new API. These days I start with three things. I start with Pydantic, MyPy and some kind of Error tracking like RollBar or Century or something like that. That's pretty nursery, right?

23:53 KENNEDY: So Pydantic is a data translation and validation library, much like dacite, right?

23:53 KENNEDY: They're not the same, but they kind of playing the same realm. They transform JSON with validation and type checking over there. And then there's MyPy.

23:53 KENNEDY: you can use PYdantic to help specify some of the types on your classes and then use MyPy to verify that you're not missing some kind of check. Dare, says look, the most common error you're gonna run into as a python developer in general is `AttributeError. None type. Object has no attributes X`, where X is whatever you're trying to do, right?

23:53 KENNEDY: Yeah, and I mean, that's just means you got `None` instead of a value, and you're trying to continue to work with that class in some way,

23:53 OKKEN: it's a `void D reference` in C

23:53 KENNEDY: Yes, exactly. So wouldn't it be nice if it said `None is not an allowed value for this` or `you have None, and you can no longer operate on it` or something like that.

23:53 KENNEDY: So PyDantic will actually give you those types of errors that will convert things like attributes, errors and mismatched type errors, to explain what was wrong, right? So that's pretty awesome. And so you can use PyDantic to actually specify what your understanding of the interface like. If you're calling in API the stuff that you expect to get back, I say, I think this is gonna be a date. I think this is an optional string and whatnot, it says. Then when you launch a code into production, your assumptions are tested against reality.

23:53 KENNEDY: That's pretty cool, as if you're lucky they turn out to be correct. But if not, you're gonna run into some of these None type errors, and PyDantic can help with that. But then you can also, once you put in the typing into your code than MyPy will go on helping. So, for example, if you're taking an argument that says, you know, first you think it's a strings, you say `:str` first type, then you go work with it, and that means it cannot be known right, like none ability is explicitly set in the type thing and python and the type space. So if you find out that it could be none, did you going to go and say This is a `typing.optional[str]`, right? Like that's what it's got to be. If it could be `None` or a string, you would find that out in the specified that PyDantic. And then, if you run MyPy against and you'll start working with optional string, your check for it to be `None` first,

23:53 KENNEDY: MyPy actually give you an error, saying that you're not checking for none. Basically, due to even tell you like the mist. If statements or other conditional code like verify that like, no, it's not the optional none, that's actually the value.

23:53 KENNEDY: Okay, that's pretty cool right there.

23:53 OKKEN: It's tripped me up for sure.

23:53 KENNEDY: yeah for sure. And I mean, normally it's just it's just not present. And it's not because Python is a dynamic language like C++ would have the same problem, right? If you take a pointer and you just start to work with it and C c++ the compilers not going to say you didn't check that for, you know, equal to know First, it just It just doesn't do that. (Right.)

23:53 KENNEDY: So this is a really awesome like addition for, like, safety in your code. So he was talking about how FastAPI automatically integrates with PyDantic out of the box, which is pretty cool. And then also at the end, he has a kata a mini-kata that works you through these ideas. So Kata is like a practice to play with these typing ideas.

23:53 OKKEN: Yeah, in a nice picture of how these will fit in

23:53 KENNEDY: Yeah, there's a cool diagram. Anyway, if you're building APIs and you're taking data, especially from sources where they might give you junk when you expected something valuable or you're not really sure you like the doc say this, but I remember getting something different some other time. There's a really cool way to formalize that, and then have your code automatically check it.

23:53 OKKEN: Yeah, this is

23:53 KENNEDY: cool, like, yeah, Awesome. That's

23:53 OKKEN: all over our six items. Do you have any extra little things to share?

23:53 KENNEDY: Well,

23:53 KENNEDY: I kind of went overboard on the extras this week, but I'll keep them all quick. There's a bunch of cool stuff out there that people send in first. Jack McKew, you did a really cool thing. So Jack McCue created a blog post or page on a site called Python Bites Awesome Package list. Have you seen this?

23:53 OKKEN: Yeah. Any, like Listen to 171 episodes in 174 days or something like that of Python Bites.

23:53 KENNEDY: I mean, this is awesome, because as I flip through this, there's a couple of things I forgot. I'm like, Oh, that's cool. Oh, we must have talked about that. But I don't remember. It's got beautiful pictures. It's I mean, it's kind of an awesome list, but it's for a podcast. So that is super cool,

23:53 KENNEDY: Jack. Thank you. Thank you. We shared a link to it at the end, and I hope you keep adding to it. That would be great, but no pressure.

23:53 KENNEDY: I want to talk about vb.net for a second.

23:53 KENNEDY: Because I kind of appreciated VB back in the early days when it was like a drag-and-drop VB6 and whatnot. And then Microsoft came up with this thing called Visual Basic .Net, and it was complete crap. Didn't like it. But here's what's interesting is like they has just announced that they're no longer maintaining it. They'll keep that thing running, but they will no longer work on it. I just thought it was interesting, like here's a fairly major language, not super top five or something, but it's kind of a major language that's like declared dead. And I just thought it was kind of interesting to point out like

23:53 KENNEDY: languages, they can go dead.

23:53 KENNEDY: It's weird.

23:53 OKKEN: Yeah, I think this one should have been shot a long time ago, but

23:53 KENNEDY: it's also

23:53 KENNEDY: worth thinking about this. I agree, by the way I should have never existed. But anyway, that's a different story. It's also an interesting take on, like here's a language controlled by a single company, and they could just decide they don't like it any more. Right, like this wouldn't really happen to Python because there's not a single person or organization, because that we're done.

23:53 OKKEN: Well, that's actually one of the fears I have for, I mean, even Java. Java is not controlled by one company, but it kind of is.

23:53 KENNEDY: Yeah. Well, And there's also that, um, Supreme Court case or the legal case of like, are you allowed to copy the Java API.

23:53 KENNEDY: I don't think that's resolved yet. I can't remember. It's still working its way through the courts.

23:53 OKKEN: I want to reiterate I don't people that actually have a job in visual basic or love it. I'm not dissing you. I just had a personally bad experience, visual basic and didn't enjoy it. So

23:53 KENNEDY: I had a good experience with Visual Basic 5. But that was in, like, 1993 or something.

23:53 KENNEDY: Okay, so also, we talked about COVID-19 all the crazy stuff going on.

23:53 KENNEDY: As tragic as much of it is, there's some really interesting data science that can be done and some dashboards that can be built and whatnot. So someone on Twitter let me pull up their name. Just pointed to a whole bunch of COVID-19 data sets

23:53 KENNEDY: BeeKeep? I'm gonna call that BeeKeep. Put that on Twitter. So, uh, check that out. Like the John Hopkins CSS e-data set and some other dashboards and something's on kaggle. So if you're in data science, you wanna Florida or some data sets that air. Probably interesting.

23:53 KENNEDY: Then finally working a new course, "Adding a CMS to Your Data Driven Web App". That would be a lot of fun. Yeah, I'll talk more about that later, but it's super excited to be creating more courses. Kind of talked about earlier.

23:53 OKKEN: Yeah, one of the things we talked about is, ah, people working from home and getting around technical problems with that. That happened to me just this morning. So this morning, I tried to hook up. I realized that I had external keyboard that's working fine-ish. I wanted to use like, a real mouse. So I plugged in an external mouse with with a little click wheel thing on it and realize that on Apple, the click wheel behavior just goes the wrong direction for scrolling. And it confused me and you can reverse it. But I didn't want my track pad to be reversed. The trackpad's fine, so they're tied together for some reason. Weird. So, Dave Forjack, Sorry, Dave. He suggested I use something called a _Scroll Reverser_. that is a little tiny app that allows you to untie those and have trackpad scrolling and mouse scrolling be different And thank you, Dave.

23:53 KENNEDY: That's awesome.

23:53 KENNEDY: that's super cool. I guess my work from home thing that I've been playing with is, ah, with Zoom, you can have virtual backgrounds.

23:53 KENNEDY: Do you have to have a green screen? You could have, like, alternate backgrounds by upload an image and it'll put you in, you know, an office space instead of a messy bedroom or whatever it is

23:53 OKKEN: so nice. Yeah. So you gonna block out the kids behind us? Stuff like that?

23:53 KENNEDY: Yeah, exactly. You have to see the kids being crazy home from school and whatnot anyway. Yeah, a lot of stuff We're learning around those little signs things. And I think the joke that I chose for us this week is going to be perfect for the opening community as a documentation as building community that you brought out.

23:53 KENNEDY: This is before that person gets inspired from listening to you and actually makes things better. All right, so let me let me set the stage here.

23:53 KENNEDY: There's three people, two of them, clearly more senior and a very excited new person, sitting at a laptop,

23:53 KENNEDY: beaming with enthusiasm, Ready to get going on the whole project. And one of the senior persons says to the other, And

23:53 KENNEDY: this is Jim, our new developer. The other one says, Great. Does he already know something about our system?

23:53 KENNEDY: The new person turned around. I read the whole documentation.

23:53 KENNEDY: Blank looks between the senior people.

23:53 KENNEDY: People: No.

23:53 KENNEDY: That's good right?

23:53 OKKEN: Yeah, definitely. I started a job when it's in my career where I had read the documentation because it was the internal job transfer. I read the documentation before getting there, and the people there that didn't know they had documentation.

23:53 OKKEN: So it was so out of date. Nobody knew it existed.

23:53 KENNEDY: Maybe a little out of date. If they don't even know it exists.

23:53 KENNEDY: Yeah. All right. Well,

23:53 KENNEDY: possibile

23:53 KENNEDY: Well, thanks a lot. You bet.

23:53 KENNEDY: Great to be here with you. As always.

23:53 KENNEDY: Thank you for listening to python bites. Follow the show on Twitter at PythonBytes. That's Python bytes as in B Y T E s and get the full show notes at pythonbytes.fm, If you have a news item, you want featured. Just visit by thumb bites on FM and send it our way where I was on the lookout for sharing something cool. This is Brian Okken. And on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page