Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #193: Break out the Django testing toolbox

Return to episode page view on github
Recorded on Wednesday, Jul 29, 2020.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 193, recorded July 29th, 2020. I'm Michael Kennedy.

00:10 And I am Brian Okken.

00:12 And we've got a bunch of great stuff to tell you about. This episode is brought to you by us.

00:16 We will share that information with you later. But for now, I want to talk about something that

00:23 I actually saw today. And I think you're going to bring this up as well, Brian. I'm gonna let

00:27 you talk about it. But something I ran into updating my servers with a big warning in red when I pip

00:32 installed some stuff saying, your things are inconsistent. There's no way pip is going to

00:37 work soon. So be ready. And of course, that just results in frustration for me because, you know,

00:43 depend a bot tells me I need these versions, but some things don't require anyway, long story,

00:47 you tell us about it. Yeah, okay. So I was curious. I haven't actually seen this yet. So I'm glad that

00:52 you've seen it so that you have some experience. This was brought up to us by Matthew Fikart. And he

00:59 says he was running pip and he got this warning and it's all in red. So I'm gonna have to squint to read

01:04 this. Says, after October 2020, you may experience errors when installing or updating packages. This is

01:12 because pip will change the way it resolves dependency conflicts. We recommend that you use use features

01:18 equals 2020 resolver to test your packages. It shows up as an error. And I think it's just so that people

01:24 actually read it. But I don't know if it's a real error or not. It's still it works fine. But it's it's going to be an error

01:30 eventually. Okay, so this is not a problem to not adjust your sets actually do adjust your sets.

01:35 What you need to be aware of is the changes. So we've got a we I think we've covered it before,

01:40 but we've got a link in the show notes to the pip dependency resolver changes. And these are good

01:47 things. But one of the things that Matthew pointed out, which is great, and we're also going to link to

01:52 an article where he discusses where, like how his problem showed up with this. And it's around projects

01:59 that use some people use poetry and other things. And I can't remember the other one,

02:03 that does things like lock files and stuff. But a lot of people just do that kind of manually. And what

02:10 you often do is you have like a, your original set of requirements that are just your, like the handful

02:16 of things that you immediately depend on with no versions or or with minimal version rules around

02:23 it. And you say have installed this stuff. Well, that actually ends up installing a whole bunch of

02:29 all of your immediate dependencies, all of their dependencies and everything. So if you want to

02:34 lock that down, so that you're only installing the same things again and again, you say pip freeze,

02:39 and then pipe that to a, like a lock file. And then you can use that, I guess, a common pattern.

02:46 It's not the same as pip env's lock file and stuff, but it can be similar anyway. And then if you use

02:52 that and pip install from that, everything should be fine. You're going to install those dependencies.

02:56 The problem is if you don't use the use 2020 resolver feature to generate your lock file,

03:05 then if you do use it to install from your lock file, there may be incompatibilities with those.

03:12 So the resolvers actually try is actually, there's good things going on here, having pip do the

03:17 resolver better. But the quick note we want to say is don't panic when you see that red thing,

03:22 you should just try the use features 2020 resolver. But if you're using a lock file, use it for the

03:28 whole process, use the new resolver to generate your original lock file from your original stuff,

03:34 and then use it when you're installing the requirements lock file. There's also information

03:39 on the IPA website. They want to know if there's issues. This is still in a, it's available, but we're

03:46 still, there's still maybe kinks, but I think it's pretty solid. Not enforced, but warning days. Yeah. And I

03:52 kind of actually really like this way of rolling out a new feature and in a behavior change is to,

03:57 to have, to have, have it be available as a flag so that you can test in a, in a, not a pre-release,

04:04 but an actual release, and then, change the default behavior later. But so the reason why we're

04:10 bringing this up is October is not that far away. And October is the date when that's going to change

04:17 to not just the flag behavior, but the default behavior. So yes, go out and make sure these things

04:22 are happening. And if you completely ignore us when things break in October, the reason is probably

04:28 that you need to regenerate your lock file. Yep. So in principle, I'm all for this. This is a great idea.

04:34 It's going to make sure that things are consistent by looking at the dependencies of your libraries.

04:40 However, two things that are driving me bonkers right now are systems like depend a bot or pie up,

04:50 which are critically important for making sure that your web apps get updated with say like security

04:57 patches and stuff. Right. So you would do this like a, you know, pip freeze your dependencies,

05:02 and then it has the version. What if say you're using Django and there's a security release around

05:09 something in there, right? Unless you know to update that, it's always just going to install the one that

05:14 you started with. So you want to use a system like depend a bot or pie up where it's going to look at

05:19 your requirements. It's going to say, these are out of date. Let's update them. Here's the new one.

05:24 However, those systems don't look at the entirety of what it potentially could set them to. It says,

05:31 okay, you're using doc opt. There's 0.16 a doc opt. Oh, except for the thing that is before it

05:37 actually requires doc opt 14 or it's incompatible. And as in incompatible as pip will not install that

05:44 requirements.txt any longer. But those systems still say, great, let's upgrade it. And you're like in

05:52 this battle of those things or like upgrading it. And then like the older libraries are not

05:59 upgrading or you get two libraries. One requires doc opt 16 or above. One requires doc opt 14 or lower.

06:05 You just can no longer use those libraries together. Now it probably doesn't actually matter. Like the

06:10 feature you're using probably is compatible with both, but you won't be able to install it anymore. And

06:14 my hope is what this means is the people that have these weird old dependencies will either loosen the

06:22 requirements on their dependency structure. Like we're talking about, right? Like this thing uses this

06:27 older version or it's got to be a new version or update it or something, because it's going to be,

06:33 there's going to be packages that are just incompatible that are not actually incompatible because of this.

06:38 Yeah. Interesting. Yes. Painful. I don't know what to do about it, but it's like literally this

06:43 morning I ran into this and I had to go back and undo what depend a bot was trying to do for me

06:48 because certain things were no longer working, right? Or something like that. Interesting. Yeah. So

06:53 does depend a bot depend a bot? Depend a bot. Yeah. That's the thing that GitHub acquired that

06:58 basically looks at your various package listings and says, there's a new version of this. Let's pin it to

07:03 a higher version. And it comes as a PR. Okay. That was my question. It comes as a PR. So if you

07:08 had testing around in a CI like environment or something, it could catch it before it went

07:14 through. Yes. You'll still get the PR. It'll still be like be in your GitHub repo, but the CI

07:19 presumably would fail because the pip install step would fail. And then it would just know that it

07:25 couldn't auto merge it. But still it's like, you know, it's, you're like constantly like trying to

07:32 push the water, the tide back because you're like, stop doing this. It's driving me crazy. And they're,

07:38 there are certain ways to like link it, but then there's just a force it to certain boundaries.

07:42 But anyway, it's, it's like, it's going to make it a little bit more complicated. Some of these

07:46 things. Hopefully it considers this. Well, maybe depend upon it can update to do this.

07:50 Wouldn't that be great. Yep. That would be great. Well, speaking of packages, the way you use packages

07:57 is you import them once you've installed them, right? Yes. So Brandon Branner, Branner was talking on

08:03 Twitter with me saying like, I have some imports that are slow. Like how can I figure out what's

08:08 going on here? And this led me over to something may have covered a long time ago. I don't think so,

08:15 but possibly called import dash profiler. You know this? No, this is cool. Yeah. So one of the things

08:21 that can actually be legitimately slow about Python startup code is actually the imports. So for example,

08:30 like if you import requests, it might be importing like a ton of different things,

08:35 standard library modules, as well as external packages, which are then themselves importing

08:40 standard library modules, et cetera, et cetera. Right. So you might want to know what was slow and what's

08:45 not. And it's also not just, it's not like, just like a C include it's a imports actually run code.

08:52 Yes, exactly. It's not something happening at compile time. It's happening at runtime.

08:57 So every time you start your app, it goes through and it says, okay, what we're going to do is want

09:01 to execute the code that defines the functions and defines the methods and potentially other code as

09:06 well. Who knows what else is going on? So there's a non-trivial amount of time to be spent doing that

09:13 kind of stuff. For example, I believe it takes like half a second to import requests, just requests.

09:20 Interesting.

09:21 I mean, obviously that depends on the system, right? You do it on MicroPython versus on like a

09:25 supercomputer, the time's going to vary, but nonetheless, like there's a non-trivial amount

09:30 of time because of what's happening. So there's this cool thing called import profiler, which all you

09:34 got to do is say from import profiler, import, profile, import. Woo. Say that a bunch of times fast.

09:43 Written, it's fine. Spoken, it's funky. But then you just create a context manager around your import

09:49 statements. You say with profile import as context, all your imports, and then you can print out, you

09:55 say context.printinfo and you get a, like a profile status report.

09:59 That's cool.

10:00 Now I included a little tiny example of this for requests and what was coming out of it. If you look

10:05 at the documentation, it's actually much longer. So I'm looking here, I would say just, you know,

10:11 eyeballing it. There's probably 30 different modules being imported when you say import requests.

10:17 That's non-trivial, right? That's a lot of stuff. So this will give you that output. It'll say

10:23 here, this module imported this module, and then it has like a hierarchy or a tree type of thing. So

10:29 this module imported this module, which imported those other two. And so you can sort of see the

10:34 chain or a tree of like, if I'm importing this, here's the whole bunch of other stuff it takes with

10:39 it. Okay. Yeah. And it gives you the overall time. I think maybe the time dedicated to just

10:46 that operation and then the inclusive time or something. Actually, maybe it looks more like

10:51 83 milliseconds. Sorry, I had my units wrong instead of half a second, but nonetheless, it's like,

10:55 you know, you have a bunch of imports and your running code. Where is that slow?

10:59 You can run this and it basically takes three lines of code to figure out how much time

11:05 each part of that entire import stack. I want to say call stack of that execution, but it's

11:11 the series of imports that happen. Like you time that whole thing and look at it. So yeah,

11:16 that's, it's pretty cool.

11:16 That's neat. And also, I mean, there's, there's times where you want, you really want to get startup

11:22 time for something really as fast as possible. And this is part of it is, is your, the things you're

11:28 importing at your startup is non, sometimes non-trivial when you have something that you really want to

11:34 run fast.

11:35 Right. Like let's say you're spending half a second on startup time because of the imports.

11:39 You might be able to take the slowest part of those and import that in a function that gets called.

11:45 Right. So.

11:46 Yeah.

11:47 Import it later.

11:48 Yes.

11:48 You only pay for it at, if you're going to go down that branch, because maybe you're not going to

11:52 call that part of the operation or like that part of the CLI or whatever.

11:56 Yeah. And it's definitely one of those fine tuning things that you want to make sure you don't do this

12:00 too early, but, but for people packaging and supporting large projects, I think it's, it's a good idea to

12:07 pay attention to this and make sure to your import time. Like it'd be something that would be kind of

12:11 fun to throw in a test case for CI to make sure that your, your import time doesn't suddenly

12:16 go slower because something you depend on suddenly got slower or something like that.

12:21 Yeah. Yeah. Absolutely. And you don't necessarily know because the thing it depends upon that thing

12:26 changed, right? It's not even the thing you actually depend upon, right? It's very,

12:29 it could be very down, down the line. Yeah. And maybe you're like, we're going to use this other

12:33 library. We barely use it, but you know, we already have dependencies. Why not just throw this one in?

12:38 Oh wait, that's adding a quarter of a second. We could just vendor that one file that we don't really,

12:44 you know, and make it much, much faster. So there's a lot of interesting,

12:46 use cases here. A lot of time you don't care. Like for my web apps, I don't care for my CLI apps.

12:51 I might care. Yeah, definitely. Yeah. So I've been on this bit of a exploration lately, Brian,

12:57 and that's because I'm working on a new course. Yeah. Yeah. Yeah. We're actually working on a

13:02 bunch of courses over at talk Python, some data science ones, which are really awesome. But the

13:06 one that I'm working on is Python memory management and profiling are, and, tips and tricks and data

13:13 structures to make all those things go better. Nice. So I'm kind of on this profiling bent.

13:18 And anyway, so if people are interested in that or any of the other courses that we're working on,

13:23 they can check them out over at training.talkpython.fm helps bring you this podcast and others and,

13:30 books. Thanks for that transition. But I, I'm excited about that because the profiling and stuff is one of

13:35 those things that often is considered kind of like a black art, something that you just learn on the job

13:40 and how do you learn it? I don't know. You just have to know somebody that knows how to do it or

13:44 something. So having some courses around, that's a really great idea. Thanks. Yeah. Also like,

13:48 when does the GC run? What is the cost of reference counting? Can you turn off the GC? What data

13:54 structures are more efficient or less efficient according to that? And all that kind of stuff.

13:58 It'll be a lot of fun. Cool. Yeah. So I've got a book. I actually want to highlight something. I've

14:03 got a link called, it's pytestbook.com. So if you just go to pytestbook.com, it actually goes to a

14:10 landing page that's on blog. That's kind of not really that active, but there is a landing page.

14:16 The reason why I'm pointing this out is because some people are transitioning. Some people are

14:21 finally starting to use three, eight more. There's people starting to test three, nine a lot, which is

14:26 great. There's pytest six just got released, not one of our items. And I've gotten a lot of questions

14:32 of, is the book still relevant? And yes, the pytest book is still relevant, but there's a couple of

14:38 gotchas. I will list all of these on that landing page. So they're not there yet, but they will be

14:44 by the time this airs. Time travel. Yeah. There's a Rata page on Pragmatic that I'll link to, but also,

14:49 but the main, there's a few things like there's, a database that I use in the examples as a tiny

14:55 DB and the API changed since I wrote the book. There's a little note to update your, update the setup

15:02 to pin the database version. And there's a, something, you know, markers used to be,

15:08 used to be able to get away with just throwing markers in anywhere. Now you get a warning if you

15:13 don't declare them. There's a few minor things that get changed, are changed to that make it

15:18 for new pytest users might be frustrating to walk through the book. So I'm going to lay those out

15:23 just directly on that page to have, have people get started really quickly. So pytestbook.com

15:29 is what that is. Awesome. Yeah. It's a great book. And you might be on a testing

15:33 bent as well. If I'm on my profiling one. Yeah, actually. So this is a Django testing

15:39 toolbox is an article by Matt Lehman. And I was actually going to think about having him on the

15:44 show and I still might on testing code to talk about some of the stuff, but he threw together,

15:48 I just wanted to cover it here. Cause it's a really great throw together of information.

15:52 That's a quick walkthrough of how Matt tests Django projects. And he goes through

15:58 some of the packages that he uses all the time and some interesting techniques,

16:03 the packages that there's a couple of them that I was familiar with. pytest Django,

16:08 which is like, of course, you're, of course you should use that. factory boy is the one

16:13 there's a lot of factory boys, one project. There's a lot of different projects to generate

16:18 fake data. Factory boys, the one Matt uses. So there's a highlight there. And then one that

16:23 I hadn't heard of before Django test plus, which is a beefed up test case. It maybe has other stuff

16:29 too, but it has a whole bunch of helper utilities to make it easier to check commonly tested things

16:34 in Django. So that's, that's pretty cool. And then some of the techniques, like one of the things that

16:40 people, some people trying to use pytest for Django get tripped up at is a lot of people think of pytest

16:47 is just functions only test functions only and not test classes. But there's a, there are some uses,

16:53 Matt says he really likes to use test classes. I mean, there's no, I mean, pytest allows you to use

16:59 test classes, but, you can use these derived test cases like, the Django test plus test case.

17:05 A couple of other things using a range act assert as a structure in memory SQLite databases when you

17:11 can get away with it to speed up because in memory databases are way faster than on file system

17:16 databases.

17:17 Yeah. And you don't have to worry about, dependencies or servers you got to run. It's

17:20 just colon memory. Boom. You connect to it and off it goes. Nice.

17:25 Yeah. one of the things I didn't get, I mean, I, I kind of get the next one disabling

17:29 migrations while testing. I don't know a lot about my database migrations or Django migrations,

17:35 or whatever those are, but apparently disabling them is a good idea. It makes sense. Faster

17:40 password hasher. I have no idea what this is talking about, but apparently you can speed

17:46 up your testing by having a pass faster password hasher. Yeah. A lot of times they'll, they'll

17:51 generate them. So they're explicitly slow, right? So like over at talk Python, I have, I use

17:59 pass lib, not Django, but pass lib is awesome. But if you just do say an MD5, it is like,

18:05 super fast, right? So if you say, I want to generate this and take this and generate it,

18:11 it'll come up with the hashed output, but because it's fast, people could look at that and say,

18:16 well, let me try like a hundred thousand words. I know and see if any of them match that, then

18:20 that's the password, right? You can use more complicated ones. And MD5 is a, not when you

18:25 want something like be encrypt or something, which is slower a little bit and better, harder

18:31 to guess. But what you should really do is you should like insert little bits of like salt,

18:36 like extra text around it. So even if it matches, it's not exactly the same, like you can't do those

18:41 guesses, but then you should fold it, which means take the output of the first time, feed it back

18:46 through, take the output of the second time, feed it back through a hundred, 200, 300,000 times. So

18:51 that if they try to guess, it's super computationally slow. I'm sure that's what it's talking about. So

18:56 you don't want to do that when you want your test to run fast because you don't care about hash

18:59 security during test. Oh yeah. That makes total sense. That's my guess. I don't know for sure,

19:04 but that's what I think what that probably means. The last tip, which is always a good tip is figure

19:09 out your editor so that you can run your tests from your editor because your cycle time of flipping

19:15 between code and test is going to be a lot faster if you can run them from your editor. Yep.

19:20 These are good tips. And if you're super intense, you have the auto run,

19:23 which I don't do. I don't have auto run on. I do it once in a while. Yeah. Yeah. Cool. Well,

19:29 back to my rant, let's talk about profiling. Okay. Actually, this is, this is not exactly the same

19:34 type of profiling. It's more of a look inside of data than of performance. So this was recommended to

19:41 us by one of our listeners named Oz. First name only is what we got. So thank you, Oz. And he is a data

19:48 scientist who goes around and spends a lot of time working on Python and doing exploratory data analysis.

19:54 And the idea is like going to grab some data, open it up and explore it. Right. And just start

20:00 looking around, but it might be incomplete. It might be malformed. You don't necessarily know exactly what

20:04 its structure is. And he used to do this by hand, but he found this project called pandas dash profiling,

20:10 which automates all of this. So that sounds handy. I mentioned before missing, no missing N O as in

20:18 the missing number, missing data explorer, which is super cool. And I still think that's awesome,

20:23 but this is kind of in the same vein. And the idea is given a pandas data frame, you know, pandas has a

20:30 describe function that says a little bit of detail about it. But with this thing, it kind of takes that

20:35 and supercharges it. And you can say DF dot profile report, and it gives you all sorts of stuff. It does

20:42 type inference to say things in this column are integers or numbers, strings, date times, whatever.

20:48 It talks about the unique values, the missing values, quartile statistics, stuff, descriptives,

20:54 that's like mean mode, some standard deviation, a bunch of stuff, histograms, correlations,

21:01 missing values. There's the missing note thing. I spoke about text analysis of like categories and

21:07 whatnot, file and image analysis, like file sizes and creation dates and sizes of images and like

21:13 all sorts of stuff. So the best way to see this is to look at an example. So in our notes, Brian,

21:18 do you see where there's like a has nice examples and there's like the NASA meteorites one. So there's

21:22 an example for like the US census data, a NASA meteorite one, some Dutch healthcare data and so on.

21:29 If you open that up, you get it. You see what you get out of it. Like it's pages of reports of what was in

21:34 that data frame. Oh, this is great. Isn't that cool? It's got like, it's, it's tabbed and stuff. So it's tabbed.

21:40 It's got warnings. It's got pictures. It's got all kinds of analysis. It's got histogram graphs and you can

21:47 like hide and show details and the details include tabbed on, you know, I mean, this is a massive dive into what the

21:55 heck is going on with this data correlations heat maps. I mean, this is the business right here. So

22:02 this, this is like one line of code to get this output. This is great. This is like replaces a couple

22:09 interns at least. Sorry interns, but yeah, this is a really cool. So I totally recommend if this sounds

22:18 interesting, you do this kind of work, just pull up the NASA meteorite data and realize that like that

22:24 all came from, you know, importing the thing and saying DF profile report basically. And you get this,

22:30 you can also click and run that in binder and Google collabs. You can go and interact with it live if you

22:36 want. Yeah. I love the, I love the, the warnings on these, some of the things that can like saying some

22:41 of the variables that show up of like, there's some of them are skewed, like too many values at

22:46 one value that there's some of them have missing zeros showing. It does quite a bit of analysis for

22:52 you about the data right away. That's pretty great. Yeah. Yeah. The types is great because you can just,

22:57 I mean, you can have like hundreds or thousands of data points. It's not trivial to just, to just say,

23:04 oh yeah, all of them are true or false. All of them are, I know they're Booleans. You'd have to look at

23:09 everything first. So yeah. Yeah. It's one of those things that's like easy to adopt,

23:14 but looks really useful and it's also beautiful. So yeah, check it out. It looks great. I want to

23:18 talk about object oriented programming a little bit. Oh, okay. Actually, it's not something, I mean,

23:23 all of Python really is object oriented because we use, everything is an object really. Deep,

23:28 deep down, everything's a Py object pointer. Yeah. There's an article by Redewon Delaware called

23:35 Interfaces, Mix-Ins, and Building Powerful Custom Data Structures in Python.

23:39 And I really liked it because it's a Python focused, I mean, there's not a lot, I've actually been

23:45 disappointed with a lot of the object oriented discussions around Python. And a lot of them

23:50 are talk about basically, I think they're lamenting that the system isn't the same as other languages,

23:56 but it's just not. Get over it. This is a Python centric discussion talking about interfaces and

24:03 abstract base classes, both informal and formally abstract base classes using Mix-Ins. And it starts

24:11 out with the concept that people, there's like a base amount of knowledge that people have to have to

24:17 discuss this sort of thing and of understanding why they're useful and what are some of the downfalls

24:23 and upfalls or benefits and whatever. And so he actually starts by, it's not too deep of a

24:30 discussion, but it's an interesting discussion. And I think it's a good background to discuss it.

24:36 And then he talks about like one of the things you kind of get into a little bit and you go, well,

24:40 what's really different about an abstract base class and an interface, for instance?

24:45 And he writes interfaces can be thought of as a special case of an abstract base class. It's

24:51 imperative that all methods of an interface are abstract methods and that classes don't store

24:56 any data or any state or instance variables. However, in case of abstract base classes,

25:02 the methods are generally abstract, but there can also be methods that provide implementation,

25:06 concrete methods, and also these classes can have instance variables. So that's a nice distinction.

25:12 Yeah.

25:13 Then mix-ins are where you have a parent class that provides some functionality of a subclass,

25:18 but it's not intended to be instantiated itself. That's why it's sort of similar to abstract base

25:24 classes and other things. So having all this discussion from one person in a good discussion,

25:29 I think is a really great thing. And there are definitely times I don't pull into class hierarchies

25:36 and base classes that much, but there's times when you need them and they're very handy. So it's cool.

25:42 Yeah, this is super cool. Actually, I really like this analysis. I love that it's really Python focused

25:47 because a lot of times the mechanics of the language just don't support some of the object oriented

25:54 programming ideas in the same way, right? Like the interface keyword doesn't exist, right? So this

26:01 distinction, you have to, you have to make it in a conventional sense. Like we come up with a

26:06 convention that we don't have concrete methods or state with interfaces, right? But there's nothing,

26:12 there's not like an interface keyword in Python. So I like it. I'm a big fan of object oriented

26:17 programming. And I'm very aware that in Python, a lot of times what people use for classes is simply

26:25 unneeded. And I, I know where that comes from. And I want, I want to make sure that people don't

26:29 overuse it, right? If you come from Java or C#, or one of these OOP only languages,

26:34 everything's a class. And so you're just going to start creating classes. But if what you really want

26:38 is to group functions and a couple of pieces of data that's like shared, that's a module, right?

26:43 You don't need a class, right? You could still say module name dot and get the list of them.

26:47 And it's, it's like a static class or something like that. But sometimes you want to model stuff

26:53 with object oriented programming. And understanding the right way to do it in Python is really cool.

26:57 This looks like a good one.

26:58 Yeah. And also, there is a built in library called ABC, or abstract base class within Python. And it

27:06 seems like a, for a lot of people, it seems like a mystery thing that only advanced people use,

27:11 but it's really not that complicated. And this article uses that as well and talks about it. So

27:16 it's good.

27:17 You know, one of my favorite things about abstract base classes and abstract methods is in PyCharm,

27:23 if I have a class that derives from an abstract class, all I have to write is class, the thing

27:28 I'm trying to create, parentheses, abstract class name, close parentheses, colon, and then

27:34 you just hit alt, alt enter, and it'll pull up all the abstract methods. You can highlight

27:38 them, say implement, it goes, boom, and it'll just write the whole class for you.

27:40 Wow.

27:41 But if it's not abstract, obviously it won't do that, right? So the abstractness will tell

27:45 the editor to like write the stubs of all the functions for you.

27:47 Oh, that'd be, that's a cool use, reason to use them.

27:50 That's almost reason to have them in the first place.

27:52 Yeah.

27:52 Almost.

27:53 Nice.

27:54 We've pickled before, haven't we?

27:55 Yeah.

27:56 Yeah, we have talked about pickle a few times.

27:58 Yes. Have we talked about this article?

27:59 I don't remember.

28:00 I don't think so. We have. Apologies. But it's short and interesting. So Ned Batchelder wrote

28:07 this article called Pickle's Nine Flaws. And so I want to talk about that. This comes to us

28:12 via piecoders.com, which is very cool. And we've talked about the drawbacks. We talked about

28:18 the benefits. But what I liked about this article is concise, but it shows you all the tradeoffs

28:22 you're making. Right? So quickly, I'll just go through the nine. One, it's insecure. And

28:29 the reason that it's insecure is not because pickles contain code, but because they create

28:33 these objects by calling the constructors named in the pickle. So any callable can be used

28:39 in place of your class name to construct objects. So basically it runs potentially arbitrary code

28:46 depending on where it got it from. Old pickles look like old code, number two. So if your code

28:51 changes between the time you pickled it and whatever, it's like you get the old one recreated back to

28:57 life. Like so if you added fields or other capabilities, like those are not going to be

29:01 there. Or you took away fields. They're still going to be there. Yeah, it's implicit. So they

29:06 will serialize whatever your object structure is. And they often over serialize. So they'll

29:11 serialize everything. So like if you have cache data or pre-computed data that you wouldn't ever

29:16 normally save, well, that's getting saved. Yeah. One of the weird ones that this has caught me out

29:22 before and it's just, I don't know, it's weird. So there you go. Is the dunder in it, the constructor

29:27 is not called. So your objects are recreated, but the dunder in it is not called. They're just the values

29:33 have the value. So that might set it up in some kind of weird state.

29:38 Like maybe pass it, fail some validation or something. It's Python only. Like you can't share

29:43 with other programs because it's like a Python only structure. They're not readable. They're binary.

29:48 It will seem like it will pickle code. So if you have like a function you're hanging on to,

29:55 you pass it along like some kind of lambda function or whatever, or a class that's been passed over

30:00 and you have a list of them or you're holding onto them and that you think that it's going to save that,

30:05 all it really saves is basically like the name of the function. So those are gone. And I think one of

30:12 the big, real big challenges is actually slower than things like JSON and whatnot. So, you know,

30:17 if you're willing to give up those trade-offs because it was super fast, that's one thing,

30:21 but it's not. And are you telling me that we covered it before?

30:24 We did cover it in 189, but I had forgotten. So it was like a couple months ago, right?

30:28 Yeah, it's a while ago. Anyway, it's good to go over it again.

30:32 Definitely.

30:32 Be careful with your pickling. All right. How about anything extra? That was our top six

30:37 items. What else we got?

30:38 I don't have anything extra. Do you have anything extra?

30:40 Pathlib. Speaking of stuff we covered before, we talked about Pathlib a couple of times. You

30:45 talked about Chris May's article or whatever it was around Pathlib, which is cool. And I said,

30:50 basically, I'm still, I just got to get my mind around like not using OS.path and just get into this.

30:56 Right. And people sent me feedback like, Michael, you should get your mind into this. Of course,

31:00 you should do this. Right. And I'm like, yeah, yeah, I know. However, Brett Abel sent over a

31:05 one-line tweet that may just like seal the deal for me. Like, this is sweet. So he said, how about

31:11 this? Text equals path of file dot read text. Done. No context managers, no open, none of that. And I'm

31:18 like, oh, that's okay. That's pretty awesome. Anyway, I just wanted to give a little shout out to

31:26 that like one liner because that's pretty nice. And then also I was just a guest on a podcast out

31:31 of the UK called a question of code or the host Ed and Tom and I discussed why Python is fun. Why is

31:38 it good for beginners and for experts? Why am I give you results faster than like tangible code or

31:45 tangible programs faster than say like JavaScript, career stuff, all kinds of stuff. So anyway, I linked

31:50 to that if people want to check that out. That's cool. Yeah. It was a lot of fun. Those guys are

31:54 running a good show over there. Yeah. I think I'm, I think I'm talking with them tomorrow.

31:57 Right on. How cool. One of the things I like about it is the accents just, you know,

32:02 cause accents are fun. So I was going to ask you, would you consider learning how to do a British

32:07 accent? Cause that would be great for the show. I would love to, I fear I would just end up insulting

32:13 all the British people and, not coming across really well, but I love British accents.

32:19 If we had enough Patreon supporters, I would be more than happy to volunteer to move to England to develop

32:26 a, maybe just live in London for a few years. Like if they're going to fund that for you, that would be

32:32 awesome. Yeah. London's a great town. Okay. All right. How about another joke? I'd love another joke. So this

32:39 one is by Caitlin Hudon, but was pointed out to us by Aaron Brown. So she tweeted this on Twitter and he's like, Hey,

32:48 you guys should think about this. So you ready? Yeah. Caitlin says, I have a Python joke, but I think,

32:53 I don't think this is the right environment. Yeah. So there's a ton of these like type of jokes. Like

32:59 I have a joke, but so this is a new thing, right? I don't know. It's probably going to be over by the

33:04 time this airs, but I'm really amused by these types of jokes. Yeah. I love it. This kind of touches

33:10 on the whole virtual environment, package of management, isolation, chaos. I mean, there was that XKCD

33:15 as well about that. Yeah. Okay. So while we're, while we're here, I'm going to read some from

33:19 Luciano. Luciano Romalo, he's a Python author and he's an awesome guy. Here's a couple other related

33:26 ones. I have a Haskell joke, but it's not popular. I have a Scala joke, but nobody understands it.

33:32 I have a Ruby joke, but it's funnier in Elixir. And I have a Rust joke, but I can't compile it.

33:38 Yeah. Those are all good. Nice. Cool. Nice. Nice. All right. Well, Brian, thanks for being here as

33:42 always. Thank you. Talk to you later. Bye. Thank you for listening to Python Bytes. Follow the show on

33:48 Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at

33:54 Python Bytes.fm. If you have a news item you want featured, just visit Python Bytes.fm and send it our

33:59 way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Okken,

34:04 this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and

34:08 colleagues.

Back to show page