Transcript #193: Break out the Django testing toolbox
Return to episode page view on github00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
00:05 This is episode 193, recorded July 29th, 2020. I'm Michael Kennedy.
00:10 And I am Brian Okken.
00:12 And we've got a bunch of great stuff to tell you about. This episode is brought to you by us.
00:16 We will share that information with you later. But for now, I want to talk about something that
00:23 I actually saw today. And I think you're going to bring this up as well, Brian. I'm gonna let
00:27 you talk about it. But something I ran into updating my servers with a big warning in red when I pip
00:32 installed some stuff saying, your things are inconsistent. There's no way pip is going to
00:37 work soon. So be ready. And of course, that just results in frustration for me because, you know,
00:43 depend a bot tells me I need these versions, but some things don't require anyway, long story,
00:47 you tell us about it. Yeah, okay. So I was curious. I haven't actually seen this yet. So I'm glad that
00:52 you've seen it so that you have some experience. This was brought up to us by Matthew Fikart. And he
00:59 says he was running pip and he got this warning and it's all in red. So I'm gonna have to squint to read
01:04 this. Says, after October 2020, you may experience errors when installing or updating packages. This is
01:12 because pip will change the way it resolves dependency conflicts. We recommend that you use use features
01:18 equals 2020 resolver to test your packages. It shows up as an error. And I think it's just so that people
01:24 actually read it. But I don't know if it's a real error or not. It's still it works fine. But it's it's going to be an error
01:30 eventually. Okay, so this is not a problem to not adjust your sets actually do adjust your sets.
01:35 What you need to be aware of is the changes. So we've got a we I think we've covered it before,
01:40 but we've got a link in the show notes to the pip dependency resolver changes. And these are good
01:47 things. But one of the things that Matthew pointed out, which is great, and we're also going to link to
01:52 an article where he discusses where, like how his problem showed up with this. And it's around projects
01:59 that use some people use poetry and other things. And I can't remember the other one,
02:03 that does things like lock files and stuff. But a lot of people just do that kind of manually. And what
02:10 you often do is you have like a, your original set of requirements that are just your, like the handful
02:16 of things that you immediately depend on with no versions or or with minimal version rules around
02:23 it. And you say have installed this stuff. Well, that actually ends up installing a whole bunch of
02:29 all of your immediate dependencies, all of their dependencies and everything. So if you want to
02:34 lock that down, so that you're only installing the same things again and again, you say pip freeze,
02:39 and then pipe that to a, like a lock file. And then you can use that, I guess, a common pattern.
02:46 It's not the same as pip env's lock file and stuff, but it can be similar anyway. And then if you use
02:52 that and pip install from that, everything should be fine. You're going to install those dependencies.
02:56 The problem is if you don't use the use 2020 resolver feature to generate your lock file,
03:05 then if you do use it to install from your lock file, there may be incompatibilities with those.
03:12 So the resolvers actually try is actually, there's good things going on here, having pip do the
03:17 resolver better. But the quick note we want to say is don't panic when you see that red thing,
03:22 you should just try the use features 2020 resolver. But if you're using a lock file, use it for the
03:28 whole process, use the new resolver to generate your original lock file from your original stuff,
03:34 and then use it when you're installing the requirements lock file. There's also information
03:39 on the IPA website. They want to know if there's issues. This is still in a, it's available, but we're
03:46 still, there's still maybe kinks, but I think it's pretty solid. Not enforced, but warning days. Yeah. And I
03:52 kind of actually really like this way of rolling out a new feature and in a behavior change is to,
03:57 to have, to have, have it be available as a flag so that you can test in a, in a, not a pre-release,
04:04 but an actual release, and then, change the default behavior later. But so the reason why we're
04:10 bringing this up is October is not that far away. And October is the date when that's going to change
04:17 to not just the flag behavior, but the default behavior. So yes, go out and make sure these things
04:22 are happening. And if you completely ignore us when things break in October, the reason is probably
04:28 that you need to regenerate your lock file. Yep. So in principle, I'm all for this. This is a great idea.
04:34 It's going to make sure that things are consistent by looking at the dependencies of your libraries.
04:40 However, two things that are driving me bonkers right now are systems like depend a bot or pie up,
04:50 which are critically important for making sure that your web apps get updated with say like security
04:57 patches and stuff. Right. So you would do this like a, you know, pip freeze your dependencies,
05:02 and then it has the version. What if say you're using Django and there's a security release around
05:09 something in there, right? Unless you know to update that, it's always just going to install the one that
05:14 you started with. So you want to use a system like depend a bot or pie up where it's going to look at
05:19 your requirements. It's going to say, these are out of date. Let's update them. Here's the new one.
05:24 However, those systems don't look at the entirety of what it potentially could set them to. It says,
05:31 okay, you're using doc opt. There's 0.16 a doc opt. Oh, except for the thing that is before it
05:37 actually requires doc opt 14 or it's incompatible. And as in incompatible as pip will not install that
05:44 requirements.txt any longer. But those systems still say, great, let's upgrade it. And you're like in
05:52 this battle of those things or like upgrading it. And then like the older libraries are not
05:59 upgrading or you get two libraries. One requires doc opt 16 or above. One requires doc opt 14 or lower.
06:05 You just can no longer use those libraries together. Now it probably doesn't actually matter. Like the
06:10 feature you're using probably is compatible with both, but you won't be able to install it anymore. And
06:14 my hope is what this means is the people that have these weird old dependencies will either loosen the
06:22 requirements on their dependency structure. Like we're talking about, right? Like this thing uses this
06:27 older version or it's got to be a new version or update it or something, because it's going to be,
06:33 there's going to be packages that are just incompatible that are not actually incompatible because of this.
06:38 Yeah. Interesting. Yes. Painful. I don't know what to do about it, but it's like literally this
06:43 morning I ran into this and I had to go back and undo what depend a bot was trying to do for me
06:48 because certain things were no longer working, right? Or something like that. Interesting. Yeah. So
06:53 does depend a bot depend a bot? Depend a bot. Yeah. That's the thing that GitHub acquired that
06:58 basically looks at your various package listings and says, there's a new version of this. Let's pin it to
07:03 a higher version. And it comes as a PR. Okay. That was my question. It comes as a PR. So if you
07:08 had testing around in a CI like environment or something, it could catch it before it went
07:14 through. Yes. You'll still get the PR. It'll still be like be in your GitHub repo, but the CI
07:19 presumably would fail because the pip install step would fail. And then it would just know that it
07:25 couldn't auto merge it. But still it's like, you know, it's, you're like constantly like trying to
07:32 push the water, the tide back because you're like, stop doing this. It's driving me crazy. And they're,
07:38 there are certain ways to like link it, but then there's just a force it to certain boundaries.
07:42 But anyway, it's, it's like, it's going to make it a little bit more complicated. Some of these
07:46 things. Hopefully it considers this. Well, maybe depend upon it can update to do this.
07:50 Wouldn't that be great. Yep. That would be great. Well, speaking of packages, the way you use packages
07:57 is you import them once you've installed them, right? Yes. So Brandon Branner, Branner was talking on
08:03 Twitter with me saying like, I have some imports that are slow. Like how can I figure out what's
08:08 going on here? And this led me over to something may have covered a long time ago. I don't think so,
08:15 but possibly called import dash profiler. You know this? No, this is cool. Yeah. So one of the things
08:21 that can actually be legitimately slow about Python startup code is actually the imports. So for example,
08:30 like if you import requests, it might be importing like a ton of different things,
08:35 standard library modules, as well as external packages, which are then themselves importing
08:40 standard library modules, et cetera, et cetera. Right. So you might want to know what was slow and what's
08:45 not. And it's also not just, it's not like, just like a C include it's a imports actually run code.
08:52 Yes, exactly. It's not something happening at compile time. It's happening at runtime.
08:57 So every time you start your app, it goes through and it says, okay, what we're going to do is want
09:01 to execute the code that defines the functions and defines the methods and potentially other code as
09:06 well. Who knows what else is going on? So there's a non-trivial amount of time to be spent doing that
09:13 kind of stuff. For example, I believe it takes like half a second to import requests, just requests.
09:20 Interesting.
09:21 I mean, obviously that depends on the system, right? You do it on MicroPython versus on like a
09:25 supercomputer, the time's going to vary, but nonetheless, like there's a non-trivial amount
09:30 of time because of what's happening. So there's this cool thing called import profiler, which all you
09:34 got to do is say from import profiler, import, profile, import. Woo. Say that a bunch of times fast.
09:43 Written, it's fine. Spoken, it's funky. But then you just create a context manager around your import
09:49 statements. You say with profile import as context, all your imports, and then you can print out, you
09:55 say context.printinfo and you get a, like a profile status report.
09:59 That's cool.
10:00 Now I included a little tiny example of this for requests and what was coming out of it. If you look
10:05 at the documentation, it's actually much longer. So I'm looking here, I would say just, you know,
10:11 eyeballing it. There's probably 30 different modules being imported when you say import requests.
10:17 That's non-trivial, right? That's a lot of stuff. So this will give you that output. It'll say
10:23 here, this module imported this module, and then it has like a hierarchy or a tree type of thing. So
10:29 this module imported this module, which imported those other two. And so you can sort of see the
10:34 chain or a tree of like, if I'm importing this, here's the whole bunch of other stuff it takes with
10:39 it. Okay. Yeah. And it gives you the overall time. I think maybe the time dedicated to just
10:46 that operation and then the inclusive time or something. Actually, maybe it looks more like
10:51 83 milliseconds. Sorry, I had my units wrong instead of half a second, but nonetheless, it's like,
10:55 you know, you have a bunch of imports and your running code. Where is that slow?
10:59 You can run this and it basically takes three lines of code to figure out how much time
11:05 each part of that entire import stack. I want to say call stack of that execution, but it's
11:11 the series of imports that happen. Like you time that whole thing and look at it. So yeah,
11:16 that's, it's pretty cool.
11:16 That's neat. And also, I mean, there's, there's times where you want, you really want to get startup
11:22 time for something really as fast as possible. And this is part of it is, is your, the things you're
11:28 importing at your startup is non, sometimes non-trivial when you have something that you really want to
11:34 run fast.
11:35 Right. Like let's say you're spending half a second on startup time because of the imports.
11:39 You might be able to take the slowest part of those and import that in a function that gets called.
11:45 Right. So.
11:46 Yeah.
11:47 Import it later.
11:48 Yes.
11:48 You only pay for it at, if you're going to go down that branch, because maybe you're not going to
11:52 call that part of the operation or like that part of the CLI or whatever.
11:56 Yeah. And it's definitely one of those fine tuning things that you want to make sure you don't do this
12:00 too early, but, but for people packaging and supporting large projects, I think it's, it's a good idea to
12:07 pay attention to this and make sure to your import time. Like it'd be something that would be kind of
12:11 fun to throw in a test case for CI to make sure that your, your import time doesn't suddenly
12:16 go slower because something you depend on suddenly got slower or something like that.
12:21 Yeah. Yeah. Absolutely. And you don't necessarily know because the thing it depends upon that thing
12:26 changed, right? It's not even the thing you actually depend upon, right? It's very,
12:29 it could be very down, down the line. Yeah. And maybe you're like, we're going to use this other
12:33 library. We barely use it, but you know, we already have dependencies. Why not just throw this one in?
12:38 Oh wait, that's adding a quarter of a second. We could just vendor that one file that we don't really,
12:44 you know, and make it much, much faster. So there's a lot of interesting,
12:46 use cases here. A lot of time you don't care. Like for my web apps, I don't care for my CLI apps.
12:51 I might care. Yeah, definitely. Yeah. So I've been on this bit of a exploration lately, Brian,
12:57 and that's because I'm working on a new course. Yeah. Yeah. Yeah. We're actually working on a
13:02 bunch of courses over at talk Python, some data science ones, which are really awesome. But the
13:06 one that I'm working on is Python memory management and profiling are, and, tips and tricks and data
13:13 structures to make all those things go better. Nice. So I'm kind of on this profiling bent.
13:18 And anyway, so if people are interested in that or any of the other courses that we're working on,
13:23 they can check them out over at training.talkpython.fm helps bring you this podcast and others and,
13:30 books. Thanks for that transition. But I, I'm excited about that because the profiling and stuff is one of
13:35 those things that often is considered kind of like a black art, something that you just learn on the job
13:40 and how do you learn it? I don't know. You just have to know somebody that knows how to do it or
13:44 something. So having some courses around, that's a really great idea. Thanks. Yeah. Also like,
13:48 when does the GC run? What is the cost of reference counting? Can you turn off the GC? What data
13:54 structures are more efficient or less efficient according to that? And all that kind of stuff.
13:58 It'll be a lot of fun. Cool. Yeah. So I've got a book. I actually want to highlight something. I've
14:03 got a link called, it's pytestbook.com. So if you just go to pytestbook.com, it actually goes to a
14:10 landing page that's on blog. That's kind of not really that active, but there is a landing page.
14:16 The reason why I'm pointing this out is because some people are transitioning. Some people are
14:21 finally starting to use three, eight more. There's people starting to test three, nine a lot, which is
14:26 great. There's pytest six just got released, not one of our items. And I've gotten a lot of questions
14:32 of, is the book still relevant? And yes, the pytest book is still relevant, but there's a couple of
14:38 gotchas. I will list all of these on that landing page. So they're not there yet, but they will be
14:44 by the time this airs. Time travel. Yeah. There's a Rata page on Pragmatic that I'll link to, but also,
14:49 but the main, there's a few things like there's, a database that I use in the examples as a tiny
14:55 DB and the API changed since I wrote the book. There's a little note to update your, update the setup
15:02 to pin the database version. And there's a, something, you know, markers used to be,
15:08 used to be able to get away with just throwing markers in anywhere. Now you get a warning if you
15:13 don't declare them. There's a few minor things that get changed, are changed to that make it
15:18 for new pytest users might be frustrating to walk through the book. So I'm going to lay those out
15:23 just directly on that page to have, have people get started really quickly. So pytestbook.com
15:29 is what that is. Awesome. Yeah. It's a great book. And you might be on a testing
15:33 bent as well. If I'm on my profiling one. Yeah, actually. So this is a Django testing
15:39 toolbox is an article by Matt Lehman. And I was actually going to think about having him on the
15:44 show and I still might on testing code to talk about some of the stuff, but he threw together,
15:48 I just wanted to cover it here. Cause it's a really great throw together of information.
15:52 That's a quick walkthrough of how Matt tests Django projects. And he goes through
15:58 some of the packages that he uses all the time and some interesting techniques,
16:03 the packages that there's a couple of them that I was familiar with. pytest Django,
16:08 which is like, of course, you're, of course you should use that. factory boy is the one
16:13 there's a lot of factory boys, one project. There's a lot of different projects to generate
16:18 fake data. Factory boys, the one Matt uses. So there's a highlight there. And then one that
16:23 I hadn't heard of before Django test plus, which is a beefed up test case. It maybe has other stuff
16:29 too, but it has a whole bunch of helper utilities to make it easier to check commonly tested things
16:34 in Django. So that's, that's pretty cool. And then some of the techniques, like one of the things that
16:40 people, some people trying to use pytest for Django get tripped up at is a lot of people think of pytest
16:47 is just functions only test functions only and not test classes. But there's a, there are some uses,
16:53 Matt says he really likes to use test classes. I mean, there's no, I mean, pytest allows you to use
16:59 test classes, but, you can use these derived test cases like, the Django test plus test case.
17:05 A couple of other things using a range act assert as a structure in memory SQLite databases when you
17:11 can get away with it to speed up because in memory databases are way faster than on file system
17:16 databases.
17:17 Yeah. And you don't have to worry about, dependencies or servers you got to run. It's
17:20 just colon memory. Boom. You connect to it and off it goes. Nice.
17:25 Yeah. one of the things I didn't get, I mean, I, I kind of get the next one disabling
17:29 migrations while testing. I don't know a lot about my database migrations or Django migrations,
17:35 or whatever those are, but apparently disabling them is a good idea. It makes sense. Faster
17:40 password hasher. I have no idea what this is talking about, but apparently you can speed
17:46 up your testing by having a pass faster password hasher. Yeah. A lot of times they'll, they'll
17:51 generate them. So they're explicitly slow, right? So like over at talk Python, I have, I use
17:59 pass lib, not Django, but pass lib is awesome. But if you just do say an MD5, it is like,
18:05 super fast, right? So if you say, I want to generate this and take this and generate it,
18:11 it'll come up with the hashed output, but because it's fast, people could look at that and say,
18:16 well, let me try like a hundred thousand words. I know and see if any of them match that, then
18:20 that's the password, right? You can use more complicated ones. And MD5 is a, not when you
18:25 want something like be encrypt or something, which is slower a little bit and better, harder
18:31 to guess. But what you should really do is you should like insert little bits of like salt,
18:36 like extra text around it. So even if it matches, it's not exactly the same, like you can't do those
18:41 guesses, but then you should fold it, which means take the output of the first time, feed it back
18:46 through, take the output of the second time, feed it back through a hundred, 200, 300,000 times. So
18:51 that if they try to guess, it's super computationally slow. I'm sure that's what it's talking about. So
18:56 you don't want to do that when you want your test to run fast because you don't care about hash
18:59 security during test. Oh yeah. That makes total sense. That's my guess. I don't know for sure,
19:04 but that's what I think what that probably means. The last tip, which is always a good tip is figure
19:09 out your editor so that you can run your tests from your editor because your cycle time of flipping
19:15 between code and test is going to be a lot faster if you can run them from your editor. Yep.
19:20 These are good tips. And if you're super intense, you have the auto run,
19:23 which I don't do. I don't have auto run on. I do it once in a while. Yeah. Yeah. Cool. Well,
19:29 back to my rant, let's talk about profiling. Okay. Actually, this is, this is not exactly the same
19:34 type of profiling. It's more of a look inside of data than of performance. So this was recommended to
19:41 us by one of our listeners named Oz. First name only is what we got. So thank you, Oz. And he is a data
19:48 scientist who goes around and spends a lot of time working on Python and doing exploratory data analysis.
19:54 And the idea is like going to grab some data, open it up and explore it. Right. And just start
20:00 looking around, but it might be incomplete. It might be malformed. You don't necessarily know exactly what
20:04 its structure is. And he used to do this by hand, but he found this project called pandas dash profiling,
20:10 which automates all of this. So that sounds handy. I mentioned before missing, no missing N O as in
20:18 the missing number, missing data explorer, which is super cool. And I still think that's awesome,
20:23 but this is kind of in the same vein. And the idea is given a pandas data frame, you know, pandas has a
20:30 describe function that says a little bit of detail about it. But with this thing, it kind of takes that
20:35 and supercharges it. And you can say DF dot profile report, and it gives you all sorts of stuff. It does
20:42 type inference to say things in this column are integers or numbers, strings, date times, whatever.
20:48 It talks about the unique values, the missing values, quartile statistics, stuff, descriptives,
20:54 that's like mean mode, some standard deviation, a bunch of stuff, histograms, correlations,
21:01 missing values. There's the missing note thing. I spoke about text analysis of like categories and
21:07 whatnot, file and image analysis, like file sizes and creation dates and sizes of images and like
21:13 all sorts of stuff. So the best way to see this is to look at an example. So in our notes, Brian,
21:18 do you see where there's like a has nice examples and there's like the NASA meteorites one. So there's
21:22 an example for like the US census data, a NASA meteorite one, some Dutch healthcare data and so on.
21:29 If you open that up, you get it. You see what you get out of it. Like it's pages of reports of what was in
21:34 that data frame. Oh, this is great. Isn't that cool? It's got like, it's, it's tabbed and stuff. So it's tabbed.
21:40 It's got warnings. It's got pictures. It's got all kinds of analysis. It's got histogram graphs and you can
21:47 like hide and show details and the details include tabbed on, you know, I mean, this is a massive dive into what the
21:55 heck is going on with this data correlations heat maps. I mean, this is the business right here. So
22:02 this, this is like one line of code to get this output. This is great. This is like replaces a couple
22:09 interns at least. Sorry interns, but yeah, this is a really cool. So I totally recommend if this sounds
22:18 interesting, you do this kind of work, just pull up the NASA meteorite data and realize that like that
22:24 all came from, you know, importing the thing and saying DF profile report basically. And you get this,
22:30 you can also click and run that in binder and Google collabs. You can go and interact with it live if you
22:36 want. Yeah. I love the, I love the, the warnings on these, some of the things that can like saying some
22:41 of the variables that show up of like, there's some of them are skewed, like too many values at
22:46 one value that there's some of them have missing zeros showing. It does quite a bit of analysis for
22:52 you about the data right away. That's pretty great. Yeah. Yeah. The types is great because you can just,
22:57 I mean, you can have like hundreds or thousands of data points. It's not trivial to just, to just say,
23:04 oh yeah, all of them are true or false. All of them are, I know they're Booleans. You'd have to look at
23:09 everything first. So yeah. Yeah. It's one of those things that's like easy to adopt,
23:14 but looks really useful and it's also beautiful. So yeah, check it out. It looks great. I want to
23:18 talk about object oriented programming a little bit. Oh, okay. Actually, it's not something, I mean,
23:23 all of Python really is object oriented because we use, everything is an object really. Deep,
23:28 deep down, everything's a Py object pointer. Yeah. There's an article by Redewon Delaware called
23:35 Interfaces, Mix-Ins, and Building Powerful Custom Data Structures in Python.
23:39 And I really liked it because it's a Python focused, I mean, there's not a lot, I've actually been
23:45 disappointed with a lot of the object oriented discussions around Python. And a lot of them
23:50 are talk about basically, I think they're lamenting that the system isn't the same as other languages,
23:56 but it's just not. Get over it. This is a Python centric discussion talking about interfaces and
24:03 abstract base classes, both informal and formally abstract base classes using Mix-Ins. And it starts
24:11 out with the concept that people, there's like a base amount of knowledge that people have to have to
24:17 discuss this sort of thing and of understanding why they're useful and what are some of the downfalls
24:23 and upfalls or benefits and whatever. And so he actually starts by, it's not too deep of a
24:30 discussion, but it's an interesting discussion. And I think it's a good background to discuss it.
24:36 And then he talks about like one of the things you kind of get into a little bit and you go, well,
24:40 what's really different about an abstract base class and an interface, for instance?
24:45 And he writes interfaces can be thought of as a special case of an abstract base class. It's
24:51 imperative that all methods of an interface are abstract methods and that classes don't store
24:56 any data or any state or instance variables. However, in case of abstract base classes,
25:02 the methods are generally abstract, but there can also be methods that provide implementation,
25:06 concrete methods, and also these classes can have instance variables. So that's a nice distinction.
25:12 Yeah.
25:13 Then mix-ins are where you have a parent class that provides some functionality of a subclass,
25:18 but it's not intended to be instantiated itself. That's why it's sort of similar to abstract base
25:24 classes and other things. So having all this discussion from one person in a good discussion,
25:29 I think is a really great thing. And there are definitely times I don't pull into class hierarchies
25:36 and base classes that much, but there's times when you need them and they're very handy. So it's cool.
25:42 Yeah, this is super cool. Actually, I really like this analysis. I love that it's really Python focused
25:47 because a lot of times the mechanics of the language just don't support some of the object oriented
25:54 programming ideas in the same way, right? Like the interface keyword doesn't exist, right? So this
26:01 distinction, you have to, you have to make it in a conventional sense. Like we come up with a
26:06 convention that we don't have concrete methods or state with interfaces, right? But there's nothing,
26:12 there's not like an interface keyword in Python. So I like it. I'm a big fan of object oriented
26:17 programming. And I'm very aware that in Python, a lot of times what people use for classes is simply
26:25 unneeded. And I, I know where that comes from. And I want, I want to make sure that people don't
26:29 overuse it, right? If you come from Java or C#, or one of these OOP only languages,
26:34 everything's a class. And so you're just going to start creating classes. But if what you really want
26:38 is to group functions and a couple of pieces of data that's like shared, that's a module, right?
26:43 You don't need a class, right? You could still say module name dot and get the list of them.
26:47 And it's, it's like a static class or something like that. But sometimes you want to model stuff
26:53 with object oriented programming. And understanding the right way to do it in Python is really cool.
26:57 This looks like a good one.
26:58 Yeah. And also, there is a built in library called ABC, or abstract base class within Python. And it
27:06 seems like a, for a lot of people, it seems like a mystery thing that only advanced people use,
27:11 but it's really not that complicated. And this article uses that as well and talks about it. So
27:16 it's good.
27:17 You know, one of my favorite things about abstract base classes and abstract methods is in PyCharm,
27:23 if I have a class that derives from an abstract class, all I have to write is class, the thing
27:28 I'm trying to create, parentheses, abstract class name, close parentheses, colon, and then
27:34 you just hit alt, alt enter, and it'll pull up all the abstract methods. You can highlight
27:38 them, say implement, it goes, boom, and it'll just write the whole class for you.
27:40 Wow.
27:41 But if it's not abstract, obviously it won't do that, right? So the abstractness will tell
27:45 the editor to like write the stubs of all the functions for you.
27:47 Oh, that'd be, that's a cool use, reason to use them.
27:50 That's almost reason to have them in the first place.
27:52 Yeah.
27:52 Almost.
27:53 Nice.
27:54 We've pickled before, haven't we?
27:55 Yeah.
27:56 Yeah, we have talked about pickle a few times.
27:58 Yes. Have we talked about this article?
27:59 I don't remember.
28:00 I don't think so. We have. Apologies. But it's short and interesting. So Ned Batchelder wrote
28:07 this article called Pickle's Nine Flaws. And so I want to talk about that. This comes to us
28:12 via piecoders.com, which is very cool. And we've talked about the drawbacks. We talked about
28:18 the benefits. But what I liked about this article is concise, but it shows you all the tradeoffs
28:22 you're making. Right? So quickly, I'll just go through the nine. One, it's insecure. And
28:29 the reason that it's insecure is not because pickles contain code, but because they create
28:33 these objects by calling the constructors named in the pickle. So any callable can be used
28:39 in place of your class name to construct objects. So basically it runs potentially arbitrary code
28:46 depending on where it got it from. Old pickles look like old code, number two. So if your code
28:51 changes between the time you pickled it and whatever, it's like you get the old one recreated back to
28:57 life. Like so if you added fields or other capabilities, like those are not going to be
29:01 there. Or you took away fields. They're still going to be there. Yeah, it's implicit. So they
29:06 will serialize whatever your object structure is. And they often over serialize. So they'll
29:11 serialize everything. So like if you have cache data or pre-computed data that you wouldn't ever
29:16 normally save, well, that's getting saved. Yeah. One of the weird ones that this has caught me out
29:22 before and it's just, I don't know, it's weird. So there you go. Is the dunder in it, the constructor
29:27 is not called. So your objects are recreated, but the dunder in it is not called. They're just the values
29:33 have the value. So that might set it up in some kind of weird state.
29:38 Like maybe pass it, fail some validation or something. It's Python only. Like you can't share
29:43 with other programs because it's like a Python only structure. They're not readable. They're binary.
29:48 It will seem like it will pickle code. So if you have like a function you're hanging on to,
29:55 you pass it along like some kind of lambda function or whatever, or a class that's been passed over
30:00 and you have a list of them or you're holding onto them and that you think that it's going to save that,
30:05 all it really saves is basically like the name of the function. So those are gone. And I think one of
30:12 the big, real big challenges is actually slower than things like JSON and whatnot. So, you know,
30:17 if you're willing to give up those trade-offs because it was super fast, that's one thing,
30:21 but it's not. And are you telling me that we covered it before?
30:24 We did cover it in 189, but I had forgotten. So it was like a couple months ago, right?
30:28 Yeah, it's a while ago. Anyway, it's good to go over it again.
30:32 Definitely.
30:32 Be careful with your pickling. All right. How about anything extra? That was our top six
30:37 items. What else we got?
30:38 I don't have anything extra. Do you have anything extra?
30:40 Pathlib. Speaking of stuff we covered before, we talked about Pathlib a couple of times. You
30:45 talked about Chris May's article or whatever it was around Pathlib, which is cool. And I said,
30:50 basically, I'm still, I just got to get my mind around like not using OS.path and just get into this.
30:56 Right. And people sent me feedback like, Michael, you should get your mind into this. Of course,
31:00 you should do this. Right. And I'm like, yeah, yeah, I know. However, Brett Abel sent over a
31:05 one-line tweet that may just like seal the deal for me. Like, this is sweet. So he said, how about
31:11 this? Text equals path of file dot read text. Done. No context managers, no open, none of that. And I'm
31:18 like, oh, that's okay. That's pretty awesome. Anyway, I just wanted to give a little shout out to
31:26 that like one liner because that's pretty nice. And then also I was just a guest on a podcast out
31:31 of the UK called a question of code or the host Ed and Tom and I discussed why Python is fun. Why is
31:38 it good for beginners and for experts? Why am I give you results faster than like tangible code or
31:45 tangible programs faster than say like JavaScript, career stuff, all kinds of stuff. So anyway, I linked
31:50 to that if people want to check that out. That's cool. Yeah. It was a lot of fun. Those guys are
31:54 running a good show over there. Yeah. I think I'm, I think I'm talking with them tomorrow.
31:57 Right on. How cool. One of the things I like about it is the accents just, you know,
32:02 cause accents are fun. So I was going to ask you, would you consider learning how to do a British
32:07 accent? Cause that would be great for the show. I would love to, I fear I would just end up insulting
32:13 all the British people and, not coming across really well, but I love British accents.
32:19 If we had enough Patreon supporters, I would be more than happy to volunteer to move to England to develop
32:26 a, maybe just live in London for a few years. Like if they're going to fund that for you, that would be
32:32 awesome. Yeah. London's a great town. Okay. All right. How about another joke? I'd love another joke. So this
32:39 one is by Caitlin Hudon, but was pointed out to us by Aaron Brown. So she tweeted this on Twitter and he's like, Hey,
32:48 you guys should think about this. So you ready? Yeah. Caitlin says, I have a Python joke, but I think,
32:53 I don't think this is the right environment. Yeah. So there's a ton of these like type of jokes. Like
32:59 I have a joke, but so this is a new thing, right? I don't know. It's probably going to be over by the
33:04 time this airs, but I'm really amused by these types of jokes. Yeah. I love it. This kind of touches
33:10 on the whole virtual environment, package of management, isolation, chaos. I mean, there was that XKCD
33:15 as well about that. Yeah. Okay. So while we're, while we're here, I'm going to read some from
33:19 Luciano. Luciano Romalo, he's a Python author and he's an awesome guy. Here's a couple other related
33:26 ones. I have a Haskell joke, but it's not popular. I have a Scala joke, but nobody understands it.
33:32 I have a Ruby joke, but it's funnier in Elixir. And I have a Rust joke, but I can't compile it.
33:38 Yeah. Those are all good. Nice. Cool. Nice. Nice. All right. Well, Brian, thanks for being here as
33:42 always. Thank you. Talk to you later. Bye. Thank you for listening to Python Bytes. Follow the show on
33:48 Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at
33:54 Python Bytes.fm. If you have a news item you want featured, just visit Python Bytes.fm and send it our
33:59 way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Okken,
34:04 this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and
34:08 colleagues.