Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #193: Break out the Django testing toolbox

Return to episode page view on github
Recorded on Wednesday, Jul 29, 2020.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 193, recorded July 29th, 2020. I'm Michael Kennedy.

00:10 And I am Brian Okken.

00:12 And we've got a bunch of great stuff to tell you about.

00:14 This episode is brought to you by us. We will share that information with you later.

00:19 But for now, I want to talk about something that I actually saw today.

00:24 And I think you're going to bring this up as well, Brian.

00:27 I'm a let you talk about but something i ran into updating my service with a big warning in red when i pick and saw some stuff saying your things are inconsistent.

00:36 There's no way pep is going to work soon so be ready and of course that just results in frustration for me because you know and depend a bot tells me i need these versions but some things don't require anyway long story you tell us about it okay so i was curious i haven't actually seen this yet so i'm glad that you've seen it so you have some experience.

00:54 This was brought up to us by Matthew Fickart.

00:58 And he says he was running pip and he got this warning and it's all in red. So I'm gonna have to squint to read this.

01:04 It says, "After October 2020, you may experience errors when installing or updating packages.

01:11 This is because pip will change the way it resolves dependency conflicts.

01:15 We recommend that you use use_features=2020 resolver to test your packages." It shows up as an error and I think it's just so that people actually read it but I don't know if it's a real error or not. It still works fine but it's going to be an error eventually.

01:31 Okay so this is not a problem do not adjust your sets actually do adjust your sets.

01:35 What you need to be aware of is the changes so we've got a we I think we've covered it before but we've got a link in the show notes to the hit the dependency resolver changes and these are good things but one of the things that matthew pointed out which is great and i'm also gonna link to an article where he discusses where like how his problem showed up with this and it's around projects that use Some people use poetry and other things and i can't remember the other one, pipenv, that does things like lock files and stuff.

02:07 But a lot of people just do that kind of manually and what you often do is you have like a, your original set of requirements that are just your, like the handful of things that you immediately depend on with no versions or with minimal version rules around it.

02:23 And you say, pip install this stuff.

02:25 Well that actually ends up installing a whole bunch of all of your immediate dependencies all of their dependencies and everything.

02:33 So if you want to lock that down so that you're only installing the same things again and again, you say pip freeze and then pipe that to a like a lock file and then you can use that I guess a common pattern it's not the same as pip m's lock file and stuff but it can be similar anyway.

02:51 And then if you use that and pip install from that, everything should be fine. You're going to install those dependencies.

02:57 The problem is if you don't use the use2020 resolver feature to generate your lock file, then if you do use it to install from your lock file, there may be incompatibilities with those.

03:12 So the resolvers actually, there's good things going on here, having pip do the resolver better.

03:18 but the quick note we want to say is don't panic when you see that red thing, you should just try the use features 2020 resolver, but if you're using a lock file, use it for the whole process, use the new resolver to generate your original lock file from your original stuff, and then use it when you're installing the requirements lock file.

03:38 There's also information on the IPA website, they want to know if there's issues, this is still in a, it's available but we're still there still maybe kinks but i think it's pretty solid not enforced but warning is yeah i can actually really like this way of rolling out a new feature and in a behavior changes to have have it be available as a flag so that you can test in a in a not a pre release but an actual release.

04:05 And then i'm had change the default behavior later but so the reason why we're bringing this up is.

04:11 October is not that far away and october is the date when that's gonna change to not just a flag behavior but the default behavior so yes go out and make sure these things are happening and if you completely ignore us when things break in october.

04:27 The reason is probably the need to regenerate your lock file so in principle i'm all for this this is a great idea it's going to make sure that things are consistent.

04:37 by looking at the dependencies of your libraries.

04:41 However, two things that are driving me bonkers right now are systems like Dependabot or PyUp, which are critically important for making sure that your web apps get updated with, say, like security patches and stuff, right?

04:59 So you would do this like, you know, pip freeze your dependencies, and then it has the version.

05:05 What if say you're using Django and there's a security release around something in there, right?

05:10 Unless you know to update that, it's always just gonna install the one that you started with.

05:15 So you wanna use a system like Dependabot or PyUp where it's gonna look at your requirements.

05:20 It's gonna say, these are out of date, let's update them.

05:23 Here's the new one.

05:25 However, those systems don't look at the entirety of what it potentially could set them to.

05:30 It says, okay, you're using DocOpt.

05:33 There's 0.16 of DocOpt.

05:35 Oh, except for the thing that is before it actually requires DocOpt 14 or is incompatible.

05:41 And as in incompatible as pip will not install that requirements.txt any longer.

05:48 But those systems still say, great, let's upgrade it.

05:51 And you're like in this, this battle of those, those things are like upgrading it.

05:57 And then like the older libraries are not upgrading or you get two libraries, one requires doc op 16 or above one requires doc op 14 or lower, you just can no longer use those libraries together.

06:07 Now it probably doesn't actually matter like the feature you're using probably is compatible with both, but you won't be able to install it anymore.

06:14 And my hope is what this means is the people that have these weird old dependencies will either loosen the requirements on their dependency structure, like we're talking about, right, this thing uses this older version or it's got to be a new version or update it or something because it's going to be there's going to be packages that are just incompatible that are not actually incompatible because of this. Yeah, interesting. Yes, painful. I don't know what to do about it. But it's like literally this morning I ran into this and I had to go back and undo what dependabot was trying to do for me because certain things were no longer working right or something - Interesting. - Yeah.

06:52 - So does Dependabot, Dependabot, Dependabot?

06:55 - Yeah, that's the thing that GitHub acquired that basically looks at your various package listings and says, "There's a new version of this.

07:02 Let's pin it to a higher version," and it comes as a PR.

07:04 - Okay, that was my question.

07:06 It comes as a PR, so if you had testing around in a CI-like environment or something, it could catch it before it went through?

07:14 - Yes, you'll still get the PR.

07:16 It'll still be in your GitHub repo, repo, but the CI presumably would fail because the pip install step would fail, and then it would just know that it couldn't auto merge it.

07:30 But still, it's like, you're constantly trying to push the water, the tide back because you're like, "Stop doing this.

07:39 It's driving me crazy." And there are certain ways to link it, but then you're just forced it to certain boundaries.

07:44 But anyway, it's going to make it a little bit more complicated some of these things.

07:46 Hopefully it considers this.

07:48 - Well, maybe Dependapod can update to do this.

07:51 - Wouldn't that be great?

07:52 Yeah, that would be great.

07:54 Well, speaking of packages, the way you use packages is you import them once you've installed them, right?

07:59 - Yes.

08:00 - So Brandon Branner was talking on Twitter with me saying like, "I have some imports that are slow.

08:07 "Like how can I figure out what's going on here?" And this led me over to something we may have covered a long time ago.

08:14 I don't think so, but possibly, called import-profiler.

08:18 You know this?

08:19 - No, this is cool.

08:20 - Yeah, so one of the things that can actually be legitimately slow about Python startup code is actually the imports.

08:29 So for example, like if you import requests, it might be importing like a ton of different things, standard library modules, as well as external packages, which are then themselves importing standard library modules, et cetera, et cetera, right?

08:43 - So you might wanna know what's slow and what's not.

08:46 - And it's also not just, it's not like, just like a C include, it's a, imports actually run code.

08:52 - Yes, exactly.

08:53 It's not something happening at compile time.

08:55 It's happening at run time.

08:57 So every time you start your app, it goes through and it says, okay, what we're gonna do is we're gonna execute the code that defines the functions and defines the methods and potentially other code as well.

09:07 Who knows what else is going on?

09:08 So there's a non-trivial amount of time to be spent doing that kind of stuff.

09:14 For example, I believe it takes like half a second to import requests, just requests.

09:20 - Interesting.

09:21 - Obviously that depends on the system, right?

09:23 You do it on MicroPython versus on like a super computer, the time's gonna vary.

09:27 But nonetheless, there's a non-trivial amount of time because of what's happening.

09:31 So there's this cool thing called import profiler, which all you gotta do is say, from import profiler, import, profile, import.

09:41 - Woo, say that a bunch of times fast.

09:43 Written, it's fine.

09:45 Spoken, it's funky.

09:46 But then you just create a context manager around your import statements.

09:49 You say with profile import as context, all your imports, and then you can print out, you say context.printInfo, and you get a profile status report.

09:59 - That's cool.

10:00 - Now I included a little tiny example of this for requests and what was coming out of it.

10:04 If you look at the documentation, it's actually much longer.

10:08 So I'm looking here.

10:10 I would say just eyeballing it, there's probably 30 different modules being imported when you say import requests.

10:17 That's non-trivial, all right?

10:19 That's a lot of stuff.

10:20 So this will give you that output.

10:23 It'll say, here, this module imported this module, and then it has a hierarchy or a tree type of thing.

10:29 So this module imported this module, which imported those other two, and so you can sort of see the chain or a tree of if I'm importing this, here's the whole bunch of other stuff it takes with it.

10:40 - Okay. - Yeah, and it gives you the overall time, I think maybe the time dedicated to just that operation, and then the inclusive time or something, actually maybe it looks more like 83 milliseconds, sorry, I have my units wrong, I said a half a second, but nonetheless, it's like, you know, you have a bunch of imports and you're running code, where is that slow?

11:00 You can run this and it basically takes three lines of code to figure out how much time each part of that entire import stack, I don't know, I want to say call stack of that execution, but it's the series of imports that happen.

11:13 Like you time that whole thing and look at it.

11:15 So yeah, it's pretty cool.

11:17 - That's neat.

11:18 And also, I mean, there's times where you really want to get startup time for something really as fast as possible.

11:25 And this is part of it, is the things you're importing at your startup is sometimes non-trivial when you have something that you really want to run fast.

11:35 - Right, like let's say you're spending half a second on startup time because of the imports.

11:40 You might be able to take the slowest part of those and import that in a function that gets called, right?

11:46 So-- - Yeah, import it later.

11:48 - Yes, you only pay for it if you're gonna go down that branch 'cause maybe you're not gonna call that part of the operation or like that part of the CLI or whatever.

11:56 - Yeah, and it's definitely one of those fine-tuning things that you wanna make sure you don't do this too early.

12:01 But for people packaging and supporting large projects, I think it's a good idea to pay attention to this and make sure to your import time.

12:10 Like it'd be something that would be kind of fun to throw in a test case for CI to make sure that your import time doesn't suddenly go slower because something you depend on suddenly got slower or something like that.

12:21 - Yeah, yeah, absolutely.

12:22 And you don't necessarily know 'cause the thing it depends upon, that thing changed, right?

12:26 It's not even the thing you actually depend upon, right?

12:29 It's very, it could be very down the line.

12:31 Yeah, and maybe you're like, we're gonna use this other library.

12:34 we barely use it, but we already have dependencies, why not just throw this one in?

12:38 Oh wait, that's adding a quarter of a second.

12:40 We could just vendor that one file that we don't really, and make it much, much faster.

12:45 So there's a lot of interesting use cases here.

12:48 A lot of time you don't care.

12:48 Like for my web apps, I don't care.

12:50 For my CLI apps, I might care.

12:52 - Yeah, definitely.

12:53 - Yeah, so I've been on this bit of exploration lately, Brian, and that's because I'm working on a new course.

13:01 - Yeah?

13:02 - Yeah, yeah, we're actually working on a bunch of courses over at Talk Python, some data science ones, which are really awesome.

13:06 But the one that I'm working on is Python memory management and profiling, and tips and tricks and data structures to make all those things go better.

13:16 - Nice.

13:16 - So I'm kind of on this profiling bent.

13:18 And anyway, so if people are interested in that or any of the other courses that we're working on, they can check them out over at training.talkpython.fm.

13:27 Helps bring you this podcast and others and books.

13:31 - Thanks for that transition, but I'm excited about that because the profiling and stuff is one of those things that often is considered kind of like a black art, something that you just learn on the job.

13:40 And how do you learn it?

13:41 I don't know, you just have to know somebody that knows how to do it or something.

13:45 So having some courses around that's a really great idea.

13:47 - Thanks, yeah, and also like when does a GC run?

13:50 What is the cost of reference counting?

13:52 Can you turn off the GC?

13:53 What data structures are more efficient or less efficient according to that and all that kind of stuff.

13:58 It'll be a lot of fun.

13:59 - Cool, yeah, so I've got a book.

14:01 I actually want to highlight something.

14:03 I've got a link called, it's pytestbook.com.

14:06 So if you just go to pytestbook.com, it actually goes to a landing page.

14:11 That's on blog.

14:12 That's kind of not really that active, but there is a landing page.

14:16 The reason why I'm pointing this out is because some people are transitioning.

14:21 Some people are finally starting to use three, eight more.

14:23 There's people starting to test three, nine a lot, which is great.

14:27 There's.

14:28 pytest 6 just got released, not one of our items.

14:30 And I've gotten a lot of questions of, is the book still relevant?

14:34 And yes, the pytest book is still relevant, but there's a couple gotchas.

14:39 I will list all of these on that landing page.

14:42 So they're not there yet, but they will be by the time this airs.

14:45 - Time travel.

14:46 - Yeah, there's a RADA page on Pragmatic that I'll link to, but the main, there's a few things.

14:51 Like there's a database that I use in the examples as a tiny DB, and the API changed since I wrote the book.

14:59 There's a little note to update this setup to pin the database version.

15:04 And there's something, markers used to be, used to be able to get away with just throwing markers in anywhere.

15:12 Now you get a warning if you don't declare them.

15:14 There's a few minor things that are changed that make it for new pytest users might be frustrating to walk through the book.

15:22 So I'm gonna lay those out just directly on that page to have people get started really quickly.

15:27 So, pytestbook.com is what that is.

15:30 - Awesome, yeah, it's a great book.

15:31 And you might be on a testing event as well, if I'm on my profiling one.

15:36 - Yeah, actually, so this is a Django testing toolbox, is an article by Matt Lehman, and I was actually gonna think about having him on the show, and I still might, on Testing Code to talk about some of this stuff.

15:47 But he threw together, I just wanted to cover it here, 'cause it's a really great throw together of information.

15:53 That's a quick walkthrough of how Matt tests Django projects and he goes through some of the packages that he uses all the time and some interesting techniques.

16:04 The packages that there's a couple of them that I was familiar with, pytest Django, which is like, of course, you're, of course, you should use that.

16:13 Factory Boy is the one there's a lot of, Factory Boy is one project, there's a lot of different projects to generate fake data.

16:20 Factory Boy is the one Matt uses.

16:21 So there's a highlight there.

16:23 And then one that I hadn't heard of before, Django test plus, which is a beefed up test case and maybe has other stuff too, but it has a whole bunch of helper utilities to make it easier to check commonly tested things in Django.

16:36 So that's pretty cool.

16:38 And then some of the techniques, like one of the things that people, some people trying to use pytest for Django get tripped up at is a lot of people think of pytest as just functions only, test functions only and not test classes.

16:51 There are some uses. Matt says he really likes to use test classes.

16:56 I mean, pytest allows you to use test classes, but you can use these derived test cases like the Django test plus test case.

17:05 A couple of things using a Rage Act assert as a structure, in-memory SQLite databases, when you can get away with it to speed up because in-memory databases are way faster than on-file system databases.

17:17 Yeah, and you don't have to worry about dependencies or servers you got to run. It's just colon memory Boom you connect to it and off it goes. Yeah one of the things I didn't get I mean I I kind of get the next one disabling migrations while testing I don't know a lot about my database migrations or django migrations or whatever those are But apparently disabling them is a good idea. It makes sense faster password hasher I have no idea what this is talking about, but Apparently you can speed up your testing by having a pass faster password hasher Yeah, a lot of times they'll, they'll generate them.

17:52 So they're explicitly slow.

17:56 Right.

17:56 So like over at talk Python, I have, I use pass lib, not Django, but pass up is awesome, but if you just do say an MD five, it is like super fast, right?

18:06 So if you say, I want to generate this and take this and generate it, it'll come up with the hashed output, but because it's fast, people could look at that and say, well, let me try like a hundred thousand words I know and see If any of them match that, then that's the password, right?

18:22 You can use more complicated ones, and MD5 is not one you want.

18:25 Something like B encrypt or something, which is slower a little bit, and better, harder to guess.

18:32 But what you should really do is you should like, insert little bits of like salt, like extra text around it, so even if it matches, it's not exactly the same.

18:40 You can't do those guesses.

18:42 But then you should fold it, which means take the output of the first time, feed it back through, take the output of the second time, feed it back through 100, 200, 300,000 times so that if they try to guess, it's super computationally slow.

18:55 I'm sure that's what it's talking about.

18:56 So you don't want to do that when you want your test to run fast 'cause you don't care about hash security during tests.

19:01 - Oh yeah, that makes total sense.

19:03 - That's my guess.

19:04 I don't know for sure, but that's what I think what that probably means.

19:06 - The last tip, which is always a good tip, is figure out your editor so that you can run your tests from your editor 'cause your cycle time of flipping between code and test is going to be a lot faster if you can run them from your editor.

19:20 Yep.

19:21 These are good tips.

19:22 And if you're super intense, you have the auto run, which I don't do.

19:25 I don't have auto run on.

19:26 I do it once in a while.

19:28 Yeah, yeah.

19:29 Cool.

19:30 Well, back to my rant.

19:31 Let's talk about profiling.

19:32 Okay.

19:33 Actually, this is not exactly the same type of profiling.

19:35 It's more of a look inside of data than of performance.

19:39 So this was recommended to us by one of our listeners named Oz.

19:43 First name only is what we got.

19:45 So thank you, Oz.

19:46 And he is a data scientist who goes around and spends a lot of time working on Python and doing exploratory data analysis.

19:54 And the idea is like going to grab some data, open it up and explore it, right?

19:59 And just start looking around.

20:00 But it might be incomplete, it might be malformed.

20:03 You don't necessarily know exactly what its structure is.

20:06 And he used to do this by hand, but he found this project called Pandas-Profiling, which automates all of this.

20:13 So that sounds handy.

20:15 I mentioned before missing no, missing N-O, as in the missing number, missing data explorer, which is super cool and I still think that's awesome.

20:23 But this is kind of in the same vein.

20:25 And the idea is given a pandas data frame, you know, pandas has a describe function that says a little bit of detail about it.

20:34 But with this thing, it kind of takes that and supercharges it.

20:37 And you can say df.profile report, and it gives you all sorts of stuff.

20:41 does type inference to say things in this column are integers or numbers, strings, date, times, whatever. It talks about the unique values, the missing values, quartile statistics stuff, descriptives, that's like mean mode, standard deviation, a bunch of stuff. Histograms, correlations, missing values, there's the missing note thing I spoke about, text analysis of like categories and whatnot, file and image analysis, like file sizes and creation dates and sizes of images and like all sorts of stuff.

21:15 So the best way to see this is to look at an example.

21:17 So in our notes, Brian, do you see where there's like, has nice examples and there's like the NASA meteorites one.

21:22 So there's an example for like the US census data, a NASA meteorite one, some Dutch healthcare data and so on.

21:29 If you open that up, you get it, you see what you get out of it.

21:31 Like it's pages of reports of what was in that data frame.

21:35 - Oh, this is great.

21:36 - Isn't that cool? - This has got like, it's tabbed and stuff, so you can--

21:39 - It's tabbed, it's got warnings, it's got pictures, it's got all kinds of analysis.

21:45 - Tons of graphs. - It's got histogram graphs and you can like hide and show details and the details include tabbed, you know, I mean this is a massive dive into what the heck is going on with this data, correlations, heat maps.

22:00 I mean, this is the business right here.

22:02 So this is like one line of code to get this output.

22:06 - This is great.

22:08 This like replaces a couple interns at least.

22:10 (laughing)

22:13 - Sorry interns, but yeah, this is really cool.

22:16 So I totally recommend if this sounds interesting, you do this kind of work, just pull up the NASA meteorite data and realize that like that all came from importing the thing and saying df profile report basically.

22:29 And you get this.

22:31 You can also click and run that in Binder and Google Collabs.

22:34 You can go and interact with it live if you want.

22:36 - Yeah, I love the warnings on some of the things.

22:40 Like saying some of the variables will show up of like, there's some of them are skewed, like too many values at one value, that there's some of them have missing zeros showing.

22:51 It does quite a bit of analysis for you about the data right away.

22:54 That's pretty great.

22:55 - Yeah, yeah.

22:56 - Yeah, the types is great because you can just, I mean, you'd have like hundreds or thousands of data points.

23:01 It's not trivial to just say, oh yeah, all of them are true or false.

23:06 All of them are, I know they're Booleans.

23:08 You'd have to look at everything first.

23:10 - Yeah.

23:11 It's one of those things that's like easy to adopt, but looks really useful and it's also beautiful.

23:16 So yeah, check it out.

23:18 It looks great.

23:18 - I want to talk about object-oriented programming a little bit.

23:21 - Oh, okay.

23:21 - Actually, it's not something, I mean, all of Python really is object-oriented because we use everything as an object really.

23:28 - Deep, deep down, everything's a Py object pointer.

23:31 - Yeah, there's an article by Redouane Delaware called Interfaces, Mixins, and Building Powerful Custom Data Structures in Python.

23:40 And I really liked it because it's a Python focused, I mean, there's not a lot, I've actually been disappointed with a lot of the object-oriented discussions around Python.

23:49 And a lot of them are, talk about basically, I think they're lamenting that the system isn't the same other languages, but it's just not. Get over it. This is a Python-centric discussion talking about interfaces and abstract base classes, both informal and formal, abstract base classes, using mixins, and it starts out with the concept that people, there's like a base amount of knowledge that people have to have to discuss this sort of thing, and of understanding why they're useful and what are some of the downfalls and upfalls or benefits and whatever.

24:26 And so he actually starts by talking, it's not too deep of a discussion, but it's an interesting discussion and I think it's a good background to discuss it.

24:36 Then he talks about, like one of the things you kind of get into a little bit and you go, well, what's really different about an abstract base class and an interface, for instance.

24:45 And he writes, "Interfaces can be thought of as a special case of an abstract base class.

24:51 It's imperative that all methods of an interface are abstract methods and that classes don't store any data or any state or instance variables.

24:59 However, in case of abstract base classes, the methods are generally abstract, but there can also be methods that provide implementation, concrete methods, and also these classes can have instance variables." So that's a nice distinction.

25:13 And mixins are where you have a parent class that provides some functionality of a subclass, but it's not intended to be instantiated itself.

25:21 That's why it's sort of similar to abstract base classes and other things.

25:25 So having all this discussion from one person in a good discussion, I think is a really great thing.

25:32 And there are definitely times I don't pull into class hierarchies and base classes that much, but there's times when you need them and they're very handy.

25:41 So this is cool.

25:42 This is super cool, actually.

25:43 I really like this analysis.

25:45 I love that it's really Python focused because a lot of times the mechanics of the language just don't support some of the object-oriented programming ideas in the same way, right?

25:56 Like the interface keyword doesn't exist, right?

26:00 So this distinction, you have to make it in a conventional sense.

26:05 Like we come up with a convention that we don't have concrete methods or state with interfaces, right?

26:11 but there's nothing, there's not like an interface keyword in Python.

26:15 So I like it.

26:16 I, I'm a big fan of object oriented programming, and I'm very aware that in Python, a lot of times what people use for classes is simply unneeded.

26:25 And I, I know where that comes from and I want, I want to make sure that people don't overuse it, right?

26:30 If you come from Java or C# or one of these OOP only languages, everything's a class, and so you're just going to start creating classes.

26:37 But if what you really want is to group functions and a couple of pieces of data that's like shared, that's a module, right?

26:43 You don't need a class, right?

26:44 You could still say module name dot and get the list of them.

26:47 And it's like a static class or something like that.

26:50 But sometimes you want to model stuff with object-oriented programming and understanding the right way to do it in Python is really cool.

26:57 This looks like a good one.

26:58 - Yeah, and also there is a built-in library called ABC for abstract base class within Python.

27:05 And it seems like a, for a lot of people, it seems like a mystery thing that only advanced people use, but it's really not that complicated.

27:12 And this article uses that as well and talks about it.

27:16 So it's good.

27:17 You want to my favorite things about abstract based classes and abstract methods is in PyCharm.

27:23 If I have a class that derives from an abstract class, all I have to write is class.

27:28 The thing I'm trying to create, parentheses, abstract class name, close parentheses, colon, and then you just hit alt, alt, enter, and it'll pull up all the abstract methods.

27:37 You can highlight them, say implement.

27:39 and it goes boom and it'll just write the whole class for you.

27:41 But if it's not abstract, obviously it won't do that, right?

27:43 So the abstractness will tell the editor to write the subs of all the functions for you.

27:48 That's a cool reason to use them.

27:50 That's almost reason to have them in the first place.

27:53 Almost.

27:54 We've pickled before, haven't we?

27:56 Yeah, we have talked about pickle a few times.

27:58 Yes, have we talked about this article?

28:00 I don't remember.

28:01 I don't think so.

28:02 We have, apologies.

28:04 But it's short and interesting.

28:06 Batchelder wrote this article called Pickles 9 Flaws.

28:10 And so I want to talk about that.

28:12 This comes to us via PyCoders.com, which is very cool.

28:16 And we've talked about the drawbacks, we've talked about the benefits, but what I liked about this article is concise, but it shows you all the trade-offs you're making.

28:23 Right?

28:24 So quickly, I'll just go through the nine.

28:27 One, it's insecure.

28:29 And the reason that it's insecure is not because pickles contain code, but because they create these objects by calling the constructors named in the pickle. So any callable can be used in place of your class name to construct objects. So basically it runs potentially arbitrary code depending on where it got it from. Old pickles look like old code number two. So if your code changes between the time you pickled it and whatever, you get the old one recreated back to life. So if you added fields or other capabilities, those are not going to be there. Or you took away fields, they're still going to be there.

29:04 Yeah, it's implicit. So they will serialize whatever your object structure is. And they often over serialize, because they'll serialize everything. So like if you have cache data, or pre computed data that you wouldn't ever normally save, while that's getting saved. Yeah. One of the weird ones that this has caught me out before, and it's just, I don't know, it's weird, though, there you go, is the dunder in it, the constructor is not called. So your objects are recreated, but the dunder in it is not called. They're just the values have the value. So that might set it up in some kind of weird state. Like maybe pass the fail some validation or something. It's Python only like you can't share with other programs because it's like a Python only structure. They're not readable, they're binary, it will seem like it will pickle code. So if you have like a function you're hanging on to you pass it along like some kind of lambda function or whatever, or a class that's been passed over, and you have a list of them or you're holding on to them, and that you think that it's going to save that, all it really saves is basically the name of the function.

30:09 So, those are gone. And I think one of the real big challenges is it's actually slower than things like JSON and whatnot.

30:17 If you're willing to give up those tradeoffs because it was super fast, that's one thing, but it's not. And are you telling me that we covered it before?

30:25 - I was like 89, but I had forgotten.

30:27 So it was like a couple months ago, right?

30:29 - Yeah, that's a while ago.

30:30 Anyway, it's good to go over it again.

30:32 - Definitely.

30:33 - Be careful with your pickling.

30:34 All right, how about anything extra?

30:35 That was our top six items.

30:38 What else we got?

30:39 - I don't have anything extra.

30:40 Do you have anything extra?

30:41 - Pathlib.

30:42 Speaking of stuff we covered before, we talked about Pathlib a couple times.

30:45 You talked about Chris May's article or whatever it was around Pathlib, which is cool.

30:49 And I said basically, I'm still, I just got to get my mind around not using OS.path and just get into this, right?

30:57 And people sent me feedback like, Michael, you should get your mind into this.

31:00 Of course you should do this, right?

31:01 And I'm like, yeah, yeah, I know.

31:02 However, Brett Abel sent over a one-line tweet that may just seal the deal for me.

31:08 Like, this is sweet.

31:09 So he said, how about this?

31:12 Text equals path of file.readText, done.

31:15 No context managers, no open, none of that.

31:18 And I'm like, oh, that's, okay, that's pretty awesome.

31:21 (laughing)

31:23 Anyway, I just wanted to give a little shout out to that one-liner, 'cause that's pretty nice.

31:28 And then also, I was just a guest on a podcast out of the UK called A Question of Code, where the host, Ed and Tom, and I discussed why Python is fun, why is it good for beginners and for experts, why it might give you results faster than tangible code or tangible programs, faster than, say, JavaScript, career stuff, all kinds of stuff.

31:50 I linked to that if people want to check that out.

31:52 - That's cool.

31:53 - Yeah, it was a lot of fun.

31:53 Those guys are running a good show over there.

31:55 - Yeah, I think I'm talking with them tomorrow.

31:57 - Right on, how cool.

31:59 - One of the things I like about it is the accents.

32:02 Just, you know, 'cause accents are fun.

32:04 So I was gonna ask you, would you consider learning how to do a British accent?

32:07 'Cause that would be great for the show.

32:09 - I would love to.

32:10 I fear I would just end up insulting all the British people and not coming across really well.

32:17 But I love British accents.

32:19 If we had enough Patreon supporters, I would be more than happy to volunteer to move to England to develop an accent.

32:28 - Maybe just live in London for a few years.

32:31 If they're gonna fund that for you, that would be awesome.

32:33 - Yeah, that'd be great.

32:34 - London's a great town.

32:35 (laughing)

32:36 - Okay.

32:37 - All right, how about another joke?

32:38 - I'd love another joke.

32:39 - So this one is by Caitlin Hudon, but was pointed out to us by Erin Brown.

32:45 So she tweeted this on Twitter, and he's like, "Hey, you guys should think about this." So you ready?

32:50 - Yeah. - Caitlin says, "I have a Python joke, "but I don't think this is the right environment." (laughing)

32:56 - Yeah, so there's a ton of these type of jokes.

32:59 Like I have a joke, but, so this is a new thing, right?

33:02 I don't know, it's probably gonna be over by the time this airs. - Yeah, probably.

33:05 - But I'm really amused by these types of jokes.

33:08 - Yeah, I love it.

33:09 This kind of touches on the whole virtual environment, package management, isolation, chaos.

33:14 I mean, there was that XKCD as well about that.

33:16 - Yeah, okay, so while we're here, I'm gonna read some from Luciano.

33:21 Luciano Romalo, he's a Python author, and he's an awesome guy.

33:25 Here's a couple other related ones.

33:27 I have a Haskell joke, but it's not popular.

33:29 I have a Scala joke, but nobody understands it.

33:33 I have a Ruby joke, but it's funnier in Elixir.

33:35 And I have a Rust joke, but I can't compile it.

33:38 - Yeah, those are all good, nice.

33:39 - Cool. - Nice, nice.

33:41 All right, well, Brian, thanks for being here, as always.

33:43 - Thank you, talk to you later.

33:44 - Bye.

33:45 Thank you for listening to Python Bytes.

33:47 Follow the show on Twitter via @PythonBytes.

33:50 That's Python Bytes as in B-Y-T-E-S.

33:53 And get the full show notes at PythonBytes.fm.

33:56 If you have a news item you want featured, just visit PythonBytes.fm and send it our way.

34:00 We're always on the lookout for sharing something cool.

34:03 On behalf of myself and Brian Okken, this is Michael Kennedy.

34:06 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page