Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #197: Structured concurrency in Python

Return to episode page view on github
Recorded on Thursday, Aug 27, 2020.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds. This is episode 197 recorded August 26th 2020 Brian, can you believe it's the end of August even if I can't say it it still is true No, I can't I don't know where August went I just I thought this whole pandemic thing would make the summer seem long and slow. It seems like it just went faster Yeah, I've got like a Lego kit that I was planning on doing like the first week week of summer vacation and it's still sitting here.

00:30 - Yeah, for sure.

00:32 Yeah, there's a lot of things I want to get done before the sun goes away and rain starts for six months straight.

00:37 That's a Pacific Northwest problem, but it's our problem.

00:39 All right, now this episode is brought to you by us as well.

00:42 We'll tell you more about the things that we're doing that we think you will appreciate later.

00:46 Right now, I want to talk about something that I think we might've covered before, but I don't know if we've ever satisfactorily covered it.

00:52 Maybe this time we'll get a little closer and that's async IO.

00:55 - Oh yeah, I think that's a new topic.

00:57 - It's a totally new topic.

00:58 Covered only less than GUIs now.

01:01 So there's a new, how should I put it?

01:05 A new compatibility-like layer library that allows you to work a little bit better with AsyncIO and some of the other Async libraries that are not directly, immediately the same as or built right on top of AsyncIO.

01:23 Curio from David Beasley and Trio from Nathaniel Smith.

01:28 So there's an article that talks about this.

01:30 I'm gonna mention as part of this conversation.

01:33 And then say, "Hey, Python has three well-known "concurrency libraries built around async "and await syntax, asyncIO, curio, and trio." True, but where's unsync, people?

01:43 Unsync is the best of all four of those.

01:44 I don't know where unsync is.

01:46 Anyway, unsync is not part of this conversation, but unsync plays a role a little bit like this thing I'm gonna mention today is anyIO.

01:55 And it's a pretty clever name because the idea is that it provides structured concurrency primitives built on top of AsyncIO.

02:04 - Okay.

02:04 - Right, so one of the challenges with AsyncIO is you can kick off a bunch of tasks and then not wait for them, and your program can exit, or you can do other things, and maybe you've seen runtime warnings like task such and such was never awaited.

02:17 You're like, hmm, I wonder what that means.

02:20 Well, that probably means your program exited while it was halfway done, or something like that, right?

02:24 Or your thing returned a value before it waited for it to finish, right?

02:28 And at the low level, something that's a little bit frustrating or annoying that you gotta deal with is that you've got to make sure that all the stuff you started on the async event loop, that you wait for that event loop to finish before your program completely shuts down or completely carries on.

02:43 And so that's basically the idea of this library.

02:47 It's a compatibility layer across those three types, those three different well-known concurrency libraries that provides this structured concurrency.

02:56 So you look at Wikipedia, they say structured concurrency is a programming paradigm aimed at improving the clarity, quality, and development time of a computer program by using a structured approach to concurrent programming.

03:09 The core concept is the encapsulations of threads of execution by way of control flow constructs that have a clear entry and exit points.

03:18 In Python, this mostly manifests itself through this library as async with blocks or async context managers.

03:27 You're like, "I'm going to do some async work, so let's create a with block, do all the work in there." And then by the way, when you leave the with block, it's going to have made sure all the tasks that were started and the tasks started by those tasks and so on, all finished.

03:41 Oh, that's nice.

03:42 Yeah, that's pretty cool.

03:44 So the way it works is you basically go anyio.createTaskGroup, and then from the task group you can spawn other subtasks, and it will keep track of those.

03:55 If there's an exception, I believe it will cancel the other undone ones, the unfinished ones, and so on.

04:00 So it's about, say, we're just gonna go through this thing and it's all gonna run here, and it enters at the top and it exits at the bottom of the with block.

04:09 That's pretty cool, right?

04:10 - Yeah.

04:11 - Yeah, so I think that that's pretty neat.

04:12 Also, it has other primitives, that's like a real simple example.

04:15 Other example or other things it does include synchronization primitives, locks.

04:21 So if you create a re-entrant lock in Python, often called a critical section and things like C++ and whatnot, it's never ever gonna help you.

04:31 Well, maybe that's a little bit strong.

04:32 It's likely not gonna help you because those mechanisms come from the operating system process level.

04:39 And what they do is they make sure two threads don't run at the same time.

04:42 Well, with async I/O, it's all a bunch of stuff that's being broken apart on a single thread, right?

04:49 It's all on the one, wherever the event loop.run is, run till complete or whatever, like wherever that's happening, that's the thread.

04:56 So like the thread locks don't matter.

04:57 It's all the same thread.

04:59 Like you're not gonna block anything.

05:00 So having primitives that will kind of function like threads to protect data while stuff is happening, while it's in temporarily invalid states, that's pretty cool for async I/O.

05:10 - Okay, so you need it or you don't need it?

05:12 - You probably need it.

05:13 I think people often don't really think too much about these invalid states or programs get into and you think, well, async IO, it's gonna be fine.

05:21 And a lot of times what you're doing with async IO is kind of standalone, like I'm gonna kick off this thing and when it comes back, I'm gonna take the data and do something.

05:30 But if you're modifying shared data structures, you can still end up in some kind of event loop erase condition.

05:36 It's not as bad as like true threading because you're not going to, I don't believe it's like a plus equals, right?

05:43 Of something that actually might be multiple steps at the lower level runtime.

05:47 I don't think that it would get broken up to that fine grained, but if you say like, debit this account, this amount of money, or a weight, debit this account, this amount of money, a weight, put that amount into the other one, and some other one is like reading in some kind of loop, like that level of higher order, like temporarily invalid state, that could be a problem for async IO and you want some kind of lock.

06:09 So this comes with that.

06:10 It comes with streams, which are similar to queues, timeouts through things like move on after or fail after a certain amount of time and so on.

06:18 So it's a pretty cool little library.

06:19 - Yeah, that's nice.

06:20 - My vote still for Unsync is the best of the four, even though it was unmentioned.

06:24 (laughing)

06:26 - Isn't Unsync built on those also?

06:28 - It's a compatibility layer that takes async IO, threading and multiprocessing and turns them all into things that you can await.

06:35 - Oh yeah.

06:37 So don't you think there should be like a standard, like a, they should get together like some consortium and have a standard about this.

06:43 - Yeah, well they probably should, but we're still in the early stages of figuring out what the right API is.

06:49 - That's right, that's why they haven't done it.

06:51 - There's something else that has, that could use some standards, and that's in a lot of data science libraries.

06:58 There's an announcement that there's a new consortium for Python data API standards.

07:03 So there is one happening, and it's happening actually quite fast.

07:07 They're getting started right away and there's activities to the announcements right away.

07:13 Then in September, I believe, they're going to kick off some work on data frames or on, no, starting with arrays and then move on to data frames.

07:23 And so, okay, I'm getting ahead of myself.

07:25 There are little blurbs says, one of the unintended consequences of the advances in multiple frameworks for data science, machine learning, deep learning and numerical computing is that there is fragmentation and in using the tools and there are differences in common function signatures. They have one example that shows what the generally mean function to get the average or mean. People are going to like flame me for calling average mean, but as a commoner I kind of think of those the same thing.

07:58 They show a different frameworks than and some of them are common with other ones and so there's five different interfaces for over the eight frameworks for just the mean function for an array. Yeah and what's crazy is like they all are basically the same they're so so similar but they're not the same not code wise the same but they might as well be.

08:18 Yeah and so one of the issues is there's people are using more than one framework for different parts of their maybe different parts of their data flow And sometimes you can kind of forget which one you're using and having a lot of these things common.

08:34 Actually, we just make life easier, I think. So I think I don't know how far they'll get with this, but I think it's a really, so they're trying to make all of these, these frameworks look exactly the same, but with commonalities in arrays and data frames.

08:49 Or and they said note that a razor also called tensors so those are.

08:55 Trying to make some of those comments are i think a really good idea for some of the easy simple stuff why not seems like a great idea it seems like a huge challenge though like who's gonna give whose functions gonna be the one that's like yeah we're dropping this part of our API to make it look like everyone else's great and that's why i think that they've been through a lot of thought on how to go about with this process and try to convince people so So they're working with, they're trying to kind of be in between the framework authors and maintainers and the community and try to do some review process for different APIs, put a proposal out, have feedback from both from the different projects and from the community to have more input to try to make it.

09:43 It isn't just like one set of people saying, "Hey, I think this should be this way." Yeah, it's a good idea.

09:49 It would be great if a lot of these applications or these frameworks may be renamed.

09:53 If it's the same function, if it's like for instance, mean in this example, if it's spelled exactly the same, maybe it should be the same API.

10:01 And if you want a special version of it, maybe have a underscore with an extra, you know, some reason why it's different.

10:08 You can have extra different functions.

10:10 Yeah, it seems like you could find some pretty good common ground here.

10:13 It's a good idea.

10:14 make it happen, you know, it'd just be easier to mix and match frameworks and use the best for different situations. Because I can certainly see you're like, "I'm working with Pandas here. It would be great if I could do this on CUDA cores with CuePy, but I don't really know that. It's close, but it's not the same. So I'm just gonna keep stroking along here." As opposed to change the import statement, now it runs there. Yep. I don't know if it's ever really gonna be like you can just swap out a different framework, but for some of the common stuff it'd really be great. And that's why one of the reasons we're bringing it up is so that people can get on board and start being part of this review process if they care about it. Yeah, also seems like there might be some room for like adaptive layers like from Coupa import pandas layer or something like that where basically you talk to the in terms of say a pandas API and it converts it to its internal. It's like oh these these arguments are switched in order or this keyword is named differently or whatever. And there's even things like differences and even if the API looks the same or it's very similar, the default might be like in some cases the default might be none versus false or versus no value or things. I don't know what no value means but anyway. Yeah cool, that's a good one. Now also good is the things that we're working on. Brian, you want to tell folks about our Patreon? Actually we've kind of silently announced it a while ago, but we have 47 patrons now and it's set up for a monthly contribution and we really appreciate people helping out because there are some expenses with the show.

11:48 So that's really cool.

11:50 We'd love to see that grow.

11:52 We'd also like to hear from people about how we'd like to come up with some special thank you benefits for patrons.

11:57 And so I'd like to have ideas come from the community.

12:00 If you can come up with some ideas, we'll think about it.

12:04 And I'm trying to figure out how to get to it.

12:06 So on our Python Bytes--

12:08 - If you're on any episode page, it's there on the right.

12:11 - Okay, if you go to an episode page, got it.

12:13 - Yep, and it says on the right, I believe somewhere, it says Sponsor Sun.

12:18 I'll have to double check, but I believe it does.

12:20 - Okay, we'll double check.

12:21 - It can for sure, if it doesn't already.

12:24 (laughing)

12:26 And also I wanna just tell folks about a couple things going on over at Talk Python Training.

12:30 We're doing a webcast on helping people move from using Excel for all their data analysis to Pandas, basically moving from Excel to the Python data science stack, which has all sorts of cool benefits and really neat things you can do there.

12:43 So Chris Moffett is gonna come on and write a course with us, and he's gonna do a webcast, which I announced it like 12, 15 hours ago, and it already has like 600 people signed up for it.

12:54 So it's free, people can just come sign up.

12:56 It happens late September 29th.

12:59 I'll put the link at the extras section of the show notes so people can find it there.

13:03 And also the Python Memory Management course is out for early access.

13:07 A bunch of people are signing up and enjoying it.

13:08 So if you wanna get to it soon, get to it early, people can check that out as well.

13:13 - Very exciting.

13:14 - So this next one I wanna talk about has to do with manners.

13:17 What kind of developer are you?

13:19 Are you a polite developer?

13:21 Are you talking to the framework?

13:22 Are you, do you always check it in with it to see how it feels, what you're allowed to do?

13:27 Are you kind of a rebel?

13:28 You're just gonna do what you like, but every now and then you're gonna get smacked down by the framework with an exception.

13:33 - I don't want to describe how a developer I am because I don't want the explicit tag on this episode.

13:39 (laughing)

13:40 - So there's an article that talks about something I think is pretty fun and interesting to consider and it talks about the two types of error handling patterns or mechanisms that you might use when you're writing code.

13:55 And Python naturally leans towards one, but there might be times you don't want to use it.

14:01 And that is, the two patterns are, it's easier to ask for forgiveness than permission.

14:06 That's one.

14:07 And the other one is look before you leap, or please may I, right?

14:12 And with the look before you leap, it's a lot of checks, like something you might do in C code.

14:19 So you would say, I'm gonna create a file.

14:21 Oh, does the folder exist?

14:24 If the folder doesn't exist, I'm gonna need to create the folder, and then I can put the file there.

14:29 Do I have permission to write the file?

14:30 Yes, okay, then I'll go ahead and write the file.

14:32 Right, you're always checking if I can do this, if this is in the right state and so on.

14:37 That's the look before you leap style.

14:40 The ask for forgiveness style is just try with open this thing, oh, that didn't work, catch exception, right, accept some IO error or something like that.

14:51 So there's reasons you might wanna use both.

14:54 Python leans or nudges you towards the ask for forgiveness, try except version.

15:00 The reason is, let's say you're opening a file and it's a JSON file.

15:04 You might check first, does the file exist?

15:07 Yes, do I have permission to read it?

15:08 Yes, okay, open the file.

15:10 Well, guess what?

15:11 What if the file's malformed and you try to feed it over to like JSON load and you give it the file pointer, it's not gonna say, sorry, it's malformed, it's gonna raise an exception.

15:21 Not gonna return it like a value, like malformed constant weird thing.

15:25 It's just gonna throw an exception and say, you know, invalid thing on line seven or whatever, right?

15:30 And so what that means is, even if you wanted to do the look before you leap, you probably can't test everything and you're gonna end up in a situation where you're still gonna have to have the try except block anyway.

15:41 So maybe you should just always do that, right?

15:45 Maybe you should just go, well, if we're gonna have to have exception handling anyway, that's just, we're gonna do exception handling as much as possible and not do these tests.

15:53 So that's the, this article over here, It's on switwoski.com.

15:59 - Mike Wazowski.

16:00 - Oh yeah, it's on Sebastian Witowoski.

16:03 So yeah, it's his, I didn't realize that it was his article.

16:06 So it's his article.

16:09 Anyway, he talks about like, what is the relative performance of these things and tries to talk about it from a well, sure, it's cool to think of how it looks in code, but is one faster or one slower than the other?

16:22 And this actually came up on Talk Python as well.

16:25 And so I said, look, if we're going to come up with an example, let's have a class and a base class, and let's have the base class define an attribute.

16:34 And sometimes let's try to access the attribute.

16:37 And when you don't have the base class, or when you only have the base class, it'll crash, right?

16:41 'Cause it's in the derived class.

16:42 So let's say we have two ways to test.

16:45 We could either ask, does it have the attribute, and then try to access it, or we could just write or access it.

16:52 And it says, well look, if it works all the time, and you're not actually getting errors, and you're doing this, it's 30% slower to do the look before you leap.

17:01 'Cause you're doing an extra test, and basically the try-except block is more or less free, like it doesn't cost anything, if there's not actually an error.

17:10 But if you turn it around, and you say, no it's not there, all of a sudden, it turns out the ask, the try-except block is four times slower.

17:21 That's a lot slower.

17:21 - Oh, really?

17:22 - Because the raising of the exception, figuring out this call stack, all that kind of stuff is expensive.

17:28 So instead of just going, does it have the attribute?

17:30 You're going, well, let's do the whole call stack thing, every error, right?

17:34 And create an object and throw it and all that kind of stuff.

17:37 So it's a lot slower when there are errors.

17:39 And anyway, it's an interesting thing to consider if you care about performance and things like parsing integers or parsing data that might sometimes fail, might not, you know, sometimes it doesn't fail.

17:52 Yeah, okay.

17:53 Devil's advocate here.

17:54 His example doesn't have any activity in the ask for forgiveness if it isn't there.

18:01 That's the way I saw it when I first read it as well.

18:03 There's two sections.

18:04 There's like one part where he says, let's do it with the attribute on the derived class and let's do it again a second time by taking away the attribute and seeing what it's like.

18:13 Right, but I mean, the code, if it doesn't exist, it just doesn't do anything.

18:17 Right, right, right.

18:18 In reality, you're still going to have to do something.

18:20 Notify the user it's wrong, whatever.

18:22 Yeah, for sure. That's a good point.

18:24 Like it's just basically a try, except pass.

18:26 Yeah. So what do you think about this?

18:28 So what I think is you're going to have to write the try, except anyway, almost all the time.

18:36 And you don't want both.

18:38 Like that doesn't seem good.

18:40 That seems like just extra complexity.

18:43 So when it makes sense, just go with ask for forgiveness.

18:46 Just embrace exceptions, right?

18:48 Remember you have a finally block that often can like get rid of a test as well.

18:53 You have multiple types of error, except clauses are based on error type.

18:58 I think people should do a lot with that.

19:00 That said, if your goal is to like parse specific data, right, like I'm gonna read this number I got off by, off of the internet by web scraping, and there's a million records here, I'm gonna parse it.

19:12 If you wanna do that a lot, lot faster, that might make a lot of sense.

19:15 I actually have a gist example that I put up trying to compare the speed of these things in a mixed case.

19:22 So like the cases we're looking at here are kind of strange because it's like, well, there's, it's all errors or it's zero errors, right?

19:29 And then it doesn't really do anything, which are both weird.

19:32 So I have this one where it comes up with like a million records, strings, and most of the time they're legitimate numbers, like 4.2 as a string, and then you can parse it.

19:42 And what I found was if you have more than 4% errors, I think it was four, like 4.5% or something errors, erroneous data, it's slower to use exceptions.

19:54 The cutoff is 4% errors.

19:55 And I think if you have more than 4% errors, then the exceptions become more expensive.

19:58 That's right.

19:59 So anyway, it's something that people can run and get real numbers out of and play with it in a slightly more concrete way.

20:04 But I don't know, what do you think?

20:06 I think you start out by focusing on the code, making it easy and clear to understand, and then worry about this stuff.

20:13 - Yeah, so I don't actually put either.

20:15 I don't usually do the checking stuff.

20:18 And that is one of the things that's good about bringing this up is that is more common in Python code is to not check stuff, just to just go ahead and do it.

20:29 And then I write a lot of tests.

20:30 So I write a lot of tests around things.

20:32 - Yeah, that's cool.

20:33 - And so either case, checking for things, or like for instance, if it is input, if I've got user input, I'm checking for things.

20:41 I'm going to do it checks ahead of time because I want, because the behavior of what happens when it isn't there or when there's a problem, it isn't really a problem. It needs to be designed into the system as to what behavior to do when something unexpected happens. But the in normal code, like, well, what happens if there's not an attribute? Well, You shouldn't be in that situation, right?

21:03 You shouldn't be in that situation. And I usually push it up higher.

21:05 I don't have try except blocks all over the place. I have them around APIs that might not be trustworthy or around external systems or something. I don't put try accept locks around code that I'm calling on my own code.

21:20 Things like that. Yeah, I'm with you on that. That makes a lot of sense.

21:23 The one time that I'll do the test, the look before you leave style, is if I think I can fix it.

21:28 Right? Does this directory not exist?

21:30 I'm going to write a file to it. Well, I'm just going to make the directory.

21:32 Then I'm going to write to it, you know.

21:35 Those kinds of tests can get you out of trouble, but if you're just going to say this didn't work, Chances are you know, you still need the error handling and exception format anyway. Yeah, and you're probably gonna throw an exception. So yeah, anyway So you probably should uh Get your code right test it and then just stick it in github Get in your repository make sure it's all up to date, right? Oh, I was wondering how you're gonna do that transition So yeah, that's good was following a discussion on twitter And I think actually I think anthony shah may have started it but I can't remember. But dealing with different, if you've got a lot of repositories, just sometimes you have a lot of maintenance to do or a little, you know, some common things you're doing for a whole bunch of repos.

22:17 And there's lots of different reasons why that might be the case or related tools or maybe just your work. You've got a lot of repos, but there's a project that came up in this discussion that I hadn't really played with before. And it's a project called My Repos.

22:33 And on the site it says you've got a lot of version control repositories sometimes you want to update them all at once or push out all the local your local changes.

22:43 You may use special command lines in summer repose to implement specific workflows well the my repose project provides an mr command which is a tool to manage all your version control repositories.

22:55 And the way it works is it's on directory structures.

22:59 So it's a, and I usually have all of my repos I'm working with under, under a common, like, projects directory or something so that I know where to look.

23:08 And so I'm already set up for something like this might work.

23:12 And you go into, into one of your repos and you type, if you have this installed, you type mr register, and it registers this under, registers that repo for common commands.

23:24 And then whether you're in a parent tree, parent directory, or one of the specific directories, and type a command, like for instance, if you say mr status, it'll do status on all of the repos that you care about, or update, or diff, or something like that.

23:40 And then you can build up even more complex commands yourself to do more complicated things.

23:47 But I would, I mean, I'm probably going to use it right away just for just checking the status or doing Polls or updates or something like that on on lots of repos. So this looks neat. Yeah, it looks neat I like the idea a lot So basically I'm the same as you I've got a directory maybe a couple levels But all of my github repos go in there I group them by like personal stuff or work stuff But other than that, they're just all next to each other and this would just let you say go do a get pull on all of Them that's great. Yeah, or like for instance at work. I've got often Like three or four different related repos that if I switch to another project that I'm working on I need to go through and make Sure, I'm not sure what branch I'm using or if everything's up-to-date So being able to just go through all like even two or three Being able to go and update them all at once or just even check the status of all it'll save time And then friend of the show at least somebody that interviewed for a testing code at least Adam Johnson wrote an article called "Maintaining Multiple Python Projects with MyRepos" and we'll link to his article in the show notes.

24:55 Yeah, perfect.

24:56 I like this idea enough that I wrote something like that already.

24:59 You did.

24:59 Well, what I wrote is something that will...

25:01 it'll go and actually synchronize my GitHub account with a folder structure on my computer.

25:09 So I'll go and just say, like, "repo-sync" or whatever I called it, And it'll use the GitHub API to go and figure out all the repos that I've cloned or created in the different organizations like Talk Python organization versus my personal one.

25:25 And then it'll create folders based on the organization or where I forked it from and then clone it. And if it's already there, it'll update it within it'll like basically pull all this down. That's cool. I need that.

25:35 It was a lot of work. This seems like it's pre-built and pretty close. So it looks pretty nice. The one thing it doesn't do is it doesn't look like, doesn't go to GitHub and say, oh, "Oh, what other repos have you created "that you maybe don't have here?" You know, maybe you want that, maybe you don't.

25:49 If you've like forked Windows source code and it's like 50 gigs, you don't want this tool that I'm talking about.

25:55 But if you have reasonable sized things, like I forked a Linux, okay, great, that's gonna take a while.

26:00 But normally it would be, I think it would be pretty neat.

26:03 Another thing that's neat around managing these types of things is Docker.

26:06 And did you know that Python has an official Docker image?

26:10 - I did not. - I didn't either.

26:11 Well, I recently heard that, but it's fairly new news to me that there's an official Docker Python image.

26:18 So theoretically, if you wanna work with some kind of Linux Docker machine that uses Python, you can go and Docker run or to create the Python one, right?

26:30 So it's not super surprising, it's just called Python, right?

26:35 But it's, yeah, it's just called Python.

26:37 That's it, I believe.

26:38 So pretty straightforward working with it.

26:40 But I'm gonna talk about basically looking through that official Docker image.

26:48 So Itamar Turner-Trowering, he was on Talk Python not long ago, talking about Phil.

26:53 We also talked about Phil on Python Bytes, the data science focused memory tool.

26:58 He wrote an article called "A Deep Dive into the Official Docker Image for Python." So basically it's like, well, if there's an official Docker image for Python, what is it?

27:09 How do you set it up?

27:10 understanding how it's set up is basically how do you take a machine that has no Python whatsoever and configure it in a Python way.

27:18 So this is using Debian. That's just what it's based on. And it's using the Buster version because apparently Debian names all their releases after characters from Toy Story.

27:29 I didn't know that, but yep, Buster. Buster is the current one. So it's going to create a Docker image.

27:36 you create the Docker file, you say this Docker image is based on some other foundational one, so Debian buster, and then it sets up slash user slash local slash bin for the environmental path 'cause that is the first thing in the path 'cause that's where it's going to put Python.

27:55 It sets the locale explicitly to the ENV language is to UTF-8.

28:01 There's some debate about whether this is actually necessary 'cause current Python also defaults to UTF-8, but here it is.

28:08 And then it also sets an environment variable, Python_version, to whatever the Python version is.

28:14 Right now it's 385, but whatever it is.

28:16 That's kinda cool, so you can ask, "Hey, what version is in this system "without actually touching Python?" That's cool.

28:23 And then it has to do a few things like register the CA certificates.

28:28 I've had people sending me messages, are taking courses and they're trying to run the code from something that talks to requests, whether it's SSL certificate endpoint, HTTPS endpoint, and they'll say, this thing says the certificate is invalid.

28:46 The certificate's not valid, what's going on here?

28:48 Right, and almost always, something about the way that Python got set up on their machine didn't run the create certificate command.

28:56 So there's like this step where Python will go download all the major certificate authorities and like trust them in the system.

29:02 So that happens next.

29:04 And then it actually will set up things like GCC and whatnot so it can compile it, which is interesting.

29:11 Downloads the source code, compiles it, but then what's interesting is it uninstalls the compiler tools.

29:18 It's like, okay, we're gonna download Python and we're gonna compile it, but you didn't explicitly ask for GCC.

29:24 We just needed it, so those are gone, right?

29:26 Cleans up the PYC files and all those kinds of things.

29:29 And then it gives an alias to say that Python 3 is the same as Python, like the command, you could do it without the 3.

29:37 Another thing that we've gone on about that's annoying is like, I created a virtual environment.

29:41 Oh, it has the wrong version of pip.

29:43 Is my pip out of date?

29:44 Your pip's probably out of date.

29:45 Everyone's pip is out of date.

29:47 Unless you're like a rare, like two week window where Python has been released at the same time like the modern pip has been released.

29:54 So guess what?

29:56 They upgrade pip to the new version, which is cool.

29:59 And then finally it sets the entry point of the Docker container, which is the default command to do if you just say Docker run this image, like Docker run Python 3.8-slim-buster.

30:12 If you just say that by itself, what program is it gonna run?

30:16 'Cause the way it works is it basically starts Linux and then runs one program.

30:20 When that program exits, the Docker container goes away.

30:23 And so it sets that to be the Python three command.

30:25 So basically, if you Docker run the Python Docker image, you're going to get just the REPL.

30:33 - Interesting.

30:34 - Yeah, you can always run it with different endpoints like bash and then go in and like do stuff to it or run it with micro-WSGI or Nginx or whatever.

30:40 But if you don't, you're just going to get Python 3 REPL.

30:45 Anyway, that's the way the official Python Docker image configures itself from a bare Debian buster over to Python 3.

30:54 - Neat.

30:54 I thought it might be worth just thinking about like, what are all the steps?

30:58 And you know, how does that happen on your computer?

31:00 If you can.

31:01 No, that's good.

31:02 Because yeah, I have been curious about that.

31:04 I was gonna throw Python on a Docker image.

31:07 What does that get me?

31:08 And yeah, exactly.

31:09 That's what it is.

31:10 Oh, you could also apt install Python 3 dash dev.

31:16 Yeah, that might be cheating.

31:19 All right.

31:20 What's this final one?

31:21 It was recommended by, we covered some craziness that Anthony did, episode or two ago, and somebody commented that maybe we need only in a pandemic section.

31:32 - Oh yeah, that sounds fun.

31:34 - So I selected Nanormost, no, sorry, Nanornest.

31:39 It's optimal peanut butter and banana sandwich placement.

31:42 So this is kind of an awesome article by Ethan Rosenthal, talks about during the pandemic, He's been sort of having trouble doing anything.

31:52 And so he really liked peanut butter sandwich, peanut butter and banana sandwiches when he was just still, even he got picked this habit up from his grandfather, I think.

32:01 Anyway, this is using Python and computer vision and deep learning and machine learning and a whole bunch of cool libraries to come up with the best packing algorithm for a particular banana and the particular bread that you have.

32:16 So you take a picture that includes both the bread and the bananas, or the banana you have, and it will come up with the optimal slicing and placement of the banana for your banana sandwich.

32:29 - Wow, this is like a banana maximization optimization problem.

32:33 So if you want, you gotta see the pictures to get this.

32:36 So like, if you're gonna cut your banana into slices, and obviously the radius of the banana slice varies at where you cut it in the banana, right?

32:45 Is it near the top?

32:46 Is it in the middle?

32:47 It's going to result in different size slices.

32:49 On where do you place the banana circles on your bread to have maximum surface area of bananas relative to what's left of the bread, right?

32:59 Something like that?

33:00 Yes.

33:01 He's trying to make it so that you have almost all of the bites of the sandwich have an equal ratio of banana, peanut butter, and bread.

33:08 Oh yeah, okay.

33:09 It's all about the flavor.

33:10 I didn't understand the real motivation, but yeah, you wanna have an equal layer, right?

33:16 So you don't want that spot where you just get bread.

33:19 - You actually learn quite a bit about all these different processes, and there's quite a bit of math here.

33:25 Talking about coming up with arcs for, you have to estimate the banana shape as part of an ellipse, and using the radius of that to determine banana slices, and estimates for, because you're looking at a banana sideways, you have to estimate what the shape of the banana circle will be, and it's not really a circle, it's more of an ellipse also.

33:49 Yeah, there's a lot going on here.

33:52 Some advanced stuff to deliver your bananas perfectly.

33:57 I love it.

33:58 Actually, this is really interesting.

33:59 This is cool.

34:00 And it's, I mean, it's a silly application, but it's also a neat example.

34:03 Yeah, actually, and this would be, I think, a cool thing for to talk about difficult problems and packing for like a teaching, like in a school setting. I think this would be a great example to talk about some of these different complex problems. Yeah, totally.

34:20 Well, that's it for our main items.

34:22 For the extras, I just want to say I'll put the links for the Excel and Python webcasts in the memory management course down there, and we'll put the Patreon link as well.

34:30 Let's see, do you have anything else you want to share?

34:32 No, that's good.

34:33 Yeah, cool.

34:34 How about sharing a joke?

34:35 A joke would be great.

34:36 So I'm going to describe the situation, and you can be the interviewer/boss who has the caption, okay?

34:42 Okay.

34:43 So the first, there's two scenarios.

34:45 The title is "Job Requirements." This comes to us from Eduardo Orochenas.

34:49 Thanks for that.

34:51 And the first scenario is the job interview, where you're getting hired.

34:57 And then there's the reality, which is later, which is the actual on-the-job day-to-day.

35:02 So on the job interview, I come in, I'm an applicant here, and Brian, the boss, says, "Invert a binary tree on this whiteboard." Or some other random data structure, like quicksort this, but using some other weird thing, right?

35:15 kind of really computer science-y way out there.

35:19 Probably not going to do, but maybe it makes sense, right?

35:22 Alright, now I'm at the job and I've got like my computer, I have a huge purple buy button on my website that I'm working on.

35:28 And the bot says, "Make the button bigger!" Yep, that's the job.

35:33 [laughter]

35:35 Yeah. Very nice.

35:38 Good, good. Alright, well, I love the jokes and all the tech we're covering.

35:41 Thanks, Brian.

35:42 Yeah, thank you.

35:43 Bye. Thank you for listening to Python Bytes. Follow the show on Twitter via @PythonBytes.

35:48 That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm.

35:54 If you have a news item you want featured, just visit PythonBytes.fm and send it our way.

35:58 We're always on the lookout for sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and

Back to show page