Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #23: Can you grok the GIL?

Return to episode page view on github
Recorded on Tuesday, Apr 25, 2017.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:07 This is episode 23, recorded April 25th, 2017.

00:12 And I'm one of your hosts, Michael Kennedy.

00:14 And I'm Brian Okken.

00:15 And we're here to share a bunch of cool Python stuff with you.

00:18 We've got six cool items queued up and ready to go.

00:22 But before we get to that, I want to say thanks to Advanced Digital.

00:24 They have an awesome Python job.

00:26 You can check it out at python.advanced.net.

00:29 and we'll talk more about that later.

00:31 Right now I want to talk about the GIL, Brian, what do you think?

00:33 I think it's great to talk about the GIL, and I'm really glad that...

00:37 So this is an article called "Grok the GIL - How to Write Fast and Thread-Safe Python" and we talk about the GIL as the reason why we can't do parallel computing and programming just built in in Python, but I haven't really jumped into it a lot.

00:55 And this this article is from a Jesse Giroux Davis who by the way is an excellent writer if you want to have Some examples of great writing read his stuff. It's great So that he has this a little it's a very lightweight introduction to what the GIL is and to talk and I like the approach of Not just the details of it because most of us aren't going to go in and start hack and seek the CPython core But a little peek into the CPython core to see that it's a mutex inside an interpreter lock the in the global The GIL is the global interpreter lock. I love how he pulls out little snippets from CPython he's got a section behold the global interpreter lock. It's just so she'd like the C code. Yeah, it's just one line Yeah, exactly, and you don't really need to know a lot of C to appreciate it But there's just like there's enough to make it super concrete you're like this is actually the code that runs when you like call the socket like and that's how the GIL gets released for example right yeah we're talking about sockets and it really he really talks about that that it's a it's focused around the lock is around IO whenever you were waiting for IO and I think there's other places too but that's the main place where your code will pause and let some other thread run I like the he has a thing that says it's so simple that you can the effect on threads is so simple you can write it on the back of your hand. One thread runs Python while in others sleep or await I/O and he actually has a picture of his hand I think it's his hand. Yeah I was wondering if that's actually his hand. Yeah and if he wrote it that means he must be left-handed because it's written on the back of his right hand or he had somebody else write it. So I was always curious about this what are the limitations and it's and how do you utilize it to have faster code and the gist of it is if you've got some code that's waiting I/O like maybe pulling off a whole bunch of different, taking a bunch of connections or pulling off a bunch of your, downloading a bunch of URLs, that's a great place to use multi-threading because the gil doesn't really get in your way. There's in places where you really have multiple processing where you really want your Python code to run at the same time, then you have to jump into multi-processing and he actually gives an example of that and it's not that bad either. So anyway I liked the quick jump into it and I'm I'm think I'm gonna be a better Python programmer for reading this. Yeah this is really nice work. Good job Jesse. He's a great writer. I actually had him on Talk Python on episode 69 I think about design patterns for programmer blogs and we did a whole session on blogging. It was great and one One of the things I like about this is he talks about cooperative multitasking, concurrency versus parallelism, preemptive multitasking, how sometimes you still need to actually lock your Python code, even though you might think of like, "Well, this is all straight Python.

03:53 It's not going to get interrupted." But there's certain mechanisms that slightly vary between Python 2 and 3 where if you hang onto the GIL too long, it will be potentially taken from you and given to another thread.

04:06 might still cause what would appear to be parallel race conditions.

04:10 So that's also worth reading about.

04:12 Yeah, and one of the things that surprised me is, and I do realize I don't really worry about that, I deal with multi-threading in C++ and with C++ you have to do it fine-grain locking of data structures, any data structure shared by multiple threads.

04:28 But I was surprised how much you can share between threads in Python because the GIL won't interrupt a bytecode and it'll only interrupt yeah between bytecodes not not in the C code so the like things like sorting a list will happen atomically and you won't be interrupted with that which is that surprised me I didn't know that it is at where ironically incrementing a variable could be interrupted right because it ends up being like a two-step or read modify write operation yeah exactly exactly and Jesse uses the disk module to look inside, which is all very good.

05:06 So that's a great article.

05:06 I think that's probably the most substantive thing we're covering.

05:09 Do you want to think about not so some substantive, but pretty cool?

05:13 Yes.

05:14 I've got one for you.

05:14 Let's talk about the NBA as in a national basketball.

05:18 Association, the American basketball.

05:20 So there was a pretty big deal on Twitter the other day.

05:25 So Mark Cuban, he owns the Dallas Mavericks and he's, I don't know if he comes from tech or not.

05:31 I don't really think so, but he definitely was an entrepreneur.

05:34 He's worth like, he's a billionaire basically.

05:37 But as a billionaire, owner of a NBA team, he posted out a pretty interesting thing on Twitter saying, "Here's the new NBA." And it was a picture of him learning Python machine learning with, I think, iPython and iPython Notebook open.

05:54 And he's like, "I need to understand "the Mavericks and the NBA.

05:57 "I'm on it." - Wow.

05:58 - It's pretty cool, right?

05:59 - It is pretty cool.

06:00 I don't know much about basketball or Mark Cuban or any of that, but it's neat that somebody that high up is wanting to learn Python and notebooks.

06:09 - That was basically the main takeaway.

06:12 Bunch of people like our friends over at Partially Derivative invited him to be on the show.

06:15 They're like, "Oh, we have to hear your story." He's like, "No, no, I'm just getting started.

06:19 "They have a team at the Mavericks.

06:20 "I just wanna understand what they're doing "when they use machine learning "to help make predictions and planning." And that's kinda cool to think of how machine learning is actually like driving these professional sports teams as well. Yeah, very interesting. Indeed, indeed. All right, so next up we have somebody winning an award. How cool is that? Yeah, Ian Kardashian. He got a or was announced that he will get the 2017 Community Service Award from the Python Software Foundation. And I think that's pretty cool. It's apparent. I didn't know that he was doing that a lot of the stuff that he did. I mean, I was familiar with Ian, he was on Test and Code Episode 13. And we talked about Betamax library that he has for for recording recording and playing back requests interactions. He's apparently been the election administrator for the PSF since 2015, volunteering all that time of course, and he is active in mentoring new coders and supporting other Python developers with a with apparently really a focus on trying to be active in mentoring women in Python. And I think that's just pretty awesome.

07:31 Yeah, that's really awesome. So congratulations, Ian. And his project that you talked about, like replaying requests, that's called Betamax?

07:39 Yes.

07:40 That's an awesome name.

07:41 Yeah, yeah. It's, I guess the idea of course, of there's a VCR type library in some other languages, but he chose Betamax because, well, everybody knows Betamax was better.

07:52 That's right.

07:53 Yeah, you should listen to it.

07:55 It's a pretty interesting tool.

07:57 So that was one that the community asked me to do.

08:00 There were community members of listeners of Test and Code that said, hey, could you go find Ian and talk to him about Betamax?

08:07 - That's awesome.

08:08 We'd love to get those recommendations for all the shows, including some stuff that we're covering here today, right?

08:13 - Yeah, definitely.

08:14 - Definitely.

08:15 So if you want to work with these kind of fun things, maybe you work at a company where you're doing Java and you dabble in Python, or you don't really get to do all the cool things you'd like, Advanced Digital has a cool job offer for everyone out there who might want to make a change.

08:30 - I wish I was near Jersey City, 'cause this sounds fun.

08:33 - It does sound fun.

08:35 So, right, they're in Jersey City, just across the Hudson from Manhattan there.

08:40 Small, agile environment.

08:42 They're mostly a Python shop, but they play with other cool technologies.

08:45 They fund you guys to go to conferences, professional development, and most importantly, coolest I think is they run one of the 10 largest news sites by traffic in the US.

08:55 And they do it with Python.

08:56 So if you want to be part of that team, you want to play with cool stuff like that, just visit python dot advance.net and check it out.

09:04 So there's a couple of things coming up, Brian, that have to do with Python versus legacy Python.

09:10 Remember, Matias from the IPython project, Matias Boussinet, I'm sorry if I mess up your name, but I think that's pretty close.

09:19 He was the original guy who got us talking Python versus Legacy Python instead of Python 3 versus Python 2.

09:25 Oh yeah, right.

09:26 Yeah.

09:27 So he works on IPython and Jupyter and all that stuff and he's back with a new blog post which is my next item.

09:33 And it's a pretty big deal.

09:35 We just talked about Mark Cuban, the new MBA, machine learning, IPython.

09:40 And so they just released IPython 6.

09:43 Okay.

09:44 So that's pretty big news.

09:45 That is big news.

09:46 Yeah.

09:47 who use IPython, you know, there's a brand new version, that's awesome. The bigger thing is that this is the first release where IPython goes Python 3 only. They've gone up for Python 2.

09:59 That's great.

10:00 So, or as Matthias would say, they now support Python and not mixing in legacy Python with it.

10:05 Yeah.

10:05 And what I thought was nice is, you know, it's a pretty major project. They did a little write up of what was their experience of converting a mixed source code to Python 3 only. What were What were the benefits?

10:16 What were the drawbacks?

10:17 So, let's see, a couple of things, a couple of stats that Matias put out.

10:22 The size of the IPython code base has decreased by 1500 lines.

10:27 That's pretty solid, right?

10:28 That's significant.

10:29 Less code means less maintenance.

10:30 Right.

10:31 They said it's not just because of dropping Python 2, but a significant amount is.

10:35 And even more impressive is they added some entirely new features that required hundreds of new lines of code.

10:42 So they're really the decrease in amount of code they had to support for Python 3 or really for 2, they were able to get rid of when they went to Python 3 is actually probably more.

10:53 So that's pretty cool.

10:54 And they said one of the benefits they think is that contributors can spend less time worrying about, well, how does this work if we do it in Python 2?

11:01 Or, you know, this has happened to me, you make a pull request, you submit it, it runs on the continuous integration, and it works fine in Python 3, but then it fails in Python to because you know you forgot the B in the string or whatever right so they don't have to worry about that.

11:16 CI runs faster.

11:18 They're told they said basically in summary we're totally happy we're entirely pleased with having switched to basically have the ability to write Python 3 only code and they're looking forward to using a lot of the improvements and Python 3 specifically async and await which will be cool.

11:34 So an async and await rebel inside of IPython how cool is that?

11:38 neat is a sinking away in all of the three versions or did that get introduced it came in three five.

11:44 Okay, the asyncio stuff was introduced in Python three four I think and then three five they're like let's put some proper syntax on this and make it really easy.

11:52 Yeah, I'm experienced trying to I'm writing a little thing that I want to have available on Python two also, and at least two seven.

12:03 And even if I were to just do Python 3, all of the three versions, I still can't use fstrings, which I wish I could use fstrings.

12:10 I know they're so new.

12:11 It's 3.6 only.

12:12 Like, even on my production server, it's 3.5.

12:14 So it is what it is.

12:16 That's a move in the right direction.

12:17 And I think it's great that Matias and others talked about their experience with that change.

12:22 Yeah, that's awesome.

12:23 Yeah, thanks, Matias.

12:24 Excellent.

12:25 I think I'm, to use an American expression, beating a dead horse.

12:28 But we have another--

12:29 Is that dead horse called Source, SRC?

12:32 Yes.

12:33 The other package I was talking about me building up it's it's for the book but it's I wanted to make sure I was representing the community correctly and how to How to put together a distributable package and do it correctly at least it with best practice I know there's not really a correct but somebody pointed me to the direction. I've a Article by I'm gonna probably get this wrong. I think it's Enoch and it's called testing and packaging and it's basically he's the guy that did the adders project or ATTRS.

13:05 Mm-hmm.

13:06 Great project.

13:07 That we've talked about a couple times.

13:08 And how there were issues, at least with one package that wasn't using the source, SRC, that the testing that was done was there was a bug that showed up in with installed applications that doesn't show up in non-installed.

13:27 So one of the benefits of using a SRC is you can more easily make sure that you're only testing the installed package and not the non installed.

13:37 And he also just shows that it's really just two lines of code change.

13:41 So to do the right thing is not that much work.

13:44 Right.

13:45 So basically, in your setup, py, the call to setup, you set the packages to be looking in the source directory, and you set the package to be in the source directory, right?

13:54 Yeah.

13:55 normally say find packages he recommends specifically saying find packages and then give it a where equals src but you can also just put src in the as the the first argument that works also and then listing it in the packages dir and then one of the things i noticed which i don't think people have really talked about is the entire repository looks better it's you've got all of the package junk like your setup and your manifest and all that stuff at the top level And the stuff you really care about on a day-to-day basis is separated into subdirectories.

14:29 You've got the docs in one and the tests in another and then your source in another.

14:33 And that separation just, it pleases my organization. It just is nice.

14:38 Yeah, I'm coming around to this as well. It sounds pretty solid.

14:40 Anyway, but that's, I'll probably try to drop talking about that every episode.

14:45 But there you go, one more article.

14:47 Well, I'm not quite done beating the Python versus legacy Python horse yet.

14:51 yet so I'm going to keep going on that one because there's some more big news.

14:55 We've heard that iPython went to Python 3 only and now same week, last week, AWS Lambda goes to Python or adds not only but adds Python 3.6 support was just 2.7 so that's a big jump.

15:09 Wow, that's a big jump.

15:10 Yeah.

15:11 Yeah.

15:12 So that's pretty awesome and do you have much experience with Lambda?

15:14 Have you played with it?

15:15 No, I've heard a lot about it but I haven't played with it yet.

15:18 So Lambda is one of these, one of these things from AWS, from Amazon, that fits into this serverless architecture.

15:27 So basically, you say, here's a function.

15:31 And when something happens, run this, please.

15:33 So run it on a schedule, somebody changes the database, somebody uploads a file to S3, whatever.

15:40 And it just runs, there's no servers that you deal with, obviously, there are servers, like it just distributes your code to run when it needs to across a whole bunch of servers.

15:49 So it scales basically infinitely, you know, as long as you have infinite money, you can infinitely scale this, it's fine, right? And that's, that's pretty cool.

15:58 Yeah. So you just, have you tried it?

16:00 No, I have not had a real reason to do it. I mean, I guess there's a couple of things that I could do, like on the websites, there's a job that runs like every couple hours that will completely re index the database and like reorganize it for super fast queries like the queries on on the various websites run and I'm going to be adding to Python bites, no worries.

16:22 They, you know, run like sub millisecond right in order to get that stuff you got like kind of pre compute some things, maybe that's a perfect lambda operation, right? Well, I mean, especially now that they have three six support I'm intrigued enough that I might give it a shot anyway, just to to come up make up some excuse to play with it.

16:41 - Exactly, we need to run this.

16:43 But I mean, if you're using other AWS stuff, like their database services, Dynamo or RDS or S3, or like, here's a way to run code, like really near your resources on triggers with no effort.

16:55 And one of the things I thought was pretty cool, like this announcement just came out.

16:58 And Zappa, so Zappa, if you look at their page, which I linked, it's called serverless Python web services.

17:08 That's interesting, right?

17:09 So basically you can set it up so that using the AWS architecture, you can route web requests to these Lambda functions, but you don't really have servers or anything like that.

17:21 And people have been asking for Python 3 support, and they've been saying, "No, no, no, no." As soon as this dropped, they're like, "Yes, it has Python 3 support." So that is pretty cool as well.

17:32 So you've seen things that basically are layered on top of Lambda also started to support Python 3, Yeah, definitely. Cool. All right. Yeah, maybe we should play with Lambda. I don't know. Yeah, very nice. All right Well, that's it for the news Brian. You got it in the anything personally want to share with everyone?

17:46 No I'm gonna be I guess I guess I'll be in the in the Munich area the second week in May if there's anybody around That wants to once I have a beer or something with me hit me up. Yeah, that sounds awesome. I'm jealous I'd love to go visit Germany. Well, I'll do that the end of the end of the summer maybe we'll see. But no no news for me. I just want to say thank you everyone for listening.

18:08 Oh, you know what, actually one more thing I that's not personal news, but it falls right here. I also saw check io at check io.org. These guys have a pretty cool like gamification learning Python. They also just went Python three. Oh, cool. So just just to keep on this, hey, Python three is starting to really roll. I'd say it's really starting to roll this week.

18:28 I use check io for hopefully I won't get in trouble for this. But I've gone through a a bunch of this stuff and I use them for interview questions.

18:35 - Yeah, I think it's actually pretty good.

18:37 And what I really like is you can solve a puzzle and then you can look at other people's solutions.

18:43 And I found after solving a bunch and looking at the solutions that I unknowingly have an implicit bias towards performance over ease of reading or simplicity or whatever.

18:53 And it was just interesting that it uncovered that for me.

18:56 - Oh, that's interesting.

18:57 And I totally have the opposite.

18:58 I like them to be readable more than anything else.

19:01 Yeah, funny, huh?

19:03 All right, well, thanks, Brian.

19:04 Thank you, everyone, for listening, and we'll catch you next week.

19:07 - And thank you.

19:08 - Yep, you're welcome.

19:08 Bye.

19:10 Thank you for listening to Python Bytes.

19:12 Follow the show on Twitter via @pythonbytes.

19:14 That's Python Bytes as in B-Y-T-E-S.

19:17 And get the full show notes at pythonbytes.fm.

19:21 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

19:25 We're always on the lookout for sharing something cool.

19:28 On behalf of myself and Brian Okken, This is Michael Kennedy.

19:31 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page