Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #41: Python Concurrency From the Ground Up and Caring for our Community

Return to episode page view on github
Recorded on Wednesday, Aug 30, 2017.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 41, recorded August 30th, 2017.

00:10 I'm Michael Kennedy.

00:11 And yes, I'm back from vacation.

00:14 Thank you, Brian Okken, for covering and all of our guest co-hosts.

00:17 And it's time to immediately start repaying Brian for his keeping things rolling while he was gone.

00:24 He's in Germany for some work business.

00:27 And we have a new guest co host.

00:30 Welcome, Miguel Greenberg.

00:31 Hello, and thank you for having me, Michael.

00:32 Happy to be here.

00:33 Yeah, I'm happy to have you here.

00:34 It's gonna be fun to talk about the items you've chosen for the week and what I've got as well.

00:41 So let's just kick it off.

00:43 First by saying thank you to Rollbar who's bringing you this episode.

00:47 So check them out at pythonbytes.fm/rollbar and get some good monitoring for your app.

00:52 We'll talk more about that later.

00:54 But let's jump to your first item, lovis.

00:56 I would like to call it LOL this, right?

00:59 But actually, this is not anything to laugh about.

01:02 So this is a Python package that generates graphical representations of very commonly used Python data structures.

01:11 So far, they support five different structures.

01:15 And one of them is a list of lists, which gives the LOL name to the package.

01:21 But they also support dictionaries, lists, linked lists, and binary trees.

01:26 And what they do is basically they use GraphViz, which you need to install on your machine to generate these graphical representations.

01:36 And one of the coolest things is that it integrates with Jupyter.

01:40 So if you're doing this from a notebook in Jupyter, then it'll render these graphical representations right in your prompt.

01:48 So it's super cool, especially I imagine for you and for me, you know, people who used to teach Python, I'm having tons of ideas to use this myself.

01:58 Yeah, so am I I'm like, this can definitely be used to help people who are new to these data structures and new to these ideas.

02:06 And honestly, new to the concept of pointers and reference types at all.

02:10 I think it's really a great way to learn it.

02:12 The linked list one is particularly interesting to help people who are not familiar with pointers.

02:18 to visualize how that works.

02:21 One thing I need to mention is that if you try to pip install this in Python 3 today, it's going to fail.

02:28 I submitted a bug and the author told me that a fix is coming pretty soon with additional features.

02:34 So I'm glad to hear that he's responsive and that he's working on making more improvements.

02:40 Yeah, this thing came out really recently.

02:42 I don't remember how old it is, but it's all quite new, like a couple of days.

02:46 and it's already got 200 stars on GitHub and it looks like it's going well.

02:50 - Yeah, it's going pretty well.

02:52 Yeah, I see a bright future for it.

02:54 - I do as well.

02:54 So if you're out there teaching Python and you're trying to help people understand pointers, data structures, and even if you're not teaching it, like I think looking at some of the different representations of these data structures that it draws is probably helpful for people who just haven't really thought a lot about it as well.

03:12 So definitely worth checking out.

03:13 Alright, so moving from visualizing different data structures to transforming data structures, the thing I want to cover first this week is Odo.

03:22 So this is a user or listener recommendation, as was lolvis.

03:28 So Odo migrates data between all these different formats.

03:32 And it knows how to read, let's say a panda data frame, and then convert that to a MongoDB table or collection and save it just like that.

03:41 And it's literally one line like, "Odo, here's your data frame, go to that database." Or "Here's your Postgres, go to CSV or reverse." So if you find yourself pulling in data from one source and trying to convert it or save it more or less in the same shape over to another source, then check out this thing called Odo, O-D-O.

04:00 I wonder if it takes a list of lists, an LOL, and writes a CSV file.

04:07 That would be something that I could find useful tons of times.

04:12 Yeah, absolutely.

04:13 That would be really, really cool.

04:14 And it does take lists and transform.

04:16 Yes, right.

04:17 So I'm not entirely sure like how flexible it is in that regard.

04:21 But I think there's also extensions.

04:23 So you can write extensions.

04:24 So just to give you the rundown, let's see it'll work with Panda data frames with a list with JSON files, CSV files, Postgres.

04:32 Yeah, so you could take like a CSV file, load it into Postgres, or Postgres into JSON, or even works with like converting into MongoDB, like I said, so like pandas to Mongo or reverse.

04:43 So not a whole lot more to it than that.

04:45 But it seems really handy if like, that's the problem you're trying to solve.

04:49 Yeah, very nice.

04:50 Yeah.

04:51 So tell us about concurrency.

04:52 I you chose this item, and it's not exactly new, but it's certainly something we haven't covered and is really amazing.

04:58 I agree with you that this is one of the really interesting things to cover in concurrency.

05:03 Right. So this is a PyCon talk from a couple years ago from the one and only Dave Beasley.

05:09 I mean, who else, right? And the interesting thing, as a speaker, I find it interesting, not only the content, but also the way he presented this talk. This is a live coding session. The entire thing, it's Dave's terminal. There are no slides and he's speaking while coding and starts from an empty file, actually quotes everything in the talk.

05:33 I think he just fires up Emacs and says, "Let's do this," right?

05:36 Right, yeah. Just two or three terminal sessions, one with Emacs and the other two with Bash, and it's done all there. And he goes to cover concurrency, the different ways you can do concurrency with threads, with processes, shows all the problems with both approaches, how the global interpreter log messes all this and complicates things. And then in the second portion of the talk, he goes and builds an asynchronous framework, pretty much like AsyncIO, a small version of it, a minimal version, using Python generators without any other additional libraries, all in Python standard code. So it's pretty amazing. And it's only 45 minutes. The amount of knowledge that's in those 45 minutes, it's unbelievable.

06:24 Yeah, and I really love the style like well done, David. So certainly the style of we're going to build this up from scratch. I'm not going to just show you a bunch of slides and talk about it. But I'm going to just show you how it's built really makes it feel accessible. You're like, well, if he could literally do it from scratch in 45 minutes, Like I saw everything that went into it. It was pretty understandable. It really is really is well done So yeah, if people are you know, thinking about Python concurrency or generators or async IO and all these things that it's actually Even good for for networking because he builds a little server. He doesn't even use, you know Nothing like flask or Django or anything. He builds a little server Using the he calls it the socket framework just using network sockets. Yeah, that was like part of the demo It was just like part of it, right?

07:08 Right.

07:09 Yeah, it's super awesome.

07:10 Yeah.

07:11 Yeah.

07:12 So certainly this is if this sounds interesting to you, be sure to watch that video.

07:14 It's on YouTube, or like it to it in the show notes.

07:16 And it's, you'll learn a bunch, especially conceptually what's going on in all this async IO stuff.

07:22 Super cool.

07:23 So before we move on to the next segment, though, let me tell you about the sponsor for the show rollbar.

07:27 So rollbar lets you basically just type pip install rollbar type a few commands either in your config file or in code in your web app, and it will continuously monitor your site, your web application for any sorts of errors.

07:43 And not just tell you if something happened, but capture all the details.

07:47 What was the logged in user when they ran into an error?

07:50 So who is your customer who's having this problem?

07:53 What's the call stack?

07:54 What other errors have you experienced like this?

07:56 What are the local variables passed to the function when it failed?

07:59 Like you can probably fix the error without even debugging it or running your code, you look and go, "Oh, I see what's wrong." I use Rollbar a lot.

08:06 I love it.

08:07 If you want to check it out, check it out at pythonbytes.fm/rollbar.

08:14 Next up, I want to talk about some optimizations.

08:16 This concurrency stuff that you brought up is certainly a form of optimization, but this is the future, trying to push CPython, the main default Python implementation, further.

08:29 This is an article on Medium by a friend of the show, Anthony Shaw, covering work that Victor Stinner has done.

08:37 Victor Stinner, have you been following what he's doing?

08:39 He's like killing it on performance.

08:41 He is.

08:42 Yes, absolutely.

08:43 I've been to his talk this past PyCon as well.

08:47 Yeah, so there's just so many things that he's doing that are amazing.

08:50 He did give some good talks at PyCon as well.

08:54 This project is called Fat Python, the next chapter in Python optimization.

08:57 So like I said, article by Anthony Shaw, and he basically highlights this fat Python project that was started by Victor Stinner back in 2015.

09:07 And the idea was, let's try to make it possible to apply better static optimizers for Python.

09:15 So one of the big challenges that you have with optimizing Python is it's super dynamic, you can't necessarily just look at the code and say, well, it has this structure.

09:24 we're going to change it around because you could go and dynamically add methods, functions, variables, whatever, right?

09:32 Yeah, so that makes it that makes it a big challenge.

09:36 So there's actually three peps Python enhancement proposals that chain together to try to make things a little bit better that Victor's working on as part of this project.

09:45 So one is pep 511, which is a proposal to add a process to optimize ASTs.

09:51 So ASTs or abstract syntax trees are basically what Python pulls up before it starts to execute your code, right?

09:59 It can pull up and it's basically a tree representation of the code that you write in a form that that's easier to be interpreted, right?

10:08 Like an object oriented representation of what your code is going to do, not just the text.

10:13 The idea is, it's possible, it could be possible to have an optimizer look at that AST and say, okay, this looks like pandas code, and you're applying this, you're doing this particular anti pattern that is slow.

10:26 So maybe we could change things around behind the scenes, you don't even know it to optimize or to fix that, right?

10:33 Maybe people run linters that say you're doing this thing that's not amazing.

10:37 What if we could have an optimizer that would just make it fast, right?

10:40 It just make the change for you.

10:42 Exactly.

10:43 The proposal is to basically create some kind of hook for creating these optimizers.

10:48 And this might be built into CPython.

10:49 It might even be something you pip install, like pip install the NumPy optimizer or whatever for my runtime, which is really, really cool.

10:57 That also brings us to PEP 509.

10:59 Like I said, this makes it really hard to optimize because everything is mutable at runtime.

11:05 There's these things called guards that verify the last time it's processed this bit of the structure that it hasn't changed.

11:13 PEP 509 is a process to add a private version of dictionaries that implement a different type of guards that are much faster.

11:21 We have 5.10, which proposes a public API to CPython for specialized code with guards for a function.

11:29 So the idea is you put it all together, you take this optimizer, it generates a new high-performance version, it replaces the code that would normally run with this optimized version.

11:41 And as long as it doesn't change, it can run that optimized code or fall back.

11:45 So taken together, all three of these create this fat Python thing, which is really great.

11:50 So you can download this and run it, you have to compile CPython, like a special branch of it to make it work, but the article talks about it.

11:57 But the results are really good.

11:58 So for example, a basic function that just you call it returns a string 24% faster than Python three, six, I was skimming through the article and the peps and I could not find for this implementation, what are the gotchas?

12:12 the gotchas, if there are any. It doesn't look like there are, which is great. But I was wondering if you know any code can run or... I think it would, I guess the gotchas would probably be like, can the optimizer deal with it? And are you 100% sure the optimizer is not going to make a change that, that like has a behavioral change, right? Right, right. Yeah, that was what I was looking for. I mean, are there any cases where it can make a mistake at this point? Yeah, I think I think this is basically cracking the system open so that these things could be plugged in but then you'd probably have to look at the gotchas of the optimizers.

12:46 You know what I mean?

12:47 Right.

12:48 Yeah, so it's pretty good like a 24% improvement in function call speed and that's 46% faster than Python 2.7.

12:55 Like that is a big deal.

12:57 One of the big drawbacks of Python is function calls are really slow.

13:00 And so people don't necessarily break their code into as small functions as they should.

13:05 And so this might really alleviate some of that.

13:07 I think it's great.

13:08 Yeah.

13:09 project. Yeah, keep up the good work, Victor. All right, so suppose I'm sitting in a coffee shop.

13:14 I'm sure it's fine if I just, you know, go log into my bank. Right. Things like that.

13:19 Right. Yeah. So, I mean, we all hear that it is insecure to go online on a coffee shop or hotel, airport, you know, all of those. Yeah, hotels and airports scare me way more, especially hotels these days. I don't know why. Right. So, you know, very few people really understand what's the problem. You may think that if you only access sites that are HTTPS, so secure sites that you'll find, and you're really not completely, all the content that you transfer to this site is going to be encrypted. But there are other things that happen before you get to a connection that are not encrypted, like for example, DNS searches. So if you're in a coffee shop getting to a lot of sites, it's very easy for someone on the same network to find out what sites you visit, even though they cannot see what data you transfer to them.

14:13 Right.

14:14 And if somehow they happen to be able to alter that DNS, which probably has no verification because it's unencrypted, then they send you to their google.com.

14:20 They send you to some other place.

14:22 Right.

14:23 Exactly.

14:24 So, you know, it's very important that you take security when you go on a public Wi-Fi spot, you know, very seriously. And I wanted to mention this tool that so many times I talk to people and I mention it and they don't know it and it's super great. It's called S-Shuttle and typically the solution that you you're told is that you need to pay for a VPN service and S-Shuttle it proclaims itself as the poor man's VPN and a nice advantage it has over regular VPNs is that it doesn't require any software to be installed on the remote machine, the secure machine that you use for your connection. So the way it works is basically you need to have SSH access into a secure machine and in my case I have it right here in my home. I have a little, this is a chip machine that is a $9 computers or could be a Raspberry Pi, you know, any of those and the only thing you need there is SSH, right? So you get it by default if you install the Linux distribution. And then from anywhere you are, you use a shuttle to create a secure encrypted tunnel into this secure machine that you have in your home. And then everything that you do goes through the tunnel, and then it's forwarded into this secure machine. And then it happens on your secure machine, including DNS searches. So that's really cool. So if I run a shuttle, and I go to like gmail.com it will actually funnel the traffic through say my raspberry pi like on on on your on the coffee shop there's only going to be a connection to your raspberry pi and then the raspberry pi will make the connection to gmail and then forward the results back into your connection in the coffee shop through an encrypted channel oh that's really cool and everything you do is absolutely private yeah that's that's super cool and it's written in python yeah and it is written in python uh it recently had support for python 3 yay nice that's That's cool.

16:20 Because that used to be a problem in the past, but now it works on Python 3 as well.

16:23 All right, well, that's really cool.

16:24 I'm definitely going to check that out.

16:25 All right.

16:26 So last thing is I want to talk about something that initially might surprise people about the topic is I want to talk about Node.js.

16:35 So Node.js, while, you know, Python developers sometimes or web developers that also play with Node.js, I don't want to talk about it for that reason.

16:43 Mostly I consider Node.js like a similar new modern ecosystem and environment similar to to Pythons, right?

16:50 It's very open source.

16:52 There's a lot of people excited and contributing to it, things like that.

16:55 But there's...

16:56 Node.js has been in the news for the wrong reason this week.

17:00 And basically, a bunch of the people who are in charge of the steering committee for Node.js quit.

17:07 Like a third of the committee quit and just said, "We're done with this," because there's been a huge rift in the community, apparently.

17:17 I want to talk about this community aspect and give thanks for how well things are going at Python and the Python ecosystem relative to Node.js.

17:28 Basically there was, I don't want to get into name calling, whatever, you guys go read the articles, but there was some guy who was on the committee who was making decisions that a lot of people disagreed with.

17:39 He was very much against the code of conducts for reasons that may or may not be valid.

17:43 I don't know.

17:44 Basically, they felt like he was not representing the community well.

17:48 And the way the board worked, or the system to enforce the code of conduct worked, was they would look at individual things and they would say, "Is this sufficiently bad, say, to have this guy removed from being in charge?" And they said any individual thing was not a big deal.

18:06 But it was not big enough to take that action.

18:09 But if you add up 10 or 20 of these things, all of a sudden here's like a...

18:13 if there's a pattern, right?

18:14 - A pattern, a pattern of behavior that says, you know, maybe this guy is not representing us as well as we want or something like that.

18:21 And so they decided not to remove the guy.

18:24 A bunch of people said, that's it.

18:25 We've tried for a long time to sort of fix this and we're so fed up with it that we're leaving.

18:31 We're no longer gonna be on the committee.

18:33 Some of the people just said, I'm done with Node.js.

18:35 Like these are former like steering committee folks on Node.js, like I'm done with Node.js entirely.

18:41 And actually, one of the, maybe the biggest thing that came out of this, not the people on the board, but they said moments after the failed leadership vote, Kate Marchan pushed the button and created IO.js, a new open source fork of Node.js.

18:56 And this has happened before there was some problem with Node.js in the community.

19:00 And they created this thing called IO.js.

19:02 This is A-Y-O dot JS, but phonetically, they're the same, right?

19:07 So you had some pretty interesting insights on this, I thought.

19:12 One of the things that you said, I mean, first of all, just give thanks that things are working better here and we seem to have a better balance.

19:19 But you also pointed out that we have a single leader who ultimately decides, right?

19:24 We have Guido.

19:25 Right.

19:26 We have a single person that sets the tone for the community, right?

19:31 And I believe they don't.

19:33 vote on things and many people vote and clearly they are very divided on the roles.

19:41 Some people put technical over community and clearly some other people it's the reverse of that.

19:47 So yeah, I think we're lucky that at least we have a model that we know that we follow.

19:56 >>Steve Yeah, I think it's really important that you know, Guido's open to taking feedback from all the folks involved in the community but in the end, he makes the decisions. And I think, you know, credit to him. He's made a lot of good decisions that keep people engaged. Yeah. So yeah, here's a Node.js item for Python Bytes.

20:15 And really, the story is just, you know, look at what's going on over here. And let's make sure that this doesn't happen to our community as well. It's called no. Yeah. All right. Well, Miguel, that's the items for this week that we had picked out. Anything else that you got going on?

20:28 John, you just did an amazing Kickstarter, right?

20:31 - Yeah, that was actually a little experiment.

20:33 I wanted, for many years, I wanted to update my Flask mega tutorial, the first tutorial I wrote more than four years ago, and the amount of work is so big that I always had to give preference to work.

20:49 - Gotta pay the bills, kids gotta go to college, things like that, right?

20:52 - Yeah, so I decided to give a Kickstarter to try, and basically I converted that task into work.

20:58 - That is really cool.

20:59 So you have this mega tutorial, this flash tutorial that you've done, and it's really elaborate.

21:04 And you basically ran the Kickstarter and said, look, I wanna update this tutorial.

21:09 Who's in on helping me do this?

21:10 And the community really responded, right?

21:12 - It responded in a huge way.

21:15 Basically, the modest funding that I set up for Go was met the first day.

21:22 And then from then on, I decided to start coming up with ideas to expand the tutorial, add more content to it.

21:28 - It got funded in one day?

21:29 - Yes, the 100% funding was in the first day, but then it got something like, I don't know, maybe 600% funding in the end.

21:39 It ended up pretty good, and I have some serious work to do because I have three new, complete new topics to add that I will be working on as part of this rewrite.

21:51 - That's really cool.

21:52 So super excited about that.

21:53 - Yeah, that's great, congratulations.

21:55 And how do people find more out about this?

21:57 - They can go to my blog, and the blog is blog.miguelgreenberg.com, and they're gonna find a little Kickstarter badge there, and from there they can go to the Kickstarter page and find out what I'm planning to do, if they're interested.

22:10 - Awesome, yeah, we'll throw the link in the show notes as well.

22:12 - Thank you.

22:13 - Cool, so I just have one piece of news as well myself, other than, hey, I'm back from vacation, that's awesome.

22:18 While I was on vacation, I ended up finding little bit of time to release my latest course, which is building RESTful APIs with Pyramid.

22:27 And so this is like an eight hour course digging into like the whole what is REST, like how do you work with HTTP status codes and verbs and all that and then making this all happen in Pyramid.

22:37 And it was really fun to write.

22:38 So if that sounds cool to you, check it out at training.python.fm.

22:40 I'll probably put that in the show notes as well.

22:44 All right, Miguel, thank you so much for filling in while Brian's having his beer over in Germany.

22:50 Actually, I don't know what he's doing, but I hope he had a beer.

22:52 Yeah, thank you for having me.

22:54 Yeah, it was great. Talk to you later.

22:56 Thank you for listening to Python Bytes.

22:58 Follow the show on Twitter via @PythonBytes.

23:00 That's Python Bytes as in B-Y-T-E-S.

23:03 And get the full show notes at pythonbytes.fm.

23:06 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

23:11 We're always on the lookout for sharing something cool.

23:14 On behalf of myself and Brian Aukin, this is Michael Kennedy.

23:17 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page