Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #90: A Django Async Roadmap

Return to episode page view on github
Recorded on Thursday, Aug 2, 2018.

00:00 Hello and welcome to Python Bytes where we deliver Python news directly to your earbuds. This is episode 90 recorded August 2nd 2018 I'm Michael Kennedy and I'm Brian Okken. Hey Brian. Good to be with you again. That's good to talk to you again Yeah, and it's also good to have DigitalOcean sponsoring this episode. So thank you to DigitalOcean We're both customers and they're sponsors of ours. So it's kind of this weird mix of everybody loving it So Pythonbytes.fm/digitalocean will get you $100 credit for your servers if you're a new customer.

00:30 So check that out.

00:31 In the meantime, we should probably talk about some data analysis.

00:34 We should.

00:35 And who better to talk about data analysis than Jake VanderPlas?

00:39 I honestly don't know.

00:40 He does awesome work over there at the eScience Institute.

00:43 And I love listening to his talks.

00:45 So there's a set of videos that he has on.

00:49 This is actually from last year, but it didn't count.

00:53 There's 11 videos and it's called reproducible data analysis in Jupyter But there I mean each of the videos is like five or six minutes, so they go pretty fast And this is a really cool thing I think everybody should check these out anyway is because there he goes through a problem of or just a Dataset that the bikes cross the bike crossings at a particular bridge in Seattle I think but it really doesn't matter the data how the data gets in there But he's using doing all this stuff live. He's doing a Jupyter notebook and Pulling data in and sometimes the tables like it the table that he ends up with doesn't look quite right so he uses a different different column to as the index or the for instance I'm one of the in the first video He puts a graph up and all the data is sort of packed together So he changes the sample rate into just weekly data and and all that stuff. I didn't even know you could do those things So it's not necessarily a complete, it's kind of a full pass through all of using all the tools you can use to do exploratory data analysis with Jupyter and doing it live. And watching a pro do it, it's a thing of beauty. And he's, like I said, each particular tool that he uses, it's not an in-depth study on exactly how to use that to its completeness, but you just get a glimpse at all the power that you can do with all these things. Yeah, I really like it.

02:19 And I think this sort of view into exploring data is super interesting.

02:22 It really shows the power of Jupyter notebooks.

02:25 Like when I first saw them, I thought, "Oh, well, there's like a simplified programming environment for people that aren't like real programmers and don't want to work with different files and stuff like that." But the more I saw people using it and interacted with it, I realized it's just for people solving problems entirely differently than the type of problems I solve.

02:41 And it's really great for that.

02:43 Yeah, and working with data, when you're throwing up a graph or a plot of the data, sometimes you might be plotting it wrong.

02:53 You're like, "Well, maybe I could see something if I plot it this way." And it's not interesting. It's just a bunch of points everywhere.

03:00 But if you plot it a little different with a different axis, maybe, or a different type of plot, it might show you interesting information.

03:08 This is actually fascinating to me, this notion of people using large data sets and then trying to figure out in real time how to use it well.

03:18 And then once you've already figured that out, if you want to put some program together using some of those tools to monitor those things, that's a great idea.

03:26 But the ability to just use a notebook to just explore stuff is pretty fascinating.

03:32 - Yeah, and there's some really interesting use cases.

03:34 Like, suppose you want to go hit a bunch of APIs and download some data and then graph it.

03:39 If you did that in like a script, every time you want to view it slightly differently, you're rerunning that down, you're getting or computing the data however you do that, right?

03:48 Whereas in notebooks, you can just rerun one cell at a time, change that cell, run it again, and you have to recompute or reacquire the data.

03:54 So it's really, there's a lot of interesting aspects to it.

03:57 - Yeah, and that's another thing he starts out with, showing how to, within the notebook, grab a URL of data and put it in a CSV file some local file so you don't have to do that network all the time.

04:10 Nice.

04:11 But anyway.

04:12 Yeah, it's a good one.

04:13 I really like the series and I'm glad you brought it up.

04:17 Another thing I want to bring up, something we haven't covered very much.

04:19 Have we talked about GUIs in Python yet?

04:21 You know, I think they're important.

04:22 We probably should start talking about it.

04:25 Yes.

04:26 So, because we were on our kick, of course people said, "Oh, have you heard about this?

04:31 Have you tried that?" And here's another one.

04:34 This one comes from Mike Barnett.

04:35 He sent me a couple emails, sort of charting the progress of this project that he built.

04:41 The name pretty much says it all, PySimpleGUI.

04:44 So it's for simple Python GUIs.

04:47 And it's just another really simple way to take what would have been like a command line program and make it a lot better.

04:54 So you know, bonus points to Mike, because this not only has a screenshot, this has many screenshots and many examples.

05:02 So as all the Python GUI libraries out there should have, if you want people to use them, screenshots.

05:10 So check it out.

05:11 It's pretty cool.

05:12 What you do is you work more or less in 100% Python language API.

05:17 And you don't work down at the GUI toolkit layer, which is cool, I think.

05:21 So you define the UI and it has this sort of auto layout mechanism.

05:25 And it has things like slide bars and text boxes and buttons.

05:29 It works in all the things.

05:31 So that's pretty nice.

05:32 on like a razz it says one of the examples is do you have a raspberry pi with a touchscreen well you know why don't you just write this one screen full of GUI code and you've got your touchscreen GUI working in that goal honey yeah so for better or worse it's based on Tkinter which is good because it comes with Python there's no dependencies so literally you just pip install pi simple GUI or if you want there's a single Python file you can include with your code. So there's not even the pip install package stuff, you could literally just go, here's the file, pi simple GUI dot pi, but next to my my program. And that's all it needs. So that's cool. But I honestly, I think Tkinter looks a little dated, right?

06:15 Like it looks like it belongs a little more in like war games than it does in 2018. But Anyway, it's still pretty nice as it is.

06:25 One of the upcoming things that he lists on the upcoming goals or if anybody wants to help out is port this to other graphics engines.

06:34 Because you don't work in the toolkit API directly, but this is like a translation layer, you could hook this up to say WX Python or Python for Qt, Qt for Python, things like that.

06:46 one of these modern looking UIs just by if somebody's willing to do that translation.

06:51 That's a neat idea. I'd like to see people working on that.

06:53 Yeah, wouldn't that be sweet? Then it could actually detect if you even have them and it wouldn't even go, oh, do you have WX? Okay, great. We'll go with that. Oh, you have Qt.

07:00 Let's just run that version. Don't have to get a dependency installed. That'd be best.

07:04 Yeah. And yeah, you're right that the TK stuff looks dated. But you know, there's a lot of use cases where you can use a GUI that doesn't have to be pretty.

07:14 Yeah, well, you know what else can look dated is the command line to a lot of people.

07:17 So, you know, maybe this is a lot better interaction for non-technical people, even if it does have a little bit funky shading on the buttons or something.

07:25 Yeah, I mean, I don't get it, but I do have, I fight that battle every once in a while.

07:30 Tell people this, just run this command line thing.

07:33 What do you mean run the command line thing?

07:34 Oh dear.

07:35 They're like, I studied accounting.

07:38 I don't know where the terminal is on my Mac.

07:41 Okay.

07:42 Okay, then.

07:43 Yeah, one other thing to call out for I think that's interesting.

07:47 This is Python 3 only.

07:49 So no Python 2 for PySimple GUI.

07:53 And more and more again we're seeing things where it used to be, well, I can't switch to Python 3 because this isn't supported.

07:58 Now it's like if you don't switch to Python 3, you don't get these cool libraries.

08:02 And here's just one more example.

08:05 Yeah, great.

08:08 Well, we've got a team that is migrating to Git, and so this isn't directly Python related, but I ran across this article called "Useful Tricks You Might Not Know About Git Stash." And git stash is the stash command, is something that I actually, it took me a while to run across because it's not something you have to use, but a stash is a way to, so let's say you've got a repository where you've cloned it, now you've made some changes on it, and you're not ready to do anything with your changes, but you need to like maybe pull down a new version or you branched at the wrong point or whatever.

08:48 Stash is a way to save off all of your changes, all the dirty stuff in your directory, save it away and just like hide it somewhere.

08:57 And then you can reapply those changes later after you update or pull or something.

09:02 And I'm still working through how to integrate Stash into a good workflow, but I wanted to highlight this article, Useful Tricks You Might Not Know About Git Stash, because I'm learning it and I wanted other people to know about it also.

09:18 - Yeah, that's cool.

09:19 One of the ideas that seems like it might be relevant here is suppose you've got some branch checked out and you're doing some work, you're like halfway through it and somebody comes along and says, "Hey, I'm on the same branch as you, "and we have this bug, could you just like fix this "really quick or make this quick change sure that I can carry on. And you're like, Oh, but this work I have here is like half done, it won't be done till tomorrow. So you could like stash that away, get the latest, do some work, push that in and then reapply that stash to like, get your work back without actually committing it and messing up that whole branch.

09:52 Yeah, definitely. And the use case and then the use case we often use is like, like the test team is working on different different tests around. But we're sharing in utility libraries and fixtures and stuff. And somebody updates a crucial fixture or a utility communication utility. And so you want to use that but you're you're in the middle of writing your test, writing changing something new. These aren't merge conflicts at all. But get doesn't let you like pull in the new stuff on top of your old stuff. I mean, you can do a merge, right, but you still have to commit it to your local repo before you can do a pull, which you might not want to do yet.

10:33 And if you're just starting out or whatever, you might not want to do that if you just want to look at stuff.

10:37 So that's the case where a lot of where we're playing with this workflow is to just stash away your changes to a poll.

10:45 And again, people can correct me if I'm if I'm using the term poll wrong, because I'm still learning at the right times to do polls and fetches and marriages and all that stuff.

10:55 So nice.

10:56 And this is really cool.

10:57 Two things that you call it here that I thought were pretty cool.

11:00 Well, I guess three one is you can label your stashes. So like, you know what they mean. They're not just hashes That's good. Also. I didn't know you can do a - you to include untracked files. That's pretty cool. That's pretty cool I didn't know that either before I read the article and the other one that Last one that I didn't I totally didn't know you could do is once you have your stash saved You could say well, I probably shouldn't have done it as a stash. I probably should have just put that on a branch there's a way to say get stash branch and then a name and then you can specify which stash and it just takes all those Files that you've all those changes and creates a branch. That's cool. Yeah, I really like that one Like oh I stashed it and actually what I want to do is more work and sort of parallel like break it off Split off my work without committing it to sort of convert the stash to a branch. That's cool. Yeah. Yeah, very nice I like it. So let me tell you about this new thing that digital ocean has so they have virtual machines and floating IPs and they have spaces and load balancers and all these sorts of things even domains and DNS is and if you have a lot of stuff going on at DigitalOcean well you might have like 20 virtual machines and some of them are for some project another one is for another project like how do you know which one is for which and is that one safe to delete I think we're done with it but I'm not sure I actually don't know really know what it belongs to so they came up with this new feature called projects where you can group droplets, load balancers, domains, IP addresses, all that kind of stuff into one two different projects. You can say this one is say for the training site, these three parts all fit together there. This one is for the Python bytes podcast. And these two servers and spaces all fit together over there. So pretty cool. Check that out. It's just one more way to make your hosting life easier. Yeah, and be sure to visit Python bytes.fm slash digital ocean. If And if you're a new user, you get $100 credit.

12:53 So that makes it even nicer.

12:55 So one of the things I'd like to see, Brian, is more async stuff.

12:58 And I think the place where it's most beneficial is around the web, actually.

13:02 Well, yes, last week, I said, because we can have like a bunch of different worker processes, it's not really necessary, right?

13:09 You can get like, if you've got an eight core server, you could have say 16 little micro whiskey worker processes.

13:15 And each one can sort of computationally chew up its stuff, like one one core and sort of get shared by the OS. But really, there's some limit where you don't want to create more because you run out of memory, right? Like I think the 16 on the training site probably take like a two gigs of RAM. So you can't have many of them or you'll run out of RAM unless you have a lot of room there. At some point, you maybe are waiting on a database call, right? You do request the request says, well, in order to process this request, I need to have a hit this database. And actually, this is a query that takes 500 milliseconds to return, that thread is really doing nothing but just waiting on a socket to return something from say, Postgres or MongoDB. And just as well could be doing other work if it could let go, but you know, maybe it doesn't, right. So if we can build this so that we could build it with async and await, anytime our code is waiting, it immediately gives up its thread, and it will go on to do more processing. And so like, for example, this is one of the ways this is basically the fundamental process concept of how like node js can do hundreds of thousands of requests on one server. All right, concurrent, because most of those are waiting on a database or something else, right?

14:28 However, the problem is many of the Python popular Python frameworks don't support this concept of async. We have new frameworks, Sanic, Gepronto, others that do support it. But those are not the old frameworks, right? So there's like this, there's these new ones that are exciting and fast. And there's the old ones that everybody knows how to work with and have deployed. But bridging that gap is a challenge. So Andrew Godwin, the guy who worked on Django channels works on Django channels.

14:59 He came up with a Django async roadmap. And it's pretty interesting and pretty thorough.

15:04 And it talks about like the timeframe and how they might make Django support this sort a world where you can have async methods.

15:13 Yeah, that's actually really cool because I mean if Django and Flask have to get there or especially Django, or it's gonna something else will take over.

15:24 Right, exactly.

15:25 I mean, this is one of the times you hear people say I'm switching to Go because it does better concurrency than Python.

15:29 Well, if Django and the other frameworks just had it baked in like that, that whole argument would largely go away.

15:36 So anyway, this is really cool. And I really like how he's put it together. He said he thinks it's time, the time has come to start seriously talking about bringing async functionality to Django.

15:46 And he's shared it previously with some people internally. But this is him kind of coming out and saying I'm opening it up for public feedback. So he has some interesting goals, says the goal is to make Django a world class example of what async can enable for HTTP requests. And that means various things at different parts of the stack. So doing ORM requests in parallel, right, just waiting on the database. Instead of waiting the 500 milliseconds, you just continue doing processing and then get back to it when the ORM responds, allowing views to query external APIs without blocking. So you know, we talked about the retry stuff, if you're calling like a credit card or other sort of external API, right, then you know, that would go away faster. You could do like slow response, long polling, super easy. It's like a sort of WebSocket stand in all sorts of performance improvements. So it's that it's imperative that they keep Django backwards compatible. And to make sure that when people come to the project, this is an option they can turn on, not something they have to learn. So like part of the beauty of these frameworks is they've been really easy to get started with. Let's not throw this at people at the very first thing they ever Yeah, yeah, there's a place for it and not sometimes not a place for it.

17:01 Yeah, exactly. All right. So I said, why now? Well, Django 2.1 will be the first release to support Python 3 and above and not the previous ones. And Python 3.5 and above. This is where async and await the language syntax and like it truly has become properly supported. So that's why I think now's the time to start working on this.

17:23 Yeah, and then the sort of the timeline is broken out into different Django releases and and which ones what sort of goals might be for each one.

17:33 That's nice.

17:34 Yeah, it's pretty cool.

17:35 Like, it doesn't make any sense to parallelize the web methods necessarily without paralyzing the data access.

17:40 So maybe start with the ORM actually, things like that.

17:42 Yeah, I love this.

17:43 I think that it's also good also talking about people like there's it's not just individual developers, but companies that have applications.

17:52 They might want to think about concurrency, but they don't necessarily, they don't need to do it right now, but they know that they're going to eventually do it to see this roadmap.

18:02 And maybe that'll help people.

18:05 In the article also talks about funding.

18:07 So if a lot of companies are relying on this or looking forward to it in the future, maybe kicking in some dollars to help it go faster is a good thing.

18:16 Yeah, it blows my mind how many huge companies are basically built on Python infrastructure, contribute zero to it. Yeah and that's something that we're as a society and not just the Python community but a web using community we've got to tackle that but part of this is I guess just also things like this to say hey this is where we're going and some of these problems need people focused on it not just volunteer time so some direct money to hire somebody for six months or a year would be good idea. Yeah a lot can get done with just like a few months of focus time. Yeah, definitely. Yeah, I mean that's how the new Pi-PI got launched.

18:53 So this is all the way up to, looks like a mostly async Django by Django 3.2.

19:00 Yeah, that's awesome. That's a ways, it's pretty conservative. It's not too wild and I really think it's a well thought out plan so I'm happy to see Andrew put it out there. Nice. Yeah, nice. So you got some music you're gonna play for us?

19:13 We both actually end up processing a ton of audio files, don't we? We do and I I usually don't do it in Python, but this PyDub, plus what a great name, PyDub, but it makes me think that maybe a little bit of audio processing within Python might make sense.

19:30 PyDub, the tagline is manipulate audio with a simple and easy high-level interface.

19:36 But it's really actually pretty cool with just a single line, like for instance, from mp3, you can pull in a mp3 file into a variable, And then once you've got it there, you can do things like use the bracket operators to get the first 10 seconds.

19:52 - It's crazy you can use the slice on it and you can use indexing operators.

19:57 It's like, that's crazy.

19:58 - And then adding or subtracting integers changes the volume by that number of decibels.

20:04 The way that the use of operators is pretty cool.

20:07 Plus, you know, so slicing and chopping, but you can do crossfade and repeat and fade.

20:12 And I'm not quite sure what the difference between crossfade and fade are, But anyway, changing formats from like say, WAV to MP3 or something, adding meta tags, that's pretty cool.

20:23 - That's the one that got my attention.

20:25 I'm like, oh, oh, this might help on some production stuff I'm doing.

20:30 - Making sure of a specific bit rates or MP3 has a quality level.

20:35 You can pass all that stuff in for saving.

20:38 Anyway, it's just like a really, we'll include a code snippet of a few things you can do, but it's pretty easy to maintain code once you've got it in place, I think.

20:47 - Yeah, this looks really interesting.

20:48 If you're doing anything with audio, people should check this out.

20:50 I did talk about this trade-off and how Django solving its async problem within itself would be great, but I still think there's room for exploration on the web in the Python world.

21:02 And so this next one is pretty much that.

21:04 It actually describes itself as an experimental framework, but it's called Molten, a modern API framework.

21:10 Have you heard of this, Brian?

21:11 - No.

21:12 It's a minimal, fast web framework specifically for building APIs with Python.

21:19 So I don't even know if it has a template language for HTML.

21:23 It's all about just building APIs.

21:25 But it looks pretty awesome, actually.

21:28 - Yeah, and pretty terse and small.

21:30 - Yeah, one of the things that I like that it does is it uses type annotations for a whole bunch of cool things.

21:38 So the other framework I saw do this was API*, but I don't think it quite used it as much.

21:44 So for example, you can have an API function that has a name, which is a string, and an age, which is an integer, and it will automatically pass that data over to you, as you call it, which is pretty awesome.

21:58 It also does request validation.

22:00 So you can create a class, which looks very much like a data annotation, or looks like a data annotation class.

22:08 And you give it a decorator and say, this is a schema.

22:11 And what happens is if you say my API function takes this class as an argument, so their example has a to-do class.

22:19 So if you say the input is colon to-do, right, you annotate it as a to-do, then it will actually parse all the things like the ID and the description and all the various pieces out of the input and verify that, you know, the ID is a string, or the ID is an integer, the description is a string, all that kind of stuff just by using type annotations.

22:42 (laughing)

22:44 - That's pretty cool.

22:45 - Yeah, yeah, it's pretty sweet.

22:47 Here, I'll throw out the next one, see what you think about this.

22:50 They also support dependency injection for allowing you to pass different data access layers and stuff like that.

22:58 So if you wanna test it, you could pass in a mocked out data layer, whereas by default, you just register it app startup and it'll create all the different pieces of infrastructure and pass them to the methods automatically.

23:11 - Okay, some people like that.

23:12 - Yeah, you know, I don't see that very often in the Python space.

23:15 I have mixed emotions.

23:16 Sometimes it's nice, sometimes it's not.

23:17 But anyway, it supports that, you don't have to use it.

23:19 But I do think the validation and the schema and the auto-mapping of your sort of JSON documents to and from just strong classes with Python-based declarative requirements and stuff is really cool.

23:31 - Yeah, I think the extra thing that they're adding, this idea of using annotations as a schema, It's pretty cool.

23:38 That's neat.

23:39 Yeah.

23:39 I really like it too.

23:40 And the other one that I looked at, sorry if I get this a little bit wrong, but there's some other framework that also used annotations that I thought was really cool, but it used them in a way that Python itself didn't make a lot of sense of.

23:53 So like you could say, like I'm getting an API key passed to me and you would say colon header to say this API key is coming out of the header, but when you actually work with it, it's not actually a header, it's a string.

24:06 It just came from the header.

24:07 And so things like PyCharm and stuff would freaking go, that doesn't have this method.

24:11 You're like, no, I know it's a string, even though I just actually said it's a header.

24:14 Like this is cool 'cause the thing you say it is actually is what it is.

24:19 The framework is consistent sort of with the programming model.

24:21 I like that a lot.

24:22 Yep, anyway, pretty cool.

24:25 And people can check that out if they're building APIs.

24:28 Remember, it's in the experimental stage, but you can play with it, see if it fits your needs or make it better.

24:34 - Definitely, nice.

24:35 - Yeah, pretty cool.

24:36 Alright, anything else you want to share with us, Brian?

24:37 No, I can't believe we're already done.

24:38 I know.

24:39 Same for me, I covered it all last week.

24:41 So just always fun to share this stuff with you.

24:44 Thanks for being here.

24:45 Definitely fun and everybody keep on sending us things that we should check out.

24:48 I love getting tips from people.

24:50 Absolutely.

24:51 Same here.

24:52 See you later.

24:53 Bye.

24:54 Thank you for listening to Python Bytes.

24:55 Follow the show on Twitter via @pythonbytes.

24:57 That's Python Bytes as in B-Y-T-E-S.

25:00 And get the full show notes at pythonbytes.fm.

25:04 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

25:08 We're always on the lookout for sharing something cool.

25:11 On behalf of myself and Brian Okken, this is Michael Kennedy.

25:14 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page