Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #162: Retrofitting async and await into Django

Return to episode page view on github
Recorded on Thursday, Dec 19, 2019.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 162 recorded December 19, 2019. I'm Brian Okken and today I've got guest host Ali Sevgi. He's one of the organizers of the Chicago Python meetup and we met last week when I was there for briefly and we already started recording this but I forgot to press record so we're starting over. So thanks Ali for hanging in there with me. Thanks Brian, - I'm glad to be here.

00:29 - And this episode is brought to you by Datadog.

00:32 We'll talk more about them later, but first, Ali's gonna tell us some exciting things about Django.

00:38 - Yeah, so Django 3.0 just came out in early December, and so I really wanted to find out what was going on.

00:45 I found a talk from DjangoCon 2019 by Andrew Godwin titled "Just Add a Wait, Retrofitting Async into Django," and Andrew's been the one leading the implementation of asynchronous support into Django, And he discussed what the 3.0 release signifies, also went over the implementation roadmap for upcoming releases.

01:04 - Okay, wow, that sounds fascinating.

01:06 - Yeah, he started by giving an overview of the async landscape, talked about how synchronous and asynchronous code interact.

01:13 A key point that he made was that when we have asynchronous functions, they're very different than synchronous functions.

01:19 This makes it really difficult to design proper APIs.

01:23 And one of the difficulties in adding asynchronous support to Django is that this project, it's been around for a really long time.

01:29 There are a lot of people familiar with it.

01:31 And for these new async implementations, they wanted to have that same familiar feeling.

01:37 And so they made a plan to implement async capabilities in three phases.

01:41 And so phase one, that was completed with the Django 3.0 release.

01:45 And this phase, it's really meant to lay a lot of the groundwork for future changes.

01:50 And so Django now supports asynchronous server gateway interface or ASCII.

01:54 And the ORM, it's also async aware.

01:57 So if you actually call ORM code from async code, it's going to raise an exception to make sure that you're not calling the synchronous function from your async code.

02:05 You don't want to have unexpected behavior there.

02:08 Phase two is going to be tracking with Django 3.1.

02:11 And they're going to be adding async capabilities to the core parts of the request path.

02:17 This means we're going to have async views, async handlers, and async middleware.

02:22 And there's already a branch in the Django repository where this is mostly working.

02:26 They just need to add a couple of tests to make sure things are passing with all those cases.

02:30 It sounds really well thought out.

02:32 They put a lot of work into it and there's a really detailed Django pep.

02:35 I think it's called a depth of all about it.

02:38 And then finally, with phase three, that's going to make the Django ORM all async.

02:42 And Andrew called this the largest, most difficult and most unbounded part of the project.

02:47 So if we just start thinking about like ORM queries, sometimes they can result in a lot of implicit data-based lookups if we're not really careful.

02:55 And this is where a lot of those complexities are going to come from.

02:57 I really enjoyed this talk. I think it went into the right amount of depth while being really accessible. And I just want to call out the fantastic diagrams. It really helped improve my understanding of Django internals. And if you check the show notes, I'm going to include a link to the async project wiki. Andrew needs some help putting these things into Django.

03:15 He's got a lot of areas there. If you have some time, you have the expertise, please go ahead and help him out.

03:20 Yeah, awesome. I like doing calls to action on the show to try to get people involved.

03:24 And I've been just starting Django for like several years now, just dipping my toes in and out every once in a while.

03:31 And I was kind of waiting for this new version to come out. So I'll probably check this out first.

03:36 I like that idea of having, since they're gradually rolling it out, doing error messages for if you try to do things that aren't implemented yet.

03:45 So that's cool.

03:46 Yeah, it's always best to be explicit.

03:48 So would you say you've been playing around with Django?

03:51 Yeah, okay.

03:52 Nice, nice segue there.

03:54 So the next topic, speaking of playing, I don't know how you got into programming, but a lot of people get into it with games.

04:02 And I actually feel bad that I'm in that camp also, because I don't know, it's just kind of cliche.

04:11 But yeah, sure enough, Games by Example is a GitHub repo by Al Swigert and other contributors as well.

04:18 Or I think there's others, but he wants more.

04:21 So the idea is these are standard I/O games.

04:25 And he's got a collection of games with the source code to use for example programming lessons.

04:32 They're all written in Python 3 and you can just clone the repo or just view it and copy and paste if you don't really, even for people that aren't used to git stuff.

04:42 I think it's neat 'cause I, before I even learned any of the concepts of programming, I was programming games by copying them out of magazines and typing them into my old TRS-80 way back in the stone ages.

04:53 - That's awesome.

04:54 - I didn't know what it was doing, but I could modify things and sort of interpret it you can kind of read BASIC. So there's some cool features of these games that actually I would have enjoyed at the time. Some of the neat things is that they're short. They're limited to 256 lines of code, just an arbitrary number, but that's a pretty nice small code size. They're all single file, single source code files with no installers, so you just run them with Python. They'll use just the standard library if they're only using print and input for interacting with the user. You know there's some downsides to that but there's some upsides too because they're fairly simple. He's tried to make them very well commented and very readable so he mentions that they're elegant and efficient is nice sometimes but understandable and readable is better for for education purposes so that's what they're planning on doing. That's what his focus is. Also input validation on everything so that you can't crash Python from typing in something weird. This is kind of cool. He made sure that every function has a docstring to describe what it is because it really is meant to be teaching people and I think this is a pretty cool. Yeah Al does a lot of good things about structuring things in a way that everybody can understand really love all the stuff he does. This would be great for people to use in I think in helping your kid out. I'm gonna try to get my my kids running with some of the stuff to just say hey watch this is how you run it now play with it and yeah break it. He also included to-do list of things he wants to do with the project but hasn't done yet and I love that because it gives people direction if they want to help out with the project. One of the areas is testing, he wants more tests so I'd love for people to get involved with that. Always got to make the call up for testing. Yeah definitely. So before we move on I want to give a shout out to our sponsor. This episode is sponsored by Datadog, a cloud-scale monitoring platform that unifies metrics, logs, distributed traces, and more.

06:54 Trace requests across service boundaries with flame graphs, correlate traces with logs and metrics, and plot the flow of traffic across multi-cloud environments with network performance monitoring. Plus, Datadog integrates with over 350 technologies like Postgres, Redis, and Hadoop, and their tracing client auto-instruments common frameworks and libraries like Django, tornado flask and asyncio. That's cool. Get started today with a 14 day free trial at pythonbytes.fm/datadog.

07:27 Now speaking of testing, Bulwark has some, ah I stole your thunder, but then this topic has some test related stuff right? Yeah so Bulwark is an open source library that allows users to easily property tests their pandas data frames.

07:44 It makes it easy for data analysts and data scientists to write tests.

07:49 So let's just take a step back.

07:50 We all know that tests are important, but tests for data, they're just a little bit different.

07:55 These tests, they're not as deterministic and they require us to think about testing just a little bit differently.

08:02 So with property tests, what we can do is we're able to check that an object has a certain property.

08:08 And so property tests for data frames include things like validating the shape of a data frame, checking to see if a column is within a given range, verifying that a data frame has no NANDs, and things like that.

08:22 So with Bulwark, you're able to implement these property tests as checks, and each check takes a data frame and some optional arguments.

08:30 This check will make an assertion about some data frame property.

08:34 If things are good, the check's gonna return the original data frame.

08:37 If the check fails, the assertion error is raised, and you have a little bit more insight into what went wrong.

08:42 And this is really helpful when you have a large data pipeline.

08:45 - That's cool. - Yeah, it's really cool.

08:47 And so one of the ways that Bulwark lets you implement property tests is also through decorators.

08:53 And so when you're creating data pipelines, it's really useful to do them as functions.

08:57 You have some input data, you perform some sort of action, it returns an output.

09:02 And with Bulwark, what you can do is you can add decorators to your pipeline functions and validate that the properties of your input data frames meet the conditions that you really wanted to have.

09:12 So Bulwark has a lot of pre-built checks already in there.

09:15 There's one for has certain data types.

09:18 Is this column monotonic?

09:20 Is this within a certain number of standard deviations?

09:22 And it seemed pretty straightforward to add new checks.

09:25 And instead of stacking decorators for multiple checks, they have a special decorator, a multi-check decorator, which won't fail only on the first.

09:33 It's actually going to run through them all and tell you all the ways your data either passed or did not pass.

09:39 Oh, that's great. Wow. Neat.

09:40 Yeah, you can use Bulwark for implementing unit tests.

09:43 You can use them in the ETL pipeline, especially on the extract and the load steps.

09:48 And Bulwark can be used during development.

09:51 And they also have some options to turn assertion errors into warnings for production.

09:56 I'm not really too sure how I feel about that, but that functionality is there if you want to use it.

10:00 This is cool.

10:01 This is actually coming out of Chicago Python community member Zach Rosenberg.

10:06 He created this and gave a talk about it at Chippy.

10:08 So just wanted to share with the rest of our community.

10:10 I think it's great.

10:11 And I think that there's seeing more and more.

10:13 I did a recent episode of Testing Code where we talked about, I talked to one of the people on the Great Expectations project that's around testing with data also.

10:22 And, you know, some frameworks are attacking the problem differently.

10:27 And also just some people like one style over another.

10:31 This is a little bit different style than Great Expectations.

10:33 So I think it's definitely worth checking out and having more, Gosh, more and more of our life is controlled by data pipelines, not necessarily controlled, but influenced by results based on these data pipelines.

10:47 So having these checks in place to make sure that our data is just as solid as our source code.

10:53 This is awesome.

10:54 Yeah, it's really good to see all these data scientists and data engineers thinking about testing in a different way.

10:59 Oh, gosh, there's some interesting contributors to this.

11:04 This is great. Cool.

11:05 I got off on the test side I could go down that rabbit hole for all day long but let's pop out and talk about packaging a little bit.

11:16 Yeah actually so packaging such as we do cover almost every story packaging related because it is a stumbling block for a lot of people to go forward at at some point you get into the intermediate python developer and how to share your code is and dealing with virtual environments and stuff is one of those things you have to run into.

11:36 So I think it's kind of cool. I am not a Poetry user, but I might try this new one out. So Poetry just announced the Poetry 1.0.0.

11:45 They're no longer 0 versionings.

11:47 They've got to 1.0. Awesome.

11:49 The announcement is by Sebastian Eustace and it highlights some of the changes and a good caution right off the bat is they are breaking backwards compatibility with this version because the zero-ver is often, we're trying things out and I think it's completely fair to break, it's always fair to break with major versions but but I think this is reasonable.

12:14 So one of the things I, I think it's this highlights, I want to highlight a few things I think it's very for this version we can really take poetry seriously I think that they're planning on sticking around and people that like it there's a lot of changes that are good.

12:29 They've added different ways to manage environments.

12:32 So within a project, you can have Poetry help you coordinate multiple versions of Python within the same project.

12:39 That's pretty cool. I don't know if it had that before. That's neat.

12:43 One of the things that I want to bring up next is private indices.

12:47 If you're working with projects within a company, this is very important to be able to have your own private, something like PyPI.

12:57 And one of the ways you can do that is you can have your a private version of something like PyPI, but you can also just throw a whole bunch of wheels in a directory and use that as an index as well.

13:09 And so having there's some improved support, you can have even specify a different index for each dependency if you want.

13:17 But you can also set up a primary and secondary or default and secondary index and have them be non something other than PyPI if you want.

13:26 So this is cool.

13:27 using Poetry and other tools like this within continuous integration, sometimes it's hard to pass things around and so environmental variables are sometimes they're ugly but they're useful within CI and so there's new environmental variable support. And then since there's all this new support for different sort of dependencies, the add command has been expanded to allow you to just add dependencies and put them in the right place with the right source and everything. The other thing I want to highlight is they've improved some things around publishing to allow you to put API tokens in place instead of having to use text, passwords, and users. So very cool changes from Poetry. I applaud their progress. Yeah, I haven't checked it out, but I'm going to go check it out. Looks like there's been a lot of changes made. Pretty excited about the 1.0 release. This is neat. You know, people are respect all over the place. There's some people that absolutely love poetry. There's some people using pipenv. I'm a straight just using the built-in venv and all the tools around that. But yeah, whatever floats your boat, I guess. Yeah, for sure.

14:34 What do you got for us next? So Kubernetes has been a huge part of the DevOps ecosystem in the last two, three years. With the rise of containers, Kubernetes is sort of the de facto platform for running and coordinating applications across multiple machines. It's not really something I I know a lot about, but it's something I want to get to know more about.

14:55 I just found this awesome link from DigitalOcean.

14:59 They put together a Kubernetes for full stack developers curriculum.

15:01 What it does is it follows all the steps that a new user would have to learn in order to deploy applications to Kubernetes.

15:11 You start by learning Kubernetes core concepts like what's a pod, what's a deployment.

15:18 Next you'll start building modern 12-factor web applications.

15:18 And then you're going to start packaging these applications inside of containers.

15:22 Next, you're going to deploy your containerized applications on Kubernetes.

15:25 And then finally, you're going to learn how to manage all that cluster orchestration and the cluster operations.

15:30 I went through the first link, which was like an introduction to Kubernetes.

15:34 I found the material really easy to understand.

15:36 It's pretty much what you expect from DigitalOcean.

15:38 I learned how to do a lot of my operations work from DigitalOcean.

15:42 And I'm really excited that they have a lot more resources for the community.

15:46 I'm going to throw a link in the show notes to a talk I gave about introduction to Docker.

15:50 If you're ever learning how to do Docker, I think this is a really good place to get started.

15:56 Yeah.

15:57 And so just to be clear, Docker and Kubernetes are closely tied, right?

15:59 Docker is the container where you package your applications in.

16:03 And the Kubernetes is the platform that manages all your Docker eyes, your containers across multiple machines.

16:10 So this way with Docker, you can build your application with Kubernetes, you can deploy it and scale it pretty much infinitely.

16:16 Cool.

16:17 And it wasn't until I, gosh, I'm forgetting the guy's name, but I saw a talk a couple years ago about Kubernetes.

16:22 And I didn't realize that this isn't just, I mean, yes, it's intended for cloud stuff, but you can test the entire Kubernetes setup locally on a laptop even.

16:33 So that's pretty neat.

16:34 Yeah, for sure.

16:35 I'm not lying there.

16:36 Is that true?

16:37 It's like a kubectl.

16:38 Like you can have like a kubelet on your machine and pretty much do whatever you need to on the cloud locally.

16:43 Okay, cool. Definitely good for playing around. You don't have to pay for anything.

16:47 Exactly.

16:48 Well, now, so the last topic, I want to apologize to people. I'm perfectly okay with making mistakes on the show.

16:54 So on episode 159, we were going through a whole bunch of pytest plugins.

16:59 And one of the things we covered was pytest Picked.

17:02 And I incorrectly assumed that it would run.

17:06 There's a quote on the pytestPict site that says it runs the tests related to unstaged files or the current branch on the current branch according to Git.

17:15 To me that sounded like it runs the tests related to code that's changed.

17:21 But I was wrong. So Michael was right, I was wrong.

17:24 It just runs, it uses Git to find out which files have changed and if any of those are test files, it runs the tests related to those test files.

17:35 That makes sense, and that might be, if you're developing tests, that might be exactly what you want.

17:40 But if you're developing code, you might want something different, and the tool I was thinking about was pytest-Testmon, and that's a plugin for pytest also, and I'm just gonna read their blurb.

17:52 So pytest-Testmon is a pytest plugin which selects and executes only tests you need to run.

17:59 It does this by collecting dependencies between test runs and all executed code internally by using coverage, the coverage data, and comparing those dependencies against changes.

18:11 So it updates the database on each test execution, so it works independently of version control.

18:17 So that's the thing I was thinking about.

18:19 So if you're using this coverage to find out what tests are affecting different parts of your code base, very cool.

18:26 So if you're making changes, this sounds like black magic to me that I'm glad somebody else wrote it, But it does look pretty neat.

18:33 And I think I'm definitely ready to try this again.

18:36 I tried it a while ago.

18:38 And for some reason, I don't even remember what I, I don't think I had any beef with it, but I just didn't think I needed it.

18:43 But this looks exciting enough to, I'm gonna try that again.

18:46 - For large test weights, I think this would save you so much time instead of rerunning everything.

18:50 - Yeah, I mean, there's other ways to, if you know specifically what you wanna rerun.

18:54 Yeah, if you've got a whole bunch of different tests and they all run pretty fast, but you're not quite sure which ones you should run based on your code changes.

19:03 This is kind of neat.

19:04 - Awesome.

19:05 - So that's the end of our normal topics.

19:06 We didn't really get to know you a little bit, but here, so you have anything extra you want to tell us what you're up to?

19:12 - Sure, so one of my favorite things about Python is the community.

19:16 And as the organizer of Chicago Python, it's no surprise that building hyper-local communities is like one of my main passions.

19:24 And in Chicago, we have a lot of events for our members.

19:27 We have talk nights, project nights, open source sprints.

19:30 We recently started one to help people with whiteboarding coding interviews, but there's great user groups all over the world.

19:37 And I just want to include a link in the show notes so how your listeners can go out and find a local community they can be a part of.

19:45 And I also want to give a special shout out to all my fellow organizers who are listening to this podcast.

19:50 Thank you for all you do for the community.

19:51 - Yeah, I mean, I didn't know that the Chicago community was so huge.

19:56 You guys are, did you say 11 organizers?

19:59 - Yeah, we have 11 organizers.

20:00 We've been around since like 2003.

20:03 So it's like one of the OG Python user groups.

20:06 And we actually had like Matplotlib debuted at Chippy.

20:10 So a lot of historical things happening.

20:12 - Yeah, it was cool.

20:13 So I was in Chicago last week and I don't know, I posted something on Twitter like Chicago's cold.

20:18 Some people made comments, but Ali reached out and said, "Hey, if you're in town and available for a drink "or something, we should meet up." And so totally impromptu, we met and had dinner together.

20:29 It was a blast.

20:30 And yeah, talk about community.

20:32 That's one of the things I love about the community.

20:34 If I'm wherever I'm traveling, I can like try to hit up people in that area and say, "Hey, I'm in town.

20:40 "I only have like three hour window, "but is anybody available?" And usually I can get somebody to come by and we can BS or something.

20:48 And so it's great.

20:50 - We have Python friends all around the world.

20:52 - You have a note here on Pi Tennessee?

20:53 - Yeah, so Pi Tennessee is gonna be happening on March 7th and 8th.

20:56 I'm gonna be giving a talk on design patterns and tickets are available at pietennessee.org.

21:02 It's gonna be in Nashville, my third year in a row going, can't wait.

21:05 And Brian, I think I saw your name on the talk list as well.

21:09 - Oh yeah, I am going to that.

21:11 (laughing)

21:12 - Awesome.

21:13 - I'm like, I don't think, I thought I was going to Nashville.

21:16 Oh yeah, that is in Tennessee.

21:18 Yes, I failed geography.

21:20 So cool.

21:21 I've never been there, so it should be fun.

21:23 - Yeah, we'll grab some dinner or drink or something.

21:25 That'll be fun.

21:26 - So speaking of community and this podcast, I just wanna announce that the next Python PDX West meetup, the one in Hillsboro, just west of Portland, is happening January 7th.

21:38 And I'm bringing it up because Michael and I thought it'd be fun to just do a live recording of Python Bytes with everybody hanging out at the meetup.

21:46 There's also gonna be one to two other talks there and we'll have food.

21:49 So if anybody's in the Portland area, the second week in January swing by.

21:54 - Sounds like a lot of fun.

21:55 - A couple jokes for us.

21:57 So the first one is sent in from Tyler Madison.

22:02 It's just a short joke.

22:04 So two co-routines walk into a bar.

22:06 Of course it's a runtime error because bar was never awaited.

22:09 (laughing)

22:11 Async jokes.

22:12 - I got one for you.

22:15 So how many developers on a message board does it take to screw in a light bulb?

22:19 - I don't know.

22:20 answer is, why are you trying to do that?

22:22 (laughing)

22:24 - Yeah.

22:24 - For all those people that are just trying to make sure that they're making sure that you're doing things their way.

22:29 - Yeah, I hate that.

22:29 When people like, you know, somebody asks a question and before anybody, it might be a simple answer, but somebody will say, yeah, you shouldn't be doing that.

22:37 Don't do that.

22:38 - Yeah.

22:39 - This was fun.

22:40 Thank you so much for filling in for Michael and co-hosting today.

22:42 - Thanks so much for having me, Brian.

22:44 - Thank you for listening to Python Bytes.

22:46 Follow the show on Twitter @pythonbytes.

22:48 That's Python Bytes, as in B-Y-T-E-S.

22:51 And get the full show notes at pythonbytes.fm.

22:54 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

22:59 We're always on the lookout for sharing something cool.

23:01 This is Brian Okken, and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page