Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #162: Retrofitting async and await into Django

Return to episode page view on github
Recorded on Thursday, Dec 19, 2019.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to

00:04 your earbuds. This is episode 162, recorded December 19, 2019. I'm Brian Okken, and today

00:11 I've got guest host Ali Sivji. He's one of the organizers of the Chicago Python Meetup,

00:17 and we met last week when I was there for briefly, and we already started recording this,

00:23 but I forgot to press record. So we're starting over. So thanks, Ali, for hanging in there with

00:28 me. Thanks, Brian. Glad to be here. This episode is brought to you by Datadog. We'll talk more about

00:33 them later. But first, Ali's going to tell us some exciting things about Django. Yeah, so Django 3.0

00:40 just came out in early December. And so I really wanted to find out what was going on. I found a

00:45 talk from DjangoCon 2019 by Andrew Godwin titled Just Add a Wait, Retrofitting Async into Django.

00:53 And Andrew's been the one leading the implementation of asynchronous support into Django.

00:57 And he discussed what the 3.0 release signifies, also went over the implementation roadmap for

01:03 upcoming releases. Okay. Wow, that sounds fascinating. Yeah, he started by giving an overview of the

01:08 async landscape, talked about how synchronous and asynchronous code interact. A key point that he made

01:14 was that when we have asynchronous functions, they're very different than synchronous functions.

01:19 this makes it really difficult to design proper APIs. And one of the difficulties in adding

01:25 asynchronous support to Django is that this project, it's been around for a really long time.

01:29 There are a lot of people familiar with it. And for these new async implementations,

01:34 they wanted to have that same familiar feeling. And so they made a plan to implement async capabilities

01:40 in three phases. And so phase one, that was completed with the Django 3.0 release. And this phase,

01:46 it's really meant to lay a lot of the groundwork for future changes. And so Django now supports

01:51 asynchronous server gateway interface or ASCII. And the ORM, it's also async aware. So if you actually

01:57 call ORM code from async code, it's going to raise an exception to make sure that you're not calling the

02:03 synchronous function from your async code. You don't want to have unexpected behavior there.

02:08 Phase two is going to be tracking with Django 3.1. And they're going to be adding async capabilities

02:13 to the core core parts of the request path. This means we're going to have async views,

02:18 async handlers, and async middleware. And there's already a branch in the Django repository where this

02:25 is mostly working, they just need to add a couple of tests to make sure things are passing with all

02:30 those cases.

02:30 It sounds really well thought out.

02:32 They will put a lot of work into it. And there's a really detailed Django pep, I think it's called a

02:36 depth of all about it. And then finally, with phase three, that's going to make the Django ORM all async.

02:42 And Andrew called this the largest, most difficult and most unbounded part of the project.

02:47 So if we just start thinking about like ORM queries, sometimes they can result in a lot of implicit data

02:52 based lookups if we're not really careful. And this is where a lot of those complexities are going to

02:57 come from. I really enjoyed this talk. I think it went into the right amount of depth while being

03:01 really accessible. And I just want to call out the fantastic diagrams. It really helped improve my

03:06 understanding of Django internals. And if you check the show notes, I'm going to include a link to the

03:11 async project wiki. Andrew needs some help putting these things into Django. He's got a lot of areas

03:16 there. If you have some time, you have the expertise, please go ahead and help him out.

03:20 Yeah, awesome. I like doing calls to action on the show to try to get people involved.

03:25 And I've been just starting Django for like several years now, just dipping my toes in and out every

03:31 once in a while. And I was kind of waiting for this new version to come out. So I'll probably check

03:35 this out first. I like that idea of having, since they're gradually rolling it out, doing error

03:42 messages for if you try to do things that aren't implemented yet. So that's cool.

03:46 Yeah, it's always best to be explicit. So would you say you've been playing around with Django?

03:50 Yeah, okay. Nice segue there. So the next topic, speaking of playing, I don't know how you got into

03:59 programming, but a lot of people get into it with games. And I actually feel bad that I'm in that camp

04:08 also, because I don't know, it's just kind of cliche. But yeah, sure enough, Games by Example is GitHub Repo by

04:16 Al Swigart and other contributors as well. Or I think there's others, but he wants more.

04:21 So the idea is, is these are standard IO games. And that collection, he's got a collection of games

04:28 with the source code to use for example, programming lessons. They're all written in Python 3. And you can

04:35 just clone the repo or just view it and copy and paste if you don't really, even for people that aren't

04:40 used to get stuff. I think it's neat because I, before I even learned any of the concepts of

04:46 programming, I was programming games by copying them out of magazines and typing them into my old

04:51 TRS-80 way back in the Stone Ages. That's awesome. I didn't know what it was doing, but I could modify

04:56 things and sort of interpret it because you can kind of read basic. So there's some cool features of these

05:01 games that actually I would have enjoyed at the time. Some of the neat things is that they're short,

05:07 they're limited it to 256 lines of code, just an arbitrary number. But that's a pretty nice,

05:13 small code size. They're all single file, single source code files with no installers. So you just

05:20 run them with Python. They'll use just the standard library. If they're, they're only using print and

05:25 input for interacting with the user, you know, there's some downsides to that, but there's some

05:31 upsides too, because they're fairly simple. He's tried to make them very well commented and very

05:36 readable. So he, he mentions that they're elegant and efficient is a nice sometimes, but understandable

05:42 and readable is better for, for education purposes. So that's what they're planning on doing. That's

05:48 what his focus is also input validation on everything so that you can't crash Python from typing in

05:55 something weird. This is kind of cool. He made sure that every function has a doc string to describe

06:00 what it is because it really is meant to be teaching people. And I think this is a pretty cool.

06:05 Yeah. Al does a lot of good things about structuring things in a way that everybody

06:09 can understand. Really love all the stuff he does.

06:11 This would be great for people to use in, I think in helping your kid out. I'm going to try to get my,

06:16 my kids running with some of this stuff to just say, Hey, watch, this is how you run it. Now play with it and

06:23 break it. He also included to do list of things he wants to do with the project, but hasn't done yet. And I love

06:30 that because it gives people direction if they want to help out with the project. One of the areas is testing. He wants more tests. So I'd love for people to get involved with that.

06:38 Always got to make the call up for testing. Yeah, definitely. So before we move on, I want to give a shout out to our sponsor. This episode is sponsored by Datadog, a cloud scale monitoring platform that unifies metrics, logs, distributed traces, and more trace requests across service boundaries with flame graphs, correlate traces with logs and metrics, and plot the flow of traffic across multi cloud environments with network performance monitoring.

07:07 Plus Datadog integrates with over 350 technologies like Postgres, Redis and Hadoop. And their tracing client auto instruments, common frameworks and libraries like Django, tornado, flask and asyncio. That's cool. Get started today with a 14 day free trial at pythonbytes.fm/Datadog. Now, speaking of testing, Bulwark has some, I stole your thunder. But the NIST topic has some test related stuff, right?

07:36 Yeah, so Bulwark is an open source library that allows users to easily property test their pandas data frames and makes it easy for data analysts and data scientists to write tests.

07:48 So let's just take a step back. We all know that tests are important. But tests for data, they're just a little bit different. These tests, they're not as deterministic. And they require us to think about testing just a little bit differently.

08:02 So with property tests, what we're able to check that an object has a certain property. And so property tests for data frames include things like validating the shape of a data frame, checking to see if a column is within a given range, verifying that a data frame has no NANDs, and things like that.

08:22 So with Bulwark, you're able to implement these property tests as checks. And each check takes a data frame and some optional arguments. This check will make an assertion about some data frame property.

08:34 If things are good, the check's going to return the original data frame. If the check fails, the assertion error is raised, and you have a little bit more insight into what went wrong. And this is really helpful when you have a large data pipeline.

08:46 That's cool.

08:46 Yeah, it's really, really cool. And so one of the ways that Bulwark lets you implement property tests is also through decorators. And so when you're creating data pipelines, it's really useful to do them as functions. You have some input data, you perform some sort of action, it returns an output. And with Bulwark, what you can do is you can add decorators to your pipeline functions, and validate that the properties of your input data frames meet the conditions that you really want it to have.

09:12 So Bulwark has a lot of pre-built checks already in there. There's one for has certain data types, is this column monotonic, is this within a certain number of standard deviations, and it seems pretty straightforward to add new checks. And instead of stacking decorators for multiple checks, they have a special decorator, a multi-check decorator, which won't fail only on the first. It's actually going to run through them all and tell you all the ways your data either passed or did not pass.

09:39 Oh, that's great. Wow, neat.

09:41 Yeah, you can use Bulwark for implementing unit tests. You can use them in the ETL pipeline, especially on the extract and the load steps. And Bulwark can be used during development. And they also have some options to turn assertion errors into warnings for production. I'm not really too sure how I feel about that. But that functionality is there if you want to use it.

10:00 This is cool.

10:01 This is actually coming out of Chicago Python community member, Zach Rosenberg. He created this and gave a talk about it at Chippy. So just wanted to share it with the rest of our community.

10:10 I think it's great.

10:19 And also, you know, some people like one style over another. This is a little bit different style than Great Expectations. I think it's definitely worth checking out and having more, gosh, more and more of our life is controlled by data pipelines.

10:41 Not necessarily controlled, but influenced by results based on these data pipelines. So having these checks in place to make sure that our data is just as solid as our source code. This is awesome.

10:54 Yeah, it's really good to see all these data scientists and data engineers thinking about testing in a different way.

10:59 Oh, gosh, there's some interesting contributors to this. This is great. Cool.

11:05 I got off on the test side. I could go down that rabbit hole for all day long. But let's pop out and talk about packaging a little bit.

11:13 Packaging? Have you talked about it before?

11:16 Yeah, yeah, actually. So packaging is such a we do cover almost every story packaging related because it is a stumbling block for a lot of people to go forward.

11:25 At some point, you get into the intermediate Python developer and how to share your code is and dealing with virtual environments and stuff is one of those things that you have to run into.

11:36 So I think it's kind of cool. I am not a poetry user, but I might try this new one out.

11:41 So poetry just announced the poetry 1.0.0. They're no longer zero versionings. They've got to 1.0. Awesome.

11:49 The announcement is by Sebastian Eustis and it highlights some of the changes.

11:56 And a good caution right off the bat is they are breaking backwards compatibility with this version because the zero is often we're trying things out.

12:07 And I think it's completely fair to break. It's always fair to break with major versions, but I think this is reasonable.

12:14 So one of the things I think it's this highlights, I want to highlight a few things.

12:19 I think it's very for this version, we can really take poetry seriously.

12:24 I think that they're planning on sticking around and people that like it.

12:27 There's a lot of changes that are good.

12:30 They've added different ways to manage environments.

12:32 So within a project, you can have poetry help you coordinate multiple versions of Python within the same project.

12:39 That's pretty cool. I don't know if it had that before. That's neat.

12:42 One of the things that I want to bring up next is private indices.

12:47 And if you're working with projects within a company, this is very important to be able to have your own private, something like PyPI.

12:56 You can, you know, one of the ways you can do that is you can have your, a private version of something like PyPI.

13:02 But you can also just throw a whole bunch of wheels in a directory and use that as an index as well.

13:08 And so having, there's some improved support, you can have even specified a different index for each dependency if you want.

13:17 But you can also set up a primary and secondary or default and secondary index and have them be non, something other than PyPI if you want.

13:26 So this is cool.

13:26 People using poetry and other tools like this within continuous integration, sometimes it's hard to pass things around.

13:34 And so environmental variables are sometimes, they're ugly, but they're useful within CI.

13:39 And so there's new environmental variable support.

13:42 And then since there's all this new support for different sort of dependencies, the add command has been expanded to allow you to just add dependencies and put them in the right place with the right source and everything.

13:56 The other thing I want to highlight is they've improved some things around publishing to allow you to put API tokens in place instead of having to use text, passwords and users.

14:07 So very cool changes from poetry.

14:09 I applaud their progress.

14:11 Yeah, I haven't checked it out, but I'm going to go check it out.

14:14 Looks like there's been a lot of changes made.

14:15 Pretty excited about the one point of release.

14:17 This is neat.

14:18 You know, people are respect all over the place.

14:20 There's some people that absolutely love poetry.

14:22 There's some people using PipInv.

14:24 I'm a straight just using the built-in VENV and all the tools around that.

14:30 But yeah, whatever floats your boat, I guess.

14:33 Yeah, for sure.

14:34 What do you got for us next?

14:35 So Kubernetes has been a huge part of the DevOps ecosystem in the last two, three years.

14:41 With the rise of containers, Kubernetes is sort of the de facto platform for running and coordinating applications across multiple machines.

14:49 It's not really something I know a lot about, but it's something I want to get to know more about.

14:54 And I just found this awesome link from DigitalOcean.

14:56 They put together a Kubernetes for full stack developers curriculum.

15:00 And what it does is, yeah, it follows all the steps that a new user would have to learn in order to deploy applications to Kubernetes.

15:08 So you start by learning Kubernetes core concepts, like what's a pod, what's a deployment.

15:13 Next, you'll start building modern 12-factor web applications.

15:17 And then you're going to start packaging these applications inside of containers.

15:21 Next, you're going to deploy your containerized applications on Kubernetes.

15:25 And then finally, you're going to learn how to manage all that cluster orchestration and the cluster operations.

15:30 I went through the first link, which was like an introduction to Kubernetes.

15:33 I found the material really easy to understand.

15:36 It's pretty much what you expect from DigitalOcean.

15:38 Like, I learned how to do a lot of my operations work from DigitalOcean.

15:41 And I'm really excited that they have a lot more resources for the community.

15:45 And I'm going to throw a link in the show notes to a talk I gave about introduction to Docker.

15:50 If you're ever learning how to do Docker, I think this is a really good place to get started.

15:55 Yeah, and so just to be clear, Docker and Kubernetes are closely tied, right?

15:58 Docker is the container where you package your applications in.

16:02 And the Kubernetes is the platform that manages all your Dockerized containers across multiple machines.

16:10 So this way, with Docker, you can build your application.

16:12 With Kubernetes, you can deploy it and scale it pretty much infinitely.

16:16 Cool.

16:16 And it wasn't until I'm forgetting the guy's name.

16:19 But I saw a talk a couple years ago about Kubernetes.

16:22 And I didn't realize that this isn't just I mean, yes, it's intended for cloud stuff.

16:27 But you can test the entire Kubernetes setup locally on a laptop even.

16:32 So that's pretty neat.

16:34 Yeah, for sure.

16:35 I'm not lying there.

16:36 Is that true?

16:36 It's like Kube CTL.

16:38 Like you can have like a Kube Lite machine and pretty much do whatever you need to on the cloud locally.

16:43 Okay, cool.

16:44 Definitely good for playing around.

16:46 You don't have to pay for anything.

16:47 Exactly.

16:48 Well, now.

16:48 So the last topic, I want to apologize to people.

16:51 I'm perfectly okay with making mistakes on the show.

16:54 So on episode 159, we were going through a whole bunch of pytest plugins.

16:59 And one of the things we covered was pytest Picked.

17:02 And I incorrectly assumed that it would run.

17:06 There's a quote on the pytest Picked site that says it runs the tests related to unstaged files or the current branch on the current branch according to Git.

17:16 To me, that sounded like it runs the tests related to code that's changed.

17:21 But I was wrong.

17:23 So Michael was right.

17:24 I was wrong.

17:24 It just runs.

17:26 It uses Git to find out which files have changed.

17:30 And if any of those are test files, it runs the tests related to those test files.

17:35 That makes sense.

17:36 And that might be, if you're developing tests, that might be exactly what you want.

17:40 But if you're developing code, you might want something different.

17:43 And the tool I was thinking about was pytest-TestMon.

17:47 And that's a plugin for pytest also.

17:50 And I'm just going to read their blurb.

17:52 So pytest-TestMon is a pytest plugin which selects and executes only tests you need to run.

17:58 It does this by collecting dependencies between test runs and all executed code internally by using coverage, the coverage data,

18:08 and comparing those dependencies against changes.

18:11 So it updates the database on each test execution.

18:14 So it works independently of version control.

18:16 So that's the thing I was thinking about.

18:19 So if you use this coverage to find out what tests are affecting different parts of your code base, very cool.

18:26 So if you're making changes, this sounds like black magic to me that I'm glad somebody else wrote it.

18:31 But it does look pretty neat.

18:33 And I think I am definitely ready to try this again.

18:36 I tried it a while ago.

18:37 And for some reason, I don't even remember what I, what, I don't think I had any beef with it.

18:41 But I just didn't think I needed it.

18:43 But this looks exciting enough to, I'm going to try that again.

18:46 For large test weights, I think this would save you so much time instead of rerunning everything.

18:50 Yeah, I mean, there's other ways to, if you know specifically what you want to rerun.

18:53 Yeah.

18:54 If you've got a whole bunch of different tests and they all run pretty fast, but you, you're not quite sure which ones you should run based on your code changes.

19:03 This is kind of neat.

19:04 Awesome.

19:04 So that's the end of our normal topics.

19:06 We didn't really get to know you a little bit, but here, so you have anything extra you want to tell us what you're up to?

19:12 Sure.

19:12 So one of my favorite things about Python is the community.

19:15 And as the organizer of Chicago Python, it's no surprise that building hyper local communities is like one of my main passions.

19:23 And in Chicago, we have a lot of events for our members.

19:26 We have talk nights, project nights, open source sprints.

19:30 We recently started one to help people with whiteboarding coding interviews, but there's great user groups all over the world.

19:37 And I just want to include a link in the show notes.

19:40 So how your listeners can go out and find a local community they can be a part of.

19:45 And I also want to give a special shout out to all my fellow organizers who are listening to this podcast.

19:49 Thank you for all you do for the community.

19:51 Yeah.

19:51 I mean, I didn't know that the Chicago community was so huge.

19:56 You guys are, did you say 11 organizers?

19:59 Yeah, we have 11 organizers.

20:00 We've been around since like 2003.

20:03 So it's like one of the OG Python user groups.

20:06 And we actually had like Mathplotlib debuted at Chippy.

20:09 So a lot of historical things happening.

20:12 Yeah, it was cool.

20:13 So I was in Chicago last week and I don't know, I posted something on Twitter like Chicago's cold.

20:18 Some people made comments, but Ollie reached out and said, hey, if you're in town and available for a drink or something, we should meet up.

20:25 And so totally impromptu.

20:27 We met and had dinner together.

20:29 It was a blast.

20:30 And yeah, talk about community.

20:32 That's one of the things I love about the community.

20:34 If I'm wherever I'm traveling, I can like try to hit up people in that area and say, hey, I'm in town.

20:40 I only have like three hour window, but is anybody available?

20:43 And usually I can get somebody to come by and we can BS or something.

20:48 And so it's great.

20:49 We have Python friends all around the world.

20:51 You have a note here on PyTennessee?

20:53 Yeah.

20:53 So PyTennessee is going to be happening on March 7th and 8th.

20:56 I'm going to be giving a talk on design patterns and tickets are available at PyTennessee.org.

21:01 It's going to be in Nashville.

21:02 My third year in a row going.

21:04 Can't wait.

21:05 And Brian, I think I saw your name on the talk list as well.

21:08 Oh, yeah.

21:09 I am going to that.

21:11 Awesome.

21:13 I'm like, I don't think I thought I was going to Nashville.

21:16 Oh, yeah, that is in Tennessee.

21:17 Yes, I failed geography.

21:19 So cool.

21:20 I've never been there.

21:22 So it should be fun.

21:23 Yeah, we'll grab some dinner or drink or something.

21:25 It'll be fun.

21:25 So speaking of community and this podcast, I just want to announce that the next Python

21:31 PDX West meetup, the one in Hillsborough, just west of Portland, is happening January 7th.

21:38 And I'm bringing it up because Michael and I thought it'd be fun to just do a live recording

21:42 of Python Bytes with everybody hanging out at the meetup.

21:45 There's also going to be one to two other talks there and we'll have food.

21:49 So if anybody's in the Portland area, the second week in January, swing by.

21:54 Sounds like a lot of fun.

21:55 A couple jokes for us.

21:56 So the first one is sent in from Tyler Madison.

22:01 It's just a short joke.

22:03 So two coroutines walk into a bar.

22:06 Of course, it's a runtime error because bar was never awaited.

22:09 Async jokes.

22:12 I got one for you.

22:14 So how many developers on a message board does it take to screw in a light bulb?

22:19 I don't know.

22:19 So the answer is, why are you trying to do that?

22:22 Yeah.

22:24 For all those people that are just trying to make sure that they're making sure that you're

22:27 doing things their way.

22:28 Yeah, I hate that.

22:29 People like, you know, somebody asks a question and before anybody, it might be a simple answer,

22:34 but somebody will say, yeah, you shouldn't be doing that.

22:37 Don't do that.

22:38 Yeah.

22:38 This was fun.

22:39 Thank you so much for filling in for Michael and co-hosting today.

22:42 Thanks so much for having me, Brian.

22:44 Thank you for listening to Python Bytes.

22:46 Follow the show on Twitter at Python Bytes.

22:48 That's Python Bytes as in B-Y-T-E-S.

22:51 And get the full show notes at pythonbytes.fm.

22:54 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

22:59 We're always on the lookout for sharing something cool.

23:01 This is Brian Okken.

23:02 And on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast

23:06 with your friends and colleagues.

Back to show page