Transcript #119: Assorted files as Django ORM backends with Alkali
Return to episode page view on github00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to
00:04 your earbuds. This is episode 119, recorded live from PyCascades in Seattle.
00:09 All right, it's great to be here. And this episode is brought to you by Datadog. Tell
00:17 you more about them later. Right now, I have a bunch of special guests, none of whom are Brian
00:21 Aukin. More about that in a second. But we have Trey Hunter. Hello. Dan Bader. Hey, how's it going?
00:26 Eric Cho. Yo. All right. And all of us are here at the conference and we thought, why not put
00:30 something live together for you? Now, Brian Okken decided to punish his teeth by having a painful
00:36 root canal and couldn't join us in some sort of last minute emergency. And that's really unfortunate
00:41 because he was looking forward to be here. So everybody, Brian, we miss you. We miss you, Brian.
00:45 Right on. Well, let's go ahead and kick it off. I'm going to do the first thing here. And have you guys
00:52 heard of this thing called Dropbox? Yeah, a little bit. They have something to do with Python. Anyway,
00:56 obviously, Guido works at Dropbox. It's a huge Python center of the universe there. And what's
01:02 really interesting is they're finally migrating to Python 3 and using some of the tools that Guido
01:08 has personally worked on with like mypy and static typing and all of that. So that's our first item.
01:14 And if you were to guess how many lines of code is the Dropbox code that you're working with,
01:20 you know, that little box in your menu bar, your task bar, that's also client-side Python,
01:25 which is interesting already. But it's over a million lines of code. So they started way back
01:32 in 2015, a little hack week side project to prove whether or not maybe they could do it. It turned
01:38 out it's going to be hard. That's what they basically they said. And officially they started the first
01:44 half of 2017. And the real thing that helped them do this, which I think is interesting is mypy.
01:50 Have you guys heard of mypy?
01:51 Yep.
01:52 Oh, yeah.
01:52 So mypy is, it takes the type annotations or type hints and verifies that, you know,
01:57 this function says it takes one of these and you're giving it one of the same things like that,
02:01 that sort of thing.
02:02 Did Guido actually, like, I don't think he started mypy.
02:05 I don't think he started it, but he definitely works on it.
02:08 One of the original contributors, I think.
02:09 Okay. Did he start it or like, was it started for Dropbox specifically or for the Dropbox codebase?
02:14 Just curious.
02:15 Yeah, I don't know either, but I know that it was an important thing he's working on.
02:18 I'm not sure, but I just want to, it seems like Dropbox been migrating away from like the public
02:23 clouds for a while and they've been focusing on just getting things right. So this is probably one
02:28 of those things where they think for the long-term growth, it's going to be better than rely on
02:33 somebody else's infrastructure.
02:34 Right. Absolutely. It's very interesting. They're stepping away from some of the cloud hosting.
02:37 Everyone else is running to the cloud. They're like, ah, well, we can make cloud.
02:40 That's pretty interesting. So let me throw this out for you all, co-guests and audience members
02:46 and listeners. One of the very first things they say in this article is, well, once we were armed
02:51 with mypy, the first few steps we took was to port our custom fork of Python to 3.5.
02:58 What?
02:59 That's big.
03:01 I'm like, wait, what? They don't run normal Python? They drop Python? What do they call it? It's pretty
03:08 cool.
03:09 It's pretty cool. It cross compiles to Perl.
03:11 Yeah.
03:12 Everybody loves it.
03:13 Yeah. So I'll just kind of wrap this up here. But basically this article that we're covering
03:18 goes to all the steps of Dropbox moving over. And I feel like if people are going to take
03:23 the Python 3 as modern Python and other Python as legacy Python as a legitimate thing, the guy
03:30 who created Python had better work at a place that uses Python 3, not Python 2.
03:34 For sure.
03:35 So I'm super happy to see that's moving along. And also that Guido was a pretty big part of it.
03:39 All right. So let's see what's up next here. Eric?
03:41 Basically, I want to talk about what I feel was like underserved community in Python. I've come from a
03:48 network engineering background and been focusing on network automation using Python. And I think we've
03:54 gotten to a point where we're big enough to be noticeable. Like it's actually material for the
03:59 amount of community. I mean, we have new terms such as NetDevOps or NRE, you know, not to be subtle
04:05 differences from the site reliability engineering for network reliability engineering. We have some
04:11 popular libraries from NetMiko, Napalm, who's been on your show before. And I can't even pronounce that
04:17 new library, Noner, I think. And N-I-R will have the link in the show notes. Yeah, you know, there's a lot
04:23 of free resources out there for people to practice on for either network engineer wants to learn more
04:28 about Python or developers who wants to learn more about network engineering. I think coming of age,
04:32 I mean, hopefully one day, you know, we're going to have a subculture of Python, just like the data
04:39 analysis community that for network engineers. So that's, I want to bring it to everybody's
04:44 attention. You could do it for fun, do it for profit. And, you know, it's a welcoming community.
04:49 Yeah. And you link to a bunch of resources in the show notes that people who are into that can check
04:53 out. And yeah, Python's a mosaic and there's so many people doing different things. And here's just
04:57 another part of it, right? Yeah, absolutely. I mean, I'm super excited about this because I think,
05:02 as you mentioned multiple times on your show, it's like you get started early or started easily,
05:06 but you know, you don't hit that ceiling. I mean, I've been doing this for five years and I haven't
05:11 found that ceiling yet. It's a dot to me. So yeah. Is that a sign of growth that the Python community
05:16 has seen where now, you know, it makes sense to have a niche for network automation specifically?
05:21 I think people are still trying to figure out like how this thing's going to go, which is
05:24 with lots of changes presents more opportunities for people. And Python kind of sort of just emerged in this
05:31 de facto and speaks to the versatility and the power of the language. I think we're in that phase,
05:35 we're trying to figure it out. And we just kind of have this trending versus like nobody has the
05:41 right answer. But that means at the same time, that's where the opportunity lies. You know,
05:45 you could figure it out and could drive that direction. And I think the developer actually
05:49 has a huge advantage that everything is virtualized, everything is abstracted away from the physical.
05:56 So that's my thought at the moment. You know, you could see like, I'm not very clear either.
05:59 I think it's super interesting that you pointed out how everything's abstracted and sort of cloud
06:02 programmable. That means like Python has a better chance in the network space if it's not all hardware
06:07 and boxes and stuff, right?
06:08 Yeah, for sure. I think one of the challenges for network engineers such as myself going into the cloud
06:13 is that the fact that, you know, there's no longer broadcast domain, your net, your nick is actually
06:18 physically attached to you. So things that we took for granted that were fixed is no longer true.
06:23 So you get to have like a network NAT gateway that's just arbitrarily attached to your virtual subnet,
06:30 which, you know, you used to, I think if you work in the traditional enterprise, like the first thing
06:35 you do when you get a new team is like, you subnet it out, you give it an IP address, you subnet.
06:39 But those are all virtualized nowadays. So you still need to understand the basics. But that basic used
06:46 to take years to master it. Now it's just a matter of reading a doc. So yeah, hopefully, you know,
06:51 you guys, you know, come say hi, if you see me at Ansible Fest, at Cisco DevNet Create at,
06:57 you know, some of the Juniper events, you know, come say hi, let's talk. And I think we could make
07:01 this potentially make a great community out of it.
07:03 Yeah, put Python on the wire.
07:05 Yeah, yeah, for sure. Buy you a Python beer.
07:08 Yeah, but it's funny, Python really is a mosaic. I mean, that's, I didn't understand. Well, I understood
07:14 a lot of the terms you were using, but what they actually mean, I don't know, because I don't need to know
07:18 what they mean. And in the space of Python that I kind of am part of, this next thing I've got is
07:23 kind of related to the fact that Python is a mosaic. It's kind of part of the web side of the mosaic of
07:27 Python, which gets maybe more reputation than it deserves in the sense that there's a lot of folks
07:34 using Python for the web, but it's not all you can use Python for at all. I mean, data science is huge.
07:39 But if you have to process data, and it's not in a database, and you are someone who's familiar with
07:45 Django, there's a thing called Alkali that Kurt made. I can't remember Kurt's last name. Remember,
07:50 Kurt's in the room, and we actually, it's Kurt Neufeld. So it's funny being at conferences,
07:56 you sometimes just meet the people who end up, you know, making the things that you're using. So
08:01 Alkali, I'm not using, but it looks kind of fun, because I'm familiar with the Django ORM.
08:05 And Alkali, it's meant to take structured data, maybe an RSS feed, maybe a CSV file,
08:10 maybe JSON data, maybe some random homegrown thing that you've got on your team or in your company,
08:15 and allow you to use a Django ORM like syntax to query it and also to save it,
08:20 maybe in some other format, even. So it's as if you're working with a database,
08:24 but you don't actually have a database behind the scenes, you've got some structured file. So it's
08:29 kind of does that all in memory, which is fun, right. So maybe you're working with XML, and you don't want to learn XPath,
08:35 or you don't want to write regular expressions against CSV files.
08:38 Who wants to learn XPath, man?
08:40 Nobody.
08:40 Hey, rhetorical question.
08:42 Hey, man, the 90s are calling, they want their API back.
08:44 Here's my style sheet says nobody ever.
08:48 Yes, exactly. So I think this is a cool project, Kurt. I definitely like that you can point it at even
08:54 like something, an endpoint on ATP service and like turn that into effectively a Django database.
09:00 And I've heard that there's a branch working on indexes, which will like sort of complete the
09:05 performance side of things.
09:06 Ooh, that would be really fun.
09:07 Yeah, no, no pressure. No pressure. It's going to be released tomorrow, I heard.
09:12 I'm just kidding. It's not going to be released tomorrow.
09:14 It's a long night for Eric.
09:15 You're shaking his head.
09:16 Long flight home. I don't know where you're from.
09:18 All right, before we move on to the next one, let me just tell you about our sponsor,
09:21 which makes all of this happen. So this episode is brought to you by Datadog and Datadog. They're
09:26 really awesome. They let you track the performance and errors and requests, not just within your
09:32 Python app, but across all of your infrastructure. Like, so if you're doing like a Kubernetes thing
09:36 and you've got a Flask app and it's talking to Nginx and it's talking to PostgreSQL, you can like
09:43 tie all the performance of that entire system together, not just profiling your Python code,
09:48 which is pretty awesome. So check them out at pythonbytes.fm/Datadog. Get a cool free
09:53 t-shirt. You get to try it out. It's awesome. Okay. So the next item, that's Dan.
09:57 Oh, sweet. Yeah. So a quick update here. The CMU Carnegie Mellon University launched a undergrad
10:03 degree in artificial intelligence. And apparently that is the first AI degree offered by a US
10:10 university. And when Mike told me about it, I was really surprised because I thought, well,
10:14 AI has kind of been like a big buzzword for a while now. And why didn't anybody else come up
10:20 with a degree before that? But I guess it always takes a little while to do that. And I don't really
10:25 know what goes into that degree or kind of what, you know, how the curriculum really differs from,
10:30 let's say like your average computer science degree or like a data science curriculum. But I just felt it
10:36 was an interesting development. Yeah. I'm sure they use a lot of Python.
10:39 Computer science forever. Well, first it was like electrical engineering, but I work on computers
10:43 on the software side. And like eventually that got a real degree, like computer science.
10:47 And then we have like software engineering. But now I think this is a big landmark, like the first
10:52 artificial intelligence, like a bachelor of artificial intelligence. Like think of that. That's crazy.
10:56 And one of the things the dean said is, you know, of course we'll do CS stuff, but we're also
11:00 going to focus on things like computer vision, language processing, huge databases, and how to help
11:06 like humans make better decisions automatically. It's pretty cool.
11:10 So I'm waiting for the day where we have an AI, get a bachelor's degree in AI, just so we can call
11:17 it a day and we're done. Or an AI teaching the bachelor's degree in AI.
11:21 Yeah. Even better. That'd be so sweet.
11:23 My professor's a jerk.
11:25 It's written in Fortran.
11:27 Yeah. So do you use Python at all?
11:31 I'm guessing you're learning Python.
11:32 It must be, right?
11:34 It's all Java. No, I don't know. It's got to be Python, right?
11:36 All right. So you all might know that maybe I've been kind of on a rant about async and await
11:41 and async is programming lately. And the next one, have you also heard that I've talked about
11:46 GUIs? Like I've mentioned this twice, I think, like that Python should have better GUIs.
11:50 Well, this next one is kind of like these things come together, which is awesome.
11:55 So Florian sent this over to me and it's PySide 2 and Qt for Python, the Qt framework. That has an event
12:02 loop that, you know, a button gets clicked or a timer runs or something like that. Well, somebody
12:07 built some layer that you can plug that into async and await. So you can have like async def button
12:14 click handler that integrates with your other async operations happening on your GUI there. It's pretty
12:21 awesome. There's some examples on how you do it. It's super simple. I linked to one about downloading
12:25 some stuff and whatnot. So yeah, if you're doing anything with Qt and you do anything with async,
12:31 then check this out. That's really, really a nice one.
12:33 So that one, usually like I know, I haven't done Qt in a while, but GTK uses kind of an object oriented
12:38 event loop there, right? Where it's classes. So it's taking a class-based syntax and allowing you to use
12:43 the new asyncio syntax, right? I think it's mixing the GUI event loop and the asyncio event
12:49 loop together because otherwise I think they would run independently. I think you basically can't have
12:55 those run on the same thread or something to that effect, right? Like the async event loop would block
12:59 the GUI loop or something to that effect. Cool. All right. So the next item we've got on the list here,
13:04 you know, guys, we're at Python 3.7 now. 3.8 is coming out pretty soon. So we're kind of running out of
13:09 like minor number space. I guess we could always create more, but whatever. That's a good intro.
13:14 People have started thinking about, you know, what's going to happen with Python 4.0? Like what
13:19 would be some cool features that we would really want to see? And so our good buddy, Anthony Shaw,
13:24 wrote a really interesting blog post about four things he wants to see in Python 4.0. And it's pretty
13:33 short read, but there's some interesting ideas in here. So we're just going to go over those points
13:37 here. And so number one is he would love to see just in time compilation as a first class feature.
13:42 So right now, you know, you've got some alternative Python interpreters like the Piston project,
13:47 or PyPI, I guess is like the most well known that actually feature just in time compilation. And it
13:53 could bring a huge speed up compared to like the plain like bytecode interpreter setup that CPython uses.
13:58 And so I guess the idea would be, is there some way to bring this into core Python? And apparently,
14:05 there is and we already have this in some way, or at least we have the infrastructure to be able to
14:10 plug in something like that.
14:11 That one would be really big because I know there are some companies that the reason they're able to
14:15 use Python for what they do is PyPI. The fact that it really speeds up with that just in time
14:19 compilation.
14:20 Yeah, yeah. I think it's a big one, right? Like performance. I think the more people use Python,
14:23 the more relevant the whole performance story becomes for people because then it's like,
14:27 yeah, you know, it has a huge impact if you have a small improvement.
14:30 Yeah, absolutely. There's tons of attempts to solve this problem. Like there's Rust Python and
14:34 there's Grumpy and there's all these different attempts on solving. And PyPI, like Trey said,
14:39 is really awesome. But it has this limitation where like when it, it kind of, when it gets to the C
14:44 interrupt stuff, it can like slow down or it doesn't necessarily work with all of them. So it kind of
14:49 falls back then. And with Pigeon and the work that Brett Cannon and those guys did,
14:53 it's really awesome because that's a plugin to the normal CPython. So it wouldn't be like an
14:57 alternative thing. So yeah, I would love to see this as well. It'd be great.
15:00 Yeah. Great idea. All right. Item number two is on the wishlist is a stable 0.0, like a stable 4.0
15:08 release.
15:08 Is that a lot to ask?
15:09 I don't know, man. You tell me.
15:10 I feel like this one, this was because of 3.0 history, right? That, you know,
15:16 there were lots of breaking changes. The initial was a kind of a rewrite of the language from my
15:20 understanding, although I'm not a core developer. I don't know.
15:22 The central point of that in the blog post here is that, well, you only have one chance to make a
15:27 first impression really. And so maybe Python 3 kind of bumbled its way into life or whatever. I think
15:33 now we're super happy that we have it, but I don't actually really remember the 0 release or the 0.1
15:37 release.
15:38 I don't know if anyone does.
15:39 Yeah. It's like, let's not talk about that. Let's just move on.
15:41 No, I'm sure it was great. All right. Static type hinting. I think that's a really good idea too.
15:46 You know, we've got my Pi, but it's optional right now. And it would be kind of
15:52 interesting to see that integrated into CPython or the core language if this is really the path
15:58 forward. And I'm not actually sure what the roadmap says there.
16:02 Yeah. I don't know either. It's pretty interesting. I think static typing is super valuable.
16:06 I think having it mean something in the language, that would change the Zen of Python, wouldn't it?
16:12 I mean, because it's so much about the duck typing and I don't have to worry about it. It's like,
16:16 whoa, compilation error. We expected a I runnable of whatever, right? Multiple templated thing. And
16:23 yeah, I don't know. I don't know about that.
16:25 Really changed the face of the language, I think.
16:28 Yeah. I like what he's recommending here. I'm not so sure about the required static type
16:32 hinting. Maybe like a mode to run it where you can check it. I mean, we have data classes,
16:36 which do some validation in a sense.
16:39 You're wrong, Anthony. No, like we're just some really interesting thoughts about this because,
16:46 you know, what should go into it? Because obviously it's a big release, right? If you're talking about
16:49 Python 4.0, it better be a really, really noticeable improvement. Otherwise, people are going to go,
16:55 like, oh, which would be nice too. I mean, if it's just a 4.0 release and everybody's kind of,
17:00 there's no upgrade hump like we had from 2 to 3, that's kind of nice too.
17:04 Right. Well, and he does mention the idea of static duck typing, putting an iterator in there as
17:08 opposed to a generator specific type of thing. But I don't know how you would really make that
17:13 a truly generic thing.
17:14 Yeah. Well, as long as we don't end up with a Python 3 death clock, we'll be in a pretty good place.
17:19 Nice. Okay. So the next item we have here is a GPU story for multiprocessing. So I guess the idea is
17:26 that a lot of workloads that people use Python for these days are actually running on GPUs. You know,
17:33 a lot of the, I guess, like the deep learning stuff's all running on GPUs these days. And so
17:36 wouldn't it be cool if Python 4.0 actually had some facilities to run stuff on the GPU
17:41 for like parallel computations and had it built into the language? Wouldn't that be sweet? It's an
17:46 interesting idea for sure.
17:47 Maybe like another decorator, like an at GPU.
17:50 And we're done. Add some tie pins. Yeah. And the last item here on this really interesting list is
17:56 number five is more community contributions. And I think Anthony is saying that he's already
18:03 seen, you know, like a lot more involvement from the larger community. And now that CPython is
18:08 hosted on GitHub and there's less barriers for people to contribute, I guess, to the code. And
18:14 just seeing more growth in that and seeing more people involved in the actual development of
18:19 CPython would be pretty sweet. And I totally agree. What do you think, Eric?
18:22 Well, a lot of these features, I haven't been coding long enough to have a strong opinion about
18:27 one way or the other. But I think to me, obviously, you know, optimizing for hardware and who would say
18:33 no to that. But to me, the 4.0 story would be big in terms of this would be the first major release without
18:40 having a BDFL. And how we I guess it will be we'll figure it out by then how, you know, 3.8 came about and all the
18:47 peps. But this will be a major release where it's determined, I guess, by the committee. So it would be kind of
18:52 interesting and just see how that transition going and hopefully for the long term and 5.0, 6.0.
18:58 I feel like even outside of the core developing team, Python naturally has had more community involvement
19:04 over the years. And it'd be nice to see that with a 4.0 because I mean, even this podcast, like, you know, you
19:08 mentioned under PyPackages recently. And that's something that that's not a PEP that's actually ready.
19:13 That's something it may or may not make it into Python. That's a discussion that normally happens
19:18 not behind closed doors, but in an open space that no one looks in, which is the core developer
19:23 mailing list, whereas it's on a podcast now.
19:24 Some random people in Portland dug it up and talked about it on the internet and all the dirt on your
19:30 Python.
19:30 Yeah. So that's it for all of our main items. Just a couple of quick extra ones for me. One,
19:35 I did an async webcast, which is available. So if you want like one hour review of what async and
19:40 await means and why I think now is the time for async and Python and you don't have to switch to
19:47 go. It's already awesome. Like just use it. And so you can check that out. I'll link to that in
19:51 show notes. And then if you happen to be somewhere near Tel Aviv or Israel, at least the first week of
19:57 June, they're having PyCon Israel, which is pretty awesome. And call for proposals is open just a
20:03 couple of days ago. So yeah, that's, those are my extra items. And you guys got anything else?
20:06 Yeah. Quick announcement. We're working on a new book for real Python and we're going to release
20:11 through real Python. It's called the Python basics book. So it's like a beginner's book for people who
20:15 want to get into Python in the first place. And Mike actually wrote the foreword for it. And it's great,
20:22 but it also kind of duplicates what we had said in the intro. So that means we've got to rip out a
20:26 bunch of stuff and then use his foreword as a new intro because it's so much better than what we had.
20:29 Thank you, Mike.
20:30 You're welcome.
20:31 Shameless block for the book.
20:32 Thanks for making more work. So the only thing I have to share is that, you know, some things in
20:38 my world, I'm, I'm, I have a goal for myself to write more because writing blog posts takes me so
20:43 much time. And so that's, that's something that I'm, I'm just announcing publicly here only so that I
20:49 will commit to it over the next quarter or so. And there's some kind of big things that folks in my
20:54 mailing list know with Python more. So it's going to be coming up soon.
20:57 Yeah. Sounds great. So I guess we got to close this out with a joke. So we got a whole list of
21:02 jokes here and I'll just grab two for you guys and, you know, let you all see what you think here.
21:06 So why did the angry function exceed its call stack size? It got into an argument with itself.
21:12 Oh no, oh no. There's more.
21:20 But wait, why did the developer ground their child? As in you can't go out, you're in trouble,
21:26 you stay home for the week. They weren't telling the truth. And with that, I think we're going to
21:31 close it out because that's what are we going to do with that? All right. So Trey, Dan, Eric,
21:37 thank you all for being here and everybody. Thank you so much for coming.
21:39 Podcasts was great. Brian, we miss you and see y'all later.
21:48 Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's
21:53 Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news
22:00 item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for
22:04 sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy.
22:09 Thank you for listening and sharing this podcast with your friends and colleagues.