Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #119: Assorted files as Django ORM backends with Alkali

Return to episode page view on github
Recorded on Monday, Feb 25, 2019.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to

00:04 your earbuds. This is episode 119, recorded live from PyCascades in Seattle.

00:09 All right, it's great to be here. And this episode is brought to you by Datadog. Tell

00:17 you more about them later. Right now, I have a bunch of special guests, none of whom are Brian

00:21 Aukin. More about that in a second. But we have Trey Hunter. Hello. Dan Bader. Hey, how's it going?

00:26 Eric Cho. Yo. All right. And all of us are here at the conference and we thought, why not put

00:30 something live together for you? Now, Brian Okken decided to punish his teeth by having a painful

00:36 root canal and couldn't join us in some sort of last minute emergency. And that's really unfortunate

00:41 because he was looking forward to be here. So everybody, Brian, we miss you. We miss you, Brian.

00:45 Right on. Well, let's go ahead and kick it off. I'm going to do the first thing here. And have you guys

00:52 heard of this thing called Dropbox? Yeah, a little bit. They have something to do with Python. Anyway,

00:56 obviously, Guido works at Dropbox. It's a huge Python center of the universe there. And what's

01:02 really interesting is they're finally migrating to Python 3 and using some of the tools that Guido

01:08 has personally worked on with like mypy and static typing and all of that. So that's our first item.

01:14 And if you were to guess how many lines of code is the Dropbox code that you're working with,

01:20 you know, that little box in your menu bar, your task bar, that's also client-side Python,

01:25 which is interesting already. But it's over a million lines of code. So they started way back

01:32 in 2015, a little hack week side project to prove whether or not maybe they could do it. It turned

01:38 out it's going to be hard. That's what they basically they said. And officially they started the first

01:44 half of 2017. And the real thing that helped them do this, which I think is interesting is mypy.

01:50 Have you guys heard of mypy?

01:51 Yep.

01:52 Oh, yeah.

01:52 So mypy is, it takes the type annotations or type hints and verifies that, you know,

01:57 this function says it takes one of these and you're giving it one of the same things like that,

02:01 that sort of thing.

02:02 Did Guido actually, like, I don't think he started mypy.

02:05 I don't think he started it, but he definitely works on it.

02:08 One of the original contributors, I think.

02:09 Okay. Did he start it or like, was it started for Dropbox specifically or for the Dropbox codebase?

02:14 Just curious.

02:15 Yeah, I don't know either, but I know that it was an important thing he's working on.

02:18 I'm not sure, but I just want to, it seems like Dropbox been migrating away from like the public

02:23 clouds for a while and they've been focusing on just getting things right. So this is probably one

02:28 of those things where they think for the long-term growth, it's going to be better than rely on

02:33 somebody else's infrastructure.

02:34 Right. Absolutely. It's very interesting. They're stepping away from some of the cloud hosting.

02:37 Everyone else is running to the cloud. They're like, ah, well, we can make cloud.

02:40 That's pretty interesting. So let me throw this out for you all, co-guests and audience members

02:46 and listeners. One of the very first things they say in this article is, well, once we were armed

02:51 with mypy, the first few steps we took was to port our custom fork of Python to 3.5.

02:58 What?

02:59 That's big.

03:01 I'm like, wait, what? They don't run normal Python? They drop Python? What do they call it? It's pretty

03:08 cool.

03:09 It's pretty cool. It cross compiles to Perl.

03:11 Yeah.

03:12 Everybody loves it.

03:13 Yeah. So I'll just kind of wrap this up here. But basically this article that we're covering

03:18 goes to all the steps of Dropbox moving over. And I feel like if people are going to take

03:23 the Python 3 as modern Python and other Python as legacy Python as a legitimate thing, the guy

03:30 who created Python had better work at a place that uses Python 3, not Python 2.

03:34 For sure.

03:35 So I'm super happy to see that's moving along. And also that Guido was a pretty big part of it.

03:39 All right. So let's see what's up next here. Eric?

03:41 Basically, I want to talk about what I feel was like underserved community in Python. I've come from a

03:48 network engineering background and been focusing on network automation using Python. And I think we've

03:54 gotten to a point where we're big enough to be noticeable. Like it's actually material for the

03:59 amount of community. I mean, we have new terms such as NetDevOps or NRE, you know, not to be subtle

04:05 differences from the site reliability engineering for network reliability engineering. We have some

04:11 popular libraries from NetMiko, Napalm, who's been on your show before. And I can't even pronounce that

04:17 new library, Noner, I think. And N-I-R will have the link in the show notes. Yeah, you know, there's a lot

04:23 of free resources out there for people to practice on for either network engineer wants to learn more

04:28 about Python or developers who wants to learn more about network engineering. I think coming of age,

04:32 I mean, hopefully one day, you know, we're going to have a subculture of Python, just like the data

04:39 analysis community that for network engineers. So that's, I want to bring it to everybody's

04:44 attention. You could do it for fun, do it for profit. And, you know, it's a welcoming community.

04:49 Yeah. And you link to a bunch of resources in the show notes that people who are into that can check

04:53 out. And yeah, Python's a mosaic and there's so many people doing different things. And here's just

04:57 another part of it, right? Yeah, absolutely. I mean, I'm super excited about this because I think,

05:02 as you mentioned multiple times on your show, it's like you get started early or started easily,

05:06 but you know, you don't hit that ceiling. I mean, I've been doing this for five years and I haven't

05:11 found that ceiling yet. It's a dot to me. So yeah. Is that a sign of growth that the Python community

05:16 has seen where now, you know, it makes sense to have a niche for network automation specifically?

05:21 I think people are still trying to figure out like how this thing's going to go, which is

05:24 with lots of changes presents more opportunities for people. And Python kind of sort of just emerged in this

05:31 de facto and speaks to the versatility and the power of the language. I think we're in that phase,

05:35 we're trying to figure it out. And we just kind of have this trending versus like nobody has the

05:41 right answer. But that means at the same time, that's where the opportunity lies. You know,

05:45 you could figure it out and could drive that direction. And I think the developer actually

05:49 has a huge advantage that everything is virtualized, everything is abstracted away from the physical.

05:56 So that's my thought at the moment. You know, you could see like, I'm not very clear either.

05:59 I think it's super interesting that you pointed out how everything's abstracted and sort of cloud

06:02 programmable. That means like Python has a better chance in the network space if it's not all hardware

06:07 and boxes and stuff, right?

06:08 Yeah, for sure. I think one of the challenges for network engineers such as myself going into the cloud

06:13 is that the fact that, you know, there's no longer broadcast domain, your net, your nick is actually

06:18 physically attached to you. So things that we took for granted that were fixed is no longer true.

06:23 So you get to have like a network NAT gateway that's just arbitrarily attached to your virtual subnet,

06:30 which, you know, you used to, I think if you work in the traditional enterprise, like the first thing

06:35 you do when you get a new team is like, you subnet it out, you give it an IP address, you subnet.

06:39 But those are all virtualized nowadays. So you still need to understand the basics. But that basic used

06:46 to take years to master it. Now it's just a matter of reading a doc. So yeah, hopefully, you know,

06:51 you guys, you know, come say hi, if you see me at Ansible Fest, at Cisco DevNet Create at,

06:57 you know, some of the Juniper events, you know, come say hi, let's talk. And I think we could make

07:01 this potentially make a great community out of it.

07:03 Yeah, put Python on the wire.

07:05 Yeah, yeah, for sure. Buy you a Python beer.

07:08 Yeah, but it's funny, Python really is a mosaic. I mean, that's, I didn't understand. Well, I understood

07:14 a lot of the terms you were using, but what they actually mean, I don't know, because I don't need to know

07:18 what they mean. And in the space of Python that I kind of am part of, this next thing I've got is

07:23 kind of related to the fact that Python is a mosaic. It's kind of part of the web side of the mosaic of

07:27 Python, which gets maybe more reputation than it deserves in the sense that there's a lot of folks

07:34 using Python for the web, but it's not all you can use Python for at all. I mean, data science is huge.

07:39 But if you have to process data, and it's not in a database, and you are someone who's familiar with

07:45 Django, there's a thing called Alkali that Kurt made. I can't remember Kurt's last name. Remember,

07:50 Kurt's in the room, and we actually, it's Kurt Neufeld. So it's funny being at conferences,

07:56 you sometimes just meet the people who end up, you know, making the things that you're using. So

08:01 Alkali, I'm not using, but it looks kind of fun, because I'm familiar with the Django ORM.

08:05 And Alkali, it's meant to take structured data, maybe an RSS feed, maybe a CSV file,

08:10 maybe JSON data, maybe some random homegrown thing that you've got on your team or in your company,

08:15 and allow you to use a Django ORM like syntax to query it and also to save it,

08:20 maybe in some other format, even. So it's as if you're working with a database,

08:24 but you don't actually have a database behind the scenes, you've got some structured file. So it's

08:29 kind of does that all in memory, which is fun, right. So maybe you're working with XML, and you don't want to learn XPath,

08:35 or you don't want to write regular expressions against CSV files.

08:38 Who wants to learn XPath, man?

08:40 Nobody.

08:40 Hey, rhetorical question.

08:42 Hey, man, the 90s are calling, they want their API back.

08:44 Here's my style sheet says nobody ever.

08:48 Yes, exactly. So I think this is a cool project, Kurt. I definitely like that you can point it at even

08:54 like something, an endpoint on ATP service and like turn that into effectively a Django database.

09:00 And I've heard that there's a branch working on indexes, which will like sort of complete the

09:05 performance side of things.

09:06 Ooh, that would be really fun.

09:07 Yeah, no, no pressure. No pressure. It's going to be released tomorrow, I heard.

09:12 I'm just kidding. It's not going to be released tomorrow.

09:14 It's a long night for Eric.

09:15 You're shaking his head.

09:16 Long flight home. I don't know where you're from.

09:18 All right, before we move on to the next one, let me just tell you about our sponsor,

09:21 which makes all of this happen. So this episode is brought to you by Datadog and Datadog. They're

09:26 really awesome. They let you track the performance and errors and requests, not just within your

09:32 Python app, but across all of your infrastructure. Like, so if you're doing like a Kubernetes thing

09:36 and you've got a Flask app and it's talking to Nginx and it's talking to PostgreSQL, you can like

09:43 tie all the performance of that entire system together, not just profiling your Python code,

09:48 which is pretty awesome. So check them out at pythonbytes.fm/Datadog. Get a cool free

09:53 t-shirt. You get to try it out. It's awesome. Okay. So the next item, that's Dan.

09:57 Oh, sweet. Yeah. So a quick update here. The CMU Carnegie Mellon University launched a undergrad

10:03 degree in artificial intelligence. And apparently that is the first AI degree offered by a US

10:10 university. And when Mike told me about it, I was really surprised because I thought, well,

10:14 AI has kind of been like a big buzzword for a while now. And why didn't anybody else come up

10:20 with a degree before that? But I guess it always takes a little while to do that. And I don't really

10:25 know what goes into that degree or kind of what, you know, how the curriculum really differs from,

10:30 let's say like your average computer science degree or like a data science curriculum. But I just felt it

10:36 was an interesting development. Yeah. I'm sure they use a lot of Python.

10:39 Computer science forever. Well, first it was like electrical engineering, but I work on computers

10:43 on the software side. And like eventually that got a real degree, like computer science.

10:47 And then we have like software engineering. But now I think this is a big landmark, like the first

10:52 artificial intelligence, like a bachelor of artificial intelligence. Like think of that. That's crazy.

10:56 And one of the things the dean said is, you know, of course we'll do CS stuff, but we're also

11:00 going to focus on things like computer vision, language processing, huge databases, and how to help

11:06 like humans make better decisions automatically. It's pretty cool.

11:10 So I'm waiting for the day where we have an AI, get a bachelor's degree in AI, just so we can call

11:17 it a day and we're done. Or an AI teaching the bachelor's degree in AI.

11:21 Yeah. Even better. That'd be so sweet.

11:23 My professor's a jerk.

11:25 It's written in Fortran.

11:27 Yeah. So do you use Python at all?

11:31 I'm guessing you're learning Python.

11:32 It must be, right?

11:34 It's all Java. No, I don't know. It's got to be Python, right?

11:36 All right. So you all might know that maybe I've been kind of on a rant about async and await

11:41 and async is programming lately. And the next one, have you also heard that I've talked about

11:46 GUIs? Like I've mentioned this twice, I think, like that Python should have better GUIs.

11:50 Well, this next one is kind of like these things come together, which is awesome.

11:55 So Florian sent this over to me and it's PySide 2 and Qt for Python, the Qt framework. That has an event

12:02 loop that, you know, a button gets clicked or a timer runs or something like that. Well, somebody

12:07 built some layer that you can plug that into async and await. So you can have like async def button

12:14 click handler that integrates with your other async operations happening on your GUI there. It's pretty

12:21 awesome. There's some examples on how you do it. It's super simple. I linked to one about downloading

12:25 some stuff and whatnot. So yeah, if you're doing anything with Qt and you do anything with async,

12:31 then check this out. That's really, really a nice one.

12:33 So that one, usually like I know, I haven't done Qt in a while, but GTK uses kind of an object oriented

12:38 event loop there, right? Where it's classes. So it's taking a class-based syntax and allowing you to use

12:43 the new asyncio syntax, right? I think it's mixing the GUI event loop and the asyncio event

12:49 loop together because otherwise I think they would run independently. I think you basically can't have

12:55 those run on the same thread or something to that effect, right? Like the async event loop would block

12:59 the GUI loop or something to that effect. Cool. All right. So the next item we've got on the list here,

13:04 you know, guys, we're at Python 3.7 now. 3.8 is coming out pretty soon. So we're kind of running out of

13:09 like minor number space. I guess we could always create more, but whatever. That's a good intro.

13:14 People have started thinking about, you know, what's going to happen with Python 4.0? Like what

13:19 would be some cool features that we would really want to see? And so our good buddy, Anthony Shaw,

13:24 wrote a really interesting blog post about four things he wants to see in Python 4.0. And it's pretty

13:33 short read, but there's some interesting ideas in here. So we're just going to go over those points

13:37 here. And so number one is he would love to see just in time compilation as a first class feature.

13:42 So right now, you know, you've got some alternative Python interpreters like the Piston project,

13:47 or PyPI, I guess is like the most well known that actually feature just in time compilation. And it

13:53 could bring a huge speed up compared to like the plain like bytecode interpreter setup that CPython uses.

13:58 And so I guess the idea would be, is there some way to bring this into core Python? And apparently,

14:05 there is and we already have this in some way, or at least we have the infrastructure to be able to

14:10 plug in something like that.

14:11 That one would be really big because I know there are some companies that the reason they're able to

14:15 use Python for what they do is PyPI. The fact that it really speeds up with that just in time

14:19 compilation.

14:20 Yeah, yeah. I think it's a big one, right? Like performance. I think the more people use Python,

14:23 the more relevant the whole performance story becomes for people because then it's like,

14:27 yeah, you know, it has a huge impact if you have a small improvement.

14:30 Yeah, absolutely. There's tons of attempts to solve this problem. Like there's Rust Python and

14:34 there's Grumpy and there's all these different attempts on solving. And PyPI, like Trey said,

14:39 is really awesome. But it has this limitation where like when it, it kind of, when it gets to the C

14:44 interrupt stuff, it can like slow down or it doesn't necessarily work with all of them. So it kind of

14:49 falls back then. And with Pigeon and the work that Brett Cannon and those guys did,

14:53 it's really awesome because that's a plugin to the normal CPython. So it wouldn't be like an

14:57 alternative thing. So yeah, I would love to see this as well. It'd be great.

15:00 Yeah. Great idea. All right. Item number two is on the wishlist is a stable 0.0, like a stable 4.0

15:08 release.

15:08 Is that a lot to ask?

15:09 I don't know, man. You tell me.

15:10 I feel like this one, this was because of 3.0 history, right? That, you know,

15:16 there were lots of breaking changes. The initial was a kind of a rewrite of the language from my

15:20 understanding, although I'm not a core developer. I don't know.

15:22 The central point of that in the blog post here is that, well, you only have one chance to make a

15:27 first impression really. And so maybe Python 3 kind of bumbled its way into life or whatever. I think

15:33 now we're super happy that we have it, but I don't actually really remember the 0 release or the 0.1

15:37 release.

15:38 I don't know if anyone does.

15:39 Yeah. It's like, let's not talk about that. Let's just move on.

15:41 No, I'm sure it was great. All right. Static type hinting. I think that's a really good idea too.

15:46 You know, we've got my Pi, but it's optional right now. And it would be kind of

15:52 interesting to see that integrated into CPython or the core language if this is really the path

15:58 forward. And I'm not actually sure what the roadmap says there.

16:02 Yeah. I don't know either. It's pretty interesting. I think static typing is super valuable.

16:06 I think having it mean something in the language, that would change the Zen of Python, wouldn't it?

16:12 I mean, because it's so much about the duck typing and I don't have to worry about it. It's like,

16:16 whoa, compilation error. We expected a I runnable of whatever, right? Multiple templated thing. And

16:23 yeah, I don't know. I don't know about that.

16:25 Really changed the face of the language, I think.

16:28 Yeah. I like what he's recommending here. I'm not so sure about the required static type

16:32 hinting. Maybe like a mode to run it where you can check it. I mean, we have data classes,

16:36 which do some validation in a sense.

16:39 You're wrong, Anthony. No, like we're just some really interesting thoughts about this because,

16:46 you know, what should go into it? Because obviously it's a big release, right? If you're talking about

16:49 Python 4.0, it better be a really, really noticeable improvement. Otherwise, people are going to go,

16:55 like, oh, which would be nice too. I mean, if it's just a 4.0 release and everybody's kind of,

17:00 there's no upgrade hump like we had from 2 to 3, that's kind of nice too.

17:04 Right. Well, and he does mention the idea of static duck typing, putting an iterator in there as

17:08 opposed to a generator specific type of thing. But I don't know how you would really make that

17:13 a truly generic thing.

17:14 Yeah. Well, as long as we don't end up with a Python 3 death clock, we'll be in a pretty good place.

17:19 Nice. Okay. So the next item we have here is a GPU story for multiprocessing. So I guess the idea is

17:26 that a lot of workloads that people use Python for these days are actually running on GPUs. You know,

17:33 a lot of the, I guess, like the deep learning stuff's all running on GPUs these days. And so

17:36 wouldn't it be cool if Python 4.0 actually had some facilities to run stuff on the GPU

17:41 for like parallel computations and had it built into the language? Wouldn't that be sweet? It's an

17:46 interesting idea for sure.

17:47 Maybe like another decorator, like an at GPU.

17:50 And we're done. Add some tie pins. Yeah. And the last item here on this really interesting list is

17:56 number five is more community contributions. And I think Anthony is saying that he's already

18:03 seen, you know, like a lot more involvement from the larger community. And now that CPython is

18:08 hosted on GitHub and there's less barriers for people to contribute, I guess, to the code. And

18:14 just seeing more growth in that and seeing more people involved in the actual development of

18:19 CPython would be pretty sweet. And I totally agree. What do you think, Eric?

18:22 Well, a lot of these features, I haven't been coding long enough to have a strong opinion about

18:27 one way or the other. But I think to me, obviously, you know, optimizing for hardware and who would say

18:33 no to that. But to me, the 4.0 story would be big in terms of this would be the first major release without

18:40 having a BDFL. And how we I guess it will be we'll figure it out by then how, you know, 3.8 came about and all the

18:47 peps. But this will be a major release where it's determined, I guess, by the committee. So it would be kind of

18:52 interesting and just see how that transition going and hopefully for the long term and 5.0, 6.0.

18:58 I feel like even outside of the core developing team, Python naturally has had more community involvement

19:04 over the years. And it'd be nice to see that with a 4.0 because I mean, even this podcast, like, you know, you

19:08 mentioned under PyPackages recently. And that's something that that's not a PEP that's actually ready.

19:13 That's something it may or may not make it into Python. That's a discussion that normally happens

19:18 not behind closed doors, but in an open space that no one looks in, which is the core developer

19:23 mailing list, whereas it's on a podcast now.

19:24 Some random people in Portland dug it up and talked about it on the internet and all the dirt on your

19:30 Python.

19:30 Yeah. So that's it for all of our main items. Just a couple of quick extra ones for me. One,

19:35 I did an async webcast, which is available. So if you want like one hour review of what async and

19:40 await means and why I think now is the time for async and Python and you don't have to switch to

19:47 go. It's already awesome. Like just use it. And so you can check that out. I'll link to that in

19:51 show notes. And then if you happen to be somewhere near Tel Aviv or Israel, at least the first week of

19:57 June, they're having PyCon Israel, which is pretty awesome. And call for proposals is open just a

20:03 couple of days ago. So yeah, that's, those are my extra items. And you guys got anything else?

20:06 Yeah. Quick announcement. We're working on a new book for real Python and we're going to release

20:11 through real Python. It's called the Python basics book. So it's like a beginner's book for people who

20:15 want to get into Python in the first place. And Mike actually wrote the foreword for it. And it's great,

20:22 but it also kind of duplicates what we had said in the intro. So that means we've got to rip out a

20:26 bunch of stuff and then use his foreword as a new intro because it's so much better than what we had.

20:29 Thank you, Mike.

20:30 You're welcome.

20:31 Shameless block for the book.

20:32 Thanks for making more work. So the only thing I have to share is that, you know, some things in

20:38 my world, I'm, I'm, I have a goal for myself to write more because writing blog posts takes me so

20:43 much time. And so that's, that's something that I'm, I'm just announcing publicly here only so that I

20:49 will commit to it over the next quarter or so. And there's some kind of big things that folks in my

20:54 mailing list know with Python more. So it's going to be coming up soon.

20:57 Yeah. Sounds great. So I guess we got to close this out with a joke. So we got a whole list of

21:02 jokes here and I'll just grab two for you guys and, you know, let you all see what you think here.

21:06 So why did the angry function exceed its call stack size? It got into an argument with itself.

21:12 Oh no, oh no. There's more.

21:20 But wait, why did the developer ground their child? As in you can't go out, you're in trouble,

21:26 you stay home for the week. They weren't telling the truth. And with that, I think we're going to

21:31 close it out because that's what are we going to do with that? All right. So Trey, Dan, Eric,

21:37 thank you all for being here and everybody. Thank you so much for coming.

21:39 Podcasts was great. Brian, we miss you and see y'all later.

21:48 Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's

21:53 Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news

22:00 item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for

22:04 sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy.

22:09 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page