Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #119: Assorted files as Django ORM backends with Alkali

Return to episode page view on github
Recorded on Monday, Feb 25, 2019.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 119, recorded live from Pi Cascades in Seattle.

00:10 [cheers]

00:12 All right, it's great to be here.

00:14 And this episode is brought to you by Datadog.

00:17 Tell you more about them later.

00:18 Right now, I have a bunch of special guests, none of whom are Brian Okken.

00:22 More about that in just a second.

00:23 But we have Trey Hunter.

00:24 Hello.

00:25 Dan Bader.

00:26 Hey, how's it going?

00:27 Eric Cho.

00:28 Yo.

00:29 conference and we thought why not put something live together for you. Now Brian Okken decided to punish his teeth by having a painful root canal and couldn't join us in some sort of last-minute emergency and that's really unfortunate because he was looking forward to be here. So everybody, Brian we miss you. We miss you Brian! Right on. Well let's go ahead and kick it off. I'm gonna do the first thing here and have you guys heard of this thing called Dropbox? Yeah a little bit. They have something to do with Python. Anyway obviously Guido works at Dropbox. It's a huge Python center of the universe there. And what's really interesting is they're finally migrating to Python 3 and using some of the tools that Guido has personally worked on with like mypy and static typing and all of that. So that's our first item. And if you had to guess how many lines of code is the Dropbox code that you're working with, you know that little box in your menu bar, your taskbar, that's also client-side Python which is interesting already, but it's over a million lines of code.

01:31 So they started way back in 2015, a little hack week side project to prove whether or not maybe they could do it.

01:38 It turned out it's going to be hard, is what basically they said.

01:43 And officially they started the first half of 2017.

01:46 And the real thing that helped them do this, which I think is interesting, is mypy.

01:50 Have you guys heard of mypy?

01:51 - Yep.

01:52 - Oh yeah.

01:53 is it takes the type annotations or type hints and verifies that this function says it takes one of these and you're giving it one of the same things, like that sort of thing.

02:02 - Did Guido actually, like, I don't think he started mypyro, like somebody correct me here.

02:06 - I don't think he started it, but he definitely works on it.

02:08 - One of the original contributors, I think.

02:09 - Okay, did he start it, or like was it started for Dropbox specifically or for the Dropbox code base?

02:14 Just curious.

02:15 - Yeah, I don't know either, but I know that it was an important thing he's working on.

02:18 - I'm not sure, but I just wanna, it seems like Dropbox been migrating away from the public clouds for a while, and they've been focusing on just getting things right.

02:27 So this is probably one of those things where they think for the long-term growth, it's going to be better than relying on somebody else's infrastructure.

02:34 - Right, absolutely, it's very interesting.

02:35 They're stepping away from some of the cloud hosting.

02:37 Everyone else is running to the cloud.

02:38 They're like, "Ah, well, we can make cloud." That's pretty interesting.

02:42 So let me throw this out for you all, co-guests and audience members and listeners.

02:47 One of the very first things they say in this article is, Well, once we were armed with mypy, the first few steps we took was to port our custom fork of Python to 3.5.

02:58 - What?

02:59 (laughing)

03:00 - That's big.

03:01 (laughing)

03:02 I'm like, wait, what?

03:03 There's a, they don't run normal Python?

03:05 What kind, they like drop Python?

03:07 What do they call it?

03:08 It's pretty cool.

03:09 - It's pretty cool.

03:10 It cross compiles to Perl.

03:11 - Yeah.

03:12 - And then they--

03:13 - Everyone does it.

03:14 Yeah, so I'll just kind of wrap this up here.

03:16 But basically this article that we're covering goes through all the steps of Dropbox moving over.

03:21 And I feel like if people are going to take the Python 3 as modern Python and other Python as legacy Python as a legitimate thing, the guy who created Python had better work at a place that uses Python 3, not Python 2.

03:34 - For sure.

03:35 - So I'm super happy to see that's moving along.

03:37 And also that Guido was a pretty big part of it.

03:39 All right, so let's see.

03:40 What's up next here?

03:41 Eric?

03:42 - Basically, I want to talk about what I feel was underserved community in Python.

03:48 I come from a network engineering background and been focusing on network automation using Python.

03:53 And I think we've gotten to a point where we're big enough to be noticeable.

03:57 Like it's actually material for the amount of community.

04:00 I mean, we have new terms such as Net DevOps or NRE, not to be subtle differences from the site reliability engineering for network reliability engineering.

04:11 We have some popular libraries from NetMeco, Napalm, who's been on your show before.

04:15 And I can't even pronounce that new library, no nerd, I think, and no R, and I will have the link in the show notes.

04:21 Yeah, you know, there's a lot of free resources out there for people to practice on for either network engineer wants to learn more about Python or developers who wants to learn more about network engineering.

04:31 I think coming of age, I mean, hopefully one day, you know, we're going to have a sub culture of Python, just like the data analysis community that for network engineers.

04:42 So that's I want to bring to everybody's attention, you could do it for fun, do it for profit.

04:47 and it's a welcoming community.

04:49 - Yeah, and you linked to a bunch of resources in the show notes that people who are into that can check out and yeah, Python's a mosaic and there's so many people doing different things and here's just another part of it, right?

04:58 - Yeah, absolutely.

04:59 I mean, I'm super excited about this 'cause I think as you mentioned multiple times on your show, it's like you get started early or started easily, but you don't hit that ceiling.

05:09 I mean, I've been doing this for five years and I haven't found that ceiling yet.

05:11 It's a dot to me, so.

05:12 (laughs)

05:14 Yeah.

05:15 - Is that a sign of growth that the Python community has seen where now it makes sense to have a niche for network automation specifically?

05:21 I think people are still trying to figure out how this thing is going to go, which is with lots of changes, presents more opportunities for people.

05:28 And Python just emerged in this de facto and speaks to the versatility and the power of the language.

05:34 I think we're in that phase where we're trying to figure it out, and we just have this trending versus nobody has the right answer.

05:41 But that means at the same time, that's where the opportunity lies.

05:45 you could figure it out and could drive that direction.

05:47 And I think the developer actually has a huge advantage that everything is virtualized, everything is abstracted away from the physical.

05:56 So that's my thought at the moment.

05:57 You know, you could see that I'm not very clear either.

05:59 - I think it's super interesting that you point out how everything's abstracted and sort of cloud programmable.

06:03 That means like Python has a better chance in the network space if it's not all hardware and boxes and stuff, right?

06:08 - Yeah, for sure.

06:09 I think one of the challenges for network engineers such as myself going into the cloud is the fact that there's no longer broadcast domain.

06:17 Your NIC is actually physically attached to you.

06:19 So things that we took for granted that were fixed is no longer true.

06:23 So you get to have a network NAT gateway that's just arbitrarily attached to your virtual subnet, which you used to, I think if you work in the traditional enterprise, the first thing you do when you get a new team is you subnet it out, you give it an IP address, you subnet, but those are all virtualized nowadays.

06:41 So you still need to understand the basics, but that basic used to take years to master.

06:47 Now it's just a matter of reading a doc.

06:49 So yeah, hopefully, you guys, come say hi if you see me at Instable Fest, at Cisco DevNet Create, at some of the Juniper events, come say hi, let's talk.

07:00 And I think we could make this, potentially make a great community out of it.

07:03 - Yeah, put Python on the wire.

07:05 - Yeah, yeah, for sure.

07:06 Buy you a Python beer.

07:09 - Yeah.

07:10 It's funny, Python really is a mosaic.

07:12 I mean, that's, I didn't understand, well, I understood a lot of the terms you were using, but what they actually mean, I don't know.

07:17 'Cause I don't need to know what they mean.

07:18 And in the space of Python that I kind of am part of, this next thing I've got is kind of related to the fact that Python's a mosaic.

07:24 It's kind of part of the web side of the mosaic of Python, which gets maybe more reputation than it deserves in the sense that there's a lot of folks using Python for the web, but it's not all you can use Python for at all.

07:37 I mean, data science is huge.

07:39 But if you have to process data, and it's not a database, and you are someone who's familiar with Django, there's a thing called Alkali that Kurt made.

07:49 I can't remember Kurt's last name.

07:50 Remember, Kurt's in the room, and we actually--

07:53 - Kurt Neufeld.

07:54 - Kurt Neufeld.

07:55 So it's funny being at conferences.

07:56 You sometimes just meet the people who end up making the things that you're using.

08:00 So Alkali I'm not using, but it looks kind of fun because I'm familiar with the Django ORM, and Alkali, it's meant to take structured data, maybe an RSS feed, maybe a CSV file, maybe JSON data, maybe some random homegrown thing that you've got on your team or in your company, and allow you to use a Django ORM-like syntax to query it and also to save it, maybe in some other format even.

08:22 So it's as if you're working with a database, but you don't actually have a database behind the scenes.

08:27 You've got some structured file.

08:28 So it kind of does that all in memory, which is fun.

08:31 - Right, so maybe you're working with XML and you don't want to learn XPath, or you don't want to write regular expressions against CSV files.

08:38 - Who wants to learn XPath, man?

08:40 - Nobody.

08:41 - Historical question.

08:42 - Hey man, the 90s are calling.

08:43 They want their API back.

08:46 Here's my style sheet.

08:47 - Says nobody ever.

08:48 - Yes, exactly.

08:49 So, I think this is a cool project, Kurt.

08:52 I definitely like that you can point it at even like something, an endpoint on an ATP service and like turn that into effectively a Django database.

09:00 And I've heard that there's a branch working on indexes which will like sort of complete the performance side of things.

09:06 - Ooh, that would be really fun.

09:08 Yeah, no pressure, no pressure.

09:10 It's gonna be released tomorrow, I heard.

09:13 I'm just kidding, it's not gonna be released tomorrow.

09:14 - It's a long night for Eric.

09:15 - He's shaking his head.

09:16 - Long flight home, I don't know where he's from.

09:18 - All right, before we move on to the next one, let me just tell you about our sponsor, which makes all of this happen.

09:23 So this episode's brought to you by Datadog, and Datadog, they're really awesome.

09:27 They let you track the performance and errors and requests, not just within your Python app, but across all of your infrastructure.

09:35 So if you're doing like a Kubernetes thing, and you've got a Flask app, and it's talking to Nginx, and it's talking to PostgreSQL, you can tie all the performance of that entire system together, not just profiling your Python code, which is pretty awesome.

09:49 So check them out at pythonbytes.fm/datadog.

09:52 Get a cool free t-shirt.

09:54 You get to try it out.

09:54 It's awesome.

09:55 OK, so the next item, that's Dan.

09:57 Oh, sweet.

09:58 Yeah, so quick update here.

10:00 The CMU, Carnegie Mellon University, launched a undergrad degree in artificial intelligence.

10:06 And apparently that is the first AI degree offered by a US university.

10:10 And when Mike told me about it, I was really surprised because I thought, well, AI has kind of been like a big buzzword for a while now.

10:18 And why didn't anybody else come up with a degree before that?

10:21 But I guess it always takes a little while to do that.

10:24 And I don't really know what goes into that degree or kind of how the curriculum really differs from let's say like your average computer science degree or like a data science curriculum, but I just felt it was an interesting development.

10:38 - Yeah, we've had computer science forever.

10:40 Well, first it was like electrical engineering, but I work on computers on the software side.

10:44 And eventually I got a real degree like computer science.

10:47 And then we have like software engineering, but now I think this is a big landmark, like the first artificial intelligence, like a bachelor of artificial intelligence.

10:55 Like think of that, that's crazy.

10:56 And one of the things the Dean said is, you know, of course we'll do CS stuff, But we're also going to focus on things like computer vision, language processing, huge databases, and how to help humans make better decisions automatically.

11:09 It's pretty cool.

11:10 So I'm waiting for the day where we have an AI, get a bachelor's degree in AI, and we can call it a day, and we're done.

11:19 Or an AI teaching the bachelor's degree in AI.

11:22 Yeah, even better.

11:22 That would be so sweet.

11:24 My professor's a jerk.

11:25 [LAUGHTER]

11:26 It's written in Fortran.

11:27 [LAUGHTER]

11:29 - Yeah, so do you use Python at all?

11:31 I'm guessing you're learning Python.

11:32 - Oh God, it's gotta be like--

11:33 - It must be, right?

11:34 - It's all Java.

11:35 No, I don't know, it's gotta be Python, right?

11:37 All right, so you all might know that maybe I've been kind of on a rant about async and await and asynchronous programming lately.

11:43 And the next one, have you also heard that I've talked about GUIs?

11:46 Like I've mentioned this twice, I think, like that Python should have better GUIs.

11:50 Well, this next one is kind of like these things come together, which is awesome.

11:55 So Florian sent this over to me and it's PySide 2 and Qt for Python, the Qt framework.

12:02 That has an event loop that, you know, a button gets clicked or a timer runs or something like that.

12:07 Well, somebody built some layer that you can plug that into async and await.

12:12 So you can have like async def button click handler that integrates with your other async operations happening on your GUI there.

12:20 It's pretty awesome.

12:21 There's some examples on how you do it.

12:23 It's super simple.

12:24 I linked to one about downloading some stuff and whatnot.

12:27 So, yeah, if you're doing anything with Qt and you do anything with async, then check this out.

12:31 That's really, really a nice one.

12:33 - So that one, usually, like I know, I haven't done Qt in a while, but GTK uses kind of an object-oriented event loop there, right, where it's classes.

12:40 So it's taking a class-based syntax and allowing you to use the new async I/O syntax, right?

12:45 - I think it's mixing the GUI event loop and the async I/O event loop together, because otherwise I think they would run independently.

12:53 I think you basically can't have those run on the same thread or something to that effect, right?

12:57 Like the async event loop would block the GUI loop or something to that effect.

13:01 - Cool, all right.

13:02 So the next item we've got on the list here, you know guys, we're at Python 3.7 now, 3.8 is coming out pretty soon.

13:08 So we're kind of running out of like minor number space.

13:12 I guess we could always create more, but whatever.

13:13 That's a good intro.

13:14 (laughing)

13:16 People have started thinking about, you know, what's gonna happen with Python 4.0?

13:19 Like what would be some cool features that we would really wanna see?

13:22 And so our good buddy, Anthony Shaw, wrote a really interesting blog post about four things he wants to see in Python 4.0.

13:31 And it's pretty short read, but there's some interesting ideas in here.

13:35 So we're just gonna go over those points here.

13:38 And so number one is he would love to see just-in-time compilation as a first-class feature.

13:42 So right now, you've got some alternative Python interpreters like the Piston project, or PyPI, I guess is like the most well-known that actually feature just-in-time compilation and it could bring a huge speed up compared to like the plain like by code interpreter setup that CPython uses.

13:59 And so I guess the idea would be, is there some way to bring this into core Python?

14:04 And apparently there is, and we already have this in some way, or at least we have the infrastructure to be able to plug in something like that.

14:11 - That one would be really big.

14:12 'Cause I know there are some companies that the reason they're able to use Python for what they do is PyPy.

14:17 The fact that it really speeds up with that just-in-time compilation.

14:20 - Yeah, yeah, I think it's a big one, right?

14:21 Like performance.

14:22 the more people use Python, the more relevant the whole performance story becomes for people because then it's like, yeah, it has a huge impact if you have a small improvement.

14:30 - Yeah, absolutely.

14:31 There's tons of attempts to solve this problem.

14:33 There's Rust Python and there's Grumpy and there's all these different attempts on solving it.

14:37 And PyPy, like Trey said, is really awesome, but it has this limitation where when it gets to the C interop stuff, it can slow down or it doesn't necessarily work with all of them, so it kind of falls back then.

14:50 And with Pigeon and the work that Brett Cannon and those guys did, it's really awesome 'cause that's a plug-in to the normal CPython, so it wouldn't be like an alternative thing.

14:58 So yeah, I would love to see this as well.

15:00 It'd be great.

15:01 - Yeah, great idea.

15:02 All right, item number two is on the wishlist is a stable .0, like a stable 4.0 release.

15:09 - Is that a lot to ask?

15:10 - I don't know, man, you tell me.

15:11 (all laughing)

15:13 - I feel like this one, this was because of 3.0 history, right?

15:15 That there were lots of breaking changes, that the initial was kind of a rewrite the language from my understanding, although I'm not a core developer, I don't know.

15:22 - The central point of that in the blog post here is that, well, you only have one chance to make a first impression really.

15:27 And so maybe Python 3 kind of bumbled its way into life or whatever.

15:33 I think now we're super happy that we have it, but I don't actually really remember the zero release or the 0.1 release.

15:38 - I don't know if anyone does.

15:39 - Yeah, it's like, let's not talk about that.

15:40 Let's just move on.

15:42 No, I'm sure it was great.

15:43 All right, static type hinting.

15:45 I think that's a really good idea too.

15:46 I mean, you know, we've got mypy, but it's optional right now.

15:49 And it would be kind of interesting to see that integrated into CPython or the core language if this is really the path forward.

15:59 And I'm not actually sure what the roadmap says there.

16:02 Yeah, I don't know either.

16:03 It's pretty interesting.

16:04 I think static typing is super valuable.

16:06 I think having it mean something in the language, that would change the zen of Python, wouldn't it?

16:12 I mean, because it's so much about the duck typing and I don't have to worry about it.

16:15 It's like, "Whoa, compilation error.

16:17 We expected a..." I runnable of whatever, right?

16:21 Multiple templated thing and yeah, I don't know.

16:24 I don't know about that.

16:25 - We really changed the face of the language, I think.

16:28 - Yeah, I like what he's recommending here.

16:30 I'm not so sure about the required static type hinting.

16:33 Maybe like a mode to run it where you can check it.

16:35 I mean, we have data classes which do some validation in a sense.

16:38 - You're wrong, Anthony.

16:40 No, like we're like, we're just, this is some really interesting thoughts about this because you know, what should go into it?

16:47 Because obviously it's a big release, right?

16:52 If you're talking about Python 4.0, it better be a really, really noticeable improvement.

16:56 Otherwise people are going to go like, "Oh." Which would be nice too.

17:00 If it's just a 4.0 release and there's no upgrade hump like we had from two to three, that's kind of nice too.

17:08 Paul: Right, well, and he does mention the idea of static duck typing, putting an iterator in there as opposed to a generator-specific type of thing.

17:13 But I don't know how you would really make that a truly generic thing.

17:14 - Yeah, well, as long as we don't end up with a Python 3 death clock.

17:17 (all laughing)

17:18 It'll be in a pretty good place.

17:20 - Nice, okay, so the next item we have here is a GPU story for multiprocessing.

17:25 So I guess the idea is that a lot of workloads that people use Python for these days are actually running on GPUs.

17:33 You know, a lot of, I guess, like the deep learning stuff is all running on GPUs these days.

17:36 And so wouldn't it be cool if Python 4.0 actually had some facilities to run stuff on the GPU for like parallel computations and how to build into the language.

17:45 Wouldn't that be sweet?

17:46 It's an interesting idea for sure.

17:47 - Maybe like another decorator, like an @GPU method and you just copy it.

17:51 - And we're done.

17:52 Add some type pins and boom.

17:54 Yeah, and the last item here on this really interesting list is, number five is more community contributions.

18:00 And I think Anthony is saying that he's already seen, you know, like a lot more involvement from the larger community.

18:06 And now that CPython is hosted on GitHub and there's less barriers for people to contribute.

18:12 I guess, to the code.

18:14 And just seeing more growth in that and seeing more people involved in the actual development of CPython would be pretty sweet.

18:21 I totally agree.

18:22 What do you think, Eric?

18:23 A lot of these features, I haven't been coding long enough to have a strong opinion about one or the other.

18:28 But I think to me, obviously, optimizing for hardware, and who would say no to that?

18:34 But to me, the 4.0 story would be big in terms of this would be the first major release without having a BDFL.

18:41 And I guess we'll figure it out by then how 3.8 came about and all the peps, but this will be a major release where it's determined, I guess, by the committee.

18:52 So it will be kind of interesting and just see how that transition going and hopefully for the long term in 5.0, 6.0.

18:58 - I feel like even outside of the core developing team, Python naturally has had more community involvement over the years and it'd be nice to see that with the 4.0 because I mean, even this podcast, like you mentioned DunderPi packages recently And that's not a PEP that's actually ready.

19:14 That's something, it may or may not make it into Python.

19:16 That's a discussion that normally happens, not behind closed doors, but in an open space that no one looks in, which is the core developer mailing list, whereas it's on a podcast now.

19:25 - Some random people in Portland dug it up and talked about it on the internet and all helped.

19:29 - Getting all the dirt on your Python.

19:30 - Yeah, so that's it for all of our main items.

19:33 Just a couple of quick extra ones from me.

19:35 One, I did an async webcast, which is available.

19:37 So if you want like one hour review of what async and await means and why.

19:42 I think now is the time for async in Python and you don't have to switch to go.

19:47 It's already awesome, just use it.

19:49 So you can check that out.

19:50 I'll link that in the show notes.

19:51 And then if you happen to be somewhere near Tel Aviv or Israel at least, the first week of June, they're having PyCon Israel, which is pretty awesome.

20:00 And call for proposals is open just a couple of days ago.

20:04 So yeah, those are my extra items.

20:05 And you guys got anything else?

20:06 - Yeah, quick announcement.

20:07 We're working on a new book for real Python.

20:10 We're going to release three real Python.

20:12 It's called the Python Basics Books.

20:13 It's like a beginner's book for people who want to get into Python in the first place.

20:17 And Mike actually wrote the forward for it.

20:20 And it's great, but it also kind of duplicates what we had said in the intro.

20:25 So that means we've got to rip out a bunch of stuff and then use this forward as a new intro because it's so much better than what we had.

20:30 Thank you, Mike.

20:31 - You're welcome.

20:31 - And shameless plug for the book.

20:33 - Thanks for making me work.

20:34 So the only thing I have to share is that some things in my world, I have a goal for myself to write more because writing blog posts takes me so much time.

20:45 And so that's something that I'm just announcing publicly here only so that I will commit to it over the next quarter or so.

20:52 And there's some kind of big things that folks on my mailing list know with Python more, so it's gonna be coming up soon.

20:57 - Yeah, sounds great.

20:57 So I guess we gotta close this out with a joke.

21:00 So we got a whole list of jokes here and I'll just grab two for you guys and let you all see what you think here.

21:06 So why did the angry function exceed its call stack size?

21:10 It got into an argument with itself.

21:12 (audience laughing)

21:14 No, no, so.

21:15 (audience laughing)

21:17 Oh no, oh no, there's more.

21:18 (audience laughing)

21:20 But wait, why did the developer ground their child?

21:24 As in, you can't go out, you're in trouble, you stay home for the week.

21:28 They weren't telling the truthy.

21:29 And with that, I think we're gonna close it out 'cause that's, what are we gonna do with that?

21:34 All right, so Trey, Dan, Eric, thank you all for being here. - Thank you.

21:38 - And everybody, thank you so much for coming.

21:40 (audience cheering)

21:43 PyCascades was great.

21:46 Brian, we miss you, and see y'all later.

21:49 Thank you for listening to Python Bytes.

21:50 Follow the show on Twitter via @pythonbytes, that's Python Bytes as in B-Y-T-E-S.

21:56 And get the full show notes at pythonbytes.fm.

21:59 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

22:03 we're always on the lookout for sharing something cool.

22:06 On behalf of myself and Brian Okken, this is Michael Kennedy.

22:09 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page