Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book

« Return to show page

Transcript for Episode #179:
Guido van Rossum drops in on Python Bytes

Recorded on Tuesday, Apr 21, 2020.

00:00 Hello, and welcome to Python bytes where we deliver Python news and headlines directly to your earbuds. This is Episode 179, recorded April 21 2020. I'm Michael Kennedy. And I'm Brian. And Brian. I'm super honored to have Guido van Rossum on the show Quito, welcome to Python bytes. Hello. Glad to be here. Yeah, it's really great to have you here. It's gonna be wonderful to hear your opinion, your perspective on some of these things that we're sharing this week. So welcome to the show. All right. And this episode is brought to you by data dog, check them out a Python byte set FM slash data dog. More on that later. Brian, what do you got? What's up first? Well, I've been thinking a lot about community lately, actually. And one of the things that came out recently, this was a little a little bit ago, but it's still fairly new, is the Django project announced a new governance model. It's been going on. I mean, I think they've been working on it for a couple years, since at least 2018. Some of the specifics are interesting. They had like a core team that they dissolve the core team. And they mainly kind of have a new role called a merger person, which they have commit access, but they only merge pull requests. So most of the changes could happen in the pull requests. And the discussion that happens there. There is a technical board also that was kept to kind of make some technical decisions if there's if it's necessary, but apparently, it hasn't been necessary for a while. I think it's interesting that they switched the governance model midstream. And then also, the rationale around it, I think is interesting. And the rationale is around trying to get more people contributing it to it. So they had like their core team that hadn't really changed for a long time. And people that were set up as core people really weren't contributing much anymore. Anyway, I just thought that was interesting that they, the reason around, changing the governance was around the train to get new people in. Yeah, I think that's a great idea. Because, you know, the Django has been around for a long time. And it's a fairly stable project. So I think it's kind of hard to jump in. I mean, it's a little bit like Python itself. We know, right, I'm thinking that sort of, maybe five years in the future bifen could consider a similar move. Or maybe we'll know that this was not the right move by then from Django was experienced. And of course, the, the situation for the two projects is somewhat different. But we definitely also feel the pain of sort of not getting enough new contributors. But we only fairly recently, like early last year, we changed our governance structure completely. So it's a little early to start considering changing it again, probably right, of course, we're just starting to see the outcome of the decisions and the releases that are actually going through that model. Right. Yeah. Been working with the steering council model for, say, 16 months now. Yeah, I guess Oh, three, eight definitely came out under that. Yeah, model. Yeah, the thing that Python did, I think is kind of interesting. And I don't know if you've started it, but the notion of having more core mentors to try to mentor new core developers, I think that's an interesting thing that you can't really like make people be mentors. But that's an interesting way to get more core developers on we have a few people who are very active as mentors, in addition to being active as core devs. And it really does make a difference. Yeah, yeah. We don't have enough mentors to mentor everyone who wants to become a core Dev. Yeah, yeah. So I think that's a really great, I mean, it's one thing to write web apps in Django. Or to write Python code, it's an entirely different thing to write Django or write Python, right. It's a very different skill set. And so I think that mentor model is really a great bridge, because out. So speaking of things I think are gonna be really helpful. But in a much simpler way. This is sort of a data science topic for everyone out there. And one of the problems in data science is you can end up with very large data sets complicated data. But every now and then there might be a nun where you expected an integer, or there might be an empty string, where you expected a date or something like that, and understanding how that data is, or how completed it is. Where is it more incomplete than less complete, right? Or, less, more or less, and so on. So there's this cool project called missing No, which I think is missing number, right, shortened and ideas. It's a missing data visualization module for Python. And you can see the picture in the show notes and folks who listen to this, they can go back and see it then in the show notes as well. But it's a really cool and simple little library, but it's not just show me a quick graph, it actually does some pretty powerful analysis. So what you can do is if you've got like some panda's data, you can just go to it and say MSN dot matrix and give it a sample of your data. And it gives you these really cool graphs like vertical, either black or white bars or bars that are kind of zebra stripe, depending on whether or not there's missing data.

05:00 Data shows you which parts which columns are more complete or incomplete. And even as a little graph on the side, that tells you the likelihood or the correlation of a row being incomplete, right? Like you might have a missing address on one line. But then another one has a missing phone number, or it could be more likely that those are both missing. At the same time. There's like a little graph to visualize that kind of stuff. What do you guys think? I think it's very cool. I'm not a data, anything person myself. So yeah, to indicate how much I am not in the target audience for this module. The whole time I read your modules, I had the the grouping wrong, I thought it was the missing data visualization module.

05:45 And I thought, well, that's kind of cool. That's that sort of, they say there's something missing. And this clearly is the one that now now it's turned up, but

05:55 visualizing missing data, which actually I understand what that is, I've seen a spreadsheet or two and like, I can actually even understand the little example chart that you pasted into, to the notes without understanding anything else around it. Yeah, it's a wonderful because that's why I actually think I like this and I chose it as you could just look at that picture and go, Oh, I basically get a sense for what this data is like, it's complete, it's not complete, it's mostly incomplete on on this column, or whatever. And yeah, it's really nice. And I suspect you could, if you had data, say, in like a database or a file or something, you could probably just read that into a panda's data frame, and then throw it out here and visualize like database, missing data, or file missing data, or whatever. But it's really nice. Yeah, for large data sets, one of the things you got to do is to decide when you're cleaning it up what to do with the missing data. And there's, I mean, there's some nones, or whatever. There's some strategies to either fill it in with interleaved data or something or, or just throw those complete rows completely away. But you mean, you don't really know how much data you're throwing away if you without visualizing it. So this is pretty cool. I think this is great. Yeah. And it has other visualizations as well. It has heat maps, which are like correlations. So like address and phone number correlated kind of things I was talking about, it has bar charts. And the most interesting or unique visualization is the dendogram, which I had never heard of. But this is a hierarchical clustering algorithm from sci fi, actually. And it creates this kind of like hierarchal cold tree of relationships of missing data. There's just if you are worried about like cleaning up data, or stuff like that, or visualizing how good your data is, you could throw it to this real quick and get some great answers. Yeah, that's cool. Yeah. All right. Well, Guido, you have been busy with the language summit recently, right? What's the news there? Yes. Well, normally, the language summit basically is in person meeting where about 50 people who are mostly but not exclusively core devs get together a day or two before the actual Python conference. Since the conference was canceled. This would have been in Pittsburgh, right? It would have been in Pittsburgh this year, right? Obviously, the conference was canceled and the language summit was too. And then the two organizers thought well, okay, this sounds like the kind of meeting that we can actually try to do on zoom. You can have a whole conference on zoom. But you can probably have a meeting with 50 people on zoom. And they tweak the format a bit so that I mean, you can't be on zoom for an entire day. I find zoom, incredibly intense. And after an hour of zooming, I'm usually ready for a break. Yeah, all the virtual stuff takes a lot more attention. Yeah, yeah. User Interface sucks. Privacy probably sucks. But it clearly serves its purpose. So we have spread over two different days. And then in addition, because nobody was traveling to Pittsburgh, we spread it out in time, one day, it was really early for me so that we could also have participants from Europe. And one day, it was really late for me so that we could have some people from Australia join us. One of the organizers lives in Poland, and he was there till the end on both days. So

09:18 that's coming with slept. Yeah. So as usually the format wasn't actually all that different. It's typically like half hour slots for various topics that are important to either get information to core devs and usually also get feedback from core devs. And we pretty much stuck to that format. The one big thing that you miss, of course, is all the whispering to the guy who was sitting next to you or during the break, quickly grabbing three other people and having a little huddle about a topic. Yeah, that's what's so powerful about in person conferences. Yeah, we missed the entire hallway track but it was just

10:00 Good to have sort of short presentations and q&a sessions and the q&a sessions back actually worked really well, there was a little tool that you can use to sort of moderate questions. And Lucas was like, running the moderation tool. And he was nobody was asking spam questions. So he, all he had to do was just click OK, for every question, I think.

10:25 Yeah, that tool is much more structured than the chat channel on zoom. Could be and sort of raising your hand on zoom and waving doesn't really work if there are 50 people, because there's no no way to see more than 16 people or so at a time. Yeah. So anyway, the first day each day, there were like, maybe five topics, and a few miscellaneous things, shall I just go over each day, briefly, see if I can sort of run them all off. Yeah, I would say just maybe touch really quickly on just the things that you've felt like, really make an impact going forward. Potentially, just a one liner guy who originally implemented f strings gave a talk about whether maybe all strings should become f strings. And the general sentiment was that that would have been nice in Python 1.0 or so. But there's no way it would just break too much code, it's gonna break too much. I totally hear that though. Because I'm so often I'm typing in a string, and like, Oh, I need to put a variable here. But I've typed 20 characters in that I got to go back to the beginning, but not the beginning of the line. Because maybe that's what I got to get to the beginning of the string and then go, maybe we could even put the F at the end. Who knows. But yeah, I would love to see it. But it's, I totally understand you can't do that without breaking stuff. There are downsides to automatically doing it too, because curly braces are useful for all sorts of things besides formatting. So that was sort of the opening salvo Dan, my two co conspirators on the tag parsing project, gave a talk about how we're going to hopefully introduce a new parser in five, three to nine. And we've been coding for like, almost a year now, probably, it started out as a little hobby project of mine and gradually became more serious and more people started helping out. And the last few months, we've been doing heavy engineering work to actually prepare for the integration. But we didn't have steering council approval yet. We made it a pep and we sort of said, Well, this is nice thing. But we're not going to do this unless there is sort of clear consensus, or at least general agreement that we are going to do this. And so very soon after the summit, the steering council actually had a meeting and approved a bunch of peps, and ours was one of them. And then the last two days, I've been stressing out because we wanted to get the new parser in the alpha six release, which is going out tomorrow. And so we're now in the last, the very last stretches of preparing for alpha six. And we're just deleting or disabling tests that are still failing that we we know how to fix them, but we just don't have the time. Right. That's exciting that this project is going to be in there. That's great. Yeah. So that's a new parser. And if all goes well, nobody will notice a thing. Ideally, what are the effects? Is it is it going to speed things up or make things more maintainable? Or it's going to sort of open up the grammar for future changes to the language that we currently can't do because the old Ll one parser holds us back. Okay. That's, that's sort of the main motivation. Super, there was one interesting talk about something called h pi, which is a proposal for a new, more portable API, and in particular, focused on other Python implementations. Besides c Python. As you may know, phi phi has been struggling for over a decade with compatibility with extension modules. And the H phi proposal is basically instead of pointers to objects, you have handles, which is a pointer to a pointer to an object. And there's a whole API around handles that is equivalent to the existing API, but it allows different styles of garbage collection, for example, you could implement a garbage collector that moves objects behind your back occasionally, right? You might get a generational compacting garbage collector, because you could update the value of the pointer pointer without reading the actual pointer, right. Yeah, yeah, that's actually really exciting. Yeah. And it's still in early stages, I believe. But it looks pretty promising. Eric snow gave a lightning talk about sort of a retrospective of all his work on multi core support which is now beginning

15:00 To conclude, well, maybe it's too soon to call it the conclusion. But we're going to have some interpreters with a much better API either in three, nine, or in 310. There's a pepper around that 554, which will definitely be moving forward. But whether it's considered mature enough to go to land in three, nine is not entirely clear. Yeah. Eric's work is very interesting that Yeah, yeah. And in 310, we will probably have separate gills per sub interpreter. That is going to be a major new thing. Let's see, what else do we have? Well, so the next day, I gave a talk about the future of typing, which Oh, yeah, there's one detail, you might remember that we introduced something called from Dunder future import annotations, which made it so that annotations are no longer evaluated at runtime, you can still introspect them, but you'll get just get the string containing the annotation expression back, well, that's going to be the default in 3d of nine, most likely, there's still a little debate about that. But there was like a two thirds preference for just making that the default in three dot nine. And various people argued, effectively that nobody should notice any difference, I'm really excited or happy to have typing in the language. It makes such a difference for the right use case, you know, on defining the boundary of API's or making the editor understand something better when it otherwise wouldn't. If you're maintaining 10s of thousands of lines of Python code or more type annotations, you really make a difference. Yeah, for sure. I still don't recommend teaching them to beginners, though. It really depends on what kind of beginners you have. If they're sort of recuperating Java programmers, maybe you should introduce them. But if they're like, actually blank slate, this is the first time they're programming ever. I wouldn't bother with them with annotations. Yeah, I kind of agree with that. Yeah, what's sort of still missing for the data science world is extensions to the type system for NumPy, and pandas and stuff like that. There's a design, but there's not enough, there are not enough people with available time to actually implement the design. And I'm sure that when you're halfway through implementation, all sorts of interesting issues with the design will prop up so the design is not final until it's been implemented. The last two topics, Zach Hatfield dogs, gave a very good talk about what he calls property based testing, and which really is about a tool named hypothesis. That introduces testing approach that I think was first developed in academia for Haskell, that works in a completely different way than your typical unit test based testing, right, the tool decides, right, instead of examples, the tool generates test cases. And I've never played with it myself. But the talk sort of made me very excited to play around with it more. And it actually even though it's a very different approach than unit tests, or pi test based testing, it will still integrate with that. I mean, you can write a unit test and then put some decorator on top of it that produces test data and hypothesis has all kinds of really advanced stuff for exploring enormous spaces of possible input data and quickly finding bugs, do you think we'll get to a place where we are able to use hypothesis for some of the testing for the standard library, that was one of the propositions that Zack made, I think it's still early for that. I think it's much easier to introduce hypothesis in sort of a new project, where you haven't yet written all the code and all the tests than it is to retrofit it in a large, mature, or maybe even somewhat dementing project. I think it'll be a while before we'll have testing for hypothesis based testing for the standard library, just like it'll be a while before we'll have annotations in the standard library rather than annotations, sort of separate from the standard library. The last talk I want to highlight and then I'm really done with this is also a very good talk by Russell Keith McGee, about the state of beeware, and Python for mobile. And one of his suggestions was that we adopt some of his mega patches that he's currently being maintaining for several Python releases. That would make Python lovely

20:00 Compile out of the box, or nearly out of the box for the important mobile platforms. Be Cool. Yeah, be so wonderful to have Python as an option for mobile, we really would bust open the doors and create even more growth. Many people believe that sort of mobile platforms are obviously continuing to grow in importance and to grow in power. And we'd be crazy if we didn't support by them on those. And it may be very important for pythons very survival. Yep. Yeah, I saw the Black Swan talk that Russell Keith McGee gave and it was compelling. He is an amazing speaker, for sure. Yeah. Yeah, that's what I have. Great. Thank you so much for that insight. That was That was awesome. A lot of people don't get to see the behind the scenes, they just see what's announced when it comes out. Right. Before we move on, let me tell you about our sponsor, data dog. This episode is brought to you by data dog. So let me ask you a question. Do you have an app in production that's slower than you like is its performance all over the place? Sometimes fast, sometimes slow? Now, here's the important question. Do you know why, with data dog you will, you can troubleshoot your performance with data dogs end to end tracing? Use detailed flame graphs identify bottlenecks and latency in that finicky app of yours. So be the hero that got the app back on track at your company? Get started with a free trial over Python bytes, that FM slash data dog, get a cool t shirt as well, Brian, you've got a another one that kind of ties into your first one, right? But it's sort of the other side of the coin, maybe I don't know what's been happening in the Python world that you sort of orbit in that might make you think about these things. But tell us about it. No, I just been thinking about community and, and codes of contact and enforcement for codes of contact, no reason really just kind of an interesting topic. One of the things I've been thinking about is, especially when researching this, the codes of conduct and enforcement of it and how we treat people have first thought it was really important for open source projects. And it definitely is because people have the option to just leave and get out of the project. So you really want to treat people well. So they stick around and be happy to be welcoming to other people. But I don't think industry is really that different. I think that people have the ability to just get another job. So or work on a different project. So I think these are important for industry as well, I took a look at two sets of codes of conduct and the enforcement of those. So the PSF has a code of conduct, I'm not going to read them all out. But there's things like being open being friendly. And in there, there's a list of inappropriate behaviors as well, that's covered. Now also the Django code of conduct. They also have all of these, when you read up, there are differences. But when you read them, they kind of sound the same. One of the things they highlight in the Django one is be careful with your choice.

22:43 choice of words, including, and they include examples of harassment, speech and exclusionary behavior, that's not appropriate. One of the big differences I saw was the enforcement. So the PSF is a two thirds majority vote enforcement sort of thing to like, make sure if something happens, like they want to kick somebody out or put them on probation or something, I think that's really important. Because if you require 100% majority, and somebody who is on the team that decides is potentially part of the problem, then what do you do, right? It's really tricky. I mean, right? If people are just going to abandon a project, right, you would rather have a, just a strong majority make a decision. I also think that PSF is probably got a larger, possibly as a larger Working Group on this and is more, I guess, maybe harder to get a hold of people, maybe it's easier to get a two thirds, then maybe you can't even reach all 100% of the group. But anyway, the other interesting differences is the PSF Code of Conduct seems to I know it does cover online interaction, as well as events like the conferences and meetups and stuff. But possibly at least I think that maybe its focus might be more on events, whereas the Django Code of Conduct is specifically targeted towards online interactions, I would say for the PSF, that sort of historically, events were the first place where codes of conduct were introduced. But we've been using them for online forums more and more in the past few years. Okay. One of the interesting things that with the Django one is that a single person on the committee can act without collaborating with anybody else. If it's an ongoing problem, or if there's a threat involved or something, they still have to go through the process of notifying everybody else. But there is an interesting thing that one person on the committee can intervene right away. I'm not saying one is better than the other or I just think it's interesting. And I think it's important for new projects to think about not just their code of conduct, but how they're going to enforce it, and what the timeline so the Django one also includes some timelines, which is interesting and I would really like to

22:43 Make sure that projects kind of practice, maybe figure out what they're going to do if they need to enact one of these things without, you know, before it becomes a problem. They know what they're going to do. Yeah, there's a lot of stuff going on with some projects out there. So having a couple of examples, and side by side comparisons, I think is great. I was interested to find out our meetup, like the Python meetup that we started, which is on hold right now, unfortunately, because of the the virus in quarantine and stuff, but because we're getting support from the Python Software Foundation to help pay for the meetup fees and stuff. We had to list a code of conduct on our meetup page and stuff like that. Yeah, that makes a lot of sense. But I didn't realize that. Yeah, yeah. The PSF been has been doing that for a few years now. Yeah, that's really great. All right, this next one I want to cover, it goes back aways. But I think it's really fun. And it's something also, I think, ties together well, with our special guests here. And this is an article about myths about indentation and Greedo. I picked this one because you were talking about this on Twitter just the other day. What was the motivation to throw that out there? That is a good question. I was just gonna volunteer the answer, because apparently, I had a link to that article on my homepage, in some odd corner. And at the very, very sort of ready old homepage. It's moved it to GitHub Pages, but it looks like web 1.0. And because it really is, I just added raw HTML blends in right with Netscape. Ah, so someone reported to me a broken link, which happens, like I don't know, once every four years or so. Someone reported the broken link. Oh, wait, it wasn't even on my homepage, it was on an old blog that I can no longer I'm very glad that that blog is still online. But so because I got the report of the broken link, I decided, oh, I'm sure I can still find on where, where that link used to point and sure enough, it was there. And I thought, Oh, that's actually still a neat little article. So I thought, okay, tweet of the day or tweet of them. Yeah, I agree. And I think it's interesting as well. And just to give you a sense of why it might have disappeared, it was one of those types of sites where the domain, or the URL included a tilde, username path, like, you know, like used to get in university or whatever, way back when. Anyway, this one is myths about indentation for Python. And for people who come from a C oriented language, I think Python can come across a little bit funky. I actually want to share a little story, just sort of my journey with it, and how I come to came to love this. But I think this is really interesting for people having the debate about is significant. whitespace useful. Is it weird, is it good. I did a ton of c++ and then C sharp development. So it was all and then JavaScript development, it was all about the curly brace languages, lots of symbols. And then I came to learn Python. And I'd love Python right away. But it was weird to me, I felt kind of naked. Like if I'd write an if statement, I'm like, I need some little parentheses to kind of hold the code in place. And why don't they need to be there. And they need a curly brace to like, say when this block of code is done and whatnot, it just took a little bit of getting used to but I knew that it was the right thing for me. Because when I went back to work on some older projects, I'm like, Why are there symbols everywhere? What is all this stuff, I have to keep typing. This is like a broken language and just took a couple of weeks for me to like, make that switch to feel like it was broken to go back to work in in languages. And I've been doing it for like 10 years. So well done, or the whitespace cuido. Thanks. Yeah. But so let's cover some of the things mentioned really quick in the article. One is that whitespace is significant in Python source code. And actually, no, not in general is the answer. It's significant on the left side, right, so as matches you indent stuff that really means things. But between variables, like whether you have like a equals seven, or a space equals space, seven doesn't matter, you can have tons of spaces in there, right? Like any other languages, spaces kind of don't matter, except for on the left. So that's cool. And also the amount of indentation doesn't really matter, right? You could have five spaces, or any code suite that you want. Or you could have 18, or you could go with a standard four, I recommend the four but you know, and then also, if you have something that defines, like, a list comprehension, or an array creation or a dictionary, then all of a sudden the spacing doesn't matter anymore, right? As soon as you have like open square bracket, and then you have a bunch of stuff and then close square bracket spacing doesn't matter in there. So I think this is interesting to think about as folks debate that maybe within their teams, it also you could say it forces you to use a certain indentation style. Well, yes and no. If you wanted to write it, single statement per line, then yeah, there's a cool example that they gave in the article is like if one plus one equals two, then new line, print food, new line print bar, new line print, or just say x equals 42. You can also put them on multiple lines with semi colons. If you're really missing your semi colons from your language. You could do that.

22:43 The thing that's interesting here, I think this is probably the most significant part of this article or this write up is, if you look at it, it looks right. And when it gets parsed, it is right. There's an example of some C code that looks visually wrong, because it's indeed intended differently. But it's going to parse but the way you see it, when you read it is not what's actually happening. And I think there was a problem like this. I think it was in some, either Objective C or something with Apple in there. It was really bad. There wasn't an infamous Apple vulnerability, I think, yeah, even have been on the iPhone where someone had added a second statement to a block. But that wasn't the block because there were no curlews. Right? Then it started out with a single conditional line, like if something indent, do the thing. And then they just indented but they didn't put the curly braces in. And it was Yeah, it was, it took so long for people to find it. Because visually, it looked like what Python would look actually mean, right? It looked like those two things are part of the IP block. But because the whitespace didn't matter, it actually didn't. And so that's really interesting. I'm not gonna go through everything, I'll put it in the show notes. But another one that I thought is like, I just don't like it. And that's fine. People can not like it. But it has a lot of advantages. Like in that example, before, if you had that wrongly indented Python code, it would not parse it's an error to have it not look right. And rather than just not be right, so it has a lot of advantages. And people can really quickly get used to not having to write all those symbols. And then you go back and you're like this code is hard to read. It's just full of curly braces, semi colons, parentheses everywhere. I always thought we used to those were just that is what builds programming languages to have a programming language, you had to have that. And then once I experienced Python, and I went back, it kind of broke my mental model of the world. I'm like, you don't actually have to have those things. So why are they there? anyway? What do you think about this article? You must like it somewhat, because you hunted it down and tweeted it right? It's all news for me? Because I didn't even invent the whitespace thing for Python that was sort of handed to me on a silver platter by one of my mentors in the early 80s. Yeah, yeah, back in the ABC days. And in those days, it was an innovation. There was like one other language that had this and Knuth had once said that he thought it would be a good idea. But he, he had never actually implemented the language or even experienced the language that they implemented it, he just thought that it would be good idea. Right? Right. The only thing that was a stumbling block for me was when I first started looking at Python, the editor I was using, I think it was I think it was an Emacs or something at the time, I'm not sure what I was using. But with the c++ code I was using, I had it set up so that if I double clicked on the closing bracket, it would jump to the top of the block. And I really liked that feature. And for some reason, that's the reason why I didn't like the whitespace thing at first, like, how do I get back, but then I just went, Okay, like beginner's mind, just open mind, just embrace it and learn it as a new thing. And I didn't like a week later, I didn't even miss it. So yeah. And of course, the new editors, the newer editors, like pi charm and stuff at the bottom, they have little breadcrumbs of you know, here's the class, here's the function, here's the if here's a while, whatever. And you can, you can jump between them, just like you were talking about, but like, the entire hierarchy of like, I don't know, the tokens or whatever. Yeah, they just I tend to tend to write smaller functions now. So it's not as much of a deal.

22:43 This is probably a good thing that it was hard. I was thankfully that. If you needed to add it to to help you find the top of the block, it must be pretty far away. Yeah.

22:43 It's 4000 lines. I hate scrolling so much. These functions are hard.

22:43 Oh, interesting. I gotta do one more you want to share with us? Well, yeah, you gave me some homework, I didn't really do it. But there's like, and of course, this has to do with parsing. And so this may be a fairly esoteric library. But if you're writing a program that sort of does some manipulation of your code, and maybe it converts for space in dense to to space in dense or free space in dense or whatever. Or maybe you're having you're writing something like black, which is the sort of Python code reformatting tool, but you'll don't like the way black handles certain things. Or maybe you're writing some other thing that does analysis of source code, maybe you're writing a linter, there are a couple of tools that you can use. And it turns out that that one of them is in the standard library, there's something called lib two to three, which is a little hard to pronounce it has the digit two and then the word to and then the digit in the name. That is tricky. That is something I wrote probably over 15 years ago, or at least the core of it, which is yet another LR one parser but this this one's written in Python rather than in C like the original ones.

22:43 And actually black ended up using lib two to three, except I think Lucas had one issue that he couldn't figure out how to do with black. And so we ended up vendoring, a copy of lib two to three, and then butchering it a little bit, which is how these things happen. I mean, if you look at what bit vendors, that's pretty scary, but there are good reasons for that, too. So but if you're writing your own, you should probably not use lib 223. And not just because it's going to go out of style. Once the bag parser arrives, there are much better tools, and the one that I discovered a few months ago, it's actually written by some folks at Facebook, mostly, it's called lib CST. And they have have unique capitalization, it's a capital L lib, and then lowercase IB. And then CST is all uppercase. And so it's a library for manipulating concrete syntax trees. And like lib two to three, it actually share some code with lib, two to three, I think they're the underneath is a parsing library called barso, which itself is a butchered version of lib, two to three, at least, that's how it started. These tools are things that can parse Python code, but they produce a syntax tree that is the opposite of an abstract syntax tree. It's a very concrete syntax tree. And that means that every space, every comment, every bit of indentation is preserved, or at least can be recovered from the information in that app store in that syntax tree and

22:43 oppose that with the typical abstract syntax tree, which in the end, doesn't even remember where the parentheses are. Right? Right. It just takes us up. Well, here's some conditional statement. Here's the two things we're testing. Yeah, right. So this sounds much more useful. If you want to do like a code analysis type of thing to say this thing you're doing here, you should do it in this other way, or transform it over, but kind of preserved things like comments and style. Yeah. And so lipsy st, has a really sort of solid, underlying model. And they thought a lot about various transformations they want to apply, because the the typical way these tools work and lead to two three itself started out that way as well is you read your source code using this customized parser, it gives you a concrete syntax tree, then in that syntax tree, you're actually going to make changes, you're going to systematically rename a parameter or move things around or insert in the two to three world. Of course, it's used to turn things like enter items into items and enter keys into keys. And you can make that kind of changes. And so lipsy st also supports that, but it sort of has a slightly better API, because 15 years ago, when I started lip, two to three, I didn't realize what an important tool it was going to be in some of the the way the whitespace is attached to nodes is exactly backwards from the way that is the most convenient way to think about it and work with it. Alright, cool. Well, it sounds like it'd be really helpful for people building tools like black or looking at code analysis and stuff. Right. lukesh had a I think it was the 2019 talk by contact where he described how black uses both concrete syntax trees and the abstract syntax tree. It was it's a pretty fascinating talk for a very low level depth into into these concepts. It wasn't until I watched that talk that I realized that black compares the before and after abstract syntax tree to make sure that your code is guaranteed to run the same. So you don't really have to test for that he's already testing for it. So that's pretty interesting. Yeah, that's very cool. That is a very neat feature. And it's actually important to IQ in general for people who are doing transformations to have some abstract way of double checking that your transformation left things in a decent state. That's cool. Africa. Well, thanks for lipsy St. Guido. That's great one. Now that's it for our main topic. So just really quick things at the end that I just want to throw out there for people. One Adam, who goes by codependent coder on Twitter, sent a message over and said, Hey, Jango no longer supports Python two at all, which is pretty awesome. Because one dot 11 has left long term support, leaving only two to 12 onward, which has only Python three support. So yay for modern Python making its way through. That's good. And then last time we talked about 90% of coding is googling and that's okay. Or it's not and we didn't really feel like that was our experience, right as people have been around for a while, but I gotta tell you this

22:43 Last week, I've been doing nothing but pandas, Altera visualization, Jupiter notebook and graphics. Because I'm building like a whole set of dashboards for like the talk Python courses and whatnot. And like, basically like the the dashboards that I showed about a while ago, I googled a lot, a whole lot. Yeah, but that's the thing. It was like, it was like a two or three day blip of like, wow, I'm googling like, 25 30% of my time, because I don't know anything about these things. And how do I get this thing to line up with that bar. But now I'm back to just kind of mostly not doing that anymore, even after a few days. So I think, generally, what we said is true. But I do think there's like these blips of like, wow, I'm diving in something new. It's like mad search scrambling. But then I'm back to sort of using like more memory coding, I don't know what you call not Google coding, you got to understand what you're doing. And that means you can just Google for examples and copy and paste them in. Because then you combine the examples, and you have no idea what you're doing. And of course, it doesn't work at best. It's frustrating, right? You're like I this work that work. But together, they don't work. And you just don't even know why. Right? Yeah. So for sure. But yeah, so anyway, it's a follow up on our conversation last week, right? What do you got to throw out there for everyone? I'm gonna say this on this show, just make sure I do it. There's like three days left for me to record my talk. Yeah, this is like a forcing yourself to commit to it. So you're gonna do it. Okay. Yes, definitely. So pi con talk, I really do want to get it online. It's important stuff. It's about parameterization. I talked to a couple episodes ago about having trouble switching back and forth at home. With all this working from home stuff between Mac and Windows, I finally figured out the whole using command and control. So thank you to everybody. But apparently there's this really simple thing. Apple lets you just swap them on on a keyboard. So that's what I'm doing. No, works great. And then also, I had promised that I was going to have my cards project be able to work and publish to pi pi or the test pi pi. It doesn't work with setup tools, sem, because I'm using flit. So if somebody's got a way to figure out how to just somehow change the version, string or bump that every time you merge or something like that, that'd be great. But otherwise, right now, I don't think there's a way to automatically push to ipi if you're using foot. Because it says, Yeah, what's already uploaded, maybe there's a GitHub action that will like just randomize that or something, because the version is embedded in the source code. And the trick that people are using with setup tools is the version is based on the version and GitHub, and you can't do that. So at least I'm figured it out. But that's okay. Probably do something else. That's my extras. We know anything else. Even though I said it's hard to imagine Python going online. It actually is going online. At least some of it is the first talk by the conference. Chair. Emily Morehouse has been posted, and many more will follow her welcome was really nice. The other thing and as you mentioned, Django no longer supports by come to at all. Well, that's just fine, because the very last release of five from two to seven point 18 was released a few days ago. Yeah. That's great. That must be kind of a load off of your shoulders to finally have that in the rearview mirror. I am very happy. And I'm sad, of course that we can have an absolutely wild and crazy party in Pittsburgh, like we were planning. Yeah, a big celebration on zoom. It's just not the same. Yeah. Just have to have a bigger one next year. That's one I don't know how to pull off the boy. That's really good. All right, you guys ready for a really quick joke? All right. So here's a quick joke sent to us by Derrick chambers. And he may have even made this up for us. And this is goes back to the sub interpreters and the multiple gills and all that. So you guys know how you can borrow money concurrently? With async. io use? It's a terrible joke. That's a bad joke. But oh, that is very groan worthy.

22:43 Most of our jokes actually are out here. But that's how it goes. Yeah. And keep them coming. Keep sending us your bad jokes. Yeah, that's right. That's right. Python dad jokes. That should be a whole separate category. There. Absolutely. Should there should. Well, Guido is really an honor to have you on the show. Thanks for coming and sharing your perspective on all this. Love to be back. Yeah. And, Brian. Thanks. It's always good to be here with you. Cheers. Yep. Hi, everyone. Thanks both of you. Thank you for listening to Python bytes. Follow the show on Twitter via at Python bytes events Python bytes as in VYT s and get the full show notes at python If you have a news item you want featured just visit Python by set FM and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian arkin. This is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page