Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #179: Guido van Rossum drops in on Python Bytes

Return to episode page view on github
Recorded on Tuesday, Apr 21, 2020.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:08 This is episode 179, recorded April 21st, 2020.

00:09 I'm Michael Kennedy.

00:15 And I'm Brian Okken.

00:16 And Brian, I'm super honored to have Guido van Rossum on the show.

00:17 Guido, welcome to Python Bytes.

00:20 Hello, glad to be here.

00:22 Yeah, it's really great to have you here.

00:24 It's going to be wonderful to hear your opinion, your perspective on some of these things that we're sharing this week.

00:25 So welcome to the show.

00:30 And this episode is brought to you by Datadog.

00:31 Check them out at pythonbytes.fm/datadog.

00:34 More on that later.

00:36 Brian, what do you got?

00:38 What's up first?

00:39 Well, I've been thinking a lot about community lately, actually.

00:39 And one of the things that came out recently, this was a little bit ago, but it's still fairly new, is the Django project announced a new governance model.

00:48 It's been going on, I mean, I think they've been working on it for a couple years, since at least 2018.

00:53 Some of the specifics are interesting.

00:54 They had like a core team that they dissolved the core team and they mainly kind of have a new role called a merger person, which they have commit access but they only merge pull requests.

01:06 So most of the changes could happen in the pull requests and the discussion that happens there.

01:11 There is a technical board also that was kept to kind of make some technical decisions if it's necessary, but apparently it hasn't been necessary for a while.

01:22 I think it's interesting that they switched the governance model midstream.

01:26 And then also the rationale around it, I think is interesting.

01:30 And the rationale is around trying to get more people contributing to it.

01:35 So they had like their core team and that hadn't really changed for a long time.

01:39 And people that were set up as core people really weren't contributing much anymore.

01:44 Anyway, I just thought that was interesting that they, the reason around, changing the governance was around the trying to get new people in.

01:52 Yeah, I think that's a great idea because, you know, Django has been around for a long time and it's a fairly stable project.

01:58 So I think it's kind of hard to jump in.

02:00 I mean, it's a little bit like Python itself, Guido.

02:02 Right. I'm thinking that sort of maybe five years in the future, Python could consider a similar move, or maybe we'll know that this was not the right move by then from Django's experience.

02:15 And of course, the situation for the two projects is somewhat different, but we definitely also feel the pain of sort of not getting enough new contributors.

02:26 But we only fairly recently, like early last year, we changed our governance structure completely.

02:33 So it's a little early to start considering changing it again, probably.

02:37 Right, of course, we're just starting to see the outcome of the decisions and the releases that are actually going through that model, right?

02:44 We've been working with the steering council model for, say, 16 months now.

02:49 Yeah, I guess so, 3.8 definitely came out under that model.

02:53 The thing that Python did that I think is kind of interesting, and I don't know if you started it, but the notion of having more core mentors to try to mentor new core developers, I think that's an interesting thing that, you can't really make people be mentors, but that's an interesting way to get more core developers on.

03:11 We have a few people who are very active as mentors in addition to being active as core devs.

03:17 And it really does make a difference.

03:19 Yeah, we don't have enough mentors to mentor everyone who wants to become a core dev.

03:27 Yeah, yeah.

03:27 So I think that's a really great, I mean, it's one thing to write web apps in Django, or to write Python code, it's an entirely different thing to write Django or write Python, right?

03:38 It's a very different skill set.

03:40 And so I think that mentor model is really a great bridge.

03:43 - Yeah, that'd be cool.

03:44 - Yeah, so speaking of things I think are going to be really helpful, but in a much simpler way, this is sort of a data science topic for everyone out there.

03:52 And one of the problems in data science is you can end up with very large data sets, complicated data, but every now and then there might be a none where you expected an integer, or there might be a empty string where you expected a date, or something like that.

04:08 and understanding how that data is, or how complete it is, where is it more incomplete than less complete, right, or more or less, and so on.

04:19 So there's this cool project called Missing No, which I think is missing number, right, shortened.

04:24 And the idea is it's a missing data visualization module for Python.

04:28 And you too can see the picture in the show notes and folks who listen to this, they can go back and see it in the show notes as well.

04:35 But it's a really cool and simple little library, but it's not just show me a quick graph.

04:40 It actually does some pretty powerful analysis.

04:43 So what you can do is if you've got some pandas data, you can just go to it and say msno.matrix and give it a sample of your data.

04:51 And it gives you these really cool graphs with vertical, either black or white bars, or bars that are kind of zebra striped, depending on whether or not there's missing data.

05:01 It shows you which parts, which columns are more complete or incomplete.

05:05 And even as a little graph on the side that tells you the likelihood or the correlation of a row being incomplete, right?

05:13 Like you might have a missing address on one line but then another one has a missing phone number or it could be more likely that those are both missing at the same time.

05:21 There's like a little graph to visualize that kind of stuff.

05:23 What do you guys think?

05:24 - I think it's very cool.

05:25 I'm not a data anything person myself.

05:29 So to indicate how much I am not in the target audience for this module. The whole time I read your modules I had the grouping wrong.

05:39 I thought it was the missing data visualization module. And I thought well that's kind of cool that they say there's something missing and this clearly is the one that's... now it's turned up but it's actually visualizing missing data which actually I understand what that is. I've seen a spreadsheet or two and I can actually even understand the little example chart that you pasted into the notes without understanding anything else around it.

06:12 Yeah, it's so wonderful because that's why I actually think I like this and I chose it is you can just look at that picture and go, "Oh, I basically get a sense for what this data is like. It's complete, it's not complete, it's mostly incomplete on this column or whatever." And yeah, it's really nice. And I suspect you could, if you had data, say, in like a database or a file or something, you could probably just read that into a pandas data frame and then throw it out here and visualize like database missing data or file missing data or whatever. But it's really nice.

06:39 Yeah, for large data sets, one of the things you got to do is to decide when you're cleaning it up what to do with the missing data. And there's I mean, there's some nones or whatever. There's some strategies to either fill it in with interleaved data or something or, or just throw those rows completely away.

06:57 But you don't really know how much data you're throwing away without visualizing it.

07:02 So this is pretty cool.

07:03 I think this is great.

07:04 Yeah, and it has other visualizations as well.

07:06 It has heat maps, which are like correlations.

07:08 So like address and phone number correlated kind of things I was talking about.

07:13 It has bar charts.

07:14 And the most interesting or unique visualization is the dendogram, which I had never heard of, but this is a hierarchical clustering algorithm from SciPy, actually.

07:23 and it creates this kind of like hierarchical tree of relationships of missing data.

07:30 If you are worried about cleaning up data or stuff like that, or visualizing how good your data is, you could throw it at this real quick and get some great answers.

07:38 Yeah, that's cool.

07:39 Guido, you have been busy with the Language Summit recently, right?

07:43 What's the news there?

07:44 Yes, well, normally the Language Summit basically is an in-person meeting meeting where about 50 people who are mostly but not exclusively core devs get together a day or two before the actual Python conference. Since the conference was canceled... This would have been in Pittsburgh, right? It would have been in Pittsburgh this year, right. Obviously the conference was canceled and the language summit was too and then the two organizers thought, well, okay, this sounds like the kind of meeting that we can actually try to do on Zoom. You can't have a whole conference on Zoom, but you can probably have a meeting with 50 people on Zoom. And they tweak the format a bit so that, I mean, you can't be on Zoom for an entire day. I find Zoom incredibly intense and after an hour of Zooming I'm usually ready for a break. Yeah, all the virtual stuff takes a lot more attention. Yeah. Yeah. User user interface sucks, privacy probably sucks, but it clearly serves its purpose.

08:49 So we had it spread over two different days and then in addition, because nobody was traveling to Pittsburgh, we spread it out in time.

08:58 One day it was really early for me so that we could also have participants from Europe and one day it was really late for me so that we could have some people from Australia join us.

09:11 of the organizers lives in Poland and he was there till the end on both days.

09:16 So I didn't know who slept.

09:19 Yeah, so as usual, the format wasn't actually all that different.

09:25 It's typically like half hour slots for various topics that are important to either get information to core devs and usually also get feedback from core devs.

09:39 And we pretty much stuck to that format.

09:41 The one big thing that you miss, of course, is all the whispering to the guy who was sitting next to you or during the break, quickly grabbing three other people and having a little huddle about a topic.

09:53 Yeah, that's what's so powerful about in-person conferences.

09:56 Yeah, we missed the entire hallway track, but it was still good to have sort of short presentations and Q&A sessions.

10:05 And the Q&A sessions actually worked really well.

10:09 was a little tool that you can use to sort of moderate questions and Lukáš was like running the moderation tool and nobody was asking spam questions so all you had to do was just click OK for every question I think.

10:25 That tool is much more structured than the chat channel on Zoom could be and sort of raising your hand on Zoom and waving doesn't really work if there are 50 people because there's no way to see more than 16 people or so at the time.

10:39 Yeah.

10:39 So anyway, the first day, each day there were like maybe five topics and a few miscellaneous things.

10:47 Shall I just go over each day briefly?

10:51 See if I can sort of run them all off.

10:53 Yeah.

10:53 I would say just maybe touch really quickly on just the things that you felt like really might make an impact going forward.

11:00 Potentially.

11:01 Just a one-liner guy who originally implemented f-strings gave a talk about whether maybe all strings should become f-strings and the general sentiment was that that would have been nice in Python 1.0 or so but there is no way, it would just break too much code.

11:21 It's gonna break too much.

11:22 I totally hear that though because I'm so often I'm typing in a string I'm like, oh I need to put a variable here but I've typed 20 characters in and then I got to go back to the beginning, but not the beginning of the line, because maybe that's what I got to get to the beginning of the string and then go, maybe we could even put the F at the end.

11:36 Who knows?

11:37 But yeah, I would love to see it, but it's, I totally understand.

11:40 You can't do that without breaking stuff.

11:42 There are downsides to automatically doing it too, because curly braces are.

11:46 Useful for all sorts of things besides formatting.

11:50 So that was sort of the opening salvo.

11:53 Then my two co-conspirators on the PEG parsing project gave a talk about how we're going to hopefully introduce a new parser in Python 3.9.

12:06 And we've been coding for like almost a year now, probably.

12:11 It started out as a little hobby project of mine and gradually became more serious and more people started helping out.

12:19 and the last few months we've been doing heavy engineering work to actually prepare for the integration but we didn't have steering council approval yet we made it a pep and we sort of said well this is a nice thing but we're not going to do this unless there is sort of clear consensus or at least general agreement that we are going to do this and so very soon after the summit the steering council actually had a meeting and approved a bunch of peps and ours was one of them.

12:53 And then the last two days I've been stressing out because we wanted to get the new parts or in the alpha six release, which is going out tomorrow.

13:01 And so we're now in the last, the very last stretches of preparing for alpha six, and we're just deleting or disabling tests that are still failing that we, we know how to fix them, but we just don't have the time.

13:15 Right.

13:16 That's exciting that this project is going to be in there. That's great.

13:18 Yeah, so that's the new parser and if all goes well, nobody will notice a thing.

13:23 Ideally.

13:24 What are the effects? Is it going to speed things up or make things more maintainable?

13:29 It's going to sort of open up the grammar for future changes to the language that we currently can't do because the old LL1 parser holds us back.

13:42 Okay.

13:43 That's sort of the main motivation.

13:46 Super.

13:46 There was one interesting talk about something called HPy, which is a proposal for a new, more portable API, and in particular focused on other Python implementations besides CPython.

14:02 As you may know, PyPy has been struggling for over a decade with compatibility with extension modules.

14:10 And the HPy proposal is basically, instead of pointers to objects, you have handles, which is a pointer to a pointer to an object.

14:18 And there's a whole API around handles that is equivalent to the existing API, but it allows different styles of garbage collection.

14:28 For example, you could implement a garbage collector that moves objects behind your back occasionally.

14:34 Right, you might get a generational compacting garbage collector, because you could update the value of the pointer, pointer without changing the actual pointer, right?

14:42 Yeah, yeah, that's actually really exciting.

14:44 Yeah, and it's still in early stages I believe, but it looks pretty promising.

14:50 Eric Snow gave a lightning talk about a sort of a retrospective of all his work on multi-core support which is now beginning to conclude, well maybe it's too soon to call it a conclusion but we're going to have sub-interpreters with a much better API either in 3.9 or in 3.10 There's a pep around that 554, which will definitely be moving forward, but whether it's considered mature enough to go to land in 3.9 is not entirely clear.

15:23 >> Yeah. Eric's work is very interesting there.

15:25 >> Yeah. In 3.10, we will probably have separate gills per sub-interpreter.

15:31 That is going to be a major new thing.

15:35 Let's see, what else do we have?

15:37 Well, so the next day I gave a talk about the future of typing, which oh, yeah, there's one detail You might remember that we introduced something called from Dunder future import annotations Which made it so that annotations are no longer evaluated at runtime. You can still Introspect them, but you'll get just get the string containing the annotation expression back Well, that's going to be the default in 3.9 most likely. There's still a little debate about that, but there was like a two-thirds preference for just making that the default in 3.9. And various people argued effectively that nobody should notice any difference.

16:19 I'm really excited or happy to have typing in the language. It makes such a difference for the right use case, you know, on defining the boundary of APIs or making the editor understand something better when it otherwise wouldn't.

16:33 If you're maintaining tens of thousands of lines of Python code or more, type annotations really make a difference.

16:40 Yeah, for sure.

16:41 I still don't recommend teaching them to beginners though.

16:44 Oh really?

16:45 Okay.

16:46 It depends on what kind of beginners you have.

16:47 If they're sort of recuperating Java programmers, maybe you should introduce them.

16:52 But if they're like actually blank slate, this is the first time they're programming ever, I wouldn't bother with them with annotations.

17:00 Yeah, I kind of agree with that.

17:02 What's sort of still missing for the data science world is extensions to the type system for NumPy and Pandas and stuff like that.

17:14 There's a design, but there are not enough people with available time to actually implement the design.

17:23 And I'm sure that when you're halfway through implementation, all sorts of interesting issues with the design will prop up.

17:30 So the design is not final until it's been implemented.

17:35 Last two topics.

17:36 Zach Hatfield Dodds gave a very good talk about what he calls property-based testing and which really is about a tool named Hypothesis that introduces testing approach that I think was first developed in academia for Haskell that works in a completely different way than your typical unit test based testing.

18:02 Right, the tool decides, right? Instead of examples.

18:05 The tool generates test cases and I've never played with it myself, but the talk sort of made me very excited to play around with it more.

18:16 And it actually, even though it's a very different approach than unit test or pytest based testing, it will still integrate with that.

18:25 I mean, you can write a unit test and then put some decorator on top of it that produces test data and Hypothesis has all kinds of really advanced stuff for exploring enormous spaces of possible input data and quickly finding bugs. Do you think we'll get to a place where we are able to use Hypothesis for some of the testing for the standard library? That was one of the propositions that Zach made. I think it's still early for that.

18:59 I think it's much easier to introduce hypothesis in sort of a new project where you haven't yet written all the code and all the tests than it is to retrofit it in a large mature or maybe even somewhat dementing project. I think it'll be a while before we'll have testing it for the hypothesis-based testing for the standard library, just like it'll be a while before we'll have annotations in the standard library rather than annotations sort of separate from the standard library. The last talk I want to highlight and then I'm really done with this is also a very good talk by Russell Keith McGee about the state of Beware and Python for Mobile. And one of his suggestions was that we adopt some of his mega patches that he's currently being maintaining for several Python releases that would make Python at least compile out of the box or nearly out of the box for the important mobile platforms.

20:06 That'd be cool.

20:06 Yeah, it'd be so wonderful to have Python as an option for mobile. It would really would bust open the doors and create even more growth.

20:13 Many people believe that sort of mobile platforms are obviously continuing to grow in importance and to grow in power.

20:22 And we'd be crazy if we didn't support Python on those.

20:26 And it may be very important for Python's very survival.

20:30 Yeah.

20:30 Yeah, I saw the block swan talk that Russell Keith McGee gave and it was compelling.

20:34 He is an amazing speaker, for sure.

20:36 Yeah.

20:37 Yeah, that's what I have.

20:38 Great.

20:38 And thank you so much for that insight.

20:40 That was, that was awesome.

20:41 A lot of people don't get to see the behind the scenes.

20:44 They just see what's announced when it comes out, right?

20:46 Before we move on, let me tell you about our sponsor, Datadog.

20:49 This episode is brought to you by Datadog.

20:52 So let me ask you a question.

20:53 Do you have an app in production that's slower than you like?

20:56 Is its performance all over the place?

20:57 Sometimes fast, sometimes slow.

20:59 Now here's the important question.

21:00 Do you know why?

21:01 With Datadog, you will.

21:03 You can troubleshoot your app's performance with Datadog's end-to-end tracing, use detailed flame graphs, identify bottlenecks and latency in that finicky app of yours.

21:10 So be the hero that got the app back on track at your company.

21:14 Get started with a free trial over at pythonbytes.fm/datadog.

21:18 Get a cool t-shirt as well.

21:19 Brian, you've got another one that kind of ties into your first one, right?

21:22 But it's sort of the other side of the coin, maybe?

21:24 I don't know what's been happening in the Python world that you sort of orbit in that might make you think about these things, but tell us about it.

21:31 - No, I've just been thinking about community and codes of contact and enforcement for codes of contact.

21:37 No reason really, just kind of an interesting topic.

21:40 One of the things I've been thinking about is, especially when researching this, the codes of conduct and enforcement of it and how we treat people.

21:48 I first thought it was really important for open source projects, and it definitely is because people have the option to just leave and get out of the project.

21:56 So you really want to treat people well so they stick around and have it be welcoming to other people.

22:02 But I don't think industry is really that different.

22:04 I think that people have the ability to just get another job, so or work on a different project.

22:09 So I think these are important for industry as well.

22:12 I took a look at two sets of codes of conduct and the enforcement of those.

22:16 So the PSF has a code of conduct.

22:19 I'm not going to read them all out, but there's things like being open, being friendly.

22:23 And in there, there's a list of inappropriate behaviors as well that's covered.

22:28 Now, also the Django code of conduct.

22:30 They also have all of these when you read them.

22:33 There are differences, but when you read them, they kind of sound the same.

22:36 One of the things they highlight in the Django one is be careful with your choice.

22:42 Choice of words, including, and they include examples of harassment, speech and exclusionary behavior that's not appropriate.

22:51 One of the big differences I saw was the enforcement.

22:55 So the PSF is a two third majority vote enforcement sort of thing to like make sure if something happens, like if they want to kick somebody or put them on probation or something.

23:08 I think that's really important, because if you require 100% majority and somebody who is on the team that decides is potentially part of the problem, then what do you do?

23:19 It's really tricky.

23:21 If people are just going to abandon a project, you would rather have just a strong majority make a decision.

23:28 I also think that PSF has probably got a larger, possibly is a larger working group on this, and is more, I guess, maybe harder to get a hold of people.

23:33 Maybe it's easier to get a two-thirds than, maybe you can't even reach all 100% of the group.

23:38 But anyway, the other interesting difference is the PSF code of conduct seems to, I know it does cover online interaction as well as events like the conferences and meetups and stuff.

23:52 But I'd possibly, at least I think that maybe it's focus might be more on events.

23:57 whereas the Django code of conduct is specifically targeted towards online interactions.

24:03 I would say for the PSF that sort of historically, events were the first place where codes of conduct were introduced, but we've been using them for online forums more and more in the past few years.

24:19 Okay. One of the interesting things with the Django one is that a single person on the committee can act without collaborating with anybody else if it's an ongoing problem or if there's a threat involved or something. They still have to go through the process of notifying everybody else, but there is an interesting thing that one person on the committee can intervene right away. I'm not saying one is better than the other, or I just think it's interesting and I think it's important for new projects to think about, not just their code of conduct but how they're going to enforce it, and what the timeline.

24:56 The Django one also includes some timelines, which is interesting.

25:01 I would really like to make sure that projects practice, maybe figure out what they're going to do if they need to enact one of these things.

25:09 Before it becomes a problem, they know what they're going to do.

25:13 There's a lot of stuff going on with some projects out there, so having a couple of examples and side-by-side comparisons comparisons, I think is great.

25:21 I was interested to find out our meetup, like the Python meetup that we started, which is on hold right now, unfortunately, because of the virus and quarantine and stuff.

25:29 But because we were getting support from the Python Software Foundation to help pay for the meetup fees and stuff, we had to list a code of conduct on our meetup page and stuff like that.

25:41 Yeah, that makes a lot of sense, but I didn't realize that.

25:43 Yeah, the PSF has been doing that for a few years now.

25:43 That's really great. All right, this next one I want to cover. It goes back a ways, but I think it's really fun. And it's something also I think ties together well with our special guest here. And this is an article about myths about indentation. And Guido, I picked this one because you were talking about this on Twitter just the other day. What was the motivation to throw that out there? That is a good question. I was just going to volunteer the answer because Apparently, I had a link to that article on my homepage in some odd corner.

26:14 And I have a very, very sort of ready old homepage.

26:18 It's moved it to GitHub pages, but it looks like web 1.0.

26:22 And because it really is, I just added raw HTML.

26:25 It blends in right with Netscape, huh?

26:26 So someone reported to me a broken link, which happens like, I don't know, once every four years or so.

26:36 someone reported a broken link. Oh wait, it wasn't even on my homepage. It was on an old blog that I can no longer edit at artema.com. I'm very glad that that blog is still online. But so, because I got the report of the broken link, I decided, oh, I'm sure I can still find on archive.org where that link used to point. And sure enough, it was there. And I thought, oh, that's actually still a neat little article. So I thought, okay, tweet of the day or tweet of the week.

27:05 - Yeah, I agree, and I think it's interesting as well.

27:07 And just to give you a sense of why it might have disappeared, it was one of those types of sites where the domain or the URL included a tilde username path, like you used to get in university or whatever way back when.

27:21 So anyway, this one is myths about indentation for Python.

27:24 And for people who come from a C-oriented language, I think Python could come across a little bit funky.

27:32 I actually want to share a little story just sort of my journey with it, and how I came to love this.

27:38 But I think this is really interesting for people having the debate about, is significant white space useful?

27:43 Is it weird?

27:44 Is it good?

27:44 I did a ton of C++ and then C# development, and then JavaScript development.

27:50 It was all about the curly brace languages, lots of symbols.

27:53 And then I came to learn Python.

27:55 And I loved Python right away.

27:56 But it was weird to me.

27:57 I felt kind of naked.

27:58 Like if I'd write an if statement, I'm like, I need some little parentheses to kind of hold the code in place.

28:03 and why don't they need to be there?

28:04 And I need a curly brace to say when this block of code is done and whatnot.

28:08 It just took a little bit of getting used to, but I knew that it was the right thing for me, 'cause when I went back to work on some older projects, I'm like, why are there symbols everywhere?

28:17 What is all this stuff I have to keep typing?

28:19 This is like a broken language.

28:21 And it just took a couple of weeks for me to make that switch to feel like it was broken to go back to work in languages that I've been doing for like 10 years.

28:28 So, well done with the white space, Guido.

28:30 - Thanks. - Yeah.

28:32 But so let's cover some of the things mentioned really quick in the article.

28:35 One is that whitespace is significant in Python source code.

28:38 And actually, no, not in general, is the answer.

28:41 It's significant on the left.

28:44 Right?

28:46 So as much as you indent stuff, that really means things, but between variables, like whether you have like a equals seven or a space equals space seven, doesn't matter, you can have tons of spaces in there, right?

28:57 Like any other language of spaces kind of don't matter except for on the left.

29:01 So that's cool.

29:03 And also the amount of indentation doesn't really matter.

29:06 You could have five spaces for any code suite that you want or you could have 18 or you could go with a standard four.

29:13 I recommend the four, but you know.

29:15 And then also if you have something that defines like a list comprehension or an array creation or a dictionary, then all of a sudden the spacing doesn't matter anymore.

29:26 As soon as you have like an open square bracket and then you have a bunch of stuff and then close square bracket, spacing doesn't matter in there.

29:31 So I think this is interesting to think about as folks debate that maybe within their teams.

29:36 It also, you could say it forces you to use a certain indentation style.

29:39 Well, yes and no.

29:41 If you wanted to write it single statement per line, then yeah, there's a cool example that they gave in the article.

29:47 It's like if one plus one equals two, then new line print food, new line print bar, new line print, or just say X equals 42.

29:54 You can also put them on multiple lines with semicolons.

29:57 If you're really missing your semicolons from your language, you could do that.

30:00 The thing that's interesting here, I think this is probably the most significant part of this article or this write-up is, if you look at it, it looks right.

30:07 And when it gets parsed, it is right.

30:10 There's an example of some C code that looks visually wrong because it's intended differently, but it's going to parse.

30:18 But the way you see it when you read it is not what's actually happening.

30:23 And I think there was a problem like this, well, I think it was in some either Objective-C There was something with Apple in there.

30:31 It was really bad.

30:32 There was an infamous Apple vulnerability.

30:35 I think it might even have been on the iPhone where someone had added a second statement to a block, but it wasn't a block because there were no curlies.

30:44 Right.

30:45 Then it started out with a single conditional line, like if something indent, do the thing.

30:51 And then they just indented, but they didn't put the curly braces in.

30:54 And it was, yeah, it was, it took so long for people to find it because visually it it looked like what Python would actually mean.

31:02 It looked like those two things were part of the if block, but because the white space didn't matter, it actually didn't.

31:10 That's really interesting.

31:11 I'm not going to go through everything.

31:13 I'll put it in the show notes, but another one that I thought was like, I just don't like it.

31:17 That's fine, people can not like it, but it has a lot of advantages.

31:20 Like in that example before, if you had that wrongly indented Python code, it would not parse.

31:25 It's an error to have it not look right rather than just not be right.

31:26 So it has a lot of advantages, and people can really quickly get used to not having to write all those symbols, and then you go back and you're like, this code is hard to read, it's just full of curly braces, semi-colons, parentheses, everywhere.

31:38 I always thought we used to, those were just, that is what builds programming languages.

31:42 To have a programming language, you had to have that, and then once I experienced Python, and I went back, it kind of, it broke my mental model of the world.

31:49 I'm like, you don't actually have to have those things, so why are they there?

31:52 Anyway, what do you think about this article?

31:54 You must like it somewhat, 'cause you hunted it down and tweeted it, right?

31:57 - It's old news for me, because I didn't even invent the whitespace thing for Python.

32:02 That was sort of handed to me on a silver platter by one of my mentors in the early '80s.

32:08 - Yeah, yeah, back in the ABC days.

32:10 - And in those days, it was an innovation.

32:13 There was like one other language that had this, and Knuth had once said that he thought it would be a good idea, but he had never actually implemented the language or even experienced the language that implemented it.

32:26 He just thought that it would be a good idea.

32:28 Right.

32:29 Right.

32:29 The only thing that was a stumbling block for me was when I first started looking at Python, the editor I was using, I think it was an Emacs something at the time.

32:39 I'm not sure what I was using.

32:41 But with the C++ code I was using, I had it set up so that if I double-clicked on the closing bracket, it would jump to the top of the block.

32:52 And I really liked that feature.

32:55 For some reason, that's the reason why I didn't like the whitespace thing at first.

32:58 How do I get back?

33:00 But then I just went, "Okay." I'm a beginner's mind, just open mind, just embrace it and learn it as a new thing.

33:05 And a week later, I didn't even miss it.

33:09 And of course, the new editors, the newer editors like PyCharm and stuff, at the bottom they have little breadcrumbs of here's the class, here's the function, here's the if, here's the while, whatever.

33:20 You can jump between them, just like you were talking about, but the entire hierarchy of the tokens or whatever.

33:24 - Yeah, I tend to write smaller functions now so it's not as much of a deal.

33:30 - This is probably a good thing that it was hard.

33:33 - I was thinking that if you needed the attitude to help you find the top of the block, it must be pretty far away.

33:39 It's 4,000 lines. I hate scrolling so much. These functions are hard.

33:42 Ah, yeah.

33:43 How interesting. All right. Guido, do you have one more you want to share with us?

33:46 Well, yeah, you gave me some homework. I didn't really do it, but there's like...

33:50 And of course, this has to do with parsing.

33:53 And so this may be a fairly esoteric library, but if you're writing a program that sort of does some manipulation of your code, and maybe it converts 4-space indents to 2-space indents or 3-space indents or whatever.

34:09 Or maybe you're writing something like Black, which is the sort of Python code reformatting tool, but you don't like the way Black handles certain things.

34:20 Or maybe you're writing some other thing that does analysis of source code.

34:26 Maybe you're writing a linter.

34:29 There are a couple of tools that you can use, and it turns out that one of them is in the standard library.

34:35 There's something called lib223, which is a little hard to pronounce.

34:40 It has the digit two and then the word T O and then the digit three in the name.

34:45 That is tricky.

34:46 That is something I wrote probably over 15 years ago, or at least the core of it.

34:52 Which is yet another LL one parser, but this, this one's written in Python rather than in C like the original one.

35:00 And actually Black ended up using Lib223 except I think Lukasz had one issue that he couldn't figure out how to do with Black.

35:12 And so he ended up vendoring a copy of Lib223 and then butchering it a little bit.

35:17 Which is how these things happen. I mean if you look at what pip vendors, that's pretty scary.

35:22 But there are good reasons for that too.

35:24 But if you're writing your own, you should probably not use lib223 and not just because it's going to go out of style once the PEG parser arrives.

35:34 There are much better tools, and the one that I discovered a few months ago is actually written by some folks at Facebook mostly.

35:44 It's called libcst.

35:46 And they have a unique capitalization.

35:50 It's a capital L lib and then lowercase ib and then cst is all uppercase.

35:56 And so it's a library for manipulating concrete syntax trees.

36:00 And like lib223, it actually shares some code with lib223.

36:06 I think the underneath is a parsing library called Parso, which itself is a butchered version of lib223.

36:15 At least that's how it started.

36:17 These tools are things that can parse Python code, but they produce a syntax tree that is the opposite of an abstract syntax tree.

36:26 It's a very concrete syntax tree, and that means that every space, every comment, every bit of indentation is preserved, or at least can be recovered from the information in that syntax tree.

36:43 and oppose that with the typical abstract syntax tree, which in the end doesn't even remember where the parentheses are.

36:52 - Right, right.

36:53 It just takes us, well, here's some conditional statement, and here's the two things we're testing.

36:57 - Yeah. - Right?

36:58 So this sounds much more useful if you want to do like a code analysis type of thing to say this thing you're doing here, you should do it in this other way or transform it over, but kind of preserve things like comments and style.

37:09 - Yeah.

37:10 And so libcst has a really sort of solid underlying model and they thought a lot about various transformations they want to apply because the typical way these tools work and lib223 itself started out that way as well is you read your source code using this customized parser.

37:33 It gives you a concrete syntax tree.

37:36 Then in that syntax tree, you're actually going to make changes.

37:40 You're going to systematically rename a parameter or move things around or insert.

37:47 In the 2-2-3 world, of course, it's used to turn things like iter items into items and iter keys into keys.

37:54 And you can make that kind of changes.

37:57 And so libcst also supports that, but it sort of has a slightly better API because 15 years ago when I started lib 2-2-3, I didn't realize what an important tool it was going to be.

38:10 And some of the way the whitespace is attached to nodes is exactly backwards from the way that is the most convenient to think about it and work with it.

38:20 All right, cool.

38:21 Well, this sounds like it'd be really helpful for people building tools like Black or looking at code analysis and stuff.

38:26 Right.

38:27 Lukasz had a, I think it was the 2019 talk, PyCon talk, where he described how Black uses both concrete syntax trees and the abstract syntax tree.

38:37 It's a pretty fascinating talk for a very low level depth into these concepts.

38:43 It wasn't until I watched that talk that I realized that Black compares the before and after abstract syntax tree to make sure that your code is guaranteed to run the same.

38:54 So you don't really have to test for that. He's already testing for it.

38:58 That's pretty interesting.

38:59 Yeah, that's pretty cool.

39:00 That is a very neat feature.

39:02 And it's actually important trick in general for people who are doing transformations to have some abstract way of double checking that your transformation left things in a decent state.

39:16 Yeah, it's cool.

39:17 Yeah, very cool.

39:18 All right, well, thanks for Libcst. Guido, that's a great one.

39:20 Now, that's it for our main topics. So just really quick things at the end that I just want to throw out there for people.

39:25 One, Adam, who goes by codependent coder on Twitter, sent a message over and said, "Hey, Django no longer supports Python 2 at all," which is pretty awesome because 1.11 has left long-term support, leaving only 2.2.12 onward, which has only Python 3 support.

39:43 So yay for modern Python making its way through.

39:46 That's good.

39:47 And then last time we talked about 90% of coding is Googling, and that's okay, or it's not.

39:53 And we didn't really feel like that was our experience, right, as people have been around for a while.

39:58 But I got to tell you, this last week, I've been doing nothing but Pandas, Altair visualization, Jupyter Notebook, and graphics, because I'm building a whole set of dashboards for the Dock Python courses and whatnot.

40:14 Basically, the dashboards that I should have built a while ago.

40:19 I Googled a lot, a whole lot.

40:20 But that's the thing, it was like a two or three day blip of wow, I'm Googling 25, 30% of my time because I don't know anything about these things and how do I get this thing to line up with that bar?

40:30 But now I'm back to just kind of mostly not doing that anymore, even after a few days.

40:33 So I think generally what we said is true, but I do think there's like these blips of like, wow, I'm diving into something new.

40:38 It's like mad search scrambling, but then I'm back to sort of using like more memory coding.

40:44 I don't know what you call not Google coding.

40:46 - You got to understand what you're doing.

40:48 And that means you can't just Google for examples and copy and paste them in because then you combine the examples and you have no idea what you're doing and of course it doesn't work.

41:00 - At best it's frustrating, right?

41:02 You're like, this worked, that worked, but together they don't work and you just don't even know why, right?

41:05 Yeah, so for sure.

41:07 Yeah, so anyway, I don't know.

41:09 A follow up on our conversation last week, Brian, what do you got to throw out there for everyone?

41:12 - I'm going to say this on this show just to make sure I do it.

41:15 There's like three days left for me to record my talk.

41:18 - Yeah, this is like forcing yourself to commit to it so you're going to do it, okay.

41:22 - Yes, definitely.

41:24 So PyCon talk, I really do want to get it online.

41:26 It's important stuff.

41:28 It's about parameterization.

41:29 I talked a couple episodes ago about having trouble switching back and forth at home with all this working from home stuff between Mac and Windows.

41:37 I finally figured out the whole using command and control.

41:41 So thank you to everybody, but apparently there's this really simple thing.

41:44 Apple lets you just swap them on a keyboard.

41:47 So that's what I'm doing and it works great.

41:50 And then also I had promised that I was going to have my cards project be able to work and publish to PyPI or the test PyPI, it doesn't work with setup tools SCM because I'm using Flit.

42:02 So if somebody's got a way to figure out how to just somehow change the version string or bump that every time you merge or something like that, that'd be great, but otherwise, right now I don't think there's a way to automatically push to PyPI if you're using Flit.

42:19 - Yeah, because it says that one's already uploaded.

42:21 Maybe there's a GitHub action that will just randomize that or something.

42:24 Because the version is embedded in the source code.

42:27 And the trick that people are using with setup tools is the version is based on the version in GitHub.

42:33 And you can't do that with Flit.

42:35 At least I haven't figured it out.

42:37 But that's okay. I'll probably do something else.

42:40 That's my extras.

42:41 Guido, anything else?

42:42 Even though I said it's hard to imagine Python going online, it actually is going online.

42:48 At least some of it is.

42:50 The first talk by the conference chair, Emily Morehouse has been posted and many more will follow.

42:58 Yeah, her welcome was really nice.

42:59 The other thing, and as you mentioned, Django no longer supports Python 2 at all.

43:05 Well, that's just fine because the very last release of Python 2, 2.7.18, was released a few days ago.

43:13 Yeah, that's great. That must be kind of a load off of your shoulders to finally have that in the rearview mirror.

43:18 I'm very happy and I'm sad of course that we can't have an absolutely wild and crazy party in Pittsburgh like we were planning.

43:26 Yeah, a big celebration on Zoom is just not the same.

43:30 Just have to have a bigger one next year.

43:32 That's one I don't know how to pull off.

43:34 That's really good.

43:35 All right, you guys ready for a really quick joke?

43:38 All right.

43:39 So here's a quick joke sent to us by Derek Chambers.

43:43 And he may have even made this up for us.

43:46 And this goes back to the sub-interpreters and the multiple gills and all that.

43:51 So you guys know how you can borrow money concurrently?

43:55 With async IOUs.

43:56 That's a terrible joke.

43:57 That's a bad joke.

43:58 Oh, that is very groan-worthy.

44:04 Most of our jokes actually are around here, but that's how it goes.

44:07 And keep them coming.

44:08 Keep sending us your bad jokes.

44:11 That's right.

44:12 That's right.

44:13 Python dad jokes, that should be a whole separate category.

44:15 They absolutely should.

44:17 Well, Guido, it was really an honor to have you on the show.

44:19 Thanks for coming and sharing your perspective on all this.

44:21 Glad to be back.

44:23 Yeah, and Brian, thanks as always. Good to be here with you.

44:25 Cheers.

44:27 Bye, everyone.

44:29 Thanks, both of you.

44:31 Thank you for listening to Python Bytes.

44:33 Follow the show on Twitter via @PythonBytes.

44:35 That's Python Bytes as in B-Y-T-E-S.

44:37 And get the full show notes at PythonBytes.fm.

44:39 If you have a news item you want featured, just visit PythonBytes.fm and send it our way.

44:43 We're always on the lookout for sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy Thank you for listening and sharing this podcast with your friends and colleagues

Back to show page