Transcript #179: Guido van Rossum drops in on Python Bytes
Return to episode page view on github00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to
00:04 your earbuds. This is episode 179, recorded April 21st, 2020. I'm Michael Kennedy.
00:10 And I'm Brian Okken.
00:11 And Brian, I'm super honored to have Guido von Rossum on the show. Guido, welcome to Python Bytes.
00:17 Hello, glad to be here.
00:18 Yeah, it's really great to have you here. It's going to be wonderful to hear your opinion,
00:23 your perspective on some of these things that we're sharing this week. So welcome to the show.
00:26 All right. And this episode is brought to you by Datadog. Check them out at pythonbytes.fm
00:30 slash Datadog. More on that later. Brian, what do you got? What's up first?
00:34 Well, I've been thinking a lot about community lately, actually. And one of the things that
00:38 came out recently, this was a little bit ago, but it's still fairly new, is the Django project
00:44 announced a new governance model. It's been going on. I mean, I think they've been working on it for
00:49 a couple of years since at least 2018. Some of the specifics are interesting. They had like a core
00:56 team that they dissolved the core team and they mainly kind of have a new role called a merger
01:01 person, which they have commit access, but they only merge pull requests. So most of the changes
01:07 could happen in the pull requests and the discussion that happens there. There is a technical board also
01:14 that was kept to kind of make some technical decisions if it's necessary, but apparently it hasn't
01:20 been necessary for a while. I think it's interesting that they switched the governance model midstream.
01:26 And then also the rationale around it, I think is interesting. And the rationale is around trying
01:32 to get more people contributing to it. So they had like their core team and that hadn't really changed
01:38 for a long time. And people that were set up as core people really weren't contributing much anymore.
01:44 Anyway, I just thought that was interesting that they, the reason around changing the governance was
01:50 around trying to get new people in.
01:52 Yeah, I think that's a great idea because, you know, Django has been around for a long time and it's a
01:57 fairly stable project. So I think it's kind of hard to jump in. I mean, it's a little bit like Python
02:01 itself, Weido.
02:02 Right. I'm thinking that sort of maybe five years in the future, Python could consider a similar move
02:09 or maybe we'll know that this was not the right move by then from Django's experience. And of course,
02:15 the situation for the two projects is somewhat different, but we definitely also feel the pain
02:22 of sort of not getting enough new contributors. But we only fairly recently, like early last year,
02:30 we changed our governance structure completely. So it's a little early to start considering changing
02:36 it again, probably.
02:37 Right. Of course, we're just starting to see the outcome of the decisions and the releases that
02:42 are actually going through that model, right?
02:44 Yeah. I've been working with the steering council model for say 16 months now.
02:49 Yeah, I guess so. 3.8 definitely came out under that model.
02:52 Yeah.
02:53 The thing that Python did, I think is kind of interesting. And I don't know if you started it,
02:57 but the notion of having more core mentors to try to mentor new core developers, I think that's an
03:04 interesting thing that you can't really like make people be mentors, but that's an interesting way to
03:10 get more core developers on.
03:11 We have a few people who are very active as mentors, in addition to being active as core devs.
03:17 And it really does make a difference.
03:19 Yeah.
03:20 Yeah.
03:20 We don't have enough mentors to mentor everyone who wants to become a core dev.
03:27 Yeah. Yeah. So I think that's really great. I mean, it's one thing to write web apps in Django
03:31 or to write Python code. It's an entirely different thing to write Django or write Python, right?
03:38 It's a very different skill set. And so I think that mentor model is really a great bridge.
03:43 Yeah. So speaking of things, I think are going to be really helpful, but in a much simpler way,
03:48 this is sort of a data science topic for everyone out there. And one of the problems in data science
03:55 is you can end up with very large data sets, complicated data, but every now and then there
04:01 might be a none where you expected an integer, or there might be a empty string where you expected
04:07 a date or something like that. And understanding how that data is for how completed it is, where
04:13 is it more incomplete than less complete, right? Or less, more or less, and so on. So there's this
04:19 cool project called missing no, which I think is missing number, right? Shortened. And the idea is
04:25 it's a missing data visualization module for Python. And you too can see the picture in the show notes and
04:31 folks who listen to this, they can go back and see it in the show notes as well. But it's a really cool
04:37 and simple little library, but it's not just show me a quick graph. It actually does some pretty
04:41 powerful analysis. So what you can do is if you've got like some pandas data, you can just go to it and
04:47 say msno.matrix and give it a sample of your data. And it gives you these really cool graphs of like
04:53 vertical, either black or white bars or bars that are like kind of zebra stripe, depending on whether or not
04:59 there's missing data. It shows you which parts, which columns are more complete or incomplete.
05:05 And even as a little graph on the side, that tells you the likelihood or the correlation of a row being
05:12 incomplete, right? Like you might have a missing address on one line, but in another one has a missing
05:17 phone number, or it could be more likely that those are both missing at the same time. There's like a
05:21 little graph to visualize that kind of stuff. What do you guys think?
05:24 I think it's very cool. I'm not a data anything person myself. So yeah, to indicate how much I am
05:32 not in the target audience for this module. The whole time I read your modules, I had the grouping
05:39 wrong. I thought it was the missing data visualization module. And I thought, well, that's kind of cool that
05:47 that they're sort of, they say there's something missing. And this clearly is the one that now it's
05:53 turned up, but it's actually visualizing missing data, which actually I understand what that is. I've seen
05:59 a spreadsheet or two, and I can actually even understand the little example chart that you pasted into
06:07 through the notes without understanding anything else around it.
06:12 Yeah, it's so wonderful because that's why I actually think I like this and I chose it as you
06:16 could just look at that picture and go, oh, I basically get a sense for what this data is like.
06:20 It's complete. It's not complete. It's mostly incomplete on this column or whatever. And yeah,
06:25 it's really nice. And I suspect you could, if you had data, say, in like a database or a file or
06:31 something, you could probably just read that into a pandas data frame and then throw it out here and
06:35 visualize like database missing data or file missing data or whatever. But it's really nice.
06:39 Yeah. For large data sets, one of the things you got to do is to decide when you're cleaning it up,
06:43 what to do with the missing data. And there's, I mean, there's some nones or whatever. There's some
06:48 strategies to either fill it in with interleaved data or something or, or just throw those rows
06:56 completely away. But you, I mean, you don't really know how much data you're throwing away if you,
07:00 without visualizing it. So this is pretty cool. I think this is great.
07:04 Yeah. And it has other visualizations as well. It has heat maps, which are like correlations,
07:08 you know, so like address and phone number correlated kind of things I was talking about.
07:13 It has bar charts and the most interesting or unique visualization is the dendogram,
07:19 which I had never heard of, but this is the hierarchical clustering algorithm from SciPy actually.
07:23 And it creates this kind of like hierarchical tree of relationships of missing data.
07:30 There's just, if you are worried about like cleaning up data or stuff like that,
07:34 or visualizing how good your data is, you could throw it at this real quick and get some great
07:38 answers.
07:38 Yeah, that's cool.
07:39 Yeah. All right. Well, Guido, you have been busy with the language summit recently, right?
07:44 What's the news there?
07:45 Yes. Well, normally the language summit basically is in-person meeting where about 50 people who are
07:52 mostly, but not exclusively core devs get together a day or two before the actual Python conference.
07:59 Since the conference was canceled.
08:01 This would have been in Pittsburgh, right?
08:03 It would have been in Pittsburgh this year, right?
08:06 Obviously the conference was canceled and the language summit was two.
08:10 And then the two organizers thought, well, okay, this sounds like the kind of meeting that we can
08:16 actually try to do on Zoom. You can't have a whole conference on Zoom, but you can probably have a
08:21 meeting with 50 people on Zoom. And they tweaked the format a bit so that, I mean, you can't be on
08:28 Zoom for an entire day. I find Zoom incredibly intense. And after an hour of Zooming,
08:36 I'm usually ready for a break.
08:38 Yeah. All the virtual stuff takes a lot more attention. Yeah.
08:41 Yeah. User interface sucks. Privacy probably sucks, but it clearly serves its purpose. So we had it
08:50 spread over two different days. And then in addition, because nobody was traveling to Pittsburgh,
08:56 we spread it out in time. One day it was really early for me so that we could also have participants from
09:03 Europe. And one day it was really late for me so that we could have some people from Australia join us.
09:10 One of the organizers lives in Poland and he was there till the end on both days. So I didn't, I don't know.
09:18 That's commitment.
09:21 So as usual, the format wasn't actually all that different. It's typically like half hour slots for
09:27 various topics that are important to either get information to core devs and usually also get
09:36 feedback from core devs. And we pretty much stuck to that format. The one big thing that you miss,
09:43 of course, is all the whispering to the guy who was sitting next to you or during the break,
09:49 quickly grabbing three other people and having a little huddle about a topic.
09:53 Yeah. That's what's so powerful about in-person conferences.
09:56 Yeah. We missed the entire hallway track, but it was still good to have sort of short presentations and
10:04 Q&A sessions. And the Q&A sessions actually worked really well. There was a little tool that you can
10:11 use to sort of moderate questions. And Lukasz was like running the moderation tool and nobody was asking
10:19 spam questions. So all he had to do was just click OK for every question, I think.
10:23 Yeah. That tool is much more structured than the chat channel on Zoom could be. And sort of raising
10:31 your hand on Zoom and waving doesn't really work if there are 50 people, because there's no way to see
10:37 more than 16 people or so at a time. Yeah.
10:39 So anyway, the first day, each day, there were like maybe five topics and a few miscellaneous things.
10:47 Shall I just go over each day briefly, see if I can sort of run them all off?
10:53 Yeah, I would say just maybe touch really quickly on just the things that you felt like really might
10:58 make an impact going forward, potentially.
11:01 Just a one-liner guy who originally implemented F-strings gave a talk about whether maybe all
11:08 strings should become F-strings. And the general sentiment was that that would have been nice
11:16 in Python 1.0 or so, but there is no way that it would just break too much code.
11:21 It's going to break too much. I totally hear that though, because I'm so often I'm typing in a
11:25 string. I'm like, oh, I need to put a variable here, but I've typed 20 characters in that I got
11:29 to go back to the beginning, but not the beginning of the line, because maybe that's what I got to get
11:33 to the beginning of the string and then go. Maybe we could even put the F at the end. Who knows?
11:37 But yeah, I would love to see it. But it's I totally understand. You can't do that without
11:41 breaking stuff. There are downsides to automatically doing it, too, because
11:45 curly braces are useful for all sorts of things besides formatting.
11:50 So that was sort of the opening salvo. Then my two co-conspirators on the peg parsing project gave a talk about how we're going to hopefully introduce a new parser in Python 3.9.
12:06 And we've been coding for like almost a year now, probably.
12:11 It started out as a little hobby project of mine and gradually became more serious and more people started helping out.
12:18 And the last few months, we've been doing heavy engineering work to actually prepare for the integration.
12:26 But we didn't have steering council approval yet.
12:29 We made it a PEP and we sort of said, well, this is a nice thing, but we're not going to do this unless there is sort of clear consensus or at least general agreement that we are going to do this.
12:43 And so very soon after the summit, the steering council actually had a meeting and approved a bunch of peps and ours was one of them.
12:53 And then the last two days, I've been stressing out because we wanted to get the new parser in the alpha 6 release, which is going out tomorrow.
13:01 And so we're now in the last, the very last stretches of preparing for alpha 6.
13:08 And we're just deleting or disabling tests that are still failing that we know how to fix them, but we just don't have the time.
13:16 Right. That's exciting that this project is going to be in there. That's great.
13:18 Yeah. So that's the new parser. And if all goes well, nobody will notice a thing.
13:23 Ideally.
13:24 What are the effects? Is it going to speed things up or make things more maintainable?
13:29 It's going to sort of open up the grammar for future changes to the language that we currently can't do because the old LL1 parser holds us back.
13:42 Okay.
13:42 That's sort of the main motivation.
13:45 Super.
13:46 There was one interesting talk about something called HPy, which is a proposal for a new, more portable API and in particular focused on other Python implementations besides CPython.
14:02 As you may know, PyPy has been struggling for over a decade with compatibility with extension modules.
14:10 And the HPy proposal is basically instead of pointers to objects, you have handles, which is a pointer to a pointer to an object.
14:19 And there's a whole API around handles that is equivalent to the existing API, but it allows different styles of garbage collection.
14:28 For example, you could implement a garbage collector that moves objects behind your back occasionally.
14:34 Right. You might get a generational compacting garbage collector because you could update the value of the pointer pointer without changing the actual pointer.
14:42 Right.
14:42 Yeah.
14:42 Yeah.
14:43 Yeah.
14:43 Yeah.
14:43 That's actually really exciting.
14:44 Yeah.
14:45 And it's still in early stages, I believe, but it looks pretty promising.
14:50 Eric Snow gave a lightning talk about sort of a retrospective of all his work on multi-core support, which is now beginning to conclude.
15:01 Well, maybe it's too soon to call it the conclusion, but we're going to have sub-interpreters with a much better API, either in 3.9 or in 3.10.
15:11 There's a PEP around that 5.5.4, which will definitely be moving forward, but whether it's considered mature enough to land in 3.9 is not entirely clear.
15:23 Yeah.
15:24 Eric's work is very interesting there.
15:25 Yeah.
15:26 Yeah.
15:26 And in 3.10, we will probably have separate gills per sub-interpreter.
15:31 That is going to be a major new thing.
15:35 Let's see.
15:35 What else do we have?
15:37 Well, so the next day, I gave a talk about the future of typing, which, oh, yeah, there's one detail.
15:43 You might remember that we introduced something called from-dunder-future-import-annotations, which made it so that annotations are no longer evaluated at runtime.
15:54 You can still introspect them, but you'll just get the string containing the annotation expression back.
16:01 Well, that's going to be the default in 3.9, most likely.
16:06 There's still a little debate about that, but there was like a two-thirds preference for just making that the default in 3.9.
16:14 And various people argued effectively that nobody should notice any difference.
16:19 I'm really excited or happy to have typing in the language.
16:23 It makes such a difference for the right use case, you know, on defining the boundary of APIs or making the editor understand something better when it otherwise wouldn't.
16:33 If you're maintaining tens of thousands of lines of Python code or more, type annotations really make a difference.
16:40 Yeah, for sure.
16:41 I still don't recommend teaching them to beginners, though.
16:44 Oh, really?
16:45 Okay.
16:45 It depends on what kind of beginners you have.
16:47 If they're sort of recuperating Java programmers, maybe you should introduce them.
16:52 But if they're like actually blank slate, this is the first time they're programming ever.
16:57 I wouldn't bother with them with annotations.
17:00 Yeah.
17:00 I kind of agree with that.
17:02 Yeah.
17:02 What's sort of still missing for the data science world is extensions to the type system for NumPy and Pandas and stuff like that.
17:13 There is a design, but there are not enough people with available time to actually implement the design.
17:22 And I'm sure that when you're halfway through implementation, all sorts of interesting issues with the design will prop up.
17:30 So the design is not final until it's been implemented.
17:33 Okay.
17:34 Last two topics.
17:36 Zach Hatfield Dodds gave a very good talk about what he calls property-based testing,
17:44 and which really is about a tool named Hypothesis that introduces testing approach that I think was first developed in academia for Haskell that works in a completely different way than your typical unit test-based testing.
18:02 Right.
18:02 The tool decides, right, instead of examples.
18:05 The tool generates test cases, and I've never played with it myself, but the talk sort of made me very excited to play around with it more.
18:16 And it actually, even though it's a very different approach than unit test or pytest-based testing, it will still integrate with that.
18:25 I mean, you can write a unit test and then put some decorator on top of it that produces test data.
18:32 And Hypothesis has all kinds of really advanced stuff for exploring enormous spaces of possible input data and quickly finding bugs.
18:45 Do you think we'll get to a place where we are able to use Hypothesis for some of the testing for the standard library?
18:52 That was one of the propositions that Zach made.
18:56 I think it's still early for that.
18:59 Okay.
19:00 I think it's much easier to introduce Hypothesis in sort of a new project where you haven't yet written all the code in all the tests than it is to retrofit it in a large, mature, or maybe even somewhat dementing project.
19:18 Yeah.
19:18 I think it'll be a while before we'll have hypothesis-based testing for the standard library, just like it'll be a while before we'll have annotations in the standard library,
19:29 rather than annotations sort of separate from the standard library.
19:34 The last talk I want to highlight, and then I'm really done with this, is also a very good talk by Russell Keith McGee about the state of Bware and Python for Mobile.
19:47 And one of his suggestions was that we adopt some of his mega patches that he's currently being maintaining for several Python releases that would make Python at least compile out of the box or nearly out of the box for the important mobile platforms.
20:06 That'd be cool.
20:06 Yeah, it'd be so wonderful to have Python as an option for mobile.
20:10 It really would bust open the doors and create even more growth.
20:13 Many people believe that sort of mobile platforms are obviously continuing to grow in importance and to grow in power.
20:22 And we'd be crazy if we didn't support Python on those.
20:26 And it may be very important for Python's very survival.
20:30 Yeah.
20:30 Yeah.
20:31 I saw the Block Swan talk that Russell Keith McGee gave, and it was compelling.
20:34 He is an amazing speaker, for sure.
20:36 Yeah.
20:37 Yeah, yeah.
20:37 That's what I have.
20:38 Great.
20:38 Thank you so much for that insight.
20:40 That was awesome.
20:41 A lot of people don't get to see the behind the scenes.
20:44 They just see what's announced when it comes out, right?
20:45 Mm-hmm.
20:46 Before we move on, let me tell you about our sponsor, Datadog.
20:48 This episode is brought to you by Datadog.
20:51 So let me ask you a question.
20:52 Do you have an app in production that's slower than you like?
20:55 Is its performance all over the place, sometimes fast, sometimes slow?
20:58 Now, here's the important question.
21:00 Do you know why?
21:01 With Datadog, you will.
21:02 You can troubleshoot your app's performance with Datadog's end-to-end tracing, use detailed flame graphs,
21:08 identify bottlenecks and latency in that finicky app of yours.
21:10 So be the hero that got the app back on track at your company.
21:13 Get started with a free trial over at pythonbytes.fm/Datadog.
21:17 Get a cool t-shirt as well.
21:19 Brian, you've got another one that kind of ties into your first one, right?
21:22 But it's sort of the other side of the coin, maybe?
21:24 I don't know what's been happening in the Python world that you sort of orbit in that might make
21:29 you think about these things, but tell us about it.
21:30 No, I've just been thinking about community and codes of contact and enforcement for codes of
21:36 conduct.
21:37 No reason, really.
21:38 Just kind of an interesting topic.
21:40 One of the things I've been thinking about is, especially when researching this, the codes
21:44 of conduct and enforcement of it and how we treat people.
21:48 I first thought it was really important for open source projects, and it definitely is because
21:52 people have the option to just leave and get out of the project.
21:56 So you really want to treat people well so they stick around and have it be welcoming to
22:01 other people.
22:02 But I don't think industry is really that different.
22:04 I think that people have the ability to just get another job or work on a different project.
22:09 So I think these are important for industry as well.
22:12 I took a look at two sets of codes of conduct and the enforcement of those.
22:16 So the PSF has a code of conduct.
22:19 I'm not going to read them all out, but there's things like being open and being friendly.
22:23 And in there, there's a list of inappropriate behaviors as well that's covered.
22:28 Now, also the Django code of conduct, they also have all of these when you read them, there
22:33 are differences, but when you read them, they kind of sound the same.
22:36 One of the things they highlight in the Django one is be careful with your choice, choice of
22:44 words, including, and they include examples of harassment, speech, and exclusionary behavior
22:50 that's not appropriate.
22:51 One of the big differences I saw was the enforcement.
22:54 So the PSF is a two-third majority vote enforcement sort of thing to make sure if something happens,
23:02 like if they want to kick somebody out or put them on probation or something.
23:05 I think that's really important because if you require 100% majority and somebody who is
23:10 on the team that decides is potentially part of the problem, then what do you do, right?
23:15 It's really tricky.
23:17 I mean, if people are just going to abandon a project, right, you would rather have just a
23:22 strong majority make a decision.
23:24 I also think that PSF has probably got a larger, possibly has a larger working group on this
23:29 and is more, I guess, maybe harder to get a hold of people.
23:33 Maybe it's easier to get a two-thirds than maybe you can't even reach all 100% of the group.
23:38 But anyway, the other interesting difference is the PSF code of conduct seems to, I know it
23:45 does cover online interaction as well as events like the conferences and meetups and stuff.
23:52 But I possibly, at least I think that maybe its focus might be more on events, whereas the Django code of conduct is specifically targeted towards online interactions.
24:03 I would say for the PSF that sort of historically, events were the first place where codes of conduct were introduced.
24:13 But we've been using them for online forums more and more in the past few years.
24:19 Okay.
24:20 One of the interesting things with the Django one is that a single person on the committee can act without collaborating with anybody else.
24:29 If it's an ongoing problem or if there's a threat involved or something, they still have to go through the process of notifying everybody else.
24:38 But there is an interesting thing that one person on the committee can intervene right away.
24:42 I'm not saying one is better than the other, or I just think it's interesting and I think it's important for new projects to think about, not just their code of conduct, but how they're going to enforce it.
24:53 And what the timeline.
24:54 So the Django one also includes some timelines, which is interesting.
24:57 And I would really like to make sure that projects kind of practice, maybe figure out what they're going to do if they need to enact one of these things without, you know, before it becomes a problem, they know what they're going to do.
25:11 Yeah, there's a lot of stuff going on with some projects out there.
25:14 So having a couple of examples and side-by-side comparisons, I think is great.
25:18 I was interested to find out our meetup, like the Python meetup that we started, which is on hold right now, unfortunately, because of the virus and quarantine and stuff.
25:26 But because we were getting support from the Python Software Foundation to help pay for the meetup fees and stuff, we had to list a code of conduct on our meetup page and stuff like that.
25:37 Yeah, that makes a lot of sense, but I didn't realize that.
25:39 Yeah, yeah.
25:40 The PSF has been doing that for a few years now.
25:43 Yeah, that's really great.
25:44 All right, this next one I want to cover.
25:45 It goes back a ways, but I think it's really fun.
25:48 And it's something that also, I think, ties together well with our special guest here.
25:53 And this is an article about myths about indentation.
25:57 And Guido, I picked this one because you were talking about this on Twitter just the other day.
26:01 What was the motivation to throw that out there?
26:03 That is a good question.
26:04 I was just going to volunteer the answer because apparently I had a link to that.
26:10 I had a link to that article on my homepage in some odd corner.
26:14 And I have a very, very sort of ready old homepage.
26:18 It's moved it to GitHub pages, but it looks like web 1.0.
26:22 And because it really is, I just added raw HTML.
26:25 Blinden, right?
26:26 With Netscape, huh?
26:26 So someone reported to me a broken link, which happens like, I don't know, once every four years or so.
26:35 Someone reported a broken link.
26:38 Oh, wait, it wasn't even on my homepage.
26:40 It was on an old blog that I can no longer edit at artema.com.
26:44 I'm very glad that that blog is still online.
26:46 But so because I got the report of the broken link, I decided, oh, I'm sure I can still find on archive.org where that link used to point.
26:57 And sure enough, it was there.
26:59 And I thought, oh, that's actually still a neat little article.
27:02 So I thought, okay, tweet of the day or tweet of the week.
27:05 Yeah, I agree.
27:06 And I think it's interesting as well.
27:07 And just to give you a sense of why it might have disappeared, it was one of those types of sites where the domain or the URL included a tilde username path, like, you know, like used to get in university or whatever way back when.
27:20 So anyway, this one is myths about indentation for Python.
27:24 And for people who come from a C-oriented language, I think Python could come across a little bit funky.
27:32 I actually want to share a little story of just sort of my journey with it.
27:36 And how I came to love this.
27:38 But I think this is really interesting for people having the debate about is significant white space useful?
27:43 Is it weird?
27:44 Is it good?
27:44 I did a ton of C++ and then C# development.
27:48 So it was all, and then JavaScript development.
27:49 It was all about the curly brace languages, lots of symbols.
27:52 And then I came to learn Python.
27:54 And I'd love Python right away.
27:56 But it was weird to me.
27:57 I felt kind of naked.
27:58 Like if I'd write an if statement, I'm like, I need some little parentheses to kind of hold the code in place.
28:03 And why don't they need to be there?
28:04 And I need a curly brace to like say when this block of code is done and whatnot.
28:08 It just took a little bit of getting used to.
28:10 But I knew that it was the right thing for me.
28:12 Because when I went back to work on some older projects, I'm like, why are there symbols everywhere?
28:17 What is all this stuff I have to keep typing?
28:19 This is like a broken language.
28:21 And it just took a couple of weeks for me to like make that switch to feel like it was broken to go back to work in languages.
28:26 And I've been doing for like 10 years.
28:28 So well done with the white space, Guido.
28:30 Thanks.
28:31 Yeah.
28:31 But so let's cover some of the things mentioned really quick in the article.
28:34 One is that white space is significant in Python source code.
28:38 And actually, no, not in general is the answer.
28:41 It's significant on the left.
28:44 Right.
28:46 So as much as you indent stuff, that really means things.
28:49 But between variables, like whether you have like a equals seven or a space equals space seven, doesn't matter.
28:56 You can have tons of spaces in there.
28:57 Right.
28:57 Like any other language of spaces kind of don't matter except for on the left.
29:01 So that's cool.
29:03 And also the amount of indentation doesn't really matter.
29:06 Right.
29:06 You could have five spaces or any code suite that you want, or you could have 18 or you could go with a standard four.
29:13 I recommend the four, but you know.
29:14 And then also, if you have something that defines like a list comprehension or an array creation or a dictionary, then all of a sudden the spacing doesn't matter anymore.
29:25 Right.
29:25 As soon as you have like an open square bracket and then you have a bunch of stuff and then close square bracket spacing doesn't matter in there.
29:30 So I think this is interesting to think about as folks debate that maybe within their teams.
29:35 It also, you could say it forces you to use a certain indentation style.
29:39 Well, yes and no.
29:40 If you wanted to write it single statement per line, then yeah, there's a cool example that they gave in the article is like if one plus one equals two, then new line print food, new line print bar, new line print, or just say X equals 42.
29:54 You can also put them on multiple lines with semicolons.
29:57 If you're really missing your semicolons from your language, you could do that.
30:00 The thing that's interesting here, I think this is probably the most significant part of this article or this write-up is if you look at it, it looks right.
30:07 And when it gets parsed, it is right.
30:09 There's an example of some C code that looks visually wrong because it's intended differently, but it's going to parse.
30:18 But the way you see it when you read it is not what's actually happening.
30:23 And I think there was a problem like this.
30:25 Well, I think it was in some, you know, Objective-C or something with Apple in there.
30:31 It was really bad.
30:32 There was an infamous Apple vulnerability.
30:35 I think it might even have been on the iPhone where someone had added a second statement to a block, but it wasn't a block because there were no curlies.
30:44 Right.
30:45 Then it started out with a single conditional line, like if something indent, do the thing.
30:51 And then they just indented, but they didn't put the curly braces in.
30:54 And it was, yeah, it was, it took so long for people to find it because visually it looked like what Python would look actually mean.
31:01 Right.
31:01 It looked like those two things are part of the if block, but because the white space didn't matter, it actually didn't.
31:06 And so that's really interesting.
31:08 I'm not going to go through everything.
31:09 I'll put it in the show notes.
31:10 But another one that I thought is like, I just don't like it.
31:13 And that's fine.
31:13 People can not like it, but it has a lot of advantages.
31:16 Like in that example before, if you had that wrongly indented Python code, it would not parse.
31:21 It's an error to have it not look right and rather than just not be right.
31:25 So it has a lot of advantages and people can really quickly get used to not having to write all those symbols.
31:31 And then you go back and you're like, this code is hard to read.
31:34 It's just full of curly braces, semicolons, parentheses everywhere.
31:37 I always thought we used to, those were just, that is what builds programming languages.
31:41 To have a programming language, you had to have that.
31:43 And then once I experienced Python and I went back, it kind of, it broke my mental model of the world.
31:49 I'm like, you don't actually have to have those things.
31:51 So why are they there?
31:52 Anyway, what do you think about this article?
31:54 You must like it somewhat because you hunted it down and tweeted it, right?
31:57 It's old news for me because I didn't even invent the white space thing for Python.
32:02 That was sort of handed to me on a silver platter by one of my mentors in the early 80s.
32:08 Yeah.
32:08 Yeah.
32:09 Back in the ABC days.
32:10 And in those days, it was an innovation.
32:12 There was like one other language that had this.
32:16 And Knuth had once said that he thought it would be a good idea, but he had never actually implemented the language or even experienced the language that implemented it.
32:26 He just thought that it would be a good idea.
32:28 Right.
32:29 Right.
32:29 The only thing that was a stumbling block for me was when I first started looking at Python, the editor I was using, I think it was, I think it was an Emacs something at the time.
32:39 I'm not sure what I was using, but with the C++ code I was using, I had it set up so that if I double clicked on the closing bracket, it would jump to the top of the block.
32:50 And I really liked that feature.
32:52 And for some reason, that's the reason why I didn't like the white space thing at first.
32:56 Like, how do I get back?
32:57 But then I just went, okay, I'm going to like beginner's mind, just open mind, just embrace it and learn it as a new thing.
33:04 And I didn't, like a week later, I didn't even miss it.
33:07 Yeah.
33:07 And of course, the new editors, the newer editors like PyCharm and stuff, at the bottom, they have little breadcrumbs of, you know, here's the class, here's the function, here's the if, here's the while, whatever.
33:17 And you can jump between them, just like you were talking about, but like the entire hierarchy of, I don't know, the tokens or whatever.
33:23 Yeah, and I just, I tend to write smaller functions now, so it's not as much of a deal.
33:27 This is probably a good thing that it was hard.
33:30 I was thinking, truly, that if you needed the attitude to help you find the top of the block, it must be pretty far away.
33:37 Yeah.
33:37 It's 4,000 lines.
33:40 I hate scrolling so much.
33:41 These functions are hard.
33:42 Ah, yeah.
33:43 How interesting.
33:44 All right.
33:44 Guido, do you have one more you want to share with us?
33:46 Well, yeah, you gave me some homework.
33:48 I didn't really do it, but there's like, and of course, this has to do with parsing.
33:52 And so this may be a fairly esoteric library.
33:55 But if you're writing a program that sort of does some manipulation of your code, and maybe it converts four space indents to two space indents or three space indents or whatever.
34:09 Or maybe you're writing something like Black, which is the sort of Python code reformatting tool, but you don't like the way Black handles certain things.
34:20 Or maybe you're writing some other thing that does analysis of source code.
34:26 Maybe you're writing a linter.
34:28 There are a couple of tools that you can use, and it turns out that one of them is in the standard library.
34:35 There's something called lib223, which is a little hard to pronounce.
34:40 It has the digit two and then the word T-O and then the digit three in the name.
34:45 That is tricky.
34:45 That is something I wrote probably over 15 years ago, or at least the core of it, which is yet another LL1 parser.
34:56 But this one's written in Python rather than in C, like the original one.
35:00 And actually Black ended up using lib223, except I think Lukasz had one issue that he couldn't figure out how to do with Black.
35:11 And so he ended up vendoring a copy of lib223 and then butchering it a little bit.
35:17 Which is how these things happen.
35:19 I mean, if you look at what pip vendors, that's pretty scary, but there are good reasons for that too.
35:24 So, but if you're writing your own, you should probably not use lib223 and not just because it's going to go out of style once the peg parser arrives.
35:33 There are much better tools.
35:36 And the one that I discovered a few months ago is actually written by some folks at Facebook mostly.
35:44 It's called libcst.
35:46 And they have unique capitalization.
35:49 It's a capital L lib and then lowercase ib and then cst is all uppercase.
35:56 And so it's a library for manipulating concrete syntax trees.
36:00 And like lib223, it actually shares some code with lib223.
36:06 I think the underneath is a parsing library called Parso, which itself is a butchered version of lib223.
36:15 At least that's how it started.
36:17 These tools are things that can parse Python code, but they produce a syntax tree that is the opposite of an abstract syntax tree.
36:26 It's a very concrete syntax tree.
36:28 And that means that every space, every comment, every bit of indentation is preserved or at least can be recovered from the information in that syntax tree.
36:43 And oppose that with the typical abstract syntax tree, which in the end doesn't even remember where the parentheses are.
36:52 Right, right.
36:53 It just takes us up.
36:54 Well, here's some conditional statement.
36:55 Here's the two things we're testing.
36:57 Yeah.
36:57 Right.
36:57 So this sounds much more useful if you want to do like a code analysis type of thing to say this thing you're doing here, you should do it in this other way or transform it over.
37:07 But kind of preserve things like comments and style.
37:09 Yeah.
37:10 So lib223 itself started out that way as well is you read your source code using this customized parser.
37:33 It gives you a concrete syntax tree.
37:35 Then in that syntax tree, you're actually going to make changes.
37:40 You're going to systematically rename a parameter or move things around or insert.
37:47 In the 223 world, of course, it's used to turn things like iter items into items and iter keys into keys.
37:54 And you can make that kind of changes.
37:56 And so libCST also supports that, but it sort of has a slightly better API because 15 years ago when I started lib223, I didn't realize what an important tool it was going to be.
38:10 And some of the way the white space is attached to nodes is exactly backwards from the way that is the most convenient to think about it and work with it.
38:20 All right.
38:20 Cool.
38:21 Well, this sounds like it'll be really helpful for people building tools like Black or looking at code analysis and stuff.
38:25 Right.
38:26 Lukash had, I think it was the 2019 talk, PyCon talk, where he described how Black uses both concrete syntax trees and the abstract syntax tree.
38:36 It's a pretty fascinating talk for a very low level depth into these concepts.
38:42 It wasn't until I watched that talk that I realized that Black compares the before and after abstract syntax tree to make sure that your code is guaranteed to run the same.
38:53 So you don't really have to test for that.
38:56 He's already testing for it.
38:57 So that's pretty interesting.
38:59 Yeah, that's very cool.
39:00 That is a very neat feature.
39:01 And it's actually an important trick in general for people who are doing transformation.
39:06 to have some abstract way of double checking that your transformation left things in a decent state.
39:16 Yeah, it's cool.
39:16 Yeah, very cool.
39:17 All right.
39:17 Well, thanks for libcst.
39:19 Guido, that's a great one.
39:20 Now, that's it for our main topic.
39:21 So just really quick things at the end that I just want to throw out there for people.
39:25 One, Adam, who goes by Codependent Coder on Twitter, sent a message over and said,
39:29 Hey, Django no longer supports Python 2 at all, which is pretty awesome because 1.11 has left long-term support, leaving only 2.2.12 onward, which has only Python 3 support.
39:43 So yay for modern Python making its way through.
39:46 That's good.
39:47 And then last time we talked about 90% of coding is Googling and that's okay, or it's not.
39:53 And we didn't really feel like that was our experience, right?
39:56 As people have been around for a while.
39:57 But I got to tell you, this last week I've been doing nothing but Pandas, Altair visualization, Jupyter notebook, and graphics.
40:06 Because I'm building a whole set of dashboards for the Talk Python courses and whatnot.
40:11 And basically the dashboards that I should have built a while ago.
40:14 I Googled a lot.
40:15 A whole lot.
40:17 But that's the thing.
40:18 It was like a two or three day blip of like, wow, I'm Googling like 25, 30% of my time because I don't know anything about these things.
40:27 And how do I get this thing to line up with that bar?
40:29 But now I'm back to just kind of mostly not doing that anymore, even after a few days.
40:33 So I think generally what we said is true, but I do think there's like these blips of like, wow, I'm diving into something new.
40:38 It's like mad search scrambling.
40:40 But then I'm back to sort of using like more memory coding.
40:44 I don't know what you call not Google coding.
40:45 You've got to understand what you're doing.
40:48 And that means you can't just Google for examples and copy and paste them in because then you can combine the examples and you have no idea what you're doing.
40:59 And of course it doesn't work.
41:00 At best it's frustrating, right?
41:02 You're like, this worked, that worked, but together they don't work.
41:04 And you just don't even know why, right?
41:05 Yeah.
41:05 So for sure.
41:06 But yeah.
41:08 So anyway, it's a follow up on our conversation last week, right?
41:11 What do you got to throw out there for everyone?
41:12 I'm going to say this on this show just to make sure I do it.
41:15 There's like three days left for me to record my talk.
41:18 Yeah.
41:18 This is like forcing yourself to commit to it.
41:21 So you're going to do it.
41:22 Okay.
41:22 Yes, definitely.
41:23 So PyCon talk.
41:25 I really do want to get it online.
41:26 It's important stuff.
41:27 It's about parameterization.
41:29 I talked a couple episodes ago about having trouble switching back and forth at home with all this working from home stuff between Mac and Windows.
41:37 I finally figured out the whole using command and control.
41:40 So thank you to everybody.
41:42 But apparently there's this really simple thing.
41:44 Apple lets you just swap them on a keyboard.
41:47 So that's what I'm doing.
41:48 And it works great.
41:50 And then also I had promised that I was going to have my cards project be able to work and publish to PyPI or the test PyPI.
41:58 It doesn't work with setup tools SCM because I'm using Flit.
42:01 So if somebody's got a way to figure out how to just somehow change the version string or bump that every time you merge or something like that, that'd be great.
42:12 But otherwise, right now, I don't think there's a way to automatically push to PyPI if you're using Flit.
42:19 Yeah, because it says that one's already uploaded.
42:21 Maybe there's a GitHub action that will just randomize that or something.
42:24 Because the version is embedded in the source code.
42:27 And the trick that people are using with setup tools is the version is based on the version in GitHub.
42:33 And you can't do that with Flit.
42:35 So at least I haven't figured it out.
42:37 But that's okay.
42:38 I'll probably do something else.
42:40 That's my extras.
42:41 Guido, anything else?
42:42 Even though I said it's hard to imagine Python going online, it actually is going online.
42:49 At least some of it is.
42:50 The first talk by the conference chair, Emily Morehouse, has been posted and many more will follow.
42:58 Yeah, her welcome was really nice.
42:59 The other thing, and as you mentioned, Django no longer supports Python 2 at all.
43:04 Well, that's just fine because the very last release of Python 2, 2.7.18, was released a few days ago.
43:13 Yeah, that's great.
43:14 That must be kind of a load off of your shoulders to finally have that in the rearview mirror.
43:18 I'm very happy.
43:19 And I'm sad, of course, that we can't have an absolutely wild and crazy party in Pittsburgh like we were planning.
43:26 Yeah, a big celebration on Zoom.
43:28 It's just not the same.
43:29 Just have to have a bigger one next year.
43:31 That's one I don't know how to pull off.
43:33 Yeah.
43:34 Well, that's really good.
43:36 All right.
43:36 You guys ready for a really quick joke?
43:38 All right.
43:39 So here's a quick joke sent to us by Derek Chambers.
43:43 And he may have even made this up for us.
43:45 And this goes back to the sub interpreters and the multiple gills and all that.
43:51 So you guys know how you can borrow money concurrently with asyncioUs?
43:57 It's a terrible joke.
43:58 That's a bad joke.
43:59 Oh, that is very groan worthy.
44:02 Very groan worthy.
44:02 Excellent.
44:03 Most of our jokes actually are around here, but that's how it goes.
44:06 Yeah.
44:06 And keep them coming.
44:07 Keep sending us your bad jokes.
44:09 Yeah.
44:10 That's right.
44:11 That's right.
44:11 Python dad jokes.
44:13 That should be a whole separate category.
44:15 They absolutely should.
44:17 They should.
44:17 Well, Guido, it was really an honor to have you on the show.
44:20 Thanks for coming and sharing your perspective on all this.
44:22 Glad to be back.
44:23 Yeah.
44:23 And Brian, thanks as always.
44:25 Good to be here with you.
44:26 Cheers.
44:27 Yep.
44:27 Bye, everyone.
44:28 Bye.
44:28 Thanks, both of you.
44:29 Thank you for listening to Python Bytes.
44:31 Follow the show on Twitter via at Python Bytes.
44:33 That's Python Bytes as in B-Y-T-E-S.
44:36 And get the full show notes at pythonbytes.fm.
44:39 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.
44:44 We're always on the lookout for sharing something cool.
44:46 On behalf of myself and Brian Okken, this is Michael Kennedy.
44:50 Thank you for listening and sharing this podcast with your friends and colleagues.