Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #35: How developers change programming languages over time

Return to episode page view on github
Recorded on Tuesday, Jul 18, 2017.

00:00 Michael KENNEDY: Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is Episode #35, recorded July 18th, 2017. I’m Michael Kennedy.

00:00 Brian OKKEN: And I’m Brian Okken.

00:00 KENNEDY: We’ve got some really cool stuff to start out with. Do you have any comments maybe, to kick us off with, Brian?

00:00 OKKEN: Well, I like to comment on just about everything. But to kick off, we’ve got an article from Philip Trauner, and it’s called, “Python Quirks: Comments”. I kind of like this article. When you’re looking at source code, there’s definitely comments that start with pound (#) which are obviously just comments to other coders. But then also sometimes we have docstrings in there. Some people have learned that you can just put strings or other objects in your code. As long as they’re not referenced by anything else, it just acts like a comment.

00:00 But this is an article taking a look at the Abstract Syntax Tree, and also taking a look at timing them, obviously.

00:00 KENNEDY: Yeah, the fundamental question was, ‘Is there a reason to prefer one over the other?’ Is there a performance difference between them?

00:00 OKKEN: Yeah, definitely. I have seen it. I haven’t seen it in a lot of big Open Source projects, but I’ve seen it in just random Python code that I look at from co-workers or whatever. People will comment out. Even commenting out a chunk of code with the three quotes, just to block it out, it’s not good. It actually leaves that object in your code and can slow things down a bit.

00:00 KENNEDY: Yes. He did a bunch of testing and the hash or pound comments that actually get out, those literally do not appear in the Abstract Syntax Tree. Once the PYC file is generated, they’re gone; they literally do not appear in the resulting executed code. However, if you have like triple quote for docstrings, then it gets set to the DunderDoc property, I think. But those appear. Those just execute, they get allocated, they immediately get de-referenced and garbage collected. But those steps happen, right?

00:00 OKKEN: Yeah. One of the thing that made me think about it, because I’ve been adding the code examples for the book and I’ve been adding docstrings. I’m curious, I’d like to do a similar test. I’m might take his code example and do a similar test on size of docstrings. If you do a 10-character one liner versus 100 to 200 characters of huge docstring, does it make a difference for performance at all?

00:00 KENNEDY: That’s pretty interesting. I’m definitely a fan of the # (hash) comment; it’s literally 4 docstrings.

00:00 OKKEN: On of the things – and this is outside the article – but I read a comment recently, a discussion about this. It was that docstrings are information for your user, or for the user of a function. The compound or # (hash) comments are information for a future developer. I like that.

00:00 KENNEDY: That’s the, ‘Imagine a psychopath who’s having a bad day 10 years from now, inheriting your code and they know where you live.’ You want to leave them some comments to help them feel happy. (Laughs)

00:00 Alright, so do you know what I get if I open up my terminal and I type Python 3-V?

00:00 OKKEN: I don’t know, what do you get?

00:00 KENNEDY: I get 3.6.2 because Python 3.6.2 is out, I think today even.

00:00 OKKEN: Awesome.

00:00 KENNEDY: Very cool, and I was blown away at how much stuff is in here. I think these are mostly fixes. I don’t think there are any new features, that’s coming in 3.7, but – holy moly – there are a lot of changes. It’s pretty interesting.

00:00 I pulled out a few just to highlight and I’m highlighting for a variety of reasons. They broke them into 4 categories here. There’s a few others that I decided not to touch on, like changes to IDLE, don’t care. But security, that I very much care about.

00:00 So, the security ones we have these changes and they have a bunch of numbers, you guys can look them up. But prevent environment variables injection into subprocesses on Windows; so, prevent other things from freaking out or taking over what the system looks like for Python.

00:00 Or this one is kind of scary: Upgrade expat copy from 2.2.0 to 2.2.1 to get fixes of multiple security vulnerabilities and all these loops, integer overflow, regressions of other bugs, counter hash flooding. All these things that are probably somewhere in there, there’s a really bad vulnerability. Also, parsing the host, urllib, and things like that. There’s a bunch of security fixes, not just features or whatever.

00:00 If you can, you should probably install this.

00:00 OKKEN: Yeah, definitely. From my first glance at it this morning, it just seemed like it’s just better for security and other changes. I’ll definitely install it. But I didn’t see anything that jumped out like, ‘Oh, I was waiting for this. Now I can use it.’

00:00 KENNEDY: Yeah. I think this is only bug fixes and security fixes. No features until 3.7.

00:00 OKKEN: You run a Mac, right? So, maybe I’m just dense. To upgrade you just download the new one and install it. There isn’t a way to just upgrade is there?

00:00 KENNEDY: Well, you can use BRU. I actually installed it off of Python.org. To upgrade that version, I think you’ve got to keep rolling with the next one and run it, that’s fine. If you BRU-install Python 3, then you can BRU-upgrade Python 3. Which is what I’ll maybe do on my next Mac. I’m happy with what I’ve got now. I’m going to try to use Home BRU more. I’m starting to love it.

00:00 OKKEN: So, if you have 3.6.1 and you install 3.6.2, you just have both of them there, right?

00:00 KENNEDY: I think so, but certainly if I type Python 3, now it’s 3.6.2.

00:00 OKKEN: Okay. I like having both around anyway for testing multiple versions.

00:00 KENNEDY: Yeah, sure. There’s some tools like Virtual Environment Wrapper, but slightly different that will let you get all the versions and flip between them. It’s pretty cool.

00:00 So, some other ones: Core and Builtin stuff like the parsing of f-strings with backslashes, apparently, was broken. Segmentation faults, when you’re working with dictionaries, those are never used in Python anywhere. When you’re changing them while searching, if Python just goes away, your web app keeps crashing, like all sorts of bad stuff. Ctrl C when your inside of a ‘yield from’ or ‘await call’ gets fixed, and all these different things.

00:00 So, tons of fixes there. The library gets race condition fixes for some signal delivery and wakeup file stuff. The lib2to3 now understands f-strings, race conditions, Windows. This one is awesome. If you work on Windows or you teach people Python who work on Windows, you can cheer for this, this is amazing. Windows now will locate msbuild.exe instead of vcvarsall.bat. That is so much more reliable to find msbuild.exe on Windows, than it is that stupid old vcvarsall.bat thing for the C compilations. So, that means pip install a thing on Windows should get more reliable.

00:00 There’s about 40 more of these types of fixes. I wanted to share the news – how awesome is this – I also wanted to hit on some of those things. Especially the security stuff because we’re coming up quickly on the end of Legacy Python, right?

00:00 OKKEN: Yes.

00:00 KENNEDY: Legacy Python has to have some of these in there. Like, people discovered these and now here are these problems that are uncovered. In 2020, these problems are going to stay in Python 2. The sooner you can get to Python 3, so that these changes keep coming to you, rather than becoming, ‘Well, that’s a security vulnerability. Sorry you have to live with that.’ Just one more reason to upgrade to Python 3 for those holdouts out there.

00:00 OKKEN: Yeah, definitely. And looking at this list, I just have to give a big ‘thank you’ to everybody who worked on all this so that I don’t have to work on things like this.

00:00 KENNEDY: Yes, thank you. It’s awesome. It’s all getting better. Cool.

00:00 Alright, speaking of contributing to an Open Source project, a lot of us feel like we’re not good enough or maybe we don’t know enough, or our experience isn’t rich enough, whatever. That’s a huge problem.

00:00 OKKEN: Yeah, I think everybody has gone through that. Definitely everybody that’s now contributing to Open Source has had an initial time where they’ve felt like whether they knew enough about something. So, Adrienne Lowe, who does codingwithknives.com and has spoken at a couple PyCons and other places, she wrote, “Contributing to Open Source Projects: Imposter Syndrome Disclaimer”. Essentially, it’s in places where you have how to contribute to your project. She’d like you to think about adding this little disclaimer to people that maybe don’t think that they’re ready to do it. It’s got great wording.

00:00 “Imposter Syndrome Disclaimer: I want your help. No, really, I do. There might be a little voice inside that tells you you’re not ready; that you need to do one more tutorial, or learn another framework, or write a few more blog posts before you can help me with this project. I assure you, that’s not the case.”

00:00 It goes on to tell you to point to your contributing guidelines and then also, to comment about other stuff.

00:00 “And you don’t just have to write code. You can help out by writing documentation, tests, or even by giving feedback about this work. (And yes, that includes giving feedback about the contribution guidelines.)”

00:00 We talked about this in one of our previous episodes, about the many ways you can contribute to Open Source projects. But I think this is a great idea, to put it right in your contributing guidelines for your project.

00:00 KENNEDY: Yeah, really nice work, Adrienne. If you guys were at PyCon, she was the host of the Art Museum dinner. This is really great. She does a bunch for the community and contributes to many projects so I know she’s been on both sides of this. I do think having this on your projects will help.

00:00 OKKEN: She’d like to collect examples, so we’ve got a link in the show notes. Or just get a hold of her and say you taken this and contributing.

00:00 KENNEDY: Yep, sounds good. Nicely done. You can just grab that and just drop that into your project, like a markdown file or something liked that.

00:00 OKKEN: So, Michael, do you have any dark secrets that you want to share?

00:00 KENNEDY: I think we all have dark secrets and I don’t really want to talk about it, but it’s time to get it out in the open. So, the next thing I wasn’t to talk about is a pretty deep article from MIT Technology Review called, “The Dark Secret at the Heart of AI”. We’ve touched on this a few times. It’s kind of a nice follow-up from last week.

00:00 There’s a huge problem with AI. We’ve had statistical models and we can look at the models and see what it’s predicting. But as we move farther and farther into things like deep learning, the machine doesn’t know why it knows a thing, we don’t know why it knows a thing, but we can teach it a thing and then it does that thing. Even the creators of these deep learning models can’t explain why it makes a decision. You can’t set a breakpoint and go, ‘Oh, this is the F-case, yes of course here.’ There’s none of that. It’s like, ‘I’ve taught it a bunch of stuff and now it somehow knows and then I ask it a question.’ They gave a really interesting example to kick off this article. They said:

00:00 “Last year, an experimental vehicle developed by researchers at the chip maker Nvidia, didn’t look different from other autonomous cars, but it was unlike anything demonstrated by Google, Tesla or General Motors, and it showed the rising power of artificial intelligence. The car didn’t follow a single instruction provided by an engineer or programmer.”

00:00 The basically taught this car how to drive by having it watch humans drive. And then they put it on the road.

00:00 OKKEN: Oh, wow.

00:00 KENNEDY: Yeah, and so it was really weird. The results seemed to do what human drivers did, but it did something different. How do you understand it or debug it, or even change it to make those decisions differently? Like, if it crashed into a tree, if it sits at a light or there’s always the hilarious joke that people seem to play on these cars, draw what looks like painted white lines in a circle around it and it can’t get out. (Laughs) But if it does an unexpected thing, how do you debug it or change it? That’ really the secret. Even the developers of AI and AI itself, they don’t know how they work.

00:00 OKKEN: Yeah. When I think about this stuff, I am fairly optimistic about the self-driving cars and I’ll be one of the first to grab one if I can afford one. But there’s always the question of, ‘Okay, if a car comes up to say, decide whether or not to crash you and your family into a tree, or take out a while glob of schoolchildren, what does it do?’ That’s the sort of moral question I don’t know how to deal with, how people are going to debug that.

00:00 KENNEDY: For sure. And if you get the AI to do that, how do you know it’s always going to make the right choice? You don’t. It’s probably statistically better than humans, but still it’s an interesting question. They basically say, ‘How do you understand what the system does and why does it make the decision it does?’ You can’t really ask it right now and it’s difficult to design a system so that it can explain what it does. People can’t explain always why they do what they do, precisely. So, it’s interesting.

00:00 One of the consequences that might be coming really soon, this is in the E.U., there’s an argument being made that you have to be able to get machines and AIs to tell you why it reached a conclusion as a fundamental legal right.

00:00 OKKEN: Wow. Okay.

00:00 KENNEDY: So, if I’m told I have cancer and I go crazy and I burn all my life savings. ‘Oh, sorry. Glitch in the whopper core, you’re fine.’ You want to know why. I’ve I’m denied a loan, if I’m denied the ability to buy a house, if I’m denied a job, these are serious questions.

00:00 There’s a lot to cover in this article but the last for us, they said:

00:00 “We’ve never before built machines that operate in ways that their creators don’t understand. How well can we expect to communicate – and get along with – intelligent machines that could be unpredictable and inscrutable.”

00:00 Crazy, huh?

00:00 OKKEN: Yeah. Definitely.

00:00 KENNEDY: I’m optimistic with you, as well. But it’s interesting that philosophy and morality is starting to become part of programming.

00:00 OKKEN: We definitely have machines now that I think that more than one person doesn’t understand.

00:00 KENNEDY: Yeah, I think the biggest consequence for us is that we are going to have programs that we can’t debug or understand why they do things. That’s going to be a bizarre programming future.

00:00 OKKEN: Before we move on, did you say, the whopper core?

00:00 KENNEDY: I did say the whopper core.

00:00 OKKEN: Is there a computer based on a hamburger?

00:00 KENNEDY: No, that’s from War Games. Remember when they had to hack into the whopper core? That machine that they had to teach to play tic-tac-toe.

00:00 OKKEN: Nice reference.

00:00 KENNEDY: Thank you.

00:00 OKKEN: I always think it’s great that in that movie you can get from Colorado Springs to Bainbridge Island in a helicopter. On like one tank of gas. Not possible.

00:00 KENNEDY: (Laughs) Awesome.

00:00 Let’s recede safely back to the three A’s of testing pattern and away from this philosophy stuff.

00:00 OKKEN: I loved seeing this. This is an article by James Cooke called, “Arrange Act Assert Pattern for Python Developers”. The Arrange Act Assert pattern is structure for how to set up test cases. This is a fairly gentle, easy introduction basically just telling people to not have big, long spaghetti test code. Your test code should be something structured. This is a decent structure. The arranged part is, ‘Get yourself ready to do whatever you’re going to do’/set-up part. Act is whatever thing your testing and the assert part is where you check. The important thing is try to do as many test cases as you can so that all of the asserts are at the end, so you don’t do more actions and more asserts.

00:00 He wrote a list. There’s other names that people might know it by. ‘Given, When, Then.’ That’s often attributed to behavior-driven development but it’s essentially the same pattern. And I did cover it in a couple of places on pythontesting.net and also Testing Code.

00:00 KENNEDY: Yep. The links are both in the show notes for the episode and the article.

00:00 OKKEN: But I’m pleased with more people – more people being one so far – other Python developers writing for targeting developers and teaching people how they should set up their tests.

00:00 KENNEDY: Yeah, and it’s such a simple pattern but I find when I follow up my code, my tests are more focused and a lot less rambly, so I think it’s good.

00:00 OKKEN: Yeah, and also you have less chance of something going… a test failing you pretty much know what’s wrong and it might be one of 15 different things.

00:00 KENNEDY: For sure.

00:00 So, last thing, I want to cover is to shine a bit of a bright light on the future of Python. Everyone out there listening, you are in a good place let me tell you, in terms of being in Python and working right now. So, there’s another really deep article by the company called source(d). They’re not super Python-focused, I think they do mostly Go stuff. But their mission is to build the first AI that understands code, speaking of AI. It’s pretty interesting.

00:00 They wrote this really long blog post. There’s a decent amount of data science and math in there. It’s called, “Analyzing GitHub: How Developers Change Programming Languages Over Time”. We’ve talked before, Brian, about how Python is the #2 most active language on GitHub, for active, non-trivial projects. JavaScript was #1 because everybody has JavaScript in their web apps.

00:00 So, this is a different question but not as similar. What is just the most popular, but how is it changing over time? Where are those trends going to? If people are changing languages, where did they change from? They have these cool Gantt charts. They’ve studied 4.5 million GitHub users, over 393 different languages and 10 terabytes of code. It said, given one of those 4.5 million users, how do you visualize them? How do we think about them? They’ve got a Gantt chart of as they transition from one language to another over time. This is based on an original article by Erik Bernhardsson. He’s at Google and the name of his article is pretty interesting as well, “The Eigenvector of Why We Move from Language X to Language Y”.

00:00 OKKEN: I love me a good eigenvector.

00:00 KENNEDY: I love me a good eigenvector as well. It tells you where you’re going.

00:00 This is a slightly different approach, more of a data science, less of a statistical approach, I believe. They said, ‘Look, first of all, we’re going to not include JavaScript because JavaScript is spread amongst all these projects.’ My Pyramid project has JavaScript, that Ruby on Rails project has JavaScript. Everything has JavaScript. It’s super hard to make reasonable claims about JavaScript because it’s such a complementary language. They said, ‘We can’t reason about this so put it to the side. Take that for what it’s worth.’

00:00 They said, ‘We’re going to look at the most popular languages on GitHub’ and they do a whole bunch of work and they come up with this stationary distribution of a Markov Chain, how about that. What they find out is the #1 most stable language at GitHub is Python. And interestingly, its stability level is higher by almost 50% in the amount of code as a percentage of it on GitHub. So, it’s really stable.

00:00 OKKEN: What do you mean by stable?

00:00 KENNEDY: People, once they get to Python, are least likely to move away from Python. I believe that’s the right interpretation. Then there’s Java, which is also very stable. C and C++. Then PHP, then Ruby and C# and it goes on and on and on.

00:00 They make some claims based on this. They say, ‘Python, at 16.1% appears to be the most attractive language, followed closely by Java. It’s especially interesting since 11.3% of all code on GitHub is written in Python.’ It is more attractive than its level of code would imply.

00:00 They said there are some languages that are repulsive – that’s my wording, not theirs. ‘Although there are 10 times more lines of code in GitHub in PHP rather than Ruby, they have the same level of stationary attractiveness.’ So, much less reason to be attracted to Ruby, but if you’re there, you’re more likely to stick.

00:00 They said, ‘What about sticking to a language? Developers coding in one of the 5 most popular languages (Java, C, C++, PHP, Ruby) are most likely to switch to Python, with a 22% chance on average.’ How about that? So, people who like Python are most likely to stay there and people in one of the 5 most popular languages are most likely to move there as well.

00:00 OKKEN: I haven’t read this article, but I think it also goes the fact of how easy it is to think of something that you could solve that you could share with somebody else, a project with Python. For instance, I programmed C++ all my career but I never contributed any C++ code to GitHub.

00:00 KENNEDY: Yeah, that’s for sure. It’s a really more Open Source-friendly language as well.

00:00 A few more random stats. They say, ‘Visual Basic developers are very likely to move to C# with a 92% chance of that. People using numerical and statistical environments (Fortan, Matlab or R) are most likely to switch to Python using this measure of analysis, whereas Erik’s original blog was suggesting they might move to C.’ So, pretty interesting little article there about stability and attractiveness of projects.

00:00 OKKEN: That correlates with other anecdotal things that we’ve heard, of more people migrating – especially in data science – to Python.

00:00 KENNEDY: Yeah, it seems totally believable to me given all the other pieces of information and studies that we’ve seen and talked about.

00:00 I think I would say, if you’re thinking about, ‘Where do I bet my career?’ That’s another positive sign that Python’s probably a good sign to hang onto for a long while.

00:00 Well, that’s the news. Anything else going on Brian?

00:00 OKKEN: I wasn’t there but EuroPython got wrapped up last week. I did have some Rocket stickers hitch a ride.

00:00 KENNEDY: Nice. They blasted all the way over to, where was that, Spain?

00:00 OKKEN: Italy, I think. I had a bunch of stickers handed out to promote the book, so that’s fun. I’ve got one more week to finish it and then it will be done.

00:00 KENNEDY: You must be looking forward to that.

00:00 OKKEN: Yeah. How about you? What’s going on?

00:00 KENNEDY: Awesome. Not too much. I’m really enjoying summer. I’m actually working on some apps, some very interesting apps from my training courses. Not that I’m going to teach but to deliver stuff. More to come on that, right now I’m just writing to see how it comes out in the end. Very fun.

00:00 OKKEN: One of the things that you brushed by, I know you talked about it a lot somewhere else maybe, but your “Python for Entrepreneurs” Course? It’s freaking awesome.

00:00 KENNEDY: Thank you, man. Thank you very much.

00:00 OKKEN: I think more people should check it out. I don’t think it’s just for entrepreneurs, I think it’s just a good top to bottom Python web, plus front end and back end. It’s a nice thing for people to look at.

00:00 KENNEDY: Thank you so much. It’s officially done as of last week. It’s finally ready for the world. Thanks a bunch for the shout out.

00:00 OKKEN: Yeah, and Matt McKay helped you with that, I believe.

00:00 KENNEDY: Yep, Matt McKay from Full Stack Python. We are happy to be done and we are planning our next thing that we’re going to do.

00:00 OKKEN: You guys did a great job on that. Cool.

00:00 KENNEDY: Thanks a bunch. It’s super good to see your book coming out as well.

00:00 OKKEN: Catch up next week.

00:00 KENNEDY: Yep, catch you next week and thank you everyone for listening.

00:00 Thank you for listening to Python Bytes. Follow the show on Twitter via @pythonbytes and get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbyes.fm and send it our way. We’re always on the lookout for sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page