Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


« Return to show page

Transcript for Episode #31:
You should have a change log

Recorded on Tuesday, Jun 20, 2017.

00:00 Michael KENNEDY: Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This time it’s Python Bytes Episode #31, recorded on Tuesday, June 20th, 2017. I’m Michael Kennedy.

00:00 Brian OKKEN: And I’m Brian Okken.

00:00 KENNEDY: We have a bunch of cool things to talk about. Some of them are huge and some of them are kind of tiny. Let's start small, huh?

00:00 OKKEN: Yeah, let’s start small. One of the reasons why I like following Twitter for Python news is that's where I found a TinyMongo. I saw somebody talking about it last week.

00:00 KENNEDY: That's awesome. I'm a fan of MongoDB and TinyDB, and if they could come together that would be even better.

00:00 OKKEN: Right. So, this is essentially an attempt to put – it’s not the exact same interface but it's fairly close to the same interaction you do with Mongo with a single file system – so it's a single file database. Steven, the person working on this – I talked with him – it wasn't his intent to always be right on top of TinyDB, but so far, he's been really happy with the TinyDB as the back end for TinyMongo. It's using TinyDB as the database part but exposes an interface that's very close to Mongo.

00:00 KENNEDY: Yeah, that's super cool. So, basically if you have code that talks to MongoDB through the PyMongo API, you could more or less adapt that really quickly to TinyMongo. TinyDB, the backing store for this thing more or less, is like, ‘Let's create a simple document database that's really just some .jsp files living on your disk.’ It's not a full-on production database, but if you’re doing simple stuff, really simple things, this is actually pretty sweet. There's no server, right?

00:00 OKKEN: Right, there's no server. I would say probably the other direction probably works the best. If you’re year end goal was to use Mongo, then Tiny Mongo might be a good way to start because it isn't the full set of functionality. I don't have a complete list of what's missing, I just have the personal experience. I tried to take a Mongo application and just swap this in and I ran across a few errors and I haven’t finished debugging those yet. I'm just really excited about it because there's more than one document database that I can use in small applications.

00:00 KENNEDY: Yeah, that's cool.

00:00 OKKEN: Also, one of the applications for this, when I was talking with the maintainer of it, is that he's using it on Raspberry Pis even.

00:00 KENNEDY: That is really cool because you don't want to start a whole separate server on a Raspberry Pi but certainly having a little a couple of JSON files laying around that you have a database interface over top of, that’s cool.

00:00 OKKEN: Yeah, definitely. I was excited about this and I'm going to start using it right away.

00:00 KENNEDY: Sweet, yeah. If people are interested in TinyDB, back on an Episode #80 of Talk Python, many moons ago, I interviewed the guy who created TinyDB and talked about some of the use cases and I think there's some extensions you can get, like indexing add-ons and stuff like that. So, there's a lot of stuff to do, it’s pretty cool.

00:00 That sounds pretty dead simple, right? Just fire up TinyDB and off you go?

00:00 OKKEN: Yeah, dead simple.

00:00 KENNEDY: You know what else? I want some some dead simple validation. So, the next project I chose is called Validus. Validus is on GitHub and it described itself as a dead simple Python data validation library. Have you ever tried to write a regular expression to match an email or a url or something like that?

00:00 OKKEN: Yes.

00:00 KENNEDY: That's super fun, right?

00:00 OKKEN: No.

00:00 KENNEDY: You think you get it working then someone emails you like, ‘I have a proper email address, but I can't sign up in your system. It says my email is invalid.’ So, this Validus thing kind of solves that for a class of types of data, basically simple input. You can just import this and say validus.isemail and give it a string and it will say, ‘Yes or no.’ You can ask it questions like, ‘Is it an RGB color? Is it a phone number? Is it an ISBN? Is it a IPV 4 or IPV 6 address? Is it a number? Is it a slug, like would it fit at the end of a URL without you needing a coding?’ All that kind of stuff is pretty awesome.

00:00 OKKEN: That's cool.

00:00 KENNEDY: I'd say it's dead simple.

00:00 OKKEN: It's even got, ‘Is Mongo ID?’

00:00 KENNEDY: Nice. Yeah, that's awesome. You know what else I like about this? It's Python only, no Legacy Python.

00:00 OKKEN: 3.6, 3.3

00:00 KENNEDY: Yeah, yeah. 3.3 and above, so it's only a Python 3 thing. Yet another sweet example. I have a lot of interesting stuff to say about that at the end of the show, not Validus but Python versus Legacy Python.

00:00 While this works pretty well, we may still need to jump in the debugger, right?

00:00 OKKEN: Yeah, definitely. I am a command line debugger kind of person, actually. I don't really jump into the debugger too much.

00:00 KENNEDY: You are a debugger of last resort-type person?

00:00 OKKEN: Yes, definitely. Last resort. So, in Episode #29, we talked about the ability to launch pdb, the Python debugger, from a failed pytest. Somebody on Twitter, another Twitter person @kidpixo, I think…

00:00 KENNEDY: Yeah, @kidpixo, he runs the Geek Cookies Italian podcast, which I was a guest on like 2 ½ years ago. He’s a great guy. He sends us lots of good stuff, yeah.

00:00 OKKEN: Well, he passed this along because he said he really loves the PuDB debugger. My first reaction is, ‘Oh my God, this thing is ugly,’ because it does look like you're back in the ‘80s, running on a 386 or something.

00:00 I feel like I've dialed into a VBS, but it does have themes so after I played with it for a while, I switched it to a midnight theme and it looks just like I'm in my editor. It's actually pretty slick. One of the things that you can do with it, it's a lot better than pdb and it's still small and fast and there's there's some documentation in it for how you can do the same thing that we did with pytest. You can launch it whenever you hit a pytest failure, so that’s pretty cool.

00:00 KENNEDY: It's really nice. You can use it over SSH and stuff so if you’re SSH into a server, you can debug with this. But it actually has little windows. It really does feel like I'm back on a BBS; it's awesome. You see your code, you can step through it, you've got a variables window and a stack and break points. It's really nice. It's like an ASCII curses-type thing.

00:00 OKKEN: But the local window of already having your listing up and also all your local variables and that changing when you go up and down the stack, it is usually enough. I like it.

00:00 KENNEDY: Yeah, definitely. It hits the sweet spot, like the 88% case for debuggers. Alright, I'm definitely going to start using that if I need to debug anything without a Windows environment like MacOs or Linux or Windows.

00:00 Okay, so the next thing that I want to talk about is a really interesting sort of wide ranging study that the guys at pyup.io did. So, pyup.io is a cool service. I'm actually a paying customer of theirs because I really think what they're doing is awesome and I use it for my web apps. The idea is you basically point, you give pyup.io access to your requirements file in your public or private GitHub repo. If there's a new version of any requirement or transitive requirement that you depend upon, it will tell you. Like, ‘Hey, there's a new release of the Pyramid web framework and here's the change log and actually this one's a security update so get in there and fix it quick.’ So, it will basically watch your requirements and tell you if they are any upgrades and things like that. It will issue them as a pull request, so really cool.

00:00 So, these guys have access to all these requirements files and many other things. They studied some Django requirements files on GitHub. Now, this isn’t through their business; they were able to BigQuery to just get a hold of all the Django requirement files that are on GitHub, and they found some interesting things. I guess this is not private, not the private repos, just the public ones. They said that Django is the most popular web framework. It’s pretty old; it’s been around for 12 years and used on all sorts of different projects.

00:00 So, let’s look at these requirements files that specify the dependencies that you have to install and see what we can get from them. The first thing they ask is, ‘Do developers pin or freeze their requirements?’ In requirements.txt, you can say, ‘I depend on Django, I depend on SQLAlchemy and I depend on Requests.’ Or you could say, ‘I depend on Django==this version, Request==that version,’ right? That’s pinning or freezing. And they said that 64% of Django developers pin their requirements. That’s interesting. And another 20% or so do ranges. So, like, ‘I’m willing to take this range of versions but not leave it unpinned.’ Some of them are just like, ‘Give me whatever when I ask for it.’ So, that’s interesting.

00:00 Another thing that they said that was pretty interesting was Django 1.8 – even though I think 1.10, 1.11 is their latest – is the most popular. That was pretty cool. But one of the things I really wanted to point out here is, they said that what is more worrisome is 1.9, 1.7 and 1.6 are 2nd, 3rd and 4th most popular on the list. Why is that a problem? None of them are receiving any security updates at all.

00:00 OKKEN: Oh, weird.

00:00 KENNEDY: So, 1.7 and 1.6 went End Of Life over two years ago. If you are on the web and your application listens on a socket, you want it to have all the security patches, let me tell you. That’s bad news. If I add those up really quick, that’s something like 40% of Django files they found are using these older versions. In fact, they said, only 2% out of all Django projects they could find are actually on a secure release. Among all the projects, more than 60% use Django releases with one or more known security vulnerabilities. That’s pretty intense. Only 2% are on a 100% secure release.

00:00 OKKEN: Clearly, it’s recommended to make sure that you are using a secure release. I was curious about the pinning or freezing. Is that considered best practice?

00:00 KENNEDY: I think it depends on what you’re doing. For large, complicated applications it’s definitely considered a best practice. The idea is, you want to make the upgrade in your dependencies at the time of your choosing. If you’re going to upgrade – especially major frameworks like Django – if you’re going to go from Django 1.8 to 1.9, you don’t want that to just happen one day when it’s release and you happen to refresh your server. That might have breaking changes. You want to explicitly say, ‘I depend on this one. There’s a new one out. Let me test the new one,’ then explicitly change that number and have it flip it for you. And basically, that’s what the pyup.io service does that I use. It will automatically upgrade my Pyramid web framework from 1.7 to 1.8 to 1.9 but it doesn’t flip it immediately. It will tell me and change my requirements files as a PR, but I have to accept it, basically.

00:00 But pretty interesting stats there. Especially if you’re into Django, check that out.

00:00 OKKEN: Yeah, definitely. It’s kind of concerning. Was this projects or applications, or is there a difference?

00:00 KENNEDY: So, as far as I can tell, I don’t really know. I think this guy who wrote could maybe chime in in the comments if he’s listening. My understanding is, basically, they went and they studied the public repos that use Django. This also may not be quite representative because companies like Pinterest that depend on Django, they’re obviously not going to make their code public, so they may be doing slightly different things. But still, it’s an interesting view at least into the Open Source side of Django.

00:00 OKKEN: Definitely. That’s cool.

00:00 KENNEDY: Speaking of Open Source projects, do you think that they should have a changelog?

00:00 OKKEN: Well, that’s what I was curious about, yeah. I am warming to the idea of changelogs. I appreciate other projects with changelogs. I actually asked some people on Twitter what they thought of them. There’s a couple things I came across which was a website called, keepachangelog.com.

00:00 KENNEDY: I really like that site. It’s so clear and compelling, it’s great.

00:00 OKKEN: Yeah. It also talks about how if there is a standard format for them, this is as close as you can get. It talks about different standards in either REST or in Markdown; there’s different ways to do it. And then when I was talking on Twitter about changelogs, some of the people from the Pytest project piped up and said that they’re using a tool called Towncrier (github.com/hawkowl/towncrier) to maintain their changelog.

00:00 KENNEDY: That looks really cool but I’ve never done anything with it. What does Towncrier do?

00:00 OKKEN: What it does is, you keep a separate directory within your project if you’re using different branches. Different changes go in and you keep the changes in little snippet files. Since they’re separate files, they merge easy because there’s going to be a new file for each change. Then you go through and say, ‘Okay, I’ve pulled all these things in, I’m going to go ahead and take everything in the directory and add it into the change log.’

00:00 KENNEDY: I see, you can keep a separate file system. ‘These are the breaking changes. These are the new features.’ Then it will build a changelog out of them?

00:00 OKKEN: Yeah. And it can add to your existing one. One of the things I liked, if you’re not doing something like Towncrier, one of the recommendations from keepachangelog.com was to keep at the top unreleased changes, things that you haven’t put a label on or haven’t done an official supported release yet. Those are things that you may end up kicking out.

00:00 KENNEDY: You may also have some things that you shouldn’t do like, don’t just take your git changelog and make that your proper changelog. Things like that.

00:00 OKKEN: Yeah. When I was doing research for this, one of the things that I saw was various automated ways to do it. But with that sort of thing, you’re going to pull things out of file changes and that’s not really what you want. You really want a human, moderated list of things that went in. That’s one of the reasons why I like Towncrier, because it’s sort of halfway in between.

00:00 KENNEDY: Yeah. It’s a nice way to manage that because you don’t want merged conflict. ‘I took PR, I changed the spelling, I accepted this.’ You don’t need all that noise. You just want the four things that changed. ‘Do I want to upgrade to this or not?’ Whatever.

00:00 OKKEN: I guess I would lump this in, the last time we talked about different decisions based on scaling and for projects that I'm the main maintainer of, I would definitely just keep a file. But if we start getting a lot of contributors, then something like Towncrier totally makes sense.

00:00 KENNEDY: Yeah, I think it's really nice. I’m going to definitely look into it.

00:00 Last thing I want to talk about is asynchronous programming, which is something that I talk about often because I’m a big fan. This is an article called, “Understanding Asynchronous Programming in Python” by Doug Farrell from Dan Bader’s site. We've had some of Doug’s stuff on before, he does good writing. He works at Shutterfly doing Python there, so he takes some of his experience and puts in this article. It’s pretty cool. What I would describe this as a very friendly introduction to asynchronous programming.

00:00 It starts out and says, ‘Let's imagine a web server. Could it be synchronous? Sure, it would be fine if we had a synchronous web server. We could optimize the heck out of it, but no matter how much we optimize it, at some point you are waiting on a thing and you want to go do other stuff.’ For example, it’s like shipping the HTML back to the browser on a slow network. You want to be processing other requests and do that in the background. So, he's got something to the effect of like 8 or 9 examples. To start them off he says, ‘Look, the real world is asynchronous. For example, if you're a parent, kids are a long running task with high priority superseding any other task you might be doing, like a checkbook balancing or laundry,’ or something like this. He's uses a lot of analogies back to real life that are pretty cool. Then he says, ‘Okay, were going to go through some examples, like 8 examples and build them up. Start with a synchronous sort of job-doing program that has a queue. You put some work in the queue, it does the work.’ And then it says, ‘Let's see how we can use generator methods with the yield keyword to instantiated cooperative multithreading or cooperative concurrency, I guess, between those two methods.’ Which is actually a really cool way to do it, where there there's no concurrent I/O, there's no threads, there's no multi- processing; it's like let's interweave the work of these two methods, the multiple methods using generators, which I thought was really a cool way to look at it. He says, ‘Okay, what if some of that work is slow? That's a problem.’ And then he takes you on a tour of different APIs of libraries to make this work. So, gevent, twisted, twisted callbacks. So, you can compare all these different ways of doing things, and I should throw in there some AIOHTTP-type things as well. But yeah, very very cool article. If you want a super gentle introduction to asynchronous programming.

00:00 OKKEN: So, this doesn't cover the AIO…

00:00 KENNEDY: Exactly, yeah. It doesn't cover the 3.5 stuff. So, this would work on any version.

00:00 OKKEN: I really like this article because we've been talking about asynchronous for a while and I have to admit, I have a hard time getting my head around how to think about it. I've been doing it for so long in C++, but I have a hard time getting my head around it in Python. This article is really a good starter.

00:00 KENNEDY: Yeah, I feel like it's definitely a good starter. I was happy to pick have it one of our picks this week.

00:00 So, that's all the news we have that we've found. But you have extra credit, don't you?

00:00 OKKEN: Well, yeah. In Episode #29, I gave the wrong credit to the wrong person for clueing me into pip cache.

00:00 KENNEDY: I’m sure they appreciated it, though.

00:00 OKKEN: Yeah, but it really was @kidpixo and he reminded me that it was him. So, sorry about that and thanks a lot for keeping us informed.

00:00 KENNEDY: Yeah, definitely. We really appreciate these ideas and these notes, and these little topics people send us. They're very nice.

00:00 OKKEN: And then I couldn't resist. This is going to be hard to do over a podcast but we have a link to a funny comic about Python Private Methods. And if you haven't seen this, check it out. It's basically a key under the mat in front of a door. (Laughs)

00:00 KENNEDY: (Laughs) I love it. That's really awesome. Yeah, that's kind of the thing. It’s private, unless you want to look for it, then it’s right there. Nice.

00:00 Alright, so update us on the book.

00:00 OKKEN: The book is coming along and taking almost all of my time. The multitasking is a hard thing. But the third beta is coming out, it should be out this week with the last chapter, chapter seven. This one is using Pytest with other tools like pdb and coverage and mock and talks and Jenkins, and things that I get a lot of questions about. So, I'm really happy to get this chapter out.

00:00 KENNEDY: Yeah, that's awesome.

00:00 OKKEN: How about you?

00:00 KENNEDY: Last time we talked, I was recording and recording and recording Talk Python episodes. So, now I'm kind of finishing up recording courses. I've actually got two 8 and 9 hour courses that I’ve finished recording after the last couple of weeks. I have finished recording the “RESTful and HTTP Services in Pyramid” and I've also finished writing and recording the “MongoDB for Python Developers” courses. I'm working on editing the final videos for those and getting those up. I’m really excited to get that out. Really fun.

00:00 OKKEN: I'm really excited to take a look at that MongoDB course. That sounds very interesting.

00:00 KENNEDY: It's a cool hands-on one. We built this database that represents a dealership and it’s got like, millions of records in it. We get it to where it will do queries in like one millisecond with millions of records. It's fun.

00:00 OKKEN: Nice.

00:00 KENNEDY: That's our news for the week. Brian, thank you so much for as always sharing with everyone.

00:00 OKKEN: Thank you.

00:00 KENNEDY: See you all later.

00:00 Thank you for listening to Python Bytes. Follow the show on Twitter via @pythonbytes and get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbyes.fm and send it our way. We’re always on the lookout for sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page