Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #31: You should have a change log

Return to episode page view on github
Recorded on Tuesday, Jun 20, 2017.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:06 This time it's Python Bytes episode 31 recorded on Tuesday, June 20th, 2017.

00:12 I'm Michael Kennedy.

00:13 And I'm Brian Okken.

00:14 And we have a bunch of cool things to talk about.

00:17 Some of them are huge and some of them are kind of tiny.

00:19 Let's start small, huh?

00:20 Yeah, let's start small.

00:21 I'm really appreciate, it's one of the reasons why I like following Twitter for Python news is that's where I found Tiny Mongo. So I saw somebody talking about it last week.

00:34 That's awesome. I'm a fan of MongoDB and TinyDB and if they could come together, that'd be even better. Right. So this is essentially an attempt to put, it's not an exact same interface, but it's fairly close to the same interaction you do with Mongo with a the single file system.

00:53 So it's a single file database.

00:56 And Steven, the person working on this, did not--

01:01 I talked with him, and it wasn't his intent to always be right on top of TinyDB.

01:07 But so far, he's been really happy with TinyDB as the back end for TinyMongo.

01:12 And so, yeah, it just sits-- it's using TinyDB as the database part, but exposes an interface that's very close to Mongo.

01:20 Yeah, that's super cool.

01:21 So basically, if you have code that talks to MongoDB through the PyMongo API, you could more or less adapt that really quickly to TinyMongo.

01:32 And TinyDB, the backing store for this thing, more or less is like, let's create a simple document database that's really just some JSON files living on your disk.

01:44 It's not a full-on production database, but if you're doing simple stuff, like really simple things, like this is actually pretty sweet.

01:50 There's no server, right?

01:51 Right, and yeah, there's no server.

01:53 I would say probably the other direction probably works the best.

01:56 So if you were gonna, your end goal was to use Mongo, that tiny Mongo might be a good way to start because it isn't the full set of functionality.

02:07 I don't have a complete list of what's missing.

02:11 I just have the personal experience of, I tried to take a Mongo application and just swap this in, and I ran across a few errors and I haven't finished debugging those yet.

02:22 I'm just really excited about it because there's more than one document database that I can use in small applications.

02:30 - Yeah, that's cool.

02:31 - And then also, one of the applications for this, when I was talking with the maintainer of it, is that he's using it on Raspberry Pis even.

02:40 So having a Mongo-like--

02:42 - That is really cool, 'cause you don't wanna start up a whole separate server on like a Raspberry Pi, but certainly having a little couple of JSON files laying around that you have like a database interface over top of, that's cool.

02:53 - Yeah, definitely.

02:54 So I was excited about this and I'm gonna start using it right away.

02:58 - That's sweet.

02:59 Yeah, if people are interested in TinyDB, I back on episode 80 of Talk Python many moons ago, I interviewed the guy who created TinyDB and talked about some of the use cases.

03:10 And I think there's some extensions you can get like indexing add-ons and stuff like that.

03:14 So there's a lot of stuff to do with this, pretty cool.

03:17 So that sounds pretty dead simple, right?

03:18 Just fire up TinyDB and off you go?

03:21 - Yeah, dead simple.

03:22 - You know what else?

03:23 I want some dead simple validation.

03:25 And so the next project I chose is called Validus.

03:28 And Validus is on GitHub and it's described itself as a dead simple Python data validation library.

03:34 And have you ever tried to write a regular expression to match an email or a URL or something like that?

03:41 - Oh yes, yeah.

03:42 - That's super fun, right?

03:43 - No.

03:44 (laughing)

03:45 You think you get it working to someone who mails you like I have a proper email address, but I can't sign up your system.

03:50 It says my email is invalid.

03:52 You're like, oh, gosh.

03:53 So this validist thing kind of like solves that for a class of types of data, basically simple input.

03:58 So you can just import this and say validist.isEmail and give it a string and it will say yes or no.

04:04 And you can ask it questions like, is it an RGB color?

04:06 Is it a phone number?

04:07 Is it an ISBN?

04:08 Is it an IPv4 or IPv6 address?

04:11 Is it a number?

04:12 Is it a slug?

04:13 would it fit at the end of a URL without needing encoding, all that kind of stuff.

04:18 It's pretty awesome.

04:19 That's cool.

04:20 I'd say it's dead simple.

04:21 It's even got is Mongo ID.

04:23 Nice.

04:24 Yeah, yeah, that's awesome.

04:25 So you know what else I like about this?

04:28 It's Python only, no legacy Python.

04:31 3.6, 3.3.

04:32 Yeah, yeah, yeah, 3.3 and above.

04:34 So it's only a Python 3 thing.

04:36 So yet another sweet example of that.

04:38 I have a lot of interesting stuff to say about that at the end of the show.

04:41 Validus but Python versus legacy Python.

04:44 While this works pretty well, we may still need to jump in the debugger, right?

04:48 Yeah, definitely.

04:49 And I'm a command line debugger kind of person.

04:52 Actually I don't really jump into the debugger too much.

04:55 You're a last resort, debugger of last resort type person?

04:58 Yes.

04:59 Yeah, definitely.

05:00 Last resort.

05:01 And so in episode 29, we talked about launching the ability to launch PDB, the Python debugger from a failed Py test.

05:10 on Twitter, another Twitter person, KidPixo, I think.

05:14 Yeah, KidPixo, he runs the Geek Cookies Italian podcast, which I was a guest on like two and a half years ago.

05:20 He's a great guy.

05:21 He sends us lots of good stuff, yeah.

05:22 Well, he passed this along because he said he really loves the PUDB debugger.

05:28 And my first reaction is, "Oh my God, this thing is ugly," because it does look like you're back in the '80s running on a 386 or something.

05:36 I feel like I've dialed into a VBS.

05:38 It does have themes, so after I played with it for a while, I switched it to a midnight theme, and it looks just like I'm in my editor. It's actually pretty slick, and one of the things that you can do with it, it's a lot better than PDB, and it's still small and fast, and there's some documentation in it for how you can do the same thing that we did with pytest. You can launch it just with whenever you hit a pytest failure.

06:07 That's sweet.

06:08 - Pretty cool.

06:09 - Yeah, it's really nice.

06:10 I mean, you can use it over like SSH and stuff.

06:12 So if you're SSH into a server, you can debug with this, but it actually has like little windows.

06:17 I mean, it really does feel like I'm back on a BBS.

06:19 It's awesome.

06:20 Like you see your code and you can step through it.

06:22 You've got like a variables window and a stack and break points.

06:25 And like, it's really nice.

06:26 It's like a ASCII curses type thing.

06:28 - But the local, yeah, the local window already having your listing up and also all your local variables And that changing when you go up and down the stack is, it's just, it's usually enough.

06:41 So I like it.

06:42 - Yeah, yeah, yeah, it definitely hits the sweet spot, like the 80% case for debuggers.

06:47 It's cool.

06:48 All right, so I'm definitely gonna start using that if I need to debug anything without a Windows environment, a windowing environment like macOS or Linux or Windows.

06:56 Okay, so the next thing that I wanna talk about is a really interesting and sort of wide ranging study that the guys at pyup.io did.

07:07 So pyup.io is a cool service.

07:09 I'm actually a paying customer of theirs because I really think what they're doing is awesome and I use it for my web apps.

07:14 So the idea is you basically point, you give pyup.io access to your requirements file in your public or private GitHub repo.

07:24 And if there's a new version of Indy requirement or transitive requirement that you depend upon, it will tell you like, hey, there's a new release of the pyramid web framework.

07:33 And here's the change log.

07:34 And actually, this one's a security update.

07:36 So get in there and fix it quick.

07:38 So it will like basically watch your requirements and tell you if there are any upgrades and things like that.

07:42 And it'll issue them as a pull request.

07:43 So really cool.

07:44 So these guys have access to all these requirements, files and many other things, right.

07:48 And they studied some Django requirements files on GitHub.

07:52 Now this isn't through their business, they were able to use BigQuery to just get ahold of all of the Django requirement files that are on GitHub.

08:00 And they found some interesting things.

08:01 And I guess this is not private, not the private repos, probably just the public ones.

08:05 But anyway, they said that Django is the most popular web framework.

08:10 And it's pretty old.

08:11 It's been around for 12 years using all sorts of different projects.

08:14 So let's look at these requirements files, which specify like all the dependencies you have to install, and see what we can get from them.

08:22 So the first thing they ask is do developers pin or freeze their requirements, right?

08:27 That's where in your requirements, txt, you could say I depend on Django, and I depend on SQLAlchemy, and I depend on requests, or you could say it depend on Django equal equal this version request equal equal that version, right?

08:38 That's pending and freezing.

08:40 And they said that 64% of Django developers pin their requirements.

08:45 That's interesting.

08:46 And another 20% or so do ranges.

08:49 So like I'm willing to take this range of versions but not leave it unpin.

08:56 And then some of them are just like, give me whatever I get when I ask for it.

08:59 So that's interesting.

09:00 Another thing that they said was pretty interesting is that Django 1.8, even though I think 1.10, 1.11 is the latest, Django 1.8 is the most popular of them.

09:11 And that was pretty cool.

09:13 But one of the things I really wanted to point out here is they said that what is more worrisome is 1.9, 1.7, and 1.6 are second, third, and fourth most popular on the list.

09:27 Why is that a problem?

09:28 None of them are receiving any security updates at all.

09:32 Oh, weird.

09:33 So what, okay, isn't that bad?

09:35 So 1.7 and 1.6 went end of life over two years ago.

09:38 So if you are on the web and your application listens on a socket, you want it to have all the security patches, let me tell you.

09:46 That's bad news.

09:47 And here's like, if I add those up really quick, that's something like 40% of Django files they found are using these older versions.

09:55 And in fact, he said only 2% of all Django projects they could find are actually on a secure release.

10:02 Among all the projects more than 60% use Django releases with one or more known security vulnerabilities.

10:08 And that's pretty intense, man, that only 2% of them are on a 100% known secure release.

10:14 Well, I mean, clearly, it's recommended to go make sure that you're using a secure release.

10:19 But I was curious about the pinning or freezing.

10:23 Is that considered best practice?

10:26 So I think it depends on what you're doing.

10:28 For large, complicated applications, it's definitely considered a best practice.

10:32 The idea is, you want to make the upgrade in your dependencies at the time of your choosing, right?

10:39 Like you want to have...

10:40 If you're going to upgrade from, especially major frameworks like Django, if you're going to go from Django 1.8 to 1.9, you don't want that to just happen one day when it gets released and you happen to refresh your server because that might have breaking changes.

10:53 So you want to explicitly say, "I depend on this one.

10:55 Oh, there's a new one out.

10:56 Let me test the new one," and then explicitly change that number and have it flip it for you.

11:03 And basically, that's what the PyUp service does that I use.

11:07 it will automatically upgrade my Pyramid web framework from 1.7 to 1.8 to 1.9, but it doesn't flip it immediately.

11:14 It's like I have to, it'll tell me and change my requirements files as a PR and I'll have to accept it basically.

11:19 - Okay.

11:20 - Yeah, but pretty interesting stats there, especially if you're into Django, check that out.

11:25 - Yeah, definitely.

11:25 It's kind of concerning that there's so many, and then those are, sorry to hang out on this so much, but was this projects or applications and is there a difference?

11:36 So as far as I can tell from the, I don't really know, Yanis, I think this guy who wrote it probably could maybe chime in in the comments if he's listening, but my understanding is basically they went and they studied the public repos that use Django.

11:51 So this also may not be quite representative because companies like Pinterest that depend on Django, they're obviously not going to make their code public, right?

12:01 So they may be doing slightly different things.

12:03 But still, it's interesting for you and to at least the open source side of Django.

12:08 - Definitely, it's cool.

12:09 - Speaking of open source projects, do you think they should have a changelog?

12:12 - Well, that's what I was curious about.

12:15 Yeah, so I kind of am warming to the idea of changelogs.

12:19 I appreciate other projects with changelogs.

12:21 I actually asked some people back on Twitter again what they thought of them.

12:26 And there's a couple things I came across, which was a website called Keepachangelog.

12:31 - I really like that site.

12:32 It's so clear and compelling.

12:34 It's great.

12:35 Yeah.

12:36 Well, it's also, it talks about that there really isn't a standard, if there is a standard format for them, this is probably as close as you can get.

12:45 And there's, it talks about different standards in either REST or in Markdown.

12:50 There's different ways to do it.

12:52 And then when I was talking to, talking on Twitter about changelogs, some of the people from the pytest project piped up and said that they're using a tool called TownCrier to maintain their change log.

13:06 That looks really cool, but I've never done anything with it.

13:08 What's TownCrier do?

13:09 So what it does is you keep a separate directory within your project so that you can have it on different, if you're using different branches, and then different changes go in and you keep the changes in little snippet files so that since they're separate files, they merge easy because they're going to be a new file for each change.

13:30 Then you go through and say, "Okay, I've pulled all these things in.

13:34 I want to go ahead and take everything in the directory and add it to the changelog." >> Oh, I see. You can keep a separate file that says, "These are the breaking changes.

13:42 These are the new features," or whatever, then it'll build a changelog out of them?

13:46 >> Yeah.

13:46 >> Oh, sweet. Okay.

13:47 >> Well, it can add to your existing one.

13:50 And one of the things I liked, if you're not doing something like Town Crier, one of the recommendations from Keep a Change Log was to keep at the top unreleased changes so that you, things that you haven't put a label on or done an official supported release yet.

14:06 Because those are things that may, I don't know, maybe you may end up kicking out.

14:10 Yeah, they also have some things that you shouldn't do, like don't just take your Get Change Log and make that your proper change log, things like that.

14:18 Yeah, and the one of the things there I saw when I was doing some research for this I did see some some various automated ways to do it but that's the sort of thing is you're gonna pull things out of file changes and that's not really what you want. You really want a human moderated list of things that went in and That's one of the reasons why I like Town Crier because it was sort of halfway in between. Yep. Yeah, it's it's definitely Really really it's like a nice way to sort of manage that human but because you don't want merged conflict took PR Accepted this I I changed the spelling like, you know, you don't need all that noise You just want the four things that change do I want to upgrade to this or not? Whatever. Let's just move on, right?

15:00 Yeah and then I guess I would lump this in last time we talked about different decisions based on scaling and for projects that I'm just I'm the main maintainer of I I would definitely just keep a file.

15:13 But if we start getting a lot of contributors, then something like Town Crier totally makes sense.

15:19 Yeah, I think it's I think it's really nice.

15:21 I'm gonna definitely look into it.

15:22 All right, last thing I want to talk about is asynchronous programming, which is something that I talk about often because I'm a big fan.

15:30 This is an article called understanding asynchronous programming in Python by Doug Farrell from Dan Bader site.

15:37 And we've had some of Doug stuff on before he does good writing.

15:40 He works at Shutterfly doing Python there.

15:42 He takes some of his experience and puts it in this article.

15:45 It's pretty cool.

15:46 What I would call or describe this as is a very friendly introduction to asynchronous programming.

15:53 It starts out and says, "Let's imagine a web server." Could it be synchronous?

16:00 Sure, it would be fine if we had a synchronous web server.

16:03 We could optimize the heck out of it, but no matter how much we optimize it, at some point you're waiting on a thing, and you want to go do other stuff.

16:11 For example, just like shipping the HTML back to the browser on a slow network, right?

16:16 Like you want to be processing other requests and do that in the background.

16:20 So he's got something to the effect of like eight or nine examples.

16:25 And to sort of start them off, he says, look, the real world is asynchronous.

16:29 For example, if you're a parent, kids are long running task with high priority, superseding any other tasks you might be doing, like a checkbook balancing or laundry or something like this.

16:40 So he's a lot of like analogies back to real life that are pretty cool.

16:43 Then he says, Okay, we're gonna go through some examples, like eight examples and build them up.

16:47 Start with like a synchronous sort of job doing program that has a queue, you put some work in the queue, it does the work.

16:53 And then it says, Alright, let's see how we can use generator methods with the yield keyword to instantiate like cooperative multi threading or cooperative concurrency, I guess, between those two methods, which is actually a really cool way to do it where there's, there's no concurrent IO, there's no, there's no threads, there's no multi processing, it's just like, let's interweave the work of these two methods or multiple multiple methods using generators, which I thought was really a cool way to look at it.

17:19 This is okay, well, but what if some of that work is slow, that's a problem.

17:23 And then he kind of takes you on a tour of different API's and libraries to make this So G event, twisted, twisted callbacks.

17:30 And so you can compare all these different ways of doing things.

17:33 And I should throw in there some AIO HTTP type things as well.

17:37 But yeah, very, very cool article if you want a super gentle introduction to asynchronous programming.

17:42 So this doesn't cover the AO--

17:44 AI.

17:46 AI.

17:47 Yes, exactly.

17:47 Yeah, it doesn't cover basically the 3.5 stuff.

17:50 OK.

17:50 Yeah.

17:51 So this would work on any version.

17:53 I really like this article because we've We've been talking about asynchronous for a while and I have to admit I have my hard time getting my head around how to think about it.

18:02 I've been doing it for so long in C++ but I have a hard time getting my hand around it in Python and this article is really a good starter.

18:10 Yeah, I feel like it's definitely a good starter.

18:13 I was happy to pick one of our picks this week.

18:15 All right, so that's all the news that we have that we've kind of found but you have extra credit, don't you?

18:21 - Well yeah, in episode 29, I gave the wrong, credit to the wrong person for cluing me into Pipcache.

18:29 - I'm sure they appreciated it though.

18:31 - Yeah, but it really was KidPixo, and he reminded me that it was him, and so sorry about that, and thanks a lot for keeping us informed.

18:40 - Yeah, definitely keep, we really appreciate these ideas and these notes and these little topics people send us.

18:46 They're very nice.

18:46 - And then I just had, I couldn't resist, this is gonna be hard to do over a podcast, but we have a link to a funny comic about Python private methods.

18:57 And if you haven't seen this, check it out.

18:59 It's just, it's basically a key under the mat in front of a door.

19:03 (both laughing)

19:07 - I love it, I love it, that's really awesome.

19:09 Yeah, that's kind of the thing, it's like, it's private unless you wanna look for it, then it's right there.

19:13 - Yeah.

19:14 - Nice, all right, so update us on the book.

19:18 - The book is coming along and taking almost all of my time.

19:22 The multitasking is a hard thing.

19:25 But yeah, the third beta is coming out, should be out this week with the last chapter, chapter seven.

19:31 And this one is using pytest with other tools like PDB and coverage and mock and talks and Jenkins and things that I get a lot of questions about.

19:41 So I'm really happy to get this chapter out.

19:44 - Yeah, that's awesome.

19:45 - How about you?

19:46 - Yeah, last time we talked, I was recording and recording and recording talk Python episodes.

19:50 So now I'm kind of finishing up recording courses.

19:52 So I've actually got two eight and nine hour courses that I've finished recording over the last couple of weeks.

19:59 So I've finished recording the RESTful and ACP services in Pyramid.

20:03 And I've also finished recording and writing recording the MongoDB for Python developers courses.

20:08 So I'm working on editing the final videos for those and getting those up.

20:11 So I'm really excited to get that out.

20:13 Really fun.

20:14 - I'm really excited to take a look at that MongoDB course.

20:16 That sounds--

20:17 - Very interesting.

20:18 - It's a cool hands-on one.

20:19 We build like this database that represents a dealership and it's got like millions of records in it.

20:24 We get it to where we'll like do queries in like one millisecond, even with millions of records.

20:28 It's fun.

20:29 - Nice.

20:30 - Yeah.

20:31 - Cool.

20:32 - All right, well, that's our news for the week.

20:33 Brian, thank you so much for, as always, sharing with everyone.

20:37 - All right, thank you.

20:38 - Yep, see you all later.

20:39 Thank you for listening to Python Bytes.

20:42 Follow the show on Twitter via @pythonbytes.

20:45 That's Python Bytes as in B-Y-T-E-S.

20:48 And get the full show notes at pythonbytes.fm.

20:51 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

20:55 We're always on the lookout for sharing something cool.

20:58 On behalf of myself and Brian Auchin, this is Michael Kennedy.

21:02 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page