« Return to show page
Transcript for Episode #75:
pypi.org officially launches
00:00 KENNEDY: Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds, this is episode 75, recorded April 25th, 2018. I'm Michael Kennedy.
00:10 OKKEN: I'm Brian Okken.
00:11 KENNEDY: We've got a bunch of awesome stuff lined up for you like always, but before we get to it I'm going to say thank you to Datadog. They're sponsoring this episode like they have many and they're a big part of why the show keeps going so thank you, Datadog. Brian, people often refer to the thing that runs your Python code when you type Python as the interpreter, but only sometimes does it actually interpret. Sometimes it compiles, sometimes it JIT compiles depending on what underlying thing you're using to run it, right? PyPy versus CPython, say, Python.net.
00:46 OKKEN: Correct.
00:47 KENNEDY: You found an interesting- for one like that, for this week, right?
00:49 OKKEN: I had to go look it up, I was surprised we hadn't covered it yet. There's a package called Numba, and it's built on top of NumPy, but it's a fairly easy way to speed up a bunch of your code by adding a just-in-time compiler to it. I've also linked to the library itself, and also to an introduction article that I found, that it's pretty nice. It's pretty easy to use and this article that we're linking to has some speed-up tests where you just put the little JIT decorator on a function that you want to speed up and there's a bunch of flags to it, you can try to do it ahead of time or not but it'll compile down some of your code to C code as good as it can, and you can specify data types too, if you want to even allow it to be even faster. I'm pretty blown away how trivial it is to add some of this stuff to your code and speed it up. Some of the speedups are like 78 times faster and stuff, depending on your algorithm.
01:53 KENNEDY: That is really awesome, and this thing lives in such an interesting place, like when I set the stage, it's not CPython, but it's also not a full compiler. It's not a JIT compiler either, it doesn't attempt to do what would be a general solution, anything; like PyPy is a general purpose JIT compiler, trying to make all of your Python fast, so they say that this is a compiler for Python arrays and numerical functions period. That's pretty awesome, right?
02:25 OKKEN: There's a lot of people working with that sort of stuff that, a lot of the code doesn't have to be really fast, but they've got a big blob of data they've got to deal with and run some algorithm over it, and in those cases with a big array, that's a great place to apply this.
02:40 KENNEDY: Yeah, it's really cool, and so if you have tight loops that are slow because of some NumPy thing or arrays, or other numerical operation, slap a decorator on it, make it go fast, pretty sweet.
02:52 OKKEN: Yeah and we had a comment from somebody that didn't know whether or not we talked about libraries, they thought we just talked about articles and stuff, and I do want to include libraries, so I've been trying to make sure going back and some of the ones that people really ought to know about if they don't know already, we'll try to get those in here and there. It's good.
03:10 KENNEDY: Yeah, I'm sure, absolutely. If we haven't talked about it, it's definitely game for being covered on the show. Speaking of libraries that people care about, what's the most common way to get them?
03:19 OKKEN: Pip install.
03:19 KENNEDY: Pip install. Well, as of recently, a couple days ago, now you pip install with pip 10, so if you haven't yet, you should install pip install --upgrade pip. Maybe put a pip 3 on the front, depending on what OS you're on. There's your brand new version of pip. And a major version as well, that's pretty cool right?
03:39 OKKEN: Yeah so we do we get new, do you know?
03:41 KENNEDY: Yeah, so we no longer have Python 2.6 support, it is out, so if you're living way way in the past, like 2006, it was good enough for me, it was good enough for my code, then you should stick on pip 9, because it's no longer supported. It supports this new feature that's described by pep 5.18, which allows you to to specify what packages are required to build from source, so that's kind of cool. You can say these are the requirements to build from source and these are other requirements, it improves Unicode on windows, it has a pip config command. I've never used pip config, have you?
04:20 OKKEN: I don't think so.
04:21 KENNEDY: I think it's new. Apparently you can set up default behaviors and stuff like that and it can be based on virtual environments, or it can be based on users or it can be based on machines, so this is probably going to be a a nice handy thing to dig into. Maybe we should do a whole section on just leveraging pip config. Also if you pip install --upgrade a thing that used to try to upgrade all the dependencies if I remember correctly, now the default upgrade strategy will be only to upgrade dependencies if needed. If the thing says I require docopt 6 and you have docopt 4, it'll upgrade it. But if it doesn't say it requires that version, then it'll just leave it alone. So that could be for better or worse, not sure. And then a bunch of bug fixes.
05:00 OKKEN: Okay, neat.
05:01 KENNEDY: Yeah, so if you're out there and you're not living super far in the past, be sure to upgrade your pip. It's nice there's even been a point-release since then.
05:08 OKKEN: One of the things I really like in pip 10, this is a minor thing, is that when you type pip list it lists your packages and it used to, in pip 9 it would give you a warning that there's table configuration and you could do the old style or the new style and that's gone. In pip 10 it just gives you the table, and I like that.
05:29 KENNEDY: Yeah, that's really awesome. I didn't like that warning there, because if you're doing like screen casts or videos it looked like something was wrong, but it wasn't actually wrong, it was just like Hey, we're changing how this works, so I'm glad that that's over. When I Have functions that just have a bunch of positional arguments, right? It's like true, true, false, true. What the heck does that mean? Who the heck knows, right? You've got a nice way to fix that that's a modern Python only thing, right?
05:58 OKKEN: I'm chuckling because I'm just leaving Michael hanging on the transition between articles today but I'll try to work on that.
06:05 KENNEDY: It's alright, I got it.
06:06 OKKEN: This is actually pretty awesome, I don't know how I missed this. There's an article from Trey Hunner called Keyword Named Arguments in Python and How to Use Them, and basically, I think he's right. I was talking to him earlier and a lot of people looked at this article and went, Yeah, I know how to use keyword arguments, so keyword arguments are very useful. They're things, you always have to name your variables that are your argument names for a function, that and the caller of the function can use those, you can specify them and if you don't specify them you have to send them in order, but if you do specify them you can rearrange the order if you want you can also, one of the things that's often used, how they're often used is you've got a whole bunch of default values for arguments and a caller just needs to override like one or two, you can just name it that way. But one of the things in Python 3 is the different use of the asterisks in all of this, and there's a way to kind of separate your positional arguments from your named only arguments, and if you stick a star (*) in as one of the variables it forces everything to the right to be only named, so a caller has to provide those with names, and I didn't know, that's something I completely missed and that's pretty cool, I like it.
07:23 KENNEDY: I really like this feature and I implore people who are using **kwargs just, that's what you pass to the function and maybe a star thrown in their for additional pain and suffering, to look at this and think about how you might be able to restructure your code. A lot of things, you'll have a function and you want to take a variable amount of things and so the way they do it is they say **kwargs. You look at that signature and you're like, What do I do here, I have no idea. I sure hope the documentation tells me because the function says I take anything, you make up a name, I'll take it. But there's usually a handful of things it actually takes? Right, there's a bounded set of these are the five things you can pass, but not all required, put them in as keyword arguments. But if it's **kwargs, you're just saying to everybody good luck, go hunting, hopefully I documented this, right? Or you find something on , so you could say *, and then the argument name equals a default value. Argument name equals a default value. And then it has exactly the same behavior except you can see, both in your tooling editor as well as looking at the function what it takes. And you call it exactly the same way. I really love this feature.
08:40 OKKEN: I love it too. I like your explanation of other reasons to use it. I'm with you, the **kwargs, it just makes it so the user of that function has no idea what they're supposed to pass in.
08:53 KENNEDY: The worst offender of this is the AWS API, it's super frustrating, sometimes its dictionary is as dictionary, sometimes they're not. Don't get me started. And the documentation doesn't cover it, right? The documentation will say well, this sub-part is its name, but it's this type which is really a dictionary, but there's no mention or description of what goes in that sub-dictionary. It's just, it can be totally avoided by doing cool stuff like this, which has the same effect. Other stuff that's cool, by the way, is Datadog, so if you have an app that spans multiple processes, maybe is using different services to piece together a larger app, you should definitely check out Datadog, it's a monitoring solution for providing deep visibility and it tracks down issues across distributed Python apps. It's pretty awesome. Within just a few minutes you'll be able to investigate bottlenecks in your code by exploring graphs and rich stash boards, so if you want to check that out if you think having better insight into how your overall app is working, visualize your Python performance today, get started with Datadog, do a free trial, and they'll give you a free Datadog tee-shirt, it's very cute, and gray and white, and I like it. Check it out over at Pythonbytes.fm/datadog.
10:06 OKKEN: One of the things that we haven't talked about for a while is how to install packages.
10:12 KENNEDY: There's a really great new way with pip 10 to pip install them, but there's more to it than that, it's not just pip install the thing and the client side behavior is different, for the first time in a very long time, ever maybe? I don't know, it goes back farther back than I know the history of it. That means something different on the server side, so finally, finally, finally, PyPI.org officially launches,
10:39 OKKEN: Yay!
10:39 KENNEDY: More importantly legacy PyPI, the one at pypi.python.org/pypi. That one is now over at legacy.pypi.org and that's like an old stick-around version and they're actually shutting it down, so the pypi.python.org will redirect over to what's called Warehouse, that's the codename for the new implementation running at pypi.org.
11:05 OKKEN: That's nice, I like it.
11:07 KENNEDY: Yeah, it's super awesome, this is no longer based on sort of prototype code that grew before the web frameworks even existed. This is based on pyramid, it's based on a lot of the common programming APIs and styles that we know today, it even uses kubernetes and docker, and all sorts of amazing stuff. It uses Elasticsearch, maybe PostgreSQL I'm not entirely sure about that. If people are thinking about adding features to PyPI but they've thought oh, this is some pretty gnarly code I'm not going into there, well the new version is the official version, it's much easier to add features to.
11:42 OKKEN: You talked to some of the people involved with this, right?
11:45 KENNEDY: I did. I talked with three of the folks that worked on it, and we talked about a lot of the stories and when it launched and stuff, and that's coming out on Talk Python this week. So it's a race, does Python Bytes beat Talk Python episode 159, I'm not sure but one of them is going to be just about the same time. So if you're interested in this, go listen to it. You'll gear a lot of interesting stories. Like, Brian do you remember when it used to be there was PyPI.io? You remember that, in the early days? And later it became PyPI.org. I figured out the story behind that. I thought it was indecision, right? Oh we'll do .io; no we should make it more organizational, we'll do .org, right? It turned out what had happened was PyPI.org was owned by someone else. The PSF didn't own it, and they had to try to get ahold of it, so in the meantime they used PyPI.io, but their ultimate goal was always to have PyPI.org. It just took them a while to acquire the domain. If stories like that interest you we've got a whole hour of it on Talk Python. So people will check that out. So packages, in Python. Tell me about 'em.
12:52 OKKEN: We got a theme going on here.
12:53 KENNEDY: I think we do have a theme. We've got one more package after this by the way.
12:57 OKKEN: One of the things I've forgotten about was when you're kind of coming into, there's a lot of people that, as we know, come into Python from other languages. And they'll rush in and figure out how to do something quickly, and they'll think yeah, I'm a Python expert because I wrote a script. It isn't because people are trying to become experts or anything but one of the things that tripped me up right at first is getting my head around what is a module, what is a package, really how do users make those? And dealing with dunder inits and __init__ files, and where do they go, and that stuff. And so, there's a nice tutorial now, I think it's a real Python modules and packages introduction. And I just really liked this, it's a good rundown, it's pretty simple, and this is something that I'm bookmarking so a lot of the people that I work with that are new to Python and trying to figure this stuff out, I can point them to this. And it's a good introduction.
13:56 KENNEDY: Yeah, it's a great introduction, and it's definitely that kind of stuff, that when you're new it's a little bit weird to figure out the difference, right? Use packages and modules are kind of the same, but they're not the same, what's the story? And especially learning how to package up your project to make it accessible to other programs is tricky.
14:14 OKKEN: This isn't going into packaging itself, which is a bummer that we have those as two different aims. Really this is talking about a directory with a dunder init in it and sub-directories and stuff, we're talking packages and sub-packages. But it is the initial part. Once you have this down, you can jump into packaging and distributing code to share. But this is a good starter, so, it's a good learning place.
14:41 KENNEDY: Yeah, it's definitely a good learning place. Remember last week we spoke about the joint project that the Pyterm team and the Python Software Foundation did with their 2017 Python survey?
14:55 OKKEN: Yeah!
14:55 KENNEDY: Right. One of the interesting pieces of data there was the people who are data scientists do more Python 3, modern Python that is, as a percentage of their projects than say web developers or other types of Python developers. That's partly because they're starting from scratch, it's like a more new type of thing so why start with the old, right? Well the big news today is that Pandas, one of the major, most important foundational data science libraries, is going modern Python only at the end of the year.
15:30 OKKEN: That's a big thing, yeah. Definitely.
15:32 KENNEDY: Yeah I would say this is as big of news as Django 2 going Python 3 only on similar time scales. Well they're already out with their new stuff, so I guess just in terms of the old support ones. So that's pretty cool, right? One more major, major building block, that it used to be, Well, if I move to Python 3 I won't be able to get my libraries. Now the story is, If you don't move to Python 3, you don't get the best, newest libraries, so, pretty awesome.
15:58 OKKEN: It isn't just getting the newest it's also security updates and things like that.
16:02 KENNEDY: Yeah, part of their announcement said, December 31st 2017, Pandas will drop support for Python 2.7, presumably, stuff before that's not even supported. This includes no back ports, or security, or bug fixes period. They are open to letting somebody take that, to become their job. They're like, we're not doing that. So, we're moving on.
16:22 OKKEN: Is Pandas a volunteer thing? Is it open-source, just run by open-source people, do you know?
16:27 KENNEDY: I think so. I think it might be part of this...
16:32 OKKEN: Part of the support thing is, you can kind of expect some companies to have back security and bug fixes on old releases for a while, but open-source projects, these are just people volunteering their time. I think it's completely reasonable for people to say, hey, we're done trying to support 2.7, let's move on.
16:51 KENNEDY: Yeah, it makes sense. So, I looked while you were talking. Pandas is a numFOCUS sponsored project, so I think it's sort of donation supported, but through a more formal science organization. But, still, most people are contributing features that can either focus on bug fixes in making Python 2 stuff work or just moving forward and getting the latest, greatest stuff coming. That's pretty cool.
17:16 OKKEN: Yeah, it's a good focus.
17:17 KENNEDY: Brian, that's it for this week for our items, got anything in particular you want to talk about?
17:23 OKKEN: I've got a couple episodes coming out on Talk Python, and I'm getting ready for PyCon, I'm looking forward to that.
17:31 KENNEDY: Yeah, you have a talk going there, right? That's going to be pretty awesome.
17:34 OKKEN: Yeah, I'm getting a little nervous for that, but it'll go really--
17:37 KENNEDY: Don't worry, it's not like it's recorded, and then also there's a bunch of people there. Oh yeah it's going to be forever. No stress, just kidding.
17:43 OKKEN: Yeah, no stress. Thanks for that.
17:47 KENNEDY: No, I'm just teasing you because I know you'll do great. We also have a booth there, you and I and a few others, so people can come visit us at the booth like last year.
17:54 OKKEN: I'd like people to try to reach out if they want to just come on, and there's something they want to record, I think I'm going to take some recording equipment so we can do some short recordings there.
18:04 KENNEDY: Oh, that'd be awesome. Yeah, I'm definitely bringing my mic, I'm going to do some stuff like that as well.
18:09 OKKEN: What's up with you?
18:10 KENNEDY: Well, you remember the last time Matt Harrison was on the show and we were joking about trying to get some course or something or other done?
18:15 OKKEN: Yeah!
18:16 KENNEDY: Well, it's taken me a ton of work the past seven days, but nonetheless we now have a brand new course to announce, Python 3, an illustrated tour.
18:24 OKKEN: Nice!
18:25 KENNEDY: So the idea is, it basically only covers the features that were new to Python as of Python 3. So says here are a couple of peps that talk about type annotations and here's some graphics and other sort of walk-through of what that means for you. So if you feel like you are not using all of the features of Python 3, or you're coming especially from Python 2, and you're like, I need a quick show-me-the-new-stuff sort of, in a practical way then check out Python 3, an illustrated tour. That's done by Matt Harrison and it came out really well. So check it out at talkpython.fm/illustrated.
19:00 OKKEN: Okay, I'll check that out.
19:01 KENNEDY: Yeah, yeah. I learned a lot, it's really fun. We keep learning things. For example, you talked about the star at the beginning of function arguments, that kind of stuff is in there. I didn't know about that till too recently either.
19:12 OKKEN: Okay, cool.
19:12 KENNEDY: Alright.
19:13 OKKEN: It's been a good talk today, thanks.
19:14 KENNEDY: Yeah, you bet. Very nice one, we really dug into the packages this time, didn't we?
19:18 OKKEN: And we don't plan this, it just happens.
19:20 KENNEDY: It just happens, that's right. Alright, well, thanks Brian, and talk to you all later.
19:24 OKKEN: Bye.
19:26 KENNEDY: Thank you for listening to Python Bytes. Follow the show on twitter via @PythonBytes, Python Bytes as in B-Y-T-E-S. Get the full show notes at Pythonbytes.fm. If you have a news item you want featured, just visit Pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.