Transcript #15: Digging into Python packaging
Return to episode page view on github00:00 This is Python Bytes, Python headlines and news delivered directly to your earbuds.
00:05 It's episode 15, recorded Monday, February 27th, 2017.
00:10 This is Michael Kennedy. I'm here with my co-host, Brian Okken.
00:14 Hey, Brian. Happy Monday. Happy Python News Day.
00:18 Yes.
00:20 And today we're going to have a couple of really interesting items and we're going to start off with a theme.
00:25 And the theme is, people should probably understand Python packaging and distribution better, right?
00:30 Yes, definitely.
00:31 I come up with this.
00:33 Actually, it jumps up on you as a surprise, I think, as soon as you try to share some Python code with somebody, even if you're just sharing it within your own work group of dealing with packages.
00:44 So there's a blog post out on, I think it was a medium by Ji Feng, a simple guide to Python packaging.
00:53 And it's actually a really quick read and it's a really nice introduction to talk about where to put the init file and you know, just basically what is a package and how to deal with it even to the extent of pushing something up to PyPI.
01:07 Yeah, it's a good article.
01:08 Yeah, I thought it was really nice and it's like five minutes or less to read.
01:13 All right, so definitely jump in.
01:14 One of the things that I was reminded of while I was reading it, I knew it, but just, you wasn't on the top of my mind is the difference between modules and packages.
01:24 And you know many people know the differences but if you haven't worked with Python packaging maybe you don't. And so people I feel like people often use module and package interchangeably like these are synonyms and we just swap them right you import module import package whatever. There's a difference right like a module is a single Python file where technically what defines a packages, it's a directory with a dunder in it.py and one or more Python files in it.
01:52 Right.
01:53 Or I guess it could be empty.
01:54 It just does nothing.
01:55 Right.
01:55 But it's more Python treats those differently.
01:58 If it finds that dunder in it.py.
02:00 Yeah.
02:00 It's very important with sharing code, but the, one of the nice things at the bottom of the, the article, there was a link to another article, a actually an entire set of instructions called how to package your Python code by Scott Torborg.
02:15 He's a Portland person.
02:16 - Oh, right on.
02:18 - Yeah, go Portland.
02:19 It's a longer introduction, but it's still like, I read it in like a half an hour, the entire site.
02:25 It's a fairly detailed look at packaging.
02:28 I think actually that this documentation should be incorporated into the PyPA documentation because it's really good.
02:35 - You think it's that good, that's awesome.
02:36 - Yeah.
02:37 - All right, way to go, Scott.
02:38 So the next item that I wanted to talk about is also related to this.
02:42 So I feel like this is the next layer up.
02:45 All right, so once you get into packaging and start packaging your code, one of the reasons you might do that is if you're building a package for the world, that's cool, right?
02:53 Submit that to PyPI and everybody can play with it.
02:55 But another reason is you just want to reuse that code.
02:58 And maybe this code is proprietary code and you don't want to reuse it with the world.
03:01 You want to reuse it maybe only within your company, right?
03:04 So one of the cool things you can do is set up your own PyPI server.
03:09 And then that's private behind your firewall.
03:12 authentication or whatever and you can put your code there and then across your different apps, across your different teams, across continuous integration as long as it has access to this, you can basically put additional private stuff including your proprietary libraries in there which I think that's really cool.
03:28 So the thing I brought up was an interesting take on this idea called Elastic PyPI by Kyle Hornberg. So the idea is we're going to take this and and we're gonna use Elasticsearch and AWS Lambda for a serverless deployment of your own private PyPI.
03:46 - Okay.
03:47 - So the AWS Lambda stuff is like, here I can give you a Python function and I'll give it to AWS and it lives at the end of a URL and it just basically, you make a request, it runs it.
03:58 Right, you don't have to set up a server or it's even like a little bit less than Platform as a Service.
04:03 It's like really, really lightweight.
04:05 So it's a combination of this and Elasticsearch, which I thought was kind of cool.
04:08 - Yeah, actually that's pretty neat.
04:10 I'm still looking for the right solution for internal to a company inside of a firewall.
04:16 I haven't quite stumbled upon the right thing yet.
04:19 - Yeah, I know there's, so I actually listed three other ones that I know about that are in this realm.
04:23 Like this runs on Elasticsearch and AWS Lambda, which isn't behind your firewall, but is potentially private.
04:29 There's also PyPI server, there's DevPi, there's, I have this, I guess I have this one here twice, just from two locations, but there's PyPI server and DevPi.
04:38 And those are interesting, you can basically pip install those and set them up and run them behind the scenes.
04:44 But I still feel like you're onto something that there's something more to do here.
04:48 - Oh, there might be something less and I'm just missing it.
04:50 I mean, do I even need a, can I just throw a directory somewhere or something?
04:55 - Yeah, I know, I feel like, you know, there's like GitHub Enterprise and things like that.
04:59 Like here's a really cool box that is GitHub for your enterprise.
05:04 Maybe there's something like that for PyPI.
05:05 Maybe these are good enough.
05:07 I don't know, but it seems like that's an interesting area, right?
05:10 Definitely.
05:11 So I got an email, we got an email from a listener who said, look, I'm really enjoying your podcast.
05:17 All the news that you find is great, but I'm kind of living in an RSS world.
05:23 And I feel like RSS is great, but it's not really where all the news comes from.
05:28 Like, where do you guys get your news from?
05:29 So Brian, where do we get our news from?
05:31 Well, where we get it from, where you guys get it from, I think should be from pythonbytes.fm, this podcast.
05:40 Honestly, I did took a look.
05:43 I pay attention to Twitter.
05:44 I pay attention to newsletters and Planet Python.
05:48 And so let's kind of go through these.
05:50 We put up the-- in our show notes, we'll have the links to both mine and Michael's Twitter accounts.
05:56 We don't really put out a lot of news on the Python Bytes, but maybe we should.
06:00 I don't know.
06:00 Yeah, we probably should.
06:01 I feel like we should probably tweet out at least our six items every week, right?
06:05 - Yeah, definitely.
06:07 Yeah, that's a good idea. - Yeah, we should get on that.
06:08 - Yeah, let's do that.
06:09 But there's a feed aggregator called planetpython.org, if you don't know about it already, but it has a few different websites and it does sort of everything new that day or I think it's updated several times a day there.
06:23 And you can, so like my RSS reader just points to that.
06:27 I just pay attention to that.
06:29 - Yeah, there's a lot of great stuff there.
06:30 - We didn't really order them in any particular order, but there's a few Python related newsletters.
06:36 There's Awesome Python, there's Python Weekly, PyCorders Weekly, and Import Python.
06:42 There's four.
06:43 And I don't really, I don't know if I have a preference.
06:45 I just like scanning through all of them.
06:47 - Those guys all have great stuff.
06:49 And every one of them has at least one unique thing each week, so they're all worth subscribing to.
06:55 - Yeah, and I also like the second opinion because if one of the articles, if there's an article or a new project or something that shows up on all four of them or three out of four, I probably should go read it.
07:06 So pay attention to that. - Yep, absolutely.
07:08 - And then Reddit is, Reddit's used quite a lot by Python people and there's a Python subreddit that's pretty good.
07:15 - Yep, and I threw in trending on GitHub.
07:18 So if you go to github.com/trending?l, I think it's l equals Python, it'll, yeah, for language, it'll actually show you all the trending repositories that are primarily Python.
07:29 So that's pretty trick for things that people haven't discovered yet.
07:32 - Yeah, I haven't used that, that's neat.
07:34 (Luke laughs)
07:35 - Yeah?
07:36 - I'm not really hanging out on Reddit a lot.
07:37 So one of the things that I think is good to know about Reddit is that there's a couple of filters at the top.
07:43 Like when you're looking at the interface, you can, it seems like a constant stream of information, but there's ways to say, what's the interesting one, the top ones in the last week or the last month or something that's helpful to filter some of this stuff.
07:56 - Yep, absolutely.
07:58 Well, let's move off of news and on to news.
08:01 (laughing)
08:02 - That's very meta.
08:03 Yes, so we've talked a lot about asynchronous programming, the concurrency story in Python, and as we've said, this is for Python 3.4 and 3.5, these are some of the really big changes and advantages to the newer versions of Python.
08:17 So what I wanted to highlight was something done by David Beasley.
08:20 He's done a bunch of great public speaking and teaching and stuff, and he's created this project called Curio, the coroutine concurrency library for Python.
08:30 So this is a small little library built for performing concurrent I/O and common system things such as launching subprocesses and farming out work on threads and process pools.
08:42 So it's built from the ground up to basically be taking full advantage of the programming paradigms, things like async and await in Python 3.5, and it doesn't have a lot of the the baggage that comes from say like using tornado or twisted or something like that.
08:57 Pretty cool.
08:58 Okay.
08:59 I haven't looked at this but I did want to see some good example of a production library that uses async and await to try to understand that better.
09:08 Yeah, it's got a really cool example of a writing basically a socket based server that can communicate with people and handle thousands and thousands of concurrent requests using a single thread.
09:19 So the mental model for how it works, I think, should probably be a little bit like Node.js, right?
09:24 You sort of subscribe to some work and then you get a callback when that work is done, like so when you get a message from a client or something.
09:32 But because it uses async and await, it looks basically like standard procedural non-concurrent code, which is the way that code should look, right?
09:41 It's great.
09:42 Okay, cool.
09:43 Yeah.
09:44 So check that out if you are wanting to look at a take on async and await.
09:48 My last item is actually a I got this through Reddit just the other day I was looking through the hot things from the last week on Reddit in Python.
09:57 And this item was a pandas is switching to use pytest as its test framework.
10:04 And I thought that was really cool because by test is kind of something I like should write a book on it.
10:10 Yeah, I should.
10:11 Yeah.
10:12 Yeah, that's really cool.
10:13 I think it's an interesting look at the challenges of moving like a very large, highly visible, highly used project from one testing framework to the other.
10:22 And there was comments like in this GitHub thread, things like, well, pretty much everything moved over okay, except for these 69 tests.
10:30 They really are hard.
10:31 And we need somebody to like champion taking on these 69 tests that just won't move.
10:36 - That's just 69.
10:39 That seems like a lot of work, but.
10:41 - Yeah, they said many of the ones are running the challenges because of external resources, like databases or other funky things like that.
10:49 - Huh, okay.
10:50 Yeah, actually, external resources is the reason I chose pytest in the first place because I depend heavily on resources, and it handles fixtures the best.
10:59 Anyway. - Yeah, awesome.
11:01 Yeah, so very cool example.
11:03 All right, so my last one is, I kind of cheated and picked something that I did, but I think this one is noteworthy, giving the amount of attention the community gave it.
11:13 So last week I had the honor of having Guido van Rossum on Talk Python.
11:19 And we spent a little over an hour talking about how he got into programming, what were the influences on the Python language.
11:27 There's a language called ABC that he was actually working on before Python.
11:31 Why did ABC fail and go away?
11:33 Why did Python succeed?
11:35 How are we doing on the move to Python 3?
11:37 Whatever his favorite features.
11:39 and even what he's up to at Dropbox these days.
11:41 So really, really interesting conversation looking backwards and forwards with the Creator Python.
11:48 - I really enjoyed that conversation actually.
11:50 It's one of my favorite episodes now.
11:52 - Yeah, thanks.
11:52 - I really liked the discussion of the two versus three and both the reasons why they did it and also the difficulty and kind of the, a lot of people dragging their feet and not wanting to switch and there were real world reasons that just sort of hit them.
12:10 They didn't think about those.
12:11 And it was a pretty interesting take.
12:13 I liked it.
12:14 - Yeah, I think people will learn a lot listening to Guido.
12:17 So I wanted to highlight that because I think Guido doesn't do a lot of those kind of open ask me anything type things.
12:25 He does a lot of public speaking and stuff, but I thought this was a great look inside and he was really a great guest and it was a lot of fun to talk with him.
12:32 So check that out.
12:33 - Okay, great.
12:34 - Well, that's our six picks for today.
12:36 - Yeah, so that's our six.
12:38 Yeah, yeah, so that's the news for Monday, February 27th.
12:42 What else is going on?
12:44 You had a great episode with Mahmood, a friend of the show, Mahmood Hashemi.
12:47 - That was actually like a month ago, and I finally got it all edited and out.
12:52 I put this out last Sunday.
12:54 And so it's episode 27, and it's a really good, I really enjoyed the discussion with him.
13:00 We started off with the excuse of talking about the different levels of testing like unit versus system, stuff like that.
13:07 But we just talk about all sorts of stuff, testing related and Python related.
13:12 And the other excitement is the, I released it on a new website.
13:17 So I have testencode.com now running everything.
13:21 And it'll be on both websites for a while until I get everything switched over.
13:25 - Yeah, that's great.
13:26 I'm looking forward to listening to that episode with Mahmood 'cause he has great things to say.
13:31 I had him on to talk about Enterprise Python.
13:34 He works at PayPal and does a bunch of cool stuff with Python there.
13:37 So I'm sure that's really interesting, that's great.
13:40 - Well, he used to work at PayPal.
13:42 - Oh, interesting.
13:43 Okay, well, see--
13:44 - So you'll have to listen to episode 27 to, yeah.
13:47 - Now I must listen to it.
13:49 Oh my gosh, I'm out of date.
13:51 Awesome, okay, well that sounds very, very exciting.
13:53 And testing code, you finally unified the domain and the rename that you did a few months ago, six months ago, whatever it was.
14:00 - Yeah, well, there's, so the, yeah, test and code domain and the website now, or the podcast and site have the same name.
14:07 So that's good. - Yeah, excellent.
14:09 - And the book, the pytest book is out to reviewers.
14:13 I've got three chapters left to write, but we're chugging ahead like a large freight train, so.
14:19 - Yeah, it looks like you're, are you ahead?
14:21 I know you were thinking like, geez, can I get this done by PyCon, which was end of May, but it feels like you're making good progress.
14:27 - Yeah, we still, to actually get an in-print book would be a miracle at this point, but we are hoping to get a beta, ebook beta out by PyCon, so.
14:37 - Awesome, okay, well congratulations.
14:40 And I've seen some of the covers and they're both very cool, so I'm looking forward to seeing which one you go with.
14:44 - Yeah, so how about you, what's up with you?
14:46 - You know, not a whole lot is going on with me right now.
14:49 I released my consuming HTTP services course and I'm just working on like two or three more courses.
14:55 I'm actually writing a Python article for Code Magazine, So I'm supposed to submit that this week.
15:01 It's about celebrating the 3,000th day, birthday, - Nice. - instead of birthday, of Python 3, 'cause right, just a few days ago, Python 3 was 3,000 days old, so that's pretty cool.
15:12 - So Code Magazine, is that a print thing, or?
15:14 - Yeah, it's a print thing.
15:15 - Cool, I didn't know there were print, code magazines anymore.
15:18 - Some people still read them, I guess.
15:20 We'll see. - Yeah.
15:22 Okay, well, awesome.
15:23 It was fun talking with you today.
15:24 - Yeah, thanks for sharing all the news.
15:26 Now people can go out and package their Python code up and hear Guido talk about it.
15:30 Yeah.
15:31 Great.
15:32 All right.
15:33 Well, good to see you.
15:34 Thanks everyone for listening in and we will catch you next week.
15:36 All right.
15:37 Thanks a lot.
15:38 Bye.
15:39 Thank you for listening to Python Bytes.
15:41 Follow the show on Twitter via @pythonbytes.
15:44 That's Python Bytes as in B-Y-T-E-S.
15:47 And get the full show notes at pythonbytes.fm.
15:50 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.
15:55 on the lookout for sharing something cool.
15:57 On behalf of myself and Brian Okken, this is Michael Kennedy.
16:01 Thank you for listening and sharing this podcast with your friends and colleagues.