#22: PYTHONPATH considered harmful

Published Tue, Apr 18, 2017, recorded Tue, Apr 18, 2017

Sponsored by ADVANCE DIGITAL. Find your rewarding Python job at http://python.advance.net/

#1 Brian: PYTHONPATH considered harmful

Don’t do it.
You might not regret it today. But later you will.
Mucks up distribution searches, etc.
- “For one, most directories are poorly suited to be on the Python search path. Consider, for example, the root directory of a typical Python project: it contains setup.py -- and so, if it were added to the current search path, import setup would become possible. (This is one reason to have src/ directories.) Often, directories added unwisely to the Python search path cause files to be imported from paths they do not expect to, and surprisingly conflict.”

#2 Michael: keon/algorithms

Minimal examples of data structures and algorithms in Python
Topics include
- Array
  - circular_counter
  - flatten
  - garage
  - merge_intervals
- graphs
  - clone_graph
  - find_path
  - traversal
- trees
- etc.

#3 Brian: Glyph on attrs

We talked about attrs in episode 11, and pointed to the project and the docs.
I came across good article introducing why you should use attrs, by glyph, from 2016.
The one Python library everyone needs: https://glyph.twistedmatrix.com/2016/08/attrs.html
Discusses
- problems with using lists and tuples as data structures.
- creating your own classes properly.
- possible problems with namedtuple (-ish. I still love namedtuple).

Sponsored by ADVANCE DIGITAL

A small team of developers who work in an agile/devops environment– you will make an impact with your work quickly
Are mostly a python shop, but there is an opportunity to introduce and run other technologies at scale
Fund employee development and conference attendance
Are located in beautiful Jersey City, one stop from Manhattan on the PATH
Are one of the 10 largest news sites by traffic in the US
Apply at http://python.advance.net/

#4 Michael: Curio for Python 3.5+ concurrency

Curio is a library for performing concurrent I/O and common system programming tasks such as launching subprocesses and farming work out to thread and process pools.
Curio is solely concerned with the execution of coroutines. A coroutine is a function defined using async def.
It uses Python coroutines and the explicit async/await syntax introduced in Python 3.5.
Its programming model is based on cooperative multitasking and existing programming abstractions such as threads, sockets, files, subprocesses, locks, and queues.
All sorts of cool constructs: AsyncThreads, UniversalQueues, async file I/O, Tasks, and more.

#5 Brian: Python Package src-ery

"Use the src, Luke"
"To src, or not to src, that is the question"

Answering a listener question about Python packaging. In episode 15: Digging into Python Packaging, we mentioned to articles about getting started with packaging. In the comments, Kristof Claes noted that these references were in conflict with a couple of other references:

pytest “Good Integration Practices”, https://docs.pytest.org/en/latest/goodpractices.html
ionel’s “Packaging a Python library”, https://blog.ionelmc.ro/2014/05/25/python-packaging/

Both of these strongly encourage the use of a “src” directory when setting up a package for distribution. There seems to be good reasons to use “src”. Many of the reasons are around the idea that during testing, you should be testing an installed version of the code. I have no reason to disagree with Ionel’s arguments and the pytest documentation recommendation.

However:

The pypa doesn’t bring this up when discussing distribution:
- https://packaging.python.org/distributing/
The pypa sample project doesn’t use “src”:
- https://github.com/pypa/sampleproject
Many popular packages don’t:
- requests: https://github.com/kennethreitz/requests
- pytest itself: https://github.com/pytest-dev/pytest

Why not?

The pytest recommendation is subtle. It recommends using “src” if you need to include a dunder init file in the tests directory. Otherwise, the local code will be tested instead of the installed code, in part to test the installation and to test a library from the perspective of a user.
pytest also recommends against having a top level dunder init in the tests directory. And this is a stronger recommendation.
But Ionel’s points are not just around the use of pytest.
So this is still really an open question to the Python community.
- If it’s great to use “src” instead of top level packages, why aren’t more projects doing this?
- Why doesn’t the PyPA mention it?

# 6 Michael: Intel Pulls Funding from OpenStack Effort It Founded With Rackspace

Intel and Rackspace were collaborating on a project called OpenStack Innovation Center
Launched in July 2015.
A source close to the effort said initial funding was supposed to last through 2018, but Intel pulled it early.
A Rackspace spokeswoman said “OSIC’s objective was to create the world’s largest OpenStack developer cloud and develop enterprise capabilities within OpenStack. It quickly accomplished the first goal, and has made great progress toward the second.”
Some 30 Rackspace employees who had been working at the innovation center have been given two weeks to find new jobs at the San Antonio-based company.
Story here is we all need to think about funding projects and diversification.

Our news:

Michael: Hurry up and register for EuroPython: https://ep2017.europython.eu/en/ Earlybird sold out.

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:06 It's April 18th, 2017. I'm Michael Kennedy.

00:10 And I'm Brian Okken.

00:11 And we have a bunch of cool stuff to share with you. But before we do, Brian, we have to say thank you to Advanced Digital.

00:16 Oh, yeah. Thank you, Advanced Digital.

00:18 They're a brand new sponsor of the show, and they have some really cool Python job opportunities. We'll talk about that later.

00:24 Let's talk about the Python path now.

00:26 I liked this. Somebody came out with a... I should write names down. Let me look it up just a second.

00:32 Okay. Moshe Zedka came out with an article called Python Path Considered Harmful.

00:37 And this is a good article. It's very short. And I guess I had forgotten about the Python path because fairly early in my Python career, realized it's not useful. It's only bad. Ignore that it's there.

00:52 So what the Python path is, it's an environmental variable that you can put directories in so that you add to the places where Python will look for import modules or modules or packages to import.

01:06 Every time I've ever thought maybe I had a good use for it, I've regretted it later.

01:12 So, and one of the classic examples is that if you had a package, you would, I guess, maybe want to add the top directory for a local package to the path.

01:25 But there's a whole bunch of stuff in the top directory that's not appropriate to be imported, like the setup.py and so forth.

01:32 But anyway, just, I guess, a public service announcement. Don't use Python path. Is there a good use for it?

01:39 Yeah, I don't know. I mean, we have virtual directories. Those work pretty well.

01:43 You can Python and, you know, set up py, develop if you want to kind of inject something into the path.

01:50 Like there's a lot of options that don't involve that necessarily.

01:52 Yeah. And also, if you're like, one of the reasons might be if you're developing a local package and you want to be able to import from it, like from your test directory.

02:03 But that's still the best, the best recommendation is to do a pip install dash e to as install editable is the best way to do that.

02:13 Yep, exactly. Exactly. So the one that the first one I want to talk about are actually algorithms.

02:19 There was a really interesting article about how I think we talked about this one.

02:25 And I'm not sure if I'm blending in my mind with another article I read, but basically that we can use languages like C and stuff to try to make our code faster by trying to get closer to the metal.

02:36 But a lot of times a way to make your code faster is actually to have better understanding and visibility of the algorithms.

02:43 Right. So you could try to make the bubble sort really, really fast and see or you could just use quicksort in pretty much any language.

02:51 Right. And it's way, way faster. So I ran across this GitHub repo that is just a bunch of minimal examples of data structures and algorithms in Python.

03:00 And it's key on your GitHub dot com slash key on K E O N slash algorithms.

03:05 And there's just tons of really cool examples. So there's there are different topics like arrays and graphs and trees.

03:12 So like in arrays, there might be like the ability to flatten an array or to merge intervals on arrays or in graphs, you can clone it or traverse it or find paths through it.

03:23 All those types of algorithms that are like, wow, that actually solves exactly what I need.

03:27 And here's a little example. I thought it was really cool.

03:30 And I think it helps professional programmers.

03:32 But if you're looking for a job, there's a lot of a lot of sort of interview type questions around algorithms and data structures.

03:39 So it might also be worth studying there.

03:40 Oh, that'd be a good place. Yeah.

03:42 Yeah. So I don't I'm not sure how many there are, maybe 50 data structures and algorithms, but pretty cool.

03:47 That's neat. Cool.

03:48 Yeah. So if you want to brush up on your algorithms, then there's a place to do it.

03:52 Well, speaking of data structures, in episode 11, we talked about a package called adders, A-T-T-R-S.

03:59 Adders has a lot of fans. People love that library.

04:01 Yeah. It's a cool package to be able to make classes easily.

04:06 And just the attributes of classes are easy to define.

04:10 Right. Kind of make them complete as well. Right.

04:12 Give them the equivalency tests and all the various things, not just here are the fields. Right.

04:17 Right. And this I ran this is an older article from from last year.

04:21 I guess it's not that old. It's from Glyph.

04:24 And it's just a really good article about the article is titled The One Python Library Everyone Needs.

04:31 And he talks about the annoying bit of having to create your own classes and make sure that the copies and less than all work and the sorting and just letting that actors work for you.

04:44 He also discusses briefly some problems with named tuple.

04:48 But I'm a huge fan of named tuple still.

04:51 And I don't know if I buy his slamming on it too much, but it's been all good for me, I think.

04:57 But now this is really cool.

04:59 And I need to be trying out adders.

05:02 I haven't done it yet, but it seems like, you know, it really does add a lot.

05:05 I should just make it a habit.

05:07 Anyway, it's a good article.

05:08 You know where I would like to try out some new stuff?

05:10 Where?

05:10 At a new job.

05:11 Building cool stuff with Python.

05:12 I would love that.

05:13 Yeah.

05:14 So our sponsor, Advanced Digital, they sponsored this episode of the podcast because they're looking to work with you.

05:20 Everyone out there who knows Python and wants to build cool stuff with Python.

05:24 So Advanced Digital, they, I've never heard of them, but they actually run one of the 10 largest news sites in the U.S. by traffic.

05:32 So they're really high scale.

05:33 So they're located in Jersey City, just across the river from, from Manhattan.

05:41 So, you know, beautiful view, see the Manhattan at night and take the path over there.

05:46 They fund employee development and conferences.

05:48 So you go to PyCon, things like that.

05:50 And they do mostly Python, but they also run other things.

05:53 So if you want to work in an environment like that and do Python for your job, check out python.advance.net and those guys will hook you up.

06:02 That's great.

06:02 Yeah.

06:03 Sounds fun, right?

06:03 Cool.

06:04 So one of the things that makes high traffic websites run well is concurrency, right?

06:09 And we, we kind of beat this drum often, but because it's an awesome drum.

06:15 And last week I had David Beasley on Talk Python To Me to talk about a project he created called Curio.

06:22 Have you heard of Curio?

06:23 I have.

06:24 I haven't played with it though.

06:25 So Curio is, it's an interesting project in that it's kind of like halfway between a framework that you can just grab and use

06:31 and half like really low level building blocks.

06:35 But the idea is we have asyncio in Python since 3.4 and it's got this event loop and allows a sort of asynchronous programming through callbacks.

06:46 But David said, well, what if we actually started from scratch and we had async and await available?

06:53 We had these asynchronous coroutines as the primary concept in an async library for Python.

07:00 What would that look like?

07:01 Well, that's Curio.

07:02 So it's a library for performing concurrent IO operations and system programming tasks like launching subprocesses or threads or whatever.

07:11 And it's solely concerned with the execution of async coroutines.

07:15 It's cool, right?

07:16 Yeah, very.

07:17 And it's got all these really neat data structures.

07:20 So it has this thing called a universal queue.

07:22 And queues are one of the primary ways to communicate between threads without locking.

07:27 So you don't have, you're not sharing the data.

07:29 You make a copy, put it on the queue.

07:30 So you're not worried about race conditions and things like that.

07:33 But one of the problems is the threading model that works for, say, queues between threads is not the same that goes between asyncio execution.

07:43 And that's also not the same that goes between curio.

07:45 So he added this thing called a universal queue that spans all three of those worlds and lets them intercommunicate with each other.

07:52 He has async threads for sort of managing computational execution and threads as if they were asynchronous coroutines and all sorts of stuff like that.

08:03 And you really learn a lot about async if you dig into this thing.

08:05 Okay.

08:06 And you said it is a sort of halfway between something low level and high level.

08:12 Yeah.

08:12 Yeah, exactly.

08:13 So if you want to make, say, like a TCP IP game server with TCP or UDP or something, like it's actually got constructs to say launch TCP server that's async and plug in the callbacks and things like that.

08:24 Oh, wow.

08:25 Yeah, but there's no HTTP layer, right?

08:28 So it's not like a web framework.

08:29 It's not like Sanic or Jepronto or one of those things.

08:32 But it's not super, super low level.

08:35 It has all these building blocks.

08:36 So it's like if I was going to build a framework thing of some sort that did a bunch of asynchronous and I wanted it asynchronous at its core, maybe using Curio as the core of your project to build that framework on top of might be perfect.

08:51 But it's not itself a framework, really.

08:53 Not yet, anyway.

08:54 Okay.

08:54 Interesting.

08:55 I'll have to keep an eye on it.

08:56 Yeah.

08:57 The built-in concepts for tasks and threads and queues and whatnot.

09:01 It's very, very neat.

09:02 Neat.

09:02 Yep.

09:03 All right.

09:04 You have some package sorcery.

09:05 Yeah, I was playing with the SRC and trying to figure out if this section should be package sorcery or use the source, Luke, or to source or not to source.

09:18 Anyway.

09:18 These are all great questions.

09:19 Well, speaking of great questions, in episode 15, we talked about Python packaging.

09:26 And one of the listeners, Christoph, had a question last week.

09:31 And the question really was that he was reading some other articles that were in conflict and actually starting with the pytest documentation and pointing to an article by Yonel.

09:46 I think it's Yonel.

09:47 It's called packaging and Python library.

09:49 And the conflict really is whether or not a distribution package for, like, if you're going to push it up onto PyPI, should it have all of the module or the source packages, should they be top-level directories or should they be under a SRC directory?

10:08 And the recommendation, some people recommend, and pytest did as well, of using this SRC directory instead.

10:16 And the argument that Yonel puts forth is referenced in lots of places.

10:21 And it actually all sounds good.

10:25 Some of the problems are based on using talks and other testing tools to be able to install things a lot and uninstall easily and not muck with the namespace too much.

10:37 So I went out and tried to find some examples that used this.

10:41 And there's actually had difficulty finding some.

10:44 If this is such a good idea, I guess my question is, why isn't this not promoted more by the Python package authority and their documentation?

10:54 And pytest itself, even though it recommends this, that package doesn't use it.

11:00 So I guess that's my question to the Python community is, should we be using a SRC directory or not?

11:07 Okay.

11:08 Yeah.

11:08 Well, this is a good place to point out at the bottom of every episode, we have a discuss section.

11:15 And people come in and they ask questions and give us feedback on the various episodes.

11:20 And this one came out of one of those, right?

11:22 Yes.

11:22 Yeah, this conversation.

11:23 So if you want to comment on one of these shows, like this is episode 22, pythonbytes.fm/22.

11:28 Boom.

11:28 Go to the bottom and pick it up, right?

11:30 So feel free to jump in.

11:32 And that's probably on episode 21 where this conversation was happening.

11:35 Yeah.

11:36 And I get why a lot of existing projects like requests don't have this in it because they didn't before.

11:43 But if there's a lot of people like Christoph out there that are trying to come up with some new code to share with people and they want to do it the right way.

11:52 And it's a legitimate question is, what is the right way?

11:54 Yeah, absolutely.

11:55 Yeah.

11:56 Great.

11:56 Great topic.

11:57 Now, this last one I have is a bit of a downer, but I just want to, I wanted to cover it because I think it's an important topic and we've talked about it a few times.

12:05 And I'm going to say two companies' names.

12:07 But before you think anything negative about these companies, these are the two companies that were funding a thing that nobody else was funding.

12:13 So thank you to them for doing it.

12:16 But it turns out that Intel is pulling funding from its OpenStack effort, like a sort of an initiative that it started with Rackspace.

12:26 So Intel and Rackspace were collaborating on a project called OpenStack Innovation Center.

12:30 And that started back in 2015.

12:33 And it was supposed to be funded through 2018, but actually it got pulled out early.

12:38 Right.

12:39 There was a lot of good things that got done there.

12:42 They said the objective was to create the world's largest OpenStack developer cloud and developer enterprise capabilities with OpenStack.

12:49 And it quickly accomplished the first goal and made a great progress towards the second.

12:53 So that's all good that that was done.

12:57 But it turns out there's 30 Rackspace employees who have been working on this now have two weeks to find another job within the company.

13:04 And so I guess the story here is just, you know, we all need to be vigilant and careful about how we fund our open source work.

13:13 Maybe a little bit of diversification, not in terms of what people are doing, but in terms of the companies that we have supporting us.

13:20 So the more companies contributing smaller amounts, I feel is probably a safer place to be than a few companies contributing huge amounts.

13:28 Like we have the same problem with PyPA.

13:29 Yeah, definitely.

13:30 Or PyPI and the packaging authority and all those guys, right?

13:33 Like that whole set of projects.

13:35 Yes.

13:35 So I don't know.

13:37 I'm going to try to talk about this a little bit at PyCon this year.

13:40 I'll try to do an open session or something if I can pull it together.

13:43 There's not really a good time of the year to try to find another job in two weeks.

13:48 No.

13:50 Anyway, sorry for you guys.

13:51 Definitely not.

13:51 Yeah, that's a real bummer.

13:53 And OpenStack is awesome.

13:54 And oh, this is not, they're pulling their support from OpenStack.

13:57 This is this initiative that was on top of it, but still.

13:59 Okay, so it's not all of OpenStack.

14:01 It's just one part of it?

14:02 No, this was an initiative specifically that they were doing, trying to bring people together around OpenStack and some other stuff.

14:10 Okay.

14:10 Yeah.

14:11 All right.

14:11 Well, that's our news for the week, Brian.

14:13 You got anything in particular?

14:14 No, I don't.

14:15 Yeah, no worries.

14:16 We're still awaiting that book release.

14:18 That's going to be a good day.

14:19 Well, I'm frantically in the middle of edits.

14:22 And there's some incredible, I'm finally working through a lot of the feedback I got from people.

14:27 And I'm just still very humbled by the help I've received by the community.

14:33 It's great.

14:33 Yeah.

14:34 Yeah, that's awesome.

14:34 So I have a quick piece of news for everyone.

14:37 If you are in Europe or you would like to spend a little time in Europe, Europython at Europython.eu.

14:45 Be sure to check that out and get your tickets because they've already sold out the early bird tickets and the main tickets are on sale.

14:53 I've already heard from some people.

14:54 I've heard from some people who were hoping to go to PyCon this year in the U.S.

14:57 Not going.

14:58 Tickets are sold out.

14:59 So if you're in Europe and you want to go, don't wait.

15:03 These things sell out and then you'll be sad.

15:04 Are you going to go, Michael?

15:05 No, not this year.

15:06 I was supposed to go last year.

15:08 I really wanted to, but we were moving back to the U.S. within like a few days of it running.

15:12 So it just didn't work out.

15:13 Yeah.

15:14 Maybe next time.

15:15 Yeah.

15:15 Maybe we can both go.

15:16 That would be awesome.

15:17 All right.

15:18 Well, Brian, thank you for sharing everything with us.

15:22 Thank you.

15:23 Yeah.

15:23 It was fun.

15:24 See you all later.

15:24 Thank you for listening to Python Bytes.

15:27 Follow the show on Twitter via at Python Bytes.

15:29 That's Python Bytes as in B-Y-T-E-S.

15:32 And get the full show notes at pythonbytes.fm.

15:36 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

15:40 We're always on the lookout for sharing something cool.

15:43 On behalf of myself and Brian Okken, this is Michael Kennedy.

15:46 Thank you for listening and sharing this podcast with your friends and colleagues.

Want to go deeper? Check our projects

Course: Python for the Absolute Beginner course

Beginners

HTMX + Flask

FastAPI

pytest book

Full transcript