#184: Too many ways to wait with await?

Published Fri, Jun 5, 2020, recorded Wed, May 27, 2020

Sponsored by DigitalOcean: pythonbytes.fm/digitalocean - $100 credit for new users to build something awesome.

by Hynek Schlawack
One of the main appeals of using Python’s asyncio is being able to fire off many coroutines and run them concurrently. How many ways do you know for waiting for their results?

The simplest case is to await your coroutines:

    result_f = await f()
    result_g = await g()

Drawbacks:
1. The coroutines do not run concurrently. g only starts executing after f has finished.
2. You can’t cancel them once you started awaiting.

asyncio.Tasks wrap your coroutines and get independently scheduled for execution by the event loop whenever you yield control to it

    task_f = asyncio.create_task(f())
    task_g = asyncio.create_task(g())

    await asyncio.sleep(0.1) # <- f() and g() are already running!
    result_f = await task_f
    result_g = await task_g

Your tasks now run concurrently and if you decide that you don’t want to wait for task_f or task_g to finish, you can cancel them using task_f.cancel()
asyncio.gather() takes 1 or more awaitables as *args, wraps them in tasks if necessary, and waits for all of them to finish. Then it returns the results of all awaitables in the same order
```
    result_f, result_g = await asyncio.gather(f(), g())
```
asyncio.wait_for() allows for passing a time out
A more elegant approach to timeouts is the async-timeout package on PyPI. It gives you an asynchronous context manager that allows you to apply a total timeout even if you need to execute the coroutines sequentially
```
    async with async_timeout.timeout(5.0):
        await f()
        await g()
```

asyncio.as_completed() takes an iterable of awaitables and returns an iterator that yields asyncio.Futures in the order the awaitables are done

    for fut in asyncio.as_completed([task_f, task_g], timeout=5.0):
        try:
            await fut
            print("one task down!")
        except Exception:
            print("ouch")

Michael’s Async Python course.

Brian #2: virtualenv is faster than venv

virtualenv docs: “virtualenv is a tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard library under the venv module. The venv module does not offer all features of this library, to name just a few more prominent:
- is slower (by not having the app-data seed method),
- is not as extendable,
- cannot create virtual environments for arbitrarily installed python versions (and automatically discover these),
- is not upgrade-able via pip,
- does not have as rich programmatic API (describe virtual environments without creating them).”
pro: faster: under 0.5 seconds vs about 2.5 seconds
con: the --prompt is weird. I like the parens and the space, and 3.9’s magic “.” option for prompt to name it after the current directory.
pro: the pip you get in your env is already updated

conclusion:

I’m on the fence for my own use. Probably leaning more toward keeping built in. But not having to update pip is nice.
For teaching, I’ll stick with the built in venv.

The “extendable” and “has an API” parts really don’t matter much to me.

    $ time python3.9 -m venv venv --prompt .

    real 0m2.698s
    user 0m2.055s
    sys 0m0.606s
    $ source venv/bin/activate
    (try) $ deactivate
    $ rm -fr venv
    $ time python3.9 -m virtualenv venv --prompt "(try) "
    ...
    real 0m0.384s
    user 0m0.202s
    sys 0m0.255s
    $ source venv/bin/activate
    (try) $

Michael #3: Latency in Asynchronous Python

Article by Chris Wellons
Was debugging a misbehaving Python program that makes significant use of Python’s asyncio.
The program would eventually take very long periods of time to respond to network requests.
The program’s author had made a couple of fundamental mistakes using asyncio.
Scenario:
- Have a “heartbeat” async method that beats once every ms:
  - heartbeat delay = 0.001s
  - heartbeat delay = 0.001s
  - …
- Have a computational amount of work that takes 10ms
- Need to run a bunch of these computational things (say 200).
- But starting the heartbeat blocks the asyncio event loop
- See my example at https://gist.github.com/mikeckennedy/d9ac5a600f91971c6933b4f41a8df480
Unsync fixes this and improves the code! Here’s my example: https://gist.github.com/mikeckennedy/f23b9b5abd9452cdc8b3bacaf1c3da20
Need to limit the number of “active” tasks at a time.
Solving it with a job queue: Here’s what does work: a job queue. Create a queue to be populated with coroutines (not tasks), and have a small number of tasks run jobs from the queue.

Brian #4: How to Deprecate a PyPI Package

Paul McCann, @polm23
A collection of options of how to get people to stop using your package on PyPI. Also includes code samples ore example packages that use some of these methods.
Options:
- Add deprecation warnings: Useful for parts of your package you want people to stop using, like some of the API, etc.
- Delete it: Deleting a package or version ok for quick oops mistakes, but allows someone else to grab the name, which is bad. Probably don’t do this.
- Redirect shim: Add a setup.py shim that just installs a different package. Cool idea, but a bit creepy.
- Fail during install: Intentionally failing during install and redirecting people to use a different package or just explain why this one is dead. I think I like this the best.

Michael #5: Another progress bar library: Enlighten

by Avram Lubkin
A few unique features:
Multicolored progress bars - It's like many progress bars in one! You could use this in testing, where red is failure, green is success, and yellow is an error. Or maybe when loading something in stages such as loaded, started, connected, and the percentage of the bar for each color changes as the services start up. Has 24-bit color support.
Writing to stdout and stderr just works! There are a lot of progress bars. Most of them just print garbage if you write to the terminal when they are running.
Automatically handles resizing! (except on Windows)
See the animation on the home page.

Brian #6: Code Ocean

Contributed by Daniel Mulkey
From Daniel “a peer-reviewed journal I read (SPIE's Optical Engineering) has a recommended platform for associating code with your article. It looks like it's focused on reproducibility in science. “
Code Ocean is a research collaboration platform that supports researchers from the beginning of a project through publication.
This is a paid service, but has a free tier.
Supports:
- C/C++
- Fortran
- Java
- Julia
- Lua
- MATLAB
- Python (including jupyter) (why is this listed so low? should be at the top!)
- R
- Stata
From the “About Us” page:
- “We built a platform that can help give researchers back 20% of the time they spend troubleshooting technology in order to run and reproduce past work before completing new experiments.”
- “Code Ocean is an open access platform for code and data where users can develop, share, publish, and download code through a web browser, eliminating the need to install software on personal computers. Our mission is to make computational research easier, more collaborative, and durable.”

Extras:

Brian:

Python 3.9.0b1 is available for testing

Michael:

SpaceX launch, lots of Python in action.

Joke:

Sent over by Steven Howell
https://twitter.com/tecmint/status/1260251832905019392
From Bert, https://twitter.com/schilduil/status/1264869362688765952
But modified for my own experience:
“What does pyjokes have in common with Java? It gets updated all the time, but never gets any better.”

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to

00:04 your earbuds. This is episode 184, recorded May 27th, 2020. I'm Brian Okken. And I'm Michael

00:12 Kennedy. And this episode is sponsored by DigitalOcean. Thank you, DigitalOcean. Yeah,

00:17 thanks, DigitalOcean. Really appreciate it. Well, I'm waiting, Michael. You're waiting?

00:21 What are you waiting for? I'm waiting for the next AsyncIO story.

00:24 I guess it's my turn, isn't it? Okay. Sorry, let me see if I can get this right.

00:30 So the topic I want to talk about is waiting in AsyncIO. Yeah, so the magic of AsyncIO,

00:37 which was introduced in Python 3.4, never really appeared until Python 3.5 when the async and await

00:44 keywords came into being, which let you write code that looks like standard single-threaded serial code,

00:50 but actually is multi-threaded, or at least parallel concurrent to some degree. Depends on

00:57 how you're running it, whether it's truly multi-threaded. Anyway, there's a lot of options,

01:02 let's say, on how you can interact with these co-routines and these tasks that are generated by the AsyncIO

01:10 framework. And anytime there's like four ways to do something in programming, you should be asking

01:16 yourself, one, why are there four ways to do this? But more importantly, when does way one apply best?

01:25 And what scenario should I use way three? And what is the trade-off between two and four?

01:29 And so on, right? So that's the case with AsyncIO. There's tons of ways to wait or await things.

01:37 And Hank, get the pronunciation right for me. I know you got it.

01:42 You got it. That's good.

01:43 Sorry.

01:47 That's just a running thing. I can never get it right.

01:50 No. Well, the problem is, I think I had it right. And then we went back and forth so many times with so

01:56 many variations. Now it's broken. Sorry about messing up your name there. So I wrote a great article,

02:02 though, called Waiting in AsyncIO that does exactly that, that says, here are all the ways,

02:07 here's the pluses, and here's the minuses, and the situations in which you apply them.

02:11 So if you're serious about using AsyncIO and you're building real things, basically this podcast

02:18 episode is for you because I have this and another one that bring two really cool ideas together.

02:22 But let's talk about the waiting one first.

02:26 So it's really easy to start doing some work, right? I can have two coroutines, let's say F and G,

02:33 and I could say I want the result of F by saying, you know, result equals await calling F,

02:40 and then result G equals await calling G. And that's fine if what you're looking for is more

02:47 concurrent execution of that part of code, right? So this is, say, in a web method, like a view,

02:54 someone makes a request, and there's not a lot you can do to make things go faster,

02:59 potentially for that one request, but you can say, let the server be less busy,

03:03 so it could handle like 10 or 20 times more of the same request, right? So this real simple,

03:08 like just await calling these functions, these Async functions, it will allow your system to scale

03:14 more, but it won't make things faster. Like for example, if you're trying to crawl 20 web pages,

03:19 this will not make it any faster, it'll just make your code more complicated. So don't do that.

03:24 So there's other ways which you want to do that. Another thing that I think a lot of people

03:28 don't quite get is when you call one of these Async functions, like Async Def Function Name,

03:34 when you call it, it doesn't actually start it until you either await it or create a task from it.

03:43 So if you call like F and then G, and you think you're going to come back and get to them later,

03:49 now they're running. No, actually, they're not unless you've created a task, which starts them.

03:54 So you either have to await them, which blocks, or create these tasks to like kick them off.

03:59 That's pretty interesting, right? Like that's not super obvious. I think normally when you call a

04:02 function, it does a thing. But here, not so much. So some other options, if you could call them both as

04:09 create them as tasks, and then you could await those tasks, right? Because their tasks are already

04:13 running, and then you await them both, whichever one first finishes first doesn't matter. You wait

04:17 till the first one's done, and maybe the second one's already done. So that's probably the pattern

04:21 that most people are going to be using. But you can also use Async IO Gather that takes one or more

04:27 awaitables as a star args, and then it waits for them all to finish, which is pretty cool. And that

04:33 itself is a future thing that you can await, right? So you would say await Async IO dot gather.

04:39 Oh, that's cool.

04:40 Yeah, gather is awesome, because I can create all these tasks. And I just say, I just need them all

04:44 to be done. And when that's done, we can get the results and carry on. And what's cool is when you

04:49 await gather, you get a tuple of results. So if I say Async IO gather function one or task one,

04:55 task two, then it returns the result one comma result two as a tuple. So you can gather them up and get

05:02 all the answers back, which is pretty cool. Yeah, that's really neat. Yeah, one of the problems

05:06 with gather, though, is you're saying I'm willing to wait forever for this to finish. And sometimes

05:12 that's fine. But sometimes things don't return correctly or ever or in the right amount of time.

05:18 So you can use wait for, which is nice and allows you to pass a timeout. But what's a little bit

05:24 better than a wait for is there's an Async timeout package on PyPI, which I'd not heard of. And you can

05:30 basically create a block that will a width block that will timeout. So you can say I want to have

05:36 an Async with timeout five seconds and then do a whole bunch of function calls and awaiting and all

05:42 that. And either they're all going to finish or when five seconds passes, everything gets canceled

05:47 that hasn't finished. That's cool. That's pretty cool, right? Another one that's really interesting

05:51 is I start a bunch of work. And then I would like to say I kick off, I'm doing web scraping,

05:56 and I want to go get the results of 20 web pages. I kick off 20 requests and then I want

06:01 to process them as they complete. Like the first one that's done, I want to work on that. Then

06:06 the next one, then the next one. So you can create a task or an iterable rather from saying AsyncIO.as

06:14 completed and you give it a bunch of tasks and it gives you an iterator that you can for in

06:19 over that gives you the first completed one, then the second completed one and just blocks until

06:25 the next one is completed. So you kick a bunch off and then you just say for completed tasks in AsyncIO.as

06:32 completed and you give it your running tasks. That's really slick, isn't it?

06:35 That is slick. Looks like it has a timeout also that you can add to it.

06:39 Yeah, very nice. Yep. You can give it a timeout indeed.

06:42 Now, there's a few more things covered in there and I didn't go over the trade-offs too much.

06:48 You know, here's the scenario where you use this and that. So if you really care about this,

06:52 two things to do. One is check out the article. It's got a lot of details and each subsection has

06:58 a little trade-offs. Here's the good, here's the bad, which is nice. And also you can check out my

07:02 Async course, which talks about this and a whole bunch of other things on Async as well. So I'll put

07:07 a link in the show notes for that as well. Nice. Yep. So I was talking about waiting. You're talking

07:11 about what, being faster? That sounds better than just waiting around.

07:14 Yes. Being faster. Well, maybe being faster. Not sure. So I'm still on the fence. Anyway,

07:21 so virtual environments. I use virtual environments. Do you use virtual environments?

07:25 Anytime I have to install any, if pip install has to be typed, there's a virtual environment

07:30 involved. Yeah.

07:31 Yeah. I use it for everything. Even if I've got a machine, like a build machine that really only has

07:36 one Python environment and I'm only using it for one thing, I still set up a virtual environment.

07:41 It's just always. And I've been, since the Python 3 started, Python 3 packaged VENV with Python. So

07:50 you can create virtual environments just with the built-in VENV package. And I've been using that.

07:57 Now, before that was in there, and if you were in Python 2 land, you needed to use the pip installable

08:05 virtualenv package. Now, it is still updated and it is still being maintained. And I noticed,

08:14 this was a conversation that started on Twitter this morning, that the virtual environment was still

08:20 around and it was, maybe you should use that. So I went and checked it out again, the documentation

08:26 for it. And it says, virtualenv is a tool to create isolated Python environments. We know this.

08:32 Since Python 3.3, I guess, a subset of, has been integrated into the standard library. Yep.

08:38 The VENV module does not offer all the features of this library. Just to name some of the prominent ones,

08:47 VENV is slower. And it's not extendable. And it cannot create environments with multiple Python

08:55 versions. And you can't update it with pip. And it doesn't have a programmatic API. Now, most of that,

09:03 I just really don't care about. But the slower part, I do care about. So I gave it a shot this morning. I

09:10 used time, the time function on the command line, just to time a couple of commands, created virtual

09:16 environments with both VENV and virtualenv. And yeah, VENV takes a little over two seconds,

09:23 two and a half seconds to finish, whereas the virtualenv version takes like quite a bit under

09:29 half a second. So that's a lot. And I mean, if I'm doing a lot of virtual environments, I might care.

09:36 Now, one of the things I was like, coming back and forth, why would I use VENV then if virtualenv

09:43 is faster? Well, you have to pip install virtualenv. And so I'll have to remember to do that.

09:50 I don't think I'll start teaching people this because it's just one more complication thing.

09:56 And a couple of seconds isn't that big of a deal. And I still like the prompt, the --prompt.

10:01 Virtualenv supports that too, but it handles it different. It doesn't wrap your prompt in

10:06 parentheses. And maybe that's just a nicety, but I kind of like it. I'm not sure. I'm on the fence as

10:12 to whether I should switch or use it.

10:15 To me, it feels like I'm going to stick with VENV. For a long time, I saw virtualenv as just like,

10:20 it's legacy stuff. It's there because before Python 3.3, you didn't have VENV built in,

10:26 so you're going to need it. And a lot of the tutorials talked about it and whatnot.

10:29 We recently covered it about why it got a big update and a lot of the things that it does that

10:34 are nice. And the speed is cool. Maybe it wouldn't be that hard to adopt the --prompt

10:39 dot feature, right? It's open source, right? It could get a PR that does that. That would be pretty

10:45 cool, actually. It probably should just so it's consistent. But to me, the idea of having another

10:51 thing I've got to install somewhere, probably into my user profiles, Python packages, so that then I can

10:57 then create virtual environments so that I can then install things over into that area, it's just

11:01 it's fine for me. But as somebody who does courses and teaching and other stuff like presentations,

11:08 like it just seems like, okay, you just lost how many people out of that, right? Like,

11:13 you know, what is the value? Like you say, it's two seconds. One of the things what I would like to see,

11:19 and it would be really nice, maybe that would even push me over the edge is it drives me crazy that I

11:25 create a new virtual environment from the latest version of Python that I can possibly get on the planet.

11:30 And it tells me that pip is out of date. Now virtual environment didn't do that for me this morning.

11:35 So it created a virtual environment with the newest pip in it.

11:40 Oh, okay. See, now that's pretty nice, because it's annoying to say, okay, what you do is create this virtual

11:45 environment, and you pip install this thing. Oh, look, there's always going to be a warning. So every single

11:49 time, what you're going to do is you're going to fix that warning by doing this, right? So if it grabs the latest,

11:55 that's actually kind of cool. Now that I think of it, I have a alias in my shell, my startup that I just type

12:03 V and V and V and it does the Python dash M V and V. And then it does an upgrade of pip.

12:11 And first it activates it. And then it does an upgrade of pip and setup tools all in like,

12:16 four characters. So that's what I've been doing these days.

12:20 Yeah, I've got like a little snippet in my profile also that I'm using. Funny enough,

12:25 I shared it recently on Twitter, just my like two line snippet that I used. And then people kept on

12:31 telling me to use all these other tools. Oh, you could just use this. It's not just use this.

12:36 It's just a two line snippet in a profile. It's not a big deal. I don't even have to know what it

12:40 is. I just typed these three characters. I'm good. Why are you bothering me? Right? Why is it such a

12:44 big deal? I know it's crazy. Yeah. Yeah. What I would really like to see in none of these address

12:49 is that something like kind of like node JS, where it just has the virtual environment at the top level

12:58 and it just walks up until it finds a virtual environment and maybe complains if it doesn't,

13:03 or it does something like that. Right? Like something to the effect where you say here,

13:08 I know that this is a feature. I forgot what it's called. It's like the local Python or something like

13:12 that, but it's not just built into Python. So if I just went into that directory and tried to run it,

13:17 it's not going to find and use that version of Python, you know? Oh, well, the durinv, there's a few,

13:22 there are a few packages that do that. And that's one of the things that people are directing me to is

13:27 durinv. Durinv is cool. We should talk about it as a separate item. It's worth it. But yeah,

13:32 D-I-R-E-N-V is cool. Yeah. Yeah. We should talk about that sometime.

13:35 Yeah. Yeah. Yeah. Don't use up all our items all in one show, man. Come on.

13:38 Okay. Let's thank our sponsor. So DigitalOcean is sponsoring this episode and DigitalOcean just

13:45 launched their virtual private cloud and new trust platform. Together, these make it easier

13:51 to architect and run serious business applications with even stronger security and confidence.

13:56 The virtual private cloud or VPC allows you to create multiple private networks for your account

14:02 or your team instead of having just one private network. DigitalOcean can auto-generate your private

14:07 network's IP address range, or you can specify your own. You can now configure droplets to behave as

14:14 internet gateways. And Trust Platform is a new microsite provides you one place to get all your

14:19 security and privacy questions answered and download their available security certifications.

14:24 DigitalOcean is your trusted partner in the cloud. Visit pythonbytes.fm/DigitalOcean

14:31 to get a $100 credit for new users to build something awesome.

14:36 Yeah. Awesome. Thanks for supporting the show, DigitalOcean. So before you had to wait on me,

14:40 didn't you, Brian? It was frustrating.

14:42 I did have to wait.

14:43 All right. Let me tell you about when you might not even be awaiting stuff and things are still slow.

14:47 So there's this cool analysis done by Chris Wellens in an article called Latency in Asynchronous

14:54 Python. I don't know if it talks about problems with asyncio directly, but it's more talks about when you have a misconception of how something works over there, and then you apply a couple of patterns or behaviors to it, it might not do what you think.

15:10 So for probably the best example would be, he gives a couple. I'll focus on this one where it's basically generating too much work and what can be done about it.

15:22 So he says, I was debugging a program that was having some significant problems, but it was based on asyncio, and it would eventually take really long time for network responses to come back.

15:36 And it's made of basically two parts. One is this thing that has to send a little heartbeat or receive a heartbeat or something.

15:42 I don't remember if it's inbound or outbound, but it has to go beep, beep, beep once every, let's say, millisecond, right?

15:49 So there's an async function, and it just rips through, and just every one millisecond, it kicks off one of these heartbeats. Totally simple, right?

15:58 You just say await asyncio dot whatever, like sleep for one millisecond, then do the thing, and then go on, right?

16:06 You can basically allow other concurrent work to happen while you're awaiting this sort of timeout, and to do it on a regular time frame.

16:14 And then there's this other stuff that has to do some computational work that takes not very long, like 10 milliseconds.

16:20 So you're receiving a JSON request, you have to parse that JSON and do it like a little bit of work, right?

16:25 So because asyncio runs really on a single thread, that 10 milliseconds is going to block out and stop the heartbeat for 10 milliseconds, which is, you know, whatever, it's fine.

16:38 It's like there's some variability, but it's no big deal.

16:41 However, if you run a whole bunch of these, in his example, Chris said, let's run 200 of these computational things and like just start them up so that they can get put into this queue of work to be done.

16:52 Well, the way it works is it all gets scheduled.

16:54 It says, okay, we have a heartbeat and we have these 200 little slices of work, each of which is going to take 10 milliseconds.

17:00 And there's a bunch of stuff around them that makes them a little bit slower, the scheduling and whatnot.

17:06 And then we have a bunch of more heartbeats.

17:08 So it goes beep, beep, block for two seconds.

17:11 Beep, beep, beep, where you would expect, okay, I've got all these heartbeats going and I've got 200 little async things.

17:18 Let's like mix them up, right?

17:20 Like kind of share it fairly and it does not do that at all.

17:24 Oh, wow.

17:24 So it talks about basically what some of the challenges are there.

17:29 One is probably you shouldn't just give it that much work in some giant batch.

17:36 You should, you know, give it less work at a time, like some kind of like work queue or, you know, he said, let's see if a semaphore can work.

17:45 Now, I don't remember if semaphores are reentrant or not.

17:48 It didn't work.

17:49 The semaphore didn't help at all, actually.

17:52 He said, we can use a semaphore to limit it to 10.

17:54 But if semaphores are reentrant, this is all one thread.

17:56 It doesn't matter.

17:57 Like the semaphore won't block itself.

17:59 So that's like this normal threading, locking and stuff like that.

18:04 They kind of don't apply because there's not actual threading going on.

18:06 So that doesn't really help.

18:08 But he comes up with this example of that the asyncio has a job queue, which allows you to push work into it.

18:16 And then you can like wait for it to be completed.

18:18 And there's all sorts of cool patterns and like producer consumer stuff that you can put on there.

18:22 So I actually put together an example.

18:24 He has like little code snippets.

18:26 I put together a running example in one whole program that demonstrates this.

18:30 And I have a link to the gist in the show notes.

18:33 And I also would like to just point out how much a fan of unsync I am, which I always get it.

18:38 I always talk about when I can around this async stuff.

18:41 Like unsync is a library that is 120 lines of Python and it unifies multiprocessing, asyncio, threading, thread, like the, all these different APIs into like a perfect thing that fits with async await.

18:55 It's really, really nice.

18:57 But applying like the standard unsync adjustments to this code to say like what you do is just put a decorator at unsync on the function.

19:06 That's it.

19:07 You still use await and async and await and all that kind of stuff.

19:11 The problem is gone.

19:12 Totally fixes it.

19:13 Oh, really?

19:14 Like you don't have to go to like crazy queues and all that.

19:16 Like the problem is gone.

19:17 It's fixed.

19:18 Well, it's alleviated.

19:20 It may still be like if you push it far enough under certain like more complex criteria.

19:25 But the example that showed the problem, you just make them unsync and you await them.

19:29 It just runs like you would have originally expected.

19:32 Like unsync is so beautiful.

19:33 That's cool.

19:34 It doesn't change the way asyncio works.

19:37 It basically says, okay, the async work is going to run on a background thread and this other computational stuff will fit into the API, but will technically run on its own thread.

19:46 So it's not like changing the internals, but you use the same code and then now this doesn't have this problem because the way it slices them together is better, I think.

19:56 Anyway, it's pretty interesting.

19:57 It's worth a look.

19:58 Also, I have a copy of that on the gist and you can check that out and run it too.

20:01 That's pretty cool.

20:02 So unsync allows you to possibly not think too much about whether you should have these things just be async or whether there should be threads or something.

20:11 Is it?

20:12 Yeah, cool.

20:13 Yeah, it's really neat.

20:13 It just cleans everything up.

20:15 But I sure hope they don't deprecate it though.

20:17 Oh, that's a better transition.

20:19 I was going to say, speaking of the cleaning things up, but...

20:23 That works too.

20:23 We'll just do both transitions.

20:25 So how to deprecate a PyPI package.

20:27 So you've put up a PyPI package and for some reason you don't want it to be up there.

20:33 I don't want a puppy anymore.

20:35 Why do I have to take care of this?

20:36 Yeah.

20:36 So there's lots of reasons why this might happen.

20:39 One of them might just be you accidentally...

20:42 You didn't use the test PyPI and use the live one and you put up Foo or some variant of Foo and you didn't mean to.

20:50 Maybe it's some other package that somebody took over and it's handling it better and you want people to use something else.

20:57 But anyway, there's lots of reasons why you might.

21:00 A guy named Paul McCann wrote a blog post about how to deprecate a PyPI package.

21:06 So he gives a few options and I think these are cool.

21:09 One of the interesting things that he mentions is that PyPI doesn't really give you direction as to what it should look like.

21:17 Which one you should use.

21:18 So he's giving his opinion, which is great.

21:21 You might use a deprecation warning.

21:23 And this doesn't really apply to entire packages.

21:26 But let's say you've changed your API.

21:28 So it might as well be listed here.

21:30 It's a good thing to, instead of just ripping out parts of your API, leave them in there, but make deprecation warnings in there.

21:37 They really should be errors instead of warnings.

21:39 If you're really taking them out and just having the warning, something like an assert is probably better.

21:46 But there is a good thing to think about whether, don't just rip it out, maybe.

21:50 I don't know.

21:50 But if you rip it out completely, a assert will happen automatically.

21:54 So maybe that's a good thing.

21:55 As far as packages, though, you could just delete it.

21:58 So you can, PyPI does allow you to remove packages.

22:01 I don't think that that's probably the right thing to do, usually, ever.

22:06 Unless you just push something up and it was an accident, then deleting it is fine.

22:11 But if it's been up there for a while and people are using it, deleting it has a problem that somebody else could take over the name.

22:17 And possibly a malicious package could take over the name and start to have people having install it.

22:23 So there's problems with that.

22:25 So it's probably not a good choice most of the time.

22:27 The last two options are more reasonable.

22:30 There's a redirect shim.

22:31 So this is an example.

22:33 Like, let's say there's an obvious package that is compatible, that is being better maintained, and you want to push people over to there.

22:42 If it's really very compatible, you can add a setup, a shim that just, and there's some code examples here to just, if somebody installs it,

22:50 it just installs the other package also that people should be using.

22:54 I know if you want it. We'll give you that one.

22:55 Yeah.

22:55 And even having, if somebody imports your package, it really just imports the other package too.

23:01 That's a little weird, but it is interesting that it's an option.

23:06 The thing I really like that probably the best is just a way to fail during install.

23:11 And there's a code example here for if somebody at pip installs something,

23:15 and all the packaging works, but the install part will throw an error, and you can put a message there redirecting people to use a different package,

23:27 or maybe just explain why you ripped this one out.

23:29 So I think I like the last one best.

23:31 So most of those are my commentary.

23:33 But there's some options for how to deal with it.

23:36 So I thought that was good.

23:37 Yeah, I really like the sort of, I tried to pip install it, and it gives you, instead of just failing or being gone,

23:44 it actually gives you a meaningful message.

23:46 Like, you should use this other package, we're done.

23:49 If you really intend to delete it, that's probably it.

23:51 Yeah, and one of the interesting things, the last couple, the redirect shim and the fail during install,

23:57 he gives example packages that do this.

24:00 And some of these are just mistyped things.

24:03 Like, if people mistype something, maybe they meant something else, and redirecting there.

24:10 Yeah.

24:10 It seems so write-only over at PyPI.

24:14 And, you know, if you make a mistake, it's not good.

24:17 So knowing what to do.

24:19 I mean, people depend on it, right?

24:20 If you yank it out, then it's trouble.

24:22 Yeah, but if it's mistake-driven, though, make sure you use the test interface first

24:26 to play with things before you push garbage up there.

24:29 Also, I'd really like people to not squat on names.

24:33 There's a lot of cool package names out there that really have nothing meaningful there

24:38 because somebody decided they wanted to grab a name and then didn't do anything with it.

24:42 That's lame.

24:43 Don't do that.

24:43 Yeah, that's definitely lame.

24:44 On the other hand, there are some times I'm like, how did you just get that name?

24:48 There'll be like a new package, like secure or something like that.

24:51 Like, how did you get that after all this time, right?

24:54 It's crazy.

24:54 Yeah, definitely.

24:56 Yeah, yeah.

24:56 Or up to 236,000 packages.

25:00 That's pretty insane.

25:01 Yeah.

25:01 So, Brian, would you like me to enlighten you a little bit in the listeners?

25:04 Yes, please enlighten me.

25:06 So, last time you brought up a cool progress bar.

25:11 It was either the last time or the time before.

25:12 And it did all sorts of cool stuff.

25:15 But here's yet another one.

25:17 Again, an example of our listeners saying, oh, here's three cool things you talked about,

25:20 but did you know there are these four others you've never heard of?

25:23 Yeah.

25:23 So, Avram Lubkin sent over his progress bar package called Enlighten.

25:30 And it's actually pretty cool.

25:32 Like, there's a bunch of cool progress bars with nice animations and stuff.

25:35 But there's a few features of Enlighten that might make you choose it.

25:39 One is you can have colored progress bars, which is nice.

25:44 But more importantly, you can have multicolored progress bars.

25:47 So, let me throw out an example that I think would connect for you, given that you're a fan of pytest.

25:52 Like, if you run some sort of series or sequence of operations, and you want to show how far you're making it,

25:58 but they have multiple outcomes, like red is failure, green is success, and yellow is, like, skip or something like that.

26:03 You could have a progress bar that has three segments, a red segment, a yellow segment, and a green segment.

26:11 And it could be all one bar, but it could kind of, like, show you as it grows.

26:15 Here's the level of failure.

26:16 Here's the level of success, and so on.

26:18 All with color, 24-bit color, not just, like, eight colors either.

26:23 Oh, yeah.

26:23 That'd be great.

26:24 That's good.

26:25 And then I said, those just go off.

26:26 The other one is a lot of these progress bars, they'll sort of control, they'll be rewriting the screen, right?

26:33 They'll be putting stuff across as it's happening.

26:35 But if you happen to do, like, a print statement, effectively writing a standard out or an exception that writes a standard error

26:42 or something like that, it, you know, messes them all up, right?

26:46 So this one works well, even allows you to write print statements while it's working.

26:52 So the print statements kind of flow by above it, but it's, you know, whatever part of the screen it's taken over,

26:57 it still is managing that as well.

26:59 So it overrides what print means or stand it out and sends it where it belongs.

27:04 That's cool.

27:04 Yeah, that's pretty cool.

27:05 It also automatically handles resizing, except on Windows.

27:08 And where it says except on Windows, I'm not sure if that means on the new terminal,

27:13 Windows terminal that they came out with that's much closer to what we have over on Mac and Linux,

27:18 or if it just means it doesn't work on Windows at all.

27:21 I suspect it might work on the new Windows terminal that just went 1.0, but certainly not on CMD.

27:27 Okay.

27:28 Who uses CMD?

27:29 Okay.

27:30 I mean, that's what comes with Windows.

27:32 If you don't like go out of the way to get something, right?

27:35 Like commander or the new terminal or something like that.

27:37 Yeah, but you gotta install Git at least on Windows anyway, and then you got Bash.

27:41 Yeah, that's true.

27:42 It comes with it.

27:42 That's true.

27:44 So like all good things that have actions and behaviors and are visual, there's a nice little animation even on the PyPI.org page.

27:54 So if you go there, you can watch and actually see the stuff scrolling by.

27:57 It's like an animated GIF right on the PyPI page.

28:00 So very, very nice.

28:01 Well done.

28:01 You know, the multicolored progress bars, it does seem pretty awesome if that's your use case.

28:05 I want rainbow ones.

28:07 I want to do a rainbow.

28:08 Yes.

28:08 Maybe with like little unicorns just shooting out of it and just like all sorts of crazy.

28:13 Yeah.

28:14 Sounds good.

28:14 Yes.

28:15 Stars and unicorns.

28:16 Yes.

28:17 It would be perfect.

28:17 Yeah.

28:18 Let's have that.

28:19 And people are starting to catch on that we like animations because they'll include it in the suggestion.

28:24 And by the way, it has an animation.

28:26 Good job.

28:26 You can watch it here.

28:27 Yeah.

28:27 Good job.

28:27 Bringing it up.

28:28 Yeah.

28:29 You're part of it.

28:30 Awesome.

28:30 Well, nice work on that progress bar library.

28:33 It seems simple and well done.

28:34 Speaking of unicorns, I want to talk about oceans.

28:37 Wait, unicorns don't live in the ocean.

28:40 Mermaids.

28:41 Let's go with mermaids.

28:42 So I want to talk about Code Ocean.

28:44 So this was contributed by Daniel Mulkey.

28:47 So this is a pretty neat thing.

28:50 So Code Ocean is a paid service, but there's a free tier.

28:55 And it's a research collaboration platform that supports researchers from the beginning of a project through publication.

29:01 So this is kind of this neat thing.

29:04 I'm going to read a little bit from their about page.

29:07 We built a platform that can help give researchers back 20% of the time they spend troubleshooting technology in order to run and reproduce past work before completing new experiments.

29:18 And Code Ocean is an open access platform for code and data where users can develop, share, publish, and download code through a web browser, eliminating the need to install software, blah, blah, blah, blah.

29:30 Anyway, mission is to make research easier.

29:33 So this idea is you can have code snippets like Jupyter and Python and even things like MATLAB and C++ code running with the data in this kind of environment that you can collaborate with other people and just sort of build up these data sets and the science and the code and all bundled together.

29:55 And it's pretty cool.

29:56 It also collaborates with some.

29:59 One of the goals of it is to be able to have all of this reproducible code and data together in a form that's acceptable to journals.

30:09 And one of the reasons why it was contributed is Daniel said that one of the peer-reviewed journals that he reads, it happened to be SPIE's optical engineering journal, recommended this platform for associating code with the article.

30:23 So people trying to do science and be published, associating a Code Ocean space with it is an option.

30:31 That's cool.

30:31 And if it gets accepted by editors as, yeah, that's what you do, then it just makes it easier.

30:37 It's kind of like saying, oh, you have an open source project.

30:39 What's the GitHub URL?

30:41 Right?

30:41 Not is it on GitHub?

30:42 Just where on GitHub is it?

30:44 Yeah.

30:44 I do technically know that there's GitLab and other places, but most of the code lives on GitHub is what I was getting at, right?

30:50 Yeah.

30:52 Cool.

30:52 Well, this looks pretty neat.

30:53 I do think there's a lot of interesting takes on reproducibility in science, and that's definitely a good thing.

31:00 There's this, there's Binder, which is doing a lot of interesting stuff, although not as focused on exact reproducibility, but still nice.

31:10 There's Gigantum, which is also a cool platform for this kind of stuff.

31:14 So there's a lot of options, and it's nice to see more of them like Code Ocean.

31:17 Yeah.

31:17 Nice.

31:18 Well, that's R6.

31:20 Do you have anything extra for us today?

31:22 I'm going to try to connect this to Python because I am excited about it.

31:27 How long has it been since astronauts have been launched into space from NASA and from the U.S.?

31:34 It's been a really long time, ever since the space shuttle got shut down four or five years ago.

31:38 I heard it was like over 10, but I could have heard wrong.

31:42 It's very possible.

31:42 It's been a very long time.

31:44 So today, I know this doesn't help you folks listening because the time it takes us to get this episode out, but hopefully this went well.

31:51 But I'm super excited for SpaceX's launch in collaboration with NASA to send two astronauts up into space.

31:58 Wow.

31:59 Are those guys brave to get on to one of these rockets?

32:04 But also, I think there's probably somewhere in the mix a lot of Python in action.

32:09 If you go to SpaceX, they had, last time I looked at one random point a couple months ago, they had 92 open positions for Python developers.

32:18 Oh, wow.

32:19 I don't know if that's 92 people they're looking for, but at least there's 92 roles they were trying to fill.

32:25 There could be multiple people into any one role.

32:27 So that's a lot of Python.

32:28 And so somehow in this launch, there's got to be some interesting stories around Python.

32:33 And this is mostly to say, one, it's awesome that SpaceX and NASA are doing this.

32:38 Hopefully this goes well.

32:40 Lots and lots of luck to that.

32:41 But also, if anyone knows how to connect us with the people inside SpaceX doing awesome rocket stuff with Python, those would make great stories.

32:51 We would love to hear about those and introductions and whatnot.

32:54 Yeah, that would be cool.

32:55 I'd love to hear more about that.

32:56 Yeah.

32:57 I do hope it goes well.

32:58 And I heard that possibly there was weather problems that might crop up, but we'll...

33:04 Well, maybe people will get to watch this.

33:06 Knock on wood.

33:07 Maybe it'll be delayed for a week.

33:09 We'll see.

33:09 Awesome.

33:11 How about you?

33:11 What do you got?

33:11 I just downloaded 3.90 beta 1.

33:14 So Python 3.9 beta 1 is available for testing.

33:18 If you are maintaining a package or any other maintaining your application, you probably ought to download

33:24 and make sure your stuff works with 3.9.

33:27 Oh, yeah.

33:27 That's cool.

33:28 And because it's a beta now, it should be frozen in terms of features and APIs and stuff, right?

33:33 It's no longer changing.

33:35 So it's now time to start making sure your stuff works and yelling if it doesn't.

33:40 Yeah.

33:40 Right.

33:41 And another reason to download it is the prompt, VNV with the prompt with the magic dot that turns your directory name.

33:49 Yeah.

33:50 That is in 3.9.

33:51 Yeah.

33:51 Super cool.

33:52 Awesome.

33:53 Well, that's not very funny, but I could tell you something that is.

33:56 And it's very relevant to your item here, actually.

33:58 Okay.

33:58 You ready for this?

33:59 So open up this link here and I'll put the link in the show notes because this is a visual.

34:04 I got to describe it to you.

34:05 So this was sent over by Stephen Howell.

34:06 Thank you for that.

34:08 And this would be better during Halloween, but Halloween is far away.

34:12 So we're going to do it this way.

34:13 So there's a person standing around and there's a ghost standing behind them, right?

34:18 Yeah.

34:18 And the ghost says, boo.

34:19 Person doesn't react.

34:21 Boo.

34:21 Person doesn't react.

34:22 Boo.

34:23 Person doesn't react.

34:25 The person says, Python 2.7.

34:27 Ah!

34:28 The person runs away.

34:29 Yeah, this is great.

34:33 It's good, right?

34:33 Yeah.

34:34 You got it as well.

34:35 What do you got here?

34:35 Well, I'm going to get haters for this, but I'm going to say it anyway.

34:38 So somebody named Bert sent us a meta joke because we have used PyJokes before.

34:44 We love PyJokes.

34:45 And I'm going to modify it a little bit.

34:47 So what does PyJokes have in common with Java?

34:50 It gets updated all the time, but never gets any better.

34:53 That's pretty funny.

34:56 I don't even really use Java, but I have a Java tool on my desktop.

35:00 And so I get like, Java's updated.

35:02 Do you want to do the update?

35:03 All the time.

35:04 Make it stop.

35:06 Yeah.

35:06 Make it stop.

35:07 Yeah, that's funny.

35:08 Yeah, PyJokes is good.

35:09 If you all need some programming jokes, just pip install --user PyJokes.

35:13 And then you can type PyJoke anytime you want.

35:15 Yeah.

35:16 I had to change it because the original joke was like about Flash, Adobe Flash.

35:21 And who has that?

35:22 Is that even a thing anymore?

35:23 Yeah.

35:24 I don't even think it gets updated anymore.

35:26 I don't know.

35:26 Maybe it does.

35:27 I sure hope it's not on my computer.

35:28 Yeah.

35:29 It's a security phone.

35:31 Yeah, it totally is.

35:33 Awesome.

35:33 All right.

35:34 Well, very funny.

35:34 All right.

35:35 Thank you.

35:35 Yep.

35:36 You bet.

35:36 Bye-bye.

35:37 Thank you for listening to Python Bytes.

35:38 Follow the show on Twitter at Python Bytes.

35:41 That's Python Bytes as in B-Y-T-E-S.

35:44 And get the full show notes at Python Bytes.fm.

35:47 If you have a news item you want featured, just visit Python Bytes.fm and send it our way.

35:52 We're always on the lookout for sharing something cool.

35:54 This is Brian Okken.

35:55 And on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast

35:59 with your friends and colleagues.

Want to go deeper? Check our projects

Course: Python for the Absolute Beginner course

Beginners

HTMX + Flask

FastAPI

pytest book

Full transcript