Transcript #351: A Python Empire (or MPIRE?)
Return to episode page view on github00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
00:04 This is episode 351, recorded September 5th, and I am Brian Okken.
00:09 And I'm Michael Kennedy.
00:11 And this episode is also sponsored by Sentry. Thank you, Sentry.
00:14 And if you want to reach any of us or the show, either of us or the show,
00:19 we're at mkennedy, at brianauken, and at pythonbytes, all at fostadon.org.
00:25 And if you're listening later and you'd like to join the show live sometime,
00:31 you can join us on YouTube at pythonbytes.fm/live.
00:36 And we'd love to have you here.
00:37 Indeed.
00:38 What do you got for us first, Michael?
00:39 Let's talk about multiprocessing.
00:42 Empire, not pyre, but mpire, right?
00:46 Pyre is the type checker from meta.
00:48 Empire is something entirely different about multiprocessing.
00:53 And so it's a Python package for easy multiprocessing that's faster than the built-in
00:58 multiprocessing, but has a similar API, has really good error reporting and a bunch of
01:04 other types of reporting, like how well did that session go?
01:07 You know, like how well did you utilize the multiprocessing capabilities of your machine
01:13 and so on?
01:13 So yeah, let's dig into it.
01:15 So the whole acronym for the name is multiprocessing is really easy, which is not what most people say, right?
01:22 No.
01:23 But it's a package that's faster than multiprocessing in most cases, has more features,
01:29 and is generally more user-friendly than the default multiprocessing package.
01:34 It has APIs like multiprocessing.pool, but it also has the benefits of things like copy,
01:41 unwrite, shared objects.
01:42 We're going to come back to that later as well.
01:45 But also the ability to have like init and exit, set up teardown for your workers, some more
01:52 state that you can use and so on.
01:53 So pretty cool.
01:55 It has a progress bar.
01:57 It has TQDM progress built right into it across the multiple processes.
02:02 So you can say things like, here is some work I want you to do.
02:05 There's a hundred items.
02:06 Split that across five cores.
02:10 And as those different processes complete the work for individual elements, give me a unified
02:15 progress bar, which is pretty awesome, right?
02:18 Yeah.
02:18 Yeah.
02:19 Very cool.
02:19 Yeah.
02:20 Yeah.
02:20 It's got a progress dashboard.
02:22 Actually, I have no idea what that is.
02:23 It has a worker insights that you can ask when it's done.
02:27 Like how well did, you know, how efficient was that multiprocessing story?
02:31 Graceful and user-friendly exception handling.
02:33 It has timeouts.
02:34 So you can say, I would like the execution of the work to not take more than three seconds.
02:41 And actually you can even say things such as if the worker process itself takes 10 seconds
02:47 or more to exit, maybe there's like some, something happening over there.
02:50 That's like a hung connection on a database thing or who knows, right?
02:55 Some network thing.
02:55 You can actually set a different timeout for the process, which is pretty cool.
02:59 It has automatic chunking.
03:01 So instead of saying, I have a hundred things, let's go individually one at a time, hand them
03:06 off to workers.
03:07 It can break them up into bigger blocks, including numpy arrays, which is pretty cool.
03:13 You can set the maximum number of tasks that are allowed to run at any given moment.
03:18 I guess, you know, you can set the workers, but also like, if it does this chunking and
03:21 you can say how many things can run to avoid memory problems.
03:24 You could even say, four, I want to use five processes, but after.
03:29 You know, 10, 10 bits of work on any given process.
03:33 Give me a new worker and shut down the others in case there's like leaky memory or other
03:38 things along those lines.
03:39 you can create a pool of them through a daemon option.
03:42 A whole bunch of stuff.
03:44 It uses dill to serialize across, the multi-process processes, which is cool because it gives you
03:51 more exotic, serialization options for say objects that are not pickleable lambdas functions and other
04:00 things in IPython and Jupyter notebooks.
04:01 So all that's pretty awesome.
04:03 Right?
04:03 Yeah.
04:04 Yeah.
04:05 So the API is super, super simple from empire import worker pool with worker pool jobs equal
04:11 five, pull that map, some function, some data go.
04:14 So this jobs here tells you how many processes to run basically for the progress bar.
04:19 you just set progress bar equals true.
04:22 That's not too bad.
04:23 Another thing that's cool is you can have shared objects.
04:26 So you can have, some shared data that's passed across without, basically using shared
04:33 memory.
04:33 I think is how that it works so that it's more efficient instead of trying to pickle it
04:37 across.
04:38 I think they have to be read only or something.
04:39 And there's a whole bunch about it.
04:40 Oh, interesting.
04:41 But you pass it into the worker pool.
04:43 Okay.
04:44 Yeah.
04:44 You say worker pool, these things, I want you to set them up in a way to be shared.
04:49 And I think, like I said, in a read only way across, across all the processes instead
04:54 of trying to copy them over.
04:55 you have a setup and a teardown thing that you can do, to like prepare the, the worker,
05:03 when it gets started, you can ask it for the, the insights, like I said, and then
05:08 benchmarks.
05:09 It shows it's significantly faster.
05:11 Not just the compared, not just against multi-processing, but they say, here's how you do it.
05:16 Here's what happens if you do it in a serial way.
05:17 Here's what multi-processing and process pool executors look like.
05:21 But it also compares against job lib, dask, and array.
05:24 Mm.
05:24 And it's, it's pretty much hanging there with the best of them, isn't it?
05:28 Yeah.
05:29 It does a titch faster than ray everywhere.
05:31 Yeah.
05:32 Just, yeah.
05:33 Just, just a bit.
05:34 One other thing.
05:35 I don't remember where it was, in this huge, long list of things.
05:40 but you can also pin the CPUs to CPU cores, which can be really valuable when you're thinking
05:47 about, taking advantage of, you know, L1, L2 CPU caches.
05:52 So if your processes are bouncing around back and forth, one's working with some data, then
05:56 it switches to another core.
05:58 And then it has to pull a new data into the, into the L2 cache, which is like hundreds
06:05 of times slower than real memory.
06:06 And that's how that slows it down.
06:08 Then they switch back and they keep running.
06:09 So you can say, you know, pin these workers to these CPUs and you know, you've got a better
06:14 chance of them not redoing their cache all the time.
06:17 So that's kind of cool.
06:18 So there's just a bunch of neat little features in here.
06:19 Yeah.
06:20 If you're already using multiprocessing, you might check this out.
06:23 Oh, if you care about performance for real, you know, why are you using multiprocessing if you don't care about performance?
06:30 Well, I mean, you're, you're looking to pull out the final little bits of performance, I
06:36 suppose.
06:36 Yeah.
06:37 Yeah.
06:37 Yeah.
06:37 Right.
06:38 Like these benchmarks are cool, but they're doing, you know, computation on the workers.
06:42 Right.
06:42 Whereas a lot of what you're doing is like talking to queues and talking to networks and talking
06:47 to databases.
06:47 Like it doesn't matter what framework you used to do that long as you're doing something
06:51 parallel.
06:51 Right.
06:52 Yeah.
06:52 Well, yeah.
06:53 Well, I don't know.
06:54 That's why you have to do your own benchmarks.
06:55 Yeah, for sure.
06:56 And then there's that article over on medium by the creator as well.
07:01 That gives you a whole lot of background on all this stuff.
07:03 Oh, neat.
07:04 Nice.
07:04 Yeah.
07:05 This is quite a long article.
07:06 And I think it's actually more relevant.
07:08 Like, for example, it's got screenshots where it shows, you know, if you use something like,
07:13 let me read really quickly.
07:14 Ray or job lib and you get some kind of exception.
07:19 It just says exception occurred.
07:20 Yikes.
07:21 Whereas with this one, with empire, you get things like here's the call stack that goes
07:26 back a little more correctly to the right function.
07:28 And here's the arguments that were passed.
07:30 Oh, interesting.
07:31 Over.
07:32 So when it crashes, you know, because you have like five processes, potentially all doing
07:37 stuff.
07:37 One of them crashed.
07:38 Like what data did they have?
07:39 I don't know.
07:40 It's a parallel.
07:41 It's hard.
07:41 Right.
07:42 So having the arguments like these are the ones that cause the error is pretty excellent.
07:46 Yeah.
07:47 Cool.
07:47 Anyway, empire.
07:49 That's what I got for number one.
07:50 All right.
07:52 Cool.
07:52 I want to have, I have something else that starts with M.
07:56 Mop up.
07:58 So mop up is something that I learned about from an article by glyph.
08:05 So I'll let me jump to the article first.
08:07 So glyph wrote an article saying, get your Mac Python from python.org.
08:12 That's what I already do.
08:13 I've tried all the other stuff and I just like just the downloader from python.org.
08:19 So this article talks about reasons why that's probably what you want.
08:23 And that's probably what if you're writing a tutorial, it's probably what your users need
08:28 to do, too.
08:29 If they're if they're using a Mac and I won't go through all the details, but he he goes through
08:35 reasons why you probably want this one and not things like what are the others?
08:42 homebrew.
08:42 You can brew install your Python, but he doesn't recommend it.
08:46 And you can read all the pyenv.
08:48 I've tried it.
08:49 It like messes up other stuff for me.
08:52 So I like the downloader from Python.
08:54 But one of the things that I don't like is that if like if I had Python 311 4 installed and
09:01 now Python 311 5 is out.
09:03 How do I get that on my computer?
09:05 Do I just reinstall it?
09:06 Yes, you can.
09:07 But Glyph made a new thing called mop up.
09:11 So what mop up does is you you just pip install mop up.
09:16 And it's like the only thing I install on my global my global Python versions like 311 pip install.
09:23 I update pip and install this and that's it.
09:26 Everything else goes into a virtual environment.
09:29 Or pip x install this.
09:32 Exactly.
09:32 But mop up.
09:34 What's the usage?
09:35 So I just tried it this morning.
09:36 I didn't pass it any flags.
09:38 I just installed it and ran it.
09:40 And it updated me from Python 311 4 to Python 311 5 without me having to re-download anything
09:48 other than this.
09:49 So I'm going to set up something that goes through.
09:52 I've got a lot of versions on my computer.
09:54 I've got I think well I've got 3.7 through 3.12 install.
09:58 And and I want all of them to be on the latest bug fix release.
10:02 So I'm just going to use probably use Brett Cannon's pi installer or Python installer pi on
10:12 my Mac to go to each of the versions and run mop up on all of them to update it.
10:17 So that's what I'd like to do.
10:19 Anyway it's cool.
10:20 I'm I'm really excited about this because this was like the one hole in using the install
10:25 the the the python.org installer is how they update it.
10:29 So nice.
10:31 Yep.
10:31 Interesting.
10:32 Interesting.
10:32 I got to admit I'm still a brew install Python 3 sort of person.
10:37 Okay.
10:38 And the drawback the main drawback that glyph makes an argument for which is valid is you
10:43 don't control necessarily the version of Python that you get.
10:46 Because if you brew install I don't know some other you know YouTube downloader app or whatever
10:53 rando thing it might say well I need a Python 3.12 and you only have 3.8 right.
10:59 And it'll auto upgrade on you without you knowing.
11:02 But I'm always running the absolute latest Python anyway.
11:05 And so you know when it those other packages say greater than 3.10 like I don't care I already
11:10 have greater than 3.10.
11:11 So I don't know that's the world I'm living in now but that's okay for me.
11:15 Oh okay.
11:16 So yeah I'm I'm a package maintainer so I I have multiple versions on on my box but it's
11:23 but in a lot of people like PyEmp for that reason but I yeah I don't.
11:28 But anyway I've always I've had trouble with PyEmp too especially around the Apple Silicon
11:32 Rosetta compiler mismatch like there's just like it wouldn't install for me and so yeah I think the
11:41 whole I think the python.org it's a good recommendation though.
11:43 Okay cool.
11:44 Yep yep.
11:45 All right.
11:46 before we move on to our next topic Brian.
11:49 Well I'd like to thank Sentry for sponsoring this episode of Python Bytes.
11:55 You know Sentry for their error tracking service but did you know that you can take take it all
11:59 the way through your multi-tier and distributed app with their distributed tracing feature?
12:04 How cool is that?
12:05 Distributed tracing is a debugging technique that involves tracking your the request of your
12:10 system starting from the very beginning like the user action all the way to the back end
12:15 database and third-party services.
12:16 This can help you identify if the cause of the an error is one project in one project is due to an
12:24 error in the other another project.
12:25 That's very useful.
12:27 Every system can benefit from distributed tracing but they are useful for especially for microservices
12:33 in microservice architecture logs won't give you the full picture so you can't debug every request
12:40 in full by reading the logs but distributed tracing with a platform like Sentry
12:45 can give you visual overview of which services were called during the execution of certain requests.
12:50 Aside from debugging and visualizing architecture distributed tracing also helps you identify
12:56 performance bottlenecks through a visual like Gantt chart.
12:59 You can see if a particular span in your stack took longer than expected and how it could be causing slowdowns in other parts of your app.
13:08 You can see if you can see the results of your app.
13:17 You can see the results of your app.
13:17 You can see the results of your app.
13:18 You can see the results of your app.
13:19 You can see the results of your app.
13:30 You can see the results of your app.
13:41 You can see the results of your app.
13:52 You can see the results of your app.
14:03 You can see the results of your app.
14:04 You can see the results of your app.
14:16 You can see the results of your app.
14:28 You can see the results of your app.
14:29 You can see the results of your app.
14:30 You can see the results of your app.
14:31 You can see the results of your app.
14:32 You can see the results of your app.
14:33 You can see the results of your app.
14:34 You can see the results of your app.
14:35 You can see the results of your app.
14:36 You can see the results of your app.
14:37 You can see the results of your app.
14:38 You can see the results of your app.
14:39 You can see the results of your app.
14:40 You can see the results of your app.
14:41 You can see the results of your app.
14:42 You can see the results of your app.
14:43 You can see the results of your app.
14:44 You can see the results of your app.
14:45 You can see the results of your app.
14:46 You can see the results of your app.
14:47 You can see the results of your app.
14:48 You can see the results of your app.
14:49 You can see the results of your app.
14:50 You can see the results of your app.
14:51 You can see the results of your app.
14:52 You can see the results of your app.
14:53 You can see the results of your app.
14:54 You can see the results of your app.
14:55 You can see the results of your app.
14:56 You can see the results of your app.
14:57 You can see the results of your app.
14:58 You can see the results of your app.
14:59 You can see the results of your app.
15:00 You can see the results of your app.
15:01 You can see the results of your app.
15:02 You can see the results of your app.
15:03 You can see the results of your app.
15:04 You can see the results of your app.
15:05 You can see the results of your app.
15:06 You can see the results of your app.
15:07 You can see the results of your app.
15:08 You can see the results of your app.
15:09 You can see the results of your app.
15:10 You can see the results of your app.
15:11 You can see the results of your app.
15:12 You can see the results of your app.
15:13 So if you have x equals a string, and then I say y equals x, it goes up to that thing
15:19 and says plus plus, you know, plus equals one on the reference count.
15:22 And when y goes away, then it minus minuses it, right?
15:26 When that number hits zero, it gets cleaned up.
15:27 There's also stuff on the object for cycles and garbage collection.
15:31 So there's a lot of stuff that's happening there, right?
15:33 Yeah.
15:34 And so what they're doing is they're running a lot of Django for Instagram, which is pretty awesome.
15:41 However, what they're trying to take advantage of is the fact that there's a lot of similar data, similar memory usage.
15:49 When I load up Python, so if I write type Python on the terminal, and then I open up a new terminal type Python, it's gone through exactly the same startup process, right?
15:58 So it's loaded the same shared libraries or DLLs.
16:01 It's created.
16:02 It's, it's, you know, negative 255 to 255 flywheel numbers.
16:09 There's going to reuse instead of when you say the number seven, it doesn't always create a new seven.
16:13 You always have the seven that was created at startup exceptions, those kinds of things, right?
16:17 Well, if you have a web server that's got 10 or 20 or a hundred worker processes that all went through the same startup for a Python app, you would want to have things like that number seven or some exception type or whatever modules, right?
16:32 Core modules that are loaded.
16:33 You would like to have one copy of those in memory on Linux, and then have a copy on right thing for the stuff that actually changes.
16:40 But those other pieces, you want them to stay the same.
16:42 Yeah.
16:43 Yeah.
16:44 Like there's no point in having like a different representation of the number four for every process.
16:49 If there's some way to share that memory that was created at startup.
16:52 And we don't need reference counts updated and all that stuff.
16:55 Because exactly exactly.
16:56 So the, I, what they found was, while many of their Python objects are practically or effectively immutable, they didn't actually over time behave that way.
17:07 So they have graphs of private memory and shared memory.
17:10 And what you would hope is that the shared memory stays pretty stable over time, or maybe even grows.
17:16 Maybe you're doing new stuff.
17:17 That's like pulled in similar things, but that's not what happens in practice.
17:20 I'm current Python.
17:22 The shared memory goes down and down and down, because even though that object, or let's say that flywheel number that got created to be shared, it's still got his reference count number change.
17:32 So throughout the behavior of one app, it might go, well, four was used for a long time.
17:36 Four was used 300 times here and 280 over there.
17:39 So those are not the same four.
17:41 Cause on the reference count, they have 281 and 301 or whatever it is.
17:45 Right.
17:46 And so that, that shared memory is falling down because the garbage collector and, just the interacting with the ref count is very in very meaningless and small ways, changing pieces of the shared memory to make them fall out of the shared state.
18:02 So this whole path, this whole idea is we're going to make those types of things.
18:06 so that their reference count can't change.
18:08 And their GC structures can't change.
18:10 They cannot be changed.
18:11 They're just always set to some magic number for like this thing's reference count is unchanged.
18:17 Right.
18:18 So if you look at like the object header, it's got a GC header reference count object type, and then the actual data for the ones that don't change.
18:26 Now these new ones can be set.
18:28 So even their GC header and the reference counts don't change.
18:31 Cool.
18:32 Right.
18:32 Yeah.
18:33 Yeah.
18:33 And what that means is if you come down here, it says there's some challenges.
18:38 First, they had to make sure that applications wouldn't crash.
18:41 If some objects suddenly had different ref counts.
18:44 Second, it changes the core memory representation of a Python object, which if you work in the C level, just directly with the memory, that's, you know, pointers to the object.
18:55 That can be tricky.
18:56 And finally, the core implementation relies on adding checks explicitly to the increment and decrement, the, you know, add ref, remove ref, decrement ref, which are two of the hottest bits of code in all of Python.
19:08 Yeah.
19:09 As in the most performance critical.
19:10 So if you make a change to it, or you make all the Python slower for this, that's bad.
19:15 And they did make Python slower, but only 2%.
19:18 And they believe that the benefits they get is actually worth it.
19:22 Cause you bring in, you know, for like heavy workloads, you get actually better performance.
19:26 So it's a trade off, but there it is.
19:28 One of the things that was reading this article.
19:30 And one of the things that confused me was, is there, is this just something internal to Python that, that it's going to happen under the hood or do I need to change my syntax in any way?
19:40 Yes.
19:41 I was looking for that as well.
19:42 And every single thing about, I went and read the PEP and everything I remember from reading the PEP and maybe I missed something, but everything I got from the PEP was it, the C layer.
19:53 It was, you know, here's the pie immortal, you know, call that you make in the C API.
19:59 So what I would like to see is something where you set a decorate, like kind of like a data class.
20:04 Like this thing is outside of garbage collection.
20:06 This class is out or this, I don't know, in some way to say in Python, this thing is a moral for now, at least.
20:14 Yeah.
20:15 But I didn't see it either.
20:16 It also would be good even if we could just do like, like, that would be kind of like a constant then also.
20:22 We could set up some, some constants in your system that, that are immortal or something.
20:28 Yeah.
20:29 Okay.
20:30 Yeah.
20:31 Like the dictionary of a module that loads up, if you're not dynamically changing it, which you almost never do, unless you're like mocking something out, like, let it, let it be, you know, it's just tell it it's the same.
20:41 Don't, don't reference combat.
20:42 Yeah.
20:43 I'd be curious to see in this implement, as they're implementing it, it does seem like parts of the system are going to go a little bit slower, but also parts of it are going to go faster because you don't have to do all that work, but.
20:53 Exactly.
20:54 Right.
20:55 Yeah.
20:56 You don't have to, you don't have to do a lot of stuff.
20:57 Okay.
20:58 Like the garbage collection cycles that happen over time, right?
21:00 Yeah.
21:01 These things will just be excluded from garbage collection entirely.
21:04 So that's cool.
21:05 So they have some graphs of what happened afterwards and the before and after and on the after in the shared memory.
21:10 Well, sorry, the before it went almost to zero.
21:13 Like it went from, you know, Y axis with no numbers really high to Y axis low with no numbers.
21:20 But I don't know exactly what this is.
21:22 Maybe a percent, but like I said, it doesn't really, really say, but after processing as few as 300 requests, it was like a 10th of the original shared memory was left.
21:33 And that was it now after it's down to it's 75, 80% still shared, which is pretty excellent.
21:39 Okay, cool.
21:40 But as you said, this is like one of the internal core things from what I can tell.
21:46 Yeah.
21:47 They do say that this is foundational work for the per interpreter Gil that six, eight, four, as well as making the global interpreter lock optional and see Python seven or three.
21:58 So that's why I'm not going to be able to do that.
21:59 And that's why I'm not going to be able to do that.
22:00 And that's why I'm not going to be able to do that.
22:01 And that's why I'm not going to be able to do that.
22:02 And that's why I'm not going to be able to do that.
22:03 And that's why I'm not going to be able to do that.
22:04 Well, I'm not going to be able to do that.
22:05 I'm not going to be able to do that.
22:06 I'm not going to be able to do that.
22:07 I'm not going to be able to do that.
22:08 I'm not going to be able to do that.
22:09 I'm not going to be able to do that.
22:10 I'm not going to be able to do that.
22:11 I'm not going to be able to do that.
22:12 I'm not going to be able to do that.
22:13 I'm not going to be able to do that.
22:14 I'm not going to be able to do that.
22:15 I'm not going to be able to do that.
22:16 I'm not going to be able to do that.
22:17 I'm not going to be able to do that.
22:18 I'm not going to be able to do that.
22:19 I'm not going to be able to do that.
22:20 So that's why it's there to support.
22:22 And that's why it's relevant for some of these parallelism peps.
22:26 So anyway, pretty cool.
22:28 This is coming in 3.12, I guess.
22:31 Nice. Cool.
22:32 Well, I'd like to talk about something that I don't really think about that much in that
22:37 that is doc strings for doc string formats.
22:40 And I just ran across this article and I'm covering it partly just as a question to the audience.
22:47 So the article is from Scott Robinson and it's called Common Doc String Formats in Python.
22:54 And doc strings, people forget what they are.
22:58 Like, let's say you have a function called add numbers or something.
23:01 You can do, you can really do any kind of quote, but the first string in a function,
23:06 if it's not assigned to a variable, is the doc string.
23:09 And it's the first element.
23:11 Anyway, the first line is a little, it's usually one line.
23:16 one line and then maybe a space and then some other stuff.
23:19 And there apparently there's several format, common formats of this.
23:24 You can also, you can get access to it by the underscore doc attribute of something.
23:29 So if you have a, a, a reference to a function, you can say dot dunder doc, and you can see the doc string.
23:37 And a lot of like ID is use this to pop up hints and stuff.
23:42 That's one of the reasons why you want to have like the, the first, at least the first line be an explanation that is good for somebody to see if it pops up on them and stuff.
23:51 So anyway, which formats should this be?
23:54 So it covers a handful of different formats.
23:57 There's a restructured, restructured text doc stream format.
24:02 So you've got all this like descriptions of parameters and the types and stuff.
24:08 this is scary looking to me.
24:10 there's the, the, let's go through next, Google doc stream format.
24:15 This one makes a little more sense, but again, I don't know.
24:18 it says, and it talks about the different parameters.
24:21 And if, if you really have to describe them, this is probably one of my favorites is what looks pretty good.
24:26 return.
24:27 What, what, what, what, what is the information that it returns?
24:30 what are the arguments and why some like one liner explanations, not too bad.
24:36 There's a numpy scipy doc format.
24:39 This is also pretty clear.
24:40 maybe a little, let's compare the two.
24:44 I guess this is it got an extra line because you're doing the, the underscore line, which is, I guess.
24:50 Okay.
24:51 It looks sort of, I don't know.
24:53 This is a lot of space, but I would, I'm just curious if people are really using this.
24:58 Looking at this, I can see the benefit of describing if it's not clear, from the name of your function, describing stuff.
25:08 And I also like type hints.
25:09 So this seems like a great argument for type hints because the types we,
25:13 the types would be great.
25:14 Just right in the right.
25:15 Right.
25:16 Right.
25:16 Exactly.
25:17 The parameters.
25:18 and then if you don't have to describe the type, maybe just have variable names that are more clear.
25:25 So, I, my personal preference really is, use type hints and then also, have a description.
25:34 If you're going to do a doc string and it's not obvious from the name of the function, then have a description of what the function does.
25:40 And that's it.
25:41 and then if it's unclear to about what really what the stuff is, the behavior of different parameters, then add that.
25:48 But, I, again, I'd love to hear back from people, go ahead and, send me a message on, at Brian Okken at faucet.org.
25:57 This worked great last week.
25:59 I got some great feedback.
26:00 and so I'd love to hear what people are doing for their doc string formats.
26:04 Do you, do you use doc string formats, Michael?
26:06 I'm familiar with doc string formats and I've played with them.
26:09 I like the Google one best, I think, but I'm with you.
26:12 Like if you have good variable names, do you need the parameter information?
26:16 If you use type hints, do you need the parameter information to say the type?
26:19 If you have a return declaration with a type, do you need to have the returns?
26:23 The function has a good name, like get user, you know, area angle bracket, optional user, like, oh, well, it returns a user or it returns none.
26:31 How much more do you need to say about what it returns?
26:33 You know?
26:34 Right.
26:34 Like there's a lot of it's, it's a little bit of a, a case study and yes, you want to be very thorough, but also good naming goes a really long way to like limit the amount of comments and docs you got to put onto a thing.
26:49 There are times when it makes sense though.
26:50 Like if, you're talking about, range or.
26:55 something like that.
26:56 is it inclusive of both, both numbers?
27:00 and if I say one to 10, do I get one, two, three, four up to 10 or I get one, two, three, four up to nine.
27:05 Right.
27:05 Like those, those situations where you might need to say the non-inclusive upper bound of the rain.
27:12 I don't know, whatever.
27:13 Something like that.
27:14 Right.
27:14 Yeah.
27:15 Yeah.
27:15 I do like, an explanation of what's returned though.
27:19 often it's not obvious.
27:20 and it's, even if you are doing a type hint and you can get the type of what's returned, what's the meaning of what's returned is if that's not obvious, please put that in a doc string.
27:31 But yeah, anyway, cool.
27:32 I wonder, I don't, I, this is an honest question.
27:35 I have no idea if you express it in any of this documentation or if the editors consume it.
27:41 But what would be really awesome is if there was a way to express all possible exception types and the entire call stack, right?
27:48 Like you could get a value error, you could get a database connection error, or you could get a uniqueness constraint exception with any of those three.
27:57 Then you could have editors where you just hit like alt enter, right?
28:00 The error handling goes, bam, bam, bam.
28:02 Here's the three types of things you might catch.
28:05 Right.
28:05 That would be awesome.
28:06 But I don't know if you can express the possible range of exceptions in there or not.
28:11 Or unless you've, yeah.
28:13 And especially if you're calling any extra functions within, within a function.
28:17 Yeah.
28:18 You don't know if it's going to raise an exception possibly, but.
28:21 Possibly.
28:22 Yeah.
28:23 Anyway, that's something I would see actually really useful there that you don't express in like the type information or the name or any of those things.
28:28 Yeah.
28:29 Cool.
28:30 Yeah.
28:31 Cool.
28:32 Well, those are our items.
28:33 Michael, do you have any extras for us?
28:34 I have an extra for you in particular.
28:37 How about that?
28:38 Okay.
28:39 Let's start with that one then.
28:40 So last week you asked about GitHub releases.
28:44 Who uses these?
28:45 Should I be bothered?
28:46 Yeah.
28:47 There's this person that seems to be telling everyone on GitHub.
28:49 They should use releases if they're not.
28:51 Do I care?
28:52 And Rhett Turnbull, who's been on talk Python to talk about building Mac apps or with Python.
28:58 GitHub said there said GitHub releases questions.
29:01 I use them and I like them so people can subscribe to be notified of new releases.
29:06 I use GH, the GitHub command line.
29:10 GitHub release create to create one of the command line every time I pushed up IPI.
29:14 I'm sure this can be done as an action, but I don't push that often.
29:18 So it's fine with me.
29:19 Anyway, there's some feedback for you.
29:21 Thanks.
29:22 Yeah, I actually got quite a few people reaching out and I really appreciate it.
29:27 And it did convince me that I'm going to start trying to figure it out using GitHub releases, but I also want to make sure that it's automated as much as possible.
29:38 I don't want to add redundant work just for the heck of it.
29:40 So you're going to set up some automation to go around and tell everyone on GitHub who doesn't have releases going yet?
29:45 No.
29:46 They should do releases.
29:47 Think of all the contributor badges you're going to get.
29:50 Yeah.
29:52 I'm just kidding.
29:53 All right.
29:57 Let's talk about one more thing.
29:58 We've heard about IPI issues where people are uploading malicious packages and a lot of times it's crypto kitties and other idiots who are doing that or researchers to like just prove a concept that it's not going to be.
30:09 Prove of concept that it can be done.
30:10 But Lazarus hackers who are I'm pretty sure.
30:14 Yeah.
30:14 North Korean state sponsored hacking group is uploaded a fake VMware VM connect library targeting it professionals.
30:25 So it only had 237 downloads.
30:29 But when you start to think about state actor hacking level of stuff getting installed onto your machine, that's like a at minimum format.
30:38 The OS maybe just throw it in the trash.
30:41 I don't know.
30:42 It's like pretty bad level of being infected.
30:44 So I don't know.
30:45 That's I have no action or further thoughts.
30:47 Just like a, Hey, that's worth checking out.
30:49 Yeah.
30:50 And maybe we do need to care about our, our, you know, pipeline and whatever.
30:54 But yeah, the supply supply chain, but we do have the, the new security person, Mike, that was hired.
31:01 Right.
31:02 So that's excellent.
31:03 Yes.
31:04 Yeah.
31:05 He was in the audience.
31:06 So that was great.
31:07 I believe it was Mike.
31:08 Right.
31:09 Hopefully I got the name, right?
31:10 Yeah.
31:11 Over to you.
31:12 That's what I got.
31:13 It's actually this, the, I don't know how to pronounce that J and Y J and Y, on PI, the PI bytes slack.
31:20 we were, talking about using, talking about using TRS 80 computers.
31:26 And I said, Hey, I, I remember typing in lunar lander on my TRS 80 way back when, copied it out of the back of a magazine.
31:35 And he's, he said, Oh, I've got a copy, copy of, of, lunar lander that works on Python.
31:41 I'm like, Oh, I want to try it.
31:42 And, and I, I still can't, I'm going to get back to him, but, I can't get his to work.
31:48 And then I looked around and there was this other cool one.
31:51 lunar lander Python.
31:54 I found that's a four years old and apparently it was done as part of a fundamentals of computing course.
32:01 which is, which is pretty impressive.
32:03 I couldn't get it to work, but their website looks great.
32:06 So they have a website with it.
32:08 You got attached to it with, with like a screenshots and it shows good fonts too.
32:13 Yeah.
32:14 Yeah.
32:15 And it looks exactly like the lunar lander that I typed into my TRS 80.
32:18 So I'm pretty excited about that.
32:20 anyway, but I can't get that to work either.
32:22 So if anybody's got like a lunar lander, copy or something that works on, works with modern Python, I would love to, to play with it.
32:32 I also want to like hack with it with my daughter and stuff.
32:35 So anyway, that's the only extra thing I got is, bring on the lunar lander.
32:40 That's really cool.
32:41 I like it.
32:42 Yeah.
32:43 Mike, Mike Felder is here.
32:44 Fidel is here.
32:45 the security guy and people are thinking him and stuff for all the security work.
32:50 So just getting started, but yeah, it's, it's not an easy job.
32:53 I'm sure.
32:54 Yeah.
32:55 And we're pretty excited.
32:56 I can't think of a better person to do this job.
32:58 So indeed.
32:59 So shall we play some bingo?
33:01 Sure.
33:02 All right.
33:03 This is our joke.
33:04 Program or bingo.
33:05 I love it.
33:06 So, you know, bingo works.
33:07 Everybody gets a different card with different options.
33:10 Typically it's numbers, but in this case it's program or actions or statements you call out or have happened.
33:16 And as they get called out, you mark them off and whoever completes a row or column or I dunno, something diagonal.
33:23 I don't play that much bingo, but you win.
33:25 Right.
33:26 And so this is, this is a possible program or bingo card.
33:29 We should come up with one, a whole bunch of them.
33:32 So I'll just read you some of the options out of this card.
33:34 Okay.
33:35 Ryan.
33:36 So we've got number one written code without comments.
33:39 Everybody could check that one off for all of the C inspired language.
33:43 People forgot a semi-colon at the end of a line.
33:46 That's good.
33:47 I can certainly relate with number three, close 12 tabs after fixing an issue.
33:51 Oh yeah.
33:52 Oh yeah.
33:53 Also related.
33:54 Number four, 20 warnings, zero errors.
33:57 Works on my machine, man.
33:59 Yeah, exactly.
34:00 The number five is program.
34:02 Didn't run on someone else's computer.
34:04 Yeah.
34:05 And instantiation of the works on my machine problem.
34:08 And then this number six to do list greater than completed tasks.
34:11 Number seven copied code from stack overflow.
34:13 I'm pretty sure we can all check that one off.
34:15 Close program without saving it.
34:17 Okay.
34:18 Number nine, ask to fix a laptop because you're a programmer.
34:21 I have a problem with my computer.
34:22 Like, please don't, please don't.
34:24 Number 10 turned your bug into a feature.
34:26 11 deleted block of code and regretted it later.
34:29 Finally learned a new programming language, but never used it.
34:32 Hello TypeScript.
34:33 We could come up with so many of these.
34:35 We should, we should totally do.
34:36 We should do more.
34:37 So good.
34:38 Aren't they?
34:39 You can just go on and on.
34:40 Yeah.
34:41 Yeah.
34:42 You can do a backup copy of your code repository, even though it's like a hub.
34:47 Yeah.
34:48 Zip is my source control.
34:50 Yeah.
34:51 And then the, the, the, there's usually a free one in the middle that could just be a need
34:57 to need to update pip.
34:59 That's exactly.
35:01 pip is out of date.
35:02 Yeah.
35:03 Awesome.
35:04 Well, as usual, pleasure to talk with you, Michael, and thank you so much Sentry for sponsoring
35:11 this episode.
35:12 Again, everybody check out Sentry and go to, what was that link again?
35:16 Pythonbytes.fm/sentry.
35:17 Thanks Brian.
35:18 Thank you.
35:19 Bye.
35:20 See y'all.
35:21 Thank you. Bye.