Transcript #249: All of Linux as a Python API
Return to episode page view on github00:00 Hey there, thanks for listening. Before we jump into this episode, I just want to remind you
00:03 that this episode is brought to you by us over at Talk Python Training and Brian through his pytest
00:09 book. So if you want to get hands-on and learn something with Python, be sure to consider our
00:14 courses over at Talk Python Training. Visit them via pythonbytes.fm/courses. And if you're
00:21 looking to do testing and get better with pytest, check out Brian's book at pythonbytes.fm slash
00:27 pytest. Enjoy the episode. Hello and welcome to Python Bytes, where we deliver news and headlines
00:32 directly to your earbuds. This is episode 249, recorded September 8th, 2021. And I am Brian
00:40 Aukin. Hey, I'm Michael Kennedy. And I am Eric Costanz. Hey, Eric, thanks for joining us today.
00:45 Yeah, thank you so much for having me. So tell us a little bit about who you are.
00:49 So first of all, I'm a longtime listener to the show. I just told Michael who I'm listening since
00:54 episode one of this podcast, actually. Also listening to Michael's podcast, obviously. And
00:59 then once I get to know it, I started listening to your podcast as well. So basically everything
01:06 that's out there, I'm listening. What I'm doing, I'm currently leading the competence center for
01:11 AI and data science at Data Drivers, which is a consultancy firm from Hamburg, Germany. Our focus
01:17 is mainly on building big data platforms and applications, mostly using cloud native services.
01:22 And we try to apply best DevOps and MLOps practices to wherever we are.
01:29 That's super cool. Do you have a favorite cloud?
01:31 In all honesty, probably Google Cloud. Gotta say it.
01:35 Yeah. Yeah, nice.
01:36 Well, Michael, why don't you kick us off with our first item?
01:40 Yeah. This one's a little fickle. Comes to us from Ollie. He sent that in. Thank you, Ollie.
01:44 And sort of indirectly from Patrick Gray over at Risky Business, which is a cool security focused
01:50 podcast. Python supports security. They talk about it over there. So you've heard of pickles,
01:56 obviously pickling in Python. It's like, I want to take this binary, this binary Python object graph
02:03 and turn it into a blob that I can stash away and then later get it back.
02:07 Sometimes it's real simple. Stash it in Reddit and other systems can pull it out real quick as a cache,
02:12 maybe save it to a file. But where it's become really popular as a means of data exchange is actually
02:18 in machine learning.
02:20 So the people who built this thing I'm going to tell you about were really built it around
02:25 focusing on the machine learning use case because people are handing around these models,
02:29 these pre-trained models and like, here's the model loaded up and roll and loaded up and roll
02:34 may mean you have an amazing artificial intelligence that drives a car, or it may mean that you have a
02:39 virus because pickles can contain all sorts of bad things. All right. So this thing I'm going to tell
02:44 you about is called fickling like pickling. It's a decompiler, a static analyzer and a bytecode
02:51 rewriter for Python pickle object serializations. So you take these pickle files, these object graphs of
02:57 Python things, and you can pull them apart and look at them. You can ask questions like,
03:02 is it a virus? And you can even say things like, let's put a virus in it. So all of these are possible
03:09 with this tool. And it's made by a security pen testing company called trail of bits for basically
03:15 that purpose. Right. So it's kind of either side, the attacking pen testing side or the defensive side
03:22 of the story. So it works on three, six and above, and you can see it's super simple. You say, you
03:29 basically do pickle stuff and you say from fickling dot pickle import pickled. And then you can kind of,
03:35 as if you would use the dis module to disassemble Python code, you can do that with this pickled library.
03:43 And it'll print out something that's kind of like an abstract syntax tree of the pickle. And they've got a
03:48 real simple example on the GitHub repo. It's like a list of four numbers, one, two, three, four. And then
03:53 it just shows you, look, we're assigning the results of creating a list and setting these constants in it.
03:59 Another thing that is nice about this is it's not specifically built for Python developers.
04:04 So it's also kind of something you can integrate into other tooling and say continuous integration
04:10 and stuff like that. So you can run it off the command line as well. You can just on the, you know,
04:15 terminal to type fickling and give it the data and then outcomes from answer. The one that people might
04:21 want to do is the --check safety. And that will try to look and see if it's doing bad things,
04:27 like for example, talking to OS dot system or doing other malicious stuff like that. So that's good,
04:34 but I wouldn't trust that entirely. Like how well is it checking, right? If you, for example, were to encode
04:39 Python code and then decode it and then take that decoded stuff and it did OS something, right? Feed that to
04:46 a val or whatever. There's all sorts of layers here, right? So it can check for obvious things, but you know,
04:51 it's not like an absolute guarantee. And then finally, you can inject arbitrary Python code that will run on
04:58 unpickling into an existing pickle file with --inject.
05:03 Seems fine, right? Everything's fine.
05:05 That's the fun part.
05:06 Yeah. So if there's no malicious code present, here you go.
05:10 Yeah, exactly. So maybe I'm imagining something like a little thing that counts that like prints out in,
05:17 in like flashing bright colors. We told you you shouldn't unpickle untrusted data.
05:22 Don't do it.
05:24 It's like a little bit of a beginning hard drive format. It has like a loud beeping sound. It was three, two, one. And just like, obviously not really do it, but like that would get your attention, right? That'd be a mean trick. But absolutely.
05:35 This is interesting. And you know, I didn't really put it together with the ML data exchange model exchange story until I heard the folks talking about it over on risky business. So it seems like, especially in the ML story, you want to have a look at these kinds of things.
05:51 Yeah. So I've heard about the use case before, actually, but I didn't know that somebody would, would solve it in this way. So pretty nice.
05:59 Yeah. I mean, Eric, this is sort of your world, right? The machine learning stuff. So how does this sit with you? What do you think?
06:07 Yeah. So it comes up all the time that you pick up some random model that someone has built. So as security issues become more prevalent, this might be a thing.
06:18 Yeah. Well, is there better ways to store it? Yeah. Is there better ways to store it? Like JSON or something else?
06:23 Models don't have to exist that way, do they?
06:26 Yeah. I mean, even if there were there, there are some projects that focus on building like some reusable interface across all these different frameworks and stuff. But in reality, people just use pickle.
06:38 Yeah, really? Yeah, they do.
06:40 I just didn't know anybody was really using it for much.
06:43 No, it's absolutely common. So within like, say, scikit-learn, which is probably most used library ever, you just use pickle on the build, store your files.
06:54 Yeah. All right. Well, cool. So this is a useful library from Trello Bits. People can check out and we're going to start with everything is fine and we'll end with everything is fine as well, Brian. But over to you.
07:04 Okay. Well, this is something it's a blast from the past a little bit, about a year ago. Anyway, I want to talk about virtual environments and directories.
07:16 So, and there's an article from Hinnick that's called Python Project Local Virtual Envy Management. That's a mouthful.
07:28 But the idea and we've talked about wanting this before is to be able to still want it. Yeah. So just to go if I've got several projects going on, whenever I like CD into the into a directory with this project, I just want the virtual environment to activate automatically.
07:47 And then when I leave it and go to another one, it's just automatically switched. Apparently that already works and we've already covered it, but I missed it.
07:55 So actually in episode 185, you brought up Durinv and in part of it, it's the ability to, you can have per project isolated development environments.
08:09 Yes.
08:10 But I didn't pick that up yet, but Hinnick just said, this is how you do it. And how you do it really is just, you just have, you have to have, you have to install Durinv first.
08:23 And then you put a .env RC file in a directory and say layout Python and then what Python version. So like layout Python, Python 3.9. And then that's it. That's all you got to do.
08:38 And I, I'm like, that can't be that easy. And it was, I did it this morning and it's like, man, this is great. So on my Mac, it's all solved.
08:47 But it doesn't work on windows. So, oh, well.
08:52 Unless you use a Linux subsystem for windows or a window subsystem for Linux WSL, I guess it is.
09:00 Oh, okay.
09:00 I mean, that sort of semi solves it.
09:03 Yeah. Yeah. So I really, I probably have this need more within a windows than I have in, in, on my, my Mac, but I have it in both places.
09:12 So I'm, I'm going to start using it. It's great.
09:16 Plus like you covered last time you can also have a bonus. You can put environmental variables in there too. So that in the project, you've got your, like your, perhaps your secrets or, or just different environmental settings you want to use.
09:31 Yeah. I think people will look in your .RC, whatever your bash RC, ZSHRC, whatever files for your secrets. But I suspect it's much less likely to go hunting through virtual environments and looking for their activate scripts and see what's in them.
09:46 Yeah.
09:47 People, people know, but fewer people know that stuff gets stashed in there. So that's probably good.
09:52 Right. So, I guess mainly the story is, I knew that you could do it, but I didn't realize how easy it was. So this is, it's super simple. It just took a little bit. and then I, my, my second thought was it isn't, it's not that hard to create virtual environments though.
10:06 So this is saving any time. I still got to create this file and put this stuff in it. It actually is more typing a little bit more, but it didn't take me long to realize that it's when you're switching between different directories, you save a ton of time. So.
10:20 Yeah. So going back and forth between projects, right?
10:23 Yeah. So that's it really just kind of neat.
10:27 Yeah. Brett out in the live stream has got a comment for us. If you use pyenv, you can run pyenv local env name in your project folder and get this behavior as well. How do you do that? How do you get it to, activate by just changing directory into it? Is what I'm not totally sure. Yeah.
10:43 Yeah.
10:44 I think you get the Python version that way, right? But not the actual virtual environment.
10:48 Yeah. Possibly if you, if you've installed Python through pyenv as well. Yeah. And then David has a comment back.
10:57 The first topic out there in the live stream. Hey David, the irony of legacy object serialization being used on cutting edge machine learning.
11:03 Like that one.
11:05 Yeah. And then Teddy at the live stream. Hey Teddy says, does it work with an IDE?
11:09 I changes the interpreter based on the folder you're in within a workspace in this coast, for example, that I don't know,
11:16 but I was going to add the personal comment that I don't need this nearly as much as I felt like I used to,
11:23 because the way I jump between projects is usually jump, open them up in PyCharm and jump between them there.
11:29 And that always activates.
11:30 If you go to the terminal in PyCharm, it activates that environment for that project.
11:34 I don't know.
11:35 I'm on the command line all the time.
11:37 So definitely.
11:38 Yeah.
11:38 If you're on the command line bus bus around a lot, then that's then both Brett and Alvaro have a follow up.
11:45 Pyenv adds a shim that intercepts the calls to Python.
11:48 So yeah, very good.
11:49 So it must be that you have to install Python through Pyenv, but then it'll also do this.
11:53 Very cool.
11:54 Good to know.
11:54 I didn't know that.
11:55 Me too.
11:56 Yeah.
11:56 Nice.
11:57 All right.
11:57 Eric, first one is for you.
11:59 Yeah.
12:00 So I brought with me the test containers Python library, which, and let me quote this one from the description,
12:09 because I think it's a pretty good summarization.
12:11 So test containers Python is a port for test containers Java that allows Docker containers for functional integration testing.
12:20 It provides capabilities to spin up Docker containers, such as databases, Selenium web browsers, and any other containers for testing.
12:27 So maybe not that many new things in here, but we use this in a project lately.
12:35 And especially we use this in integration pipelines using cloud native services.
12:42 So there's a container for Google Cloud Pub/Sub, for example, which is pretty amazing.
12:46 Also for like your Kafka.
12:48 This is originally a Java project.
12:50 So there's still a lot to do for the Python community in order to catch up on a bunch of interfaces that need to be implemented and stuff.
12:57 One example, it is here.
13:03 Let me just show you that one.
13:05 So there's in the repo, you can find an example of how to use this within your CI pipeline.
13:15 So what's happening here is actually that if you have like a standard CI pipeline for your integration test, which consists of Docker containers that we use Docker in Docker to actually run the integration test.
13:27 So all your standard 2021 stuff in here, I guess.
13:31 Yeah, this is super cool.
13:33 And the way you do it is just create a context manager, right?
13:36 Exactly.
13:37 You just say something like with my SQL container, here's a connection string.
13:40 And then you can just do your normal database stuff over to it.
13:45 Yeah.
13:45 So it integrates perfectly fine with pytest.
13:47 We did that a lot.
13:50 And so, yeah, the syntax is pretty cool.
13:52 It's super easy to use.
13:53 The integration with the CI CD works fine.
13:56 So, yeah.
13:57 Brian, we could use this with a test fixture and a little yield action, something like that.
14:03 Yeah.
14:03 Yeah.
14:04 I can't wait to try to play with something like this.
14:07 Yeah.
14:08 We talked about this way long ago.
14:10 I brought this up, I believe.
14:11 But I'm glad you brought it back, Eric, because it's really useful and it's really neat.
14:15 And there's more stuff than actually is listed on the readme for some reason.
14:19 Exactly.
14:20 Like, if you flip through the actual documentation, you can see that there's other containers, right?
14:27 For example, I believe there's a MongoDB one, for example, but that's not listed in the documentation.
14:32 And then the cloud emulators are probably neat for you for testing there, right?
14:36 Yeah.
14:37 I mean, that's one of the things that I find off-putting from like cloud native type stuff is if you don't have access to the cloud, you're dead in the water, right?
14:46 And that can be a problem for continuous integration and for all sorts of things.
14:50 So things like this are pretty neat.
14:51 It's definitely challenging.
14:53 So stuff like this helps.
14:54 Yeah.
14:54 You know, to me, it's an interesting trade-off because on one hand, sure, you can mock out your database and then just test against your test data.
15:03 But then if your data model and the database changes, but you don't think to update the test data, well, then your code's going to, like SQLAlchemy, for example, will freak out and crash if the scheme is not a perfect match.
15:15 Whereas you wouldn't find that in testing if you weren't letting it talk a little bit to the database.
15:20 I think there's just interesting things like this.
15:22 Brian, you even had an episode about not mocking out your database, didn't you?
15:26 Yeah.
15:27 I think as little as you can, I guess, let's do it the reverse.
15:34 As close as you can have to the real environment, the better.
15:36 And this is when people are deploying on containers.
15:39 Testing with containers makes total sense.
15:41 Yeah.
15:42 Absolutely.
15:43 Absolutely.
15:43 All right.
15:44 Want to talk a little more infrastructure?
15:45 Yeah.
15:47 All right.
15:47 So I have the one, it's got to be the shortest named thing for a featured item.
15:53 JC.
15:54 Two letters.
15:55 JC.
15:55 So JC comes to us from Garrett.
15:58 Thank you, Garrett, for sending that in.
15:59 And at first I was like, I don't know if this is relevant to me or if this is interesting.
16:03 But the more I looked at it, I'm like, yeah, this is actually pretty awesome.
16:06 To me, let me, I'll read what JC describes itself as in a moment.
16:10 But to me, what this is, is it is basically what web scraping is to the web.
16:16 JC is to Linux.
16:18 So there's not a nice API for it, but I'd like to somehow wrap a little Python magic around it and then have an API for it.
16:26 Okay.
16:26 So it's official story is it's a CLI tool in Python library that converts the output of popular command line tools and file types to JSON.
16:34 And it allows piping one thing to the next, obviously, because it's Linux like.
16:39 So the idea is, you know, the example they have on their site there is dig.
16:43 So dig is a command that'll give you information about a domain.
16:48 So you could do something like dig example.com pipe JC, and then you tell JC what it's expecting output from just whatever the print output to the terminal is in dig.
17:00 And it will parse that and turn it into a Python dictionary.
17:04 Right.
17:04 So I could sub process run dig, but then I just get a huge blob of text and I've got to basically go through it, try to understand it and so on.
17:14 And this knows the exact format and turns it into like structured data.
17:18 So think of all of these different Linux commands you may run.
17:22 You find a whole bunch of them.
17:23 They're like a huge list down here.
17:24 So airport, ARP, crontab, date, CSV, free, DU, hash, history, hosts, IP config, netstat, all those types of commands, syscontrol.
17:37 So for example, if you're automating daemons and stuff like that, you can now do that from Python.
17:42 And then instead of getting just a text blob and an exit code, you get a dictionary back that you can then check out and program against.
17:49 What do you think?
17:50 Oh, that's pretty cool.
17:51 Yeah.
17:52 Yeah.
17:53 There's a bunch of built ins.
17:55 Hopefully the thing you're looking for is one of these.
18:01 Yeah, exactly.
18:01 I suspect it's not extraordinarily hard to do to add another one.
18:07 Yeah.
18:07 Yeah.
18:08 But you can also run it on the command line.
18:10 You don't have to use it in Python, which is what I was scrolling around looking for.
18:14 So if you want to, like, let's suppose I want to go and run dig and I just want to go to the answers and get the data, which would be the IP address of some domain.
18:25 You can say JC, run this thing, and then JQ-R, or there's like a way to just pass over a string.
18:34 And basically the string you pass in is the object dereferencing, the traversal of the dictionary.
18:41 So dot, bracket, dot answer, bracket, dot data, and it'll go and pull that all apart, which is pretty neat.
18:47 So it's got a cool command line, terminal automation aspect, just like Vickle.
18:53 This is a nice wizard effect so that if you know how to do this well and people come over and watch you do this, they will be amazed.
19:00 Yeah, just make sure you spin up your third or fourth terminal while you do that.
19:05 Yeah, yeah, yeah.
19:06 Exactly.
19:07 Eric, what do you think?
19:08 Yeah.
19:08 So it sounds like I found something that I can put my usual Sunday afternoon time into.
19:15 So I'll play around with it.
19:17 Yeah, yeah, yeah.
19:18 Exactly.
19:19 Yeah, because every now and then I want to do some subprocess thing and it needs to call some kind of Linux command.
19:26 I'm like, what am I going to do?
19:27 Am I just going to check the status code, the return code and hope it works and then just say it didn't work if it didn't work?
19:33 Or, you know, you could do so much more with this.
19:34 Sorry, Brian.
19:35 Well, there's some stuff that's less Unix-y that other people might need.
19:41 Like you can parse pip list and pip show and YAML and XML with this as well.
19:51 So that's pretty cool.
19:52 Yeah, yeah, very cool.
19:54 All right.
19:54 How about some ellipses or I don't know how else to say it.
19:59 Dot, dot, dot.
20:00 The next thing.
20:01 Do say more.
20:02 So this was a surprise to me.
20:08 I guess I haven't run into this yet.
20:10 Or maybe just I forgot.
20:12 But Python has ellipses and it has the keyword ellipses.
20:17 Ellipses?
20:18 This is ellipses?
20:19 Ellipses.
20:20 Ellipsi.
20:20 Ellipsi.
20:21 Keep going.
20:23 And it's an actual object within Python.
20:25 Who knew?
20:26 And then also you can just do dot, dot, dot.
20:30 And that's a valid thing.
20:32 An identifier.
20:35 So it's a special value.
20:36 But you can use it for all sorts of stuff.
20:40 Like the...
20:42 Oh, by the way, I'm referencing an article called What is Python's ellipses object from Florian
20:48 Dalitz?
20:49 Thanks, Florian, for writing that.
20:50 So it's...
20:52 The Python or the definition really is it's the same.
20:57 The ellipsis literal is the same as the literal dot, dot, dot.
21:01 It's a special value used mostly in conjunction with extended slicing syntax for user-defined container
21:09 data types.
21:10 I don't know.
21:11 What does that mean?
21:12 I guess Pandas uses it maybe.
21:15 But the article comes up, has some interesting things.
21:19 You can use it in place of pass because it is a valid, has a valid value.
21:24 You can kind of do a dictionary or a function definition.
21:30 And instead of saying pass, just do three dots.
21:34 And that's valid Python.
21:35 I'm kind of liking that.
21:37 I'm sure it's...
21:38 It's cool.
21:38 People will be like, what are you doing?
21:39 But at the same time, it's like, that's really what you wanted to put down there.
21:43 It's like, I just don't want to put anything, but Python won't work unless I kind of close
21:46 this off.
21:46 So here's a pass, right?
21:48 Well, also, one of the things I was thinking about is, no, I would probably use pass all
21:52 the time in that case.
21:54 But when writing documentation and you really want to have a working code example, but you
22:00 want to just indicate there's going to be more code there, that's a cool thing to put
22:04 in.
22:04 Anyway, so there's that.
22:07 And then there's also using it in type information.
22:10 So with type information, for instance, apparently, like, let's say I've got a function that returns
22:16 a tuple or tuple.
22:18 I've got these words today.
22:20 Anyway, a tuple with two integers, you can just say a tuple with two int, but if you don't
22:26 know how many integers are going to be there, you can do the three dots.
22:29 And apparently that works with typing.
22:32 That's neat.
22:34 That's very neat.
22:35 There's not a lot.
22:36 Apparently, it's used also within FastAPI and Typer, but it's there.
22:41 And if you want to use to implement a certain feature where that might make sense, it is a
22:47 thing that's available to you.
22:50 Like, maybe you could have an operator, a dot, dot, dot operator on your something.
22:54 So I learned this just the other day from a tweet from Raymond HedJinja, where he was
23:01 asking people, like, how would you do this?
23:03 And he brought up the exact same example using the documentation and the pass or the ellipsis
23:11 instead.
23:11 And I didn't even know that this was a Python object.
23:15 I knew it from the typing.
23:18 So the question is, can you pass this object around?
23:22 Can you return from a function value like dot, dot, dot?
23:26 I imagine.
23:27 I don't know.
23:28 It should work, right?
23:29 It should work.
23:30 Yeah, it should work.
23:30 Yeah.
23:31 Nice.
23:32 I'll try it out while we go on to the next topic.
23:38 Yeah.
23:38 That one surprised me.
23:40 Well done, Florian.
23:41 Yeah.
23:42 So the last one that I brought with me, actually, since I lead the data science and AI team, I got to bring something with me that has to do with it.
23:51 So I brought with me the PyTorch forecasting library.
23:56 So, Michael, you just used this analogy a couple of minutes ago.
24:03 So I'm going to use an analogy now.
24:05 So for me, PyTorch forecasting looks like what a fast AI does for computer vision and natural language processing.
24:15 It does for time series forecasting.
24:16 Because there was like a lack of deep learning for type series forecasting.
24:25 And actually, I think that PyTorch forecasting is going to close this gap.
24:32 So it comes in with a bunch of important features, actually.
24:37 So it's built on top of PyTorch Lightning, which allows training on CPUs, single and multiple GPUs, basically out of the box.
24:47 So there's been a lot of software engineering involved for the data scientists in the past.
24:52 And this library just makes it pretty simple.
24:57 So you have to work very hard in order to mess things up.
25:02 with this library, I guess.
25:03 So what it also brings is an implementation of a model that is called the temporal fusion transformers.
25:17 So this is from Google Researcher.
25:19 Actually, there's also a TensorFlow-based implementation.
25:23 I'm going to put the link to the paper in the show notes.
25:27 This is a very interesting model that has performed pretty well on a dozen prominent benchmarks very lately.
25:38 And it has a very huge benefit, which is that it is pretty interpretable.
25:43 So it does actually calculate feature importance for you.
25:48 So this is, in the real world applications, very important.
25:52 Because whenever you stick your data into these models and something good comes out, people will always ask you,
25:59 so what was the important part of the data?
26:01 How does it influence the model and the outcome?
26:05 So temporary fusion transformers, they do this for you.
26:08 Also, the PyTorch forecasting comes with Optuna, which is a popular library for hyperparameter tuning,
26:16 which is also implemented in here.
26:20 Right, there might be.
26:21 So this does multivariate time series, multivariable time series.
26:26 Yeah, so the multivariable part of it is pretty important, actually.
26:31 So go ahead.
26:32 I was going to say, so the hyperparameter tuning might say, this part actually doesn't make any difference in the prediction,
26:37 but this other part does.
26:38 So pay attention to that, right?
26:39 Yeah, absolutely.
26:40 Yeah.
26:41 Yeah, this looks really good.
26:42 So if you want to predict the future about sales, home prices, heart rate, whatever, right?
26:50 It comes up all the time.
26:51 It comes up all the time.
26:52 And I know from a couple of guys who work for the Google Clouds of this world and the AWSs,
27:01 that within these software as a services or these APIs that they provide for,
27:06 let's say, a demand forecast, they use this temporary fusion transformers under the hood.
27:11 So.
27:11 Yeah, this looks great.
27:13 Just spin it up and use it.
27:14 Yeah, great recommendation.
27:15 A follow-up from the previous one, Brian, Will McGugan.
27:18 Hey, Will.
27:18 The live stream says it's the dot, dot, dot.
27:21 Ellipsis sometimes is used as a sentinel value to mean no value when none is a valid value.
27:26 So, yeah.
27:28 Yeah, and also, yes, you can return it from a function.
27:31 Nice.
27:32 Just fine.
27:33 And then, let's see, someone out in the live stream asked if it has methods.
27:39 Does it have methods or anything that you can do to it?
27:41 That was Teddy.
27:41 Yes, but only the built-ins, right?
27:44 I don't think it, from object, I don't think it does anything interesting besides just be dot, dot, dot.
27:48 Yeah.
27:49 And then Anderson, hey, Anderson, it's a pity the ecosystem is moving towards PyTorch lightning.
27:54 The separation of concerns there is not very nice.
27:57 In my opinion, PyTorch ignite does a better job in that aspect.
28:00 Eric, that's all you.
28:01 Yeah, fair enough.
28:02 Fair enough.
28:03 Still, I mean, one thing that you've got to keep in mind.
28:08 So, speaking of separation of concerns, right?
28:11 There's so many data scientists out there that if you throw like separations of concerns at them,
28:16 they just answer like, yeah, here's my model.
28:18 So, what is separation of concerns in this sense, right?
28:22 So, if this works, if people use it, it's probably good.
28:25 Yeah.
28:25 Cool.
28:26 Brian, extras?
28:27 Extras.
28:28 Oh, I just wanted to bring up that Python 3.10 RC2 is out.
28:34 So, the second release candidate for Python 3.10 is out.
28:38 So, people can play with it.
28:39 Apparently, we're like maybe a month away from getting 3.10.
28:42 So, I'm excited about that.
28:44 Yeah, that's me.
28:45 Very exciting.
28:45 Nice.
28:46 Awesome.
28:47 All right.
28:47 I got a couple to throw out there.
28:49 Really?
28:49 What a surprise.
28:50 Can you imagine?
28:51 What a surprise.
28:52 Can you imagine?
28:52 So, remember we talked about several things.
28:56 I talked about how I turned off all of the tracking stuff and all those things on the website,
29:05 which I think is good because so many people run ad blockers.
29:08 They were, it was like pretty inconsistent data anyway and accurate.
29:11 Then I mentioned goaccess.io.
29:14 I said, that'd be cool.
29:15 Maybe we should apply it.
29:16 I ended up writing a ton of automation to apply this to Python Bytes, Talk Python, Talk Python
29:20 training, all the things.
29:21 And it's pretty cool.
29:22 I built some automation that will download all the Intent X log files, some of which are text, some of which are gzipped, and then run this thing across it
29:30 and it will build like one giant monthly log thing.
29:34 And then goaccess can then turn into nice, beautiful reports.
29:37 So, very excited to have goaccess working well.
29:40 And instead of running on the server, I actually just download and then run it on like a monthly
29:45 report locally, which I think is kind of cool.
29:47 Yeah.
29:48 All right.
29:48 One, we had some feedback about Caffeinate.
29:54 Remember Caffeinate?
29:54 You can type Caffeinate on the macOS terminal and it'll keep your system alive.
29:59 Nathan Henry said, you mentioned over macOS, the Caffeinate tool says you can follow it with
30:08 a long running command to keep awake.
30:11 So, you can say like Caffeinate Python dash C import time, time.sleep, or so give it some
30:18 kind of, so you could say Caffeinate Python and some script you want to run.
30:21 So, you could reverse it if that script doesn't use keep awake or I think that's what it was.
30:27 Right?
30:28 So, you could apply Caffeinate to your Python code and just say, no, stay awake while you're
30:31 doing this.
30:32 Or you can even apply it to a running process using a PID.
30:35 So, it just stays awake while that process is running then?
30:39 Yeah.
30:39 And then it'll go away.
30:40 Yeah.
30:40 Oh, okay.
30:41 Nice.
30:42 Yeah.
30:42 So, it's like the reverse of what we talked about then.
30:44 Then Sean Tabor from Teaching Python said, isn't this what we were asking for?
30:49 Remember, we were talking about the keyboards?
30:51 Keyboards.
30:52 And here's a Python one.
30:55 This is a M60 mechanical keyboard, the open source USB, BLE, Bluetooth, low energy, five,
31:02 hot swappable, 60% keyboard powered by Python.
31:06 So, this one comes with Python built in, which is pretty excellent.
31:09 So, if people want to play with that, they definitely can.
31:12 The next one I want to throw out there real quick comes to us from Mark Little, a friend
31:17 of mine here in Portland.
31:18 And basically, the subtitle is that, this is an article from CNBC Finance News, that open
31:25 source is booming.
31:25 So, the headline has to do with MongoDB, but it's more broad.
31:29 So, if people are interested in kind of following up on that, it's kind of cool.
31:32 So, MongoDB surged on Friday, which was last Friday.
31:36 It's now worth as much as IBM paid for Red Hat.
31:40 Databricks raised private financing around at $30 billion valuation.
31:45 And just, you know, these are the mega open source companies, but it's pretty interesting.
31:49 To just give you a sense, like, I read this article, I got it.
31:53 It's pretty interesting.
31:53 These numbers kind of just like bounce off me.
31:55 But the one that made it stick for me was MongoDB was a private company for a while.
32:00 Then it became, then it IPO'd, right?
32:02 It had VC money, then it IPO'd.
32:04 Do you have a sense?
32:05 Either of you have a sense for how much it IPO'd for?
32:07 It seemed crazy, right?
32:09 Like, like a $1.2, $1.4 billion.
32:12 MongoDB is worth $30 billion now, right?
32:16 So, even after like the crazy IPO, you know, $1.2 billion to start and now over $30 billion.
32:21 Wow.
32:22 So, that is an insane amount of growth in these.
32:25 And then they talk about Confluent and JFrog and a bunch of other Elastic.
32:29 If you kind of want to dig into the business side of open source, that's pretty interesting.
32:34 All right.
32:35 Two more.
32:35 I've been doing a ton of video encoding lately.
32:38 I use FFMPEG for some of the audio processing and other types of things around both the podcasts and the courses.
32:45 So, attribution here.
32:47 This is from Jim Anderson.
32:49 Sent this over.
32:49 Thanks, Jim.
32:50 FFMPEG.wasm.
32:52 So, here's FFMPEG, which is a very popular tool in that world, but as a WebAssembly thing, which is pretty awesome.
32:59 And I'm trying to remember what the name of the library was.
33:04 But over in, we did talk about on Python Bytes, I think with Cecil Phillip on one time.
33:09 Maybe it was even him that brought it up.
33:11 But there's a Python library that will run WebAssemblies.
33:16 So, not run WebAssembly in their browser or put Python in their browser, but reverse it.
33:20 Like, I have a WebAssembly library that does cool stuff.
33:22 Put it in my Python code and run it here.
33:24 So, you could take FFMPEG.wasm and pure Python and have like a no dependency sort of audio video processing tool in Python,
33:33 which I think is pretty cool.
33:34 Cool.
33:34 All right.
33:35 Last one.
33:36 I told you we'd start with everything is fine.
33:38 I'm going to end with everything is fine.
33:39 Credit card stealing backdoored packages found in Python's PyPI library hub.
33:45 What?
33:45 That's not good.
33:46 This is not good.
33:48 This is not good.
33:50 When you hear people talk about remote code execution, that typically is bad.
33:56 Like, I'm on the internet.
33:57 People send me bad stuff.
34:00 Now they have my computer and I don't even necessarily know it.
34:03 So, apparently, in addition to this, these were found and removed.
34:06 It was something, what was it?
34:08 It was something around the line of Noblesse, N-O-B-L-E-S-S-E, and a couple of variations on that spelling.
34:16 That was the problem.
34:17 So, I'm happy to see I didn't install that.
34:18 But this doesn't make me happy.
34:20 It looks like it's fixed.
34:21 So, the PyPI team also just patched a remote code execution hole in their platform, which
34:26 potentially could have been exploited to hijack the entirety of PyPI.
34:30 That one makes me way more nervous than typosquadding or their weirdness.
34:34 And it was a vulnerability in the way that they were doing GitHub actions with PyPI, which
34:41 allowed a malicious pull request to execute arbitrary code over there, which is not ideal.
34:47 Nice.
34:48 Yeah.
34:48 But I'm glad to hear that's fixed.
34:49 Anyway, everything's fine.
34:50 Doesn't feel fine.
34:53 No, not at all.
34:55 More like a nightmare, to be honest.
35:00 Yeah, to be honest.
35:01 Eric, anything else you want to share with us?
35:04 No, just thank you guys again for having me on the show.
35:08 Pretty fun.
35:09 And make sure that you guys follow me on Twitter.
35:12 And yeah.
35:14 Awesome.
35:15 We'll put a link in the show notes for your Twitter.
35:18 No, we are done.
35:19 Are we, Ryan?
35:20 No, we need to.
35:21 One thing is missing.
35:22 Yeah, yeah, yeah.
35:23 It's important.
35:23 So this one is more of a, not an ML one, is more of a web API type thing.
35:30 So, so often people will write web APIs and just return some kind of message in a JavaScript
35:36 dictionary that says things like bad response or whatever.
35:39 But you're supposed to use HTTP status codes, right?
35:42 Like if there's a bad request, you should return the status code 400.
35:46 If it's not found as an entity, you should return 404 or whatever.
35:51 So here's like two kids at school exchanging messages and it has server on one of them, client
35:57 on the other, and 200 on the message exchange here.
36:00 And then at the bottom, the one kid that got the message reads the JavaScript as a status
36:05 code 400 detail bad request.
36:07 He's like, why?
36:07 Why did you do this to me?
36:08 This is good.
36:12 Yeah.
36:13 This is like little Bobby tables.
36:14 Let this be a lesson to you.
36:16 You don't pass messages like that.
36:17 Come on.
36:18 It's so true.
36:19 It's totally true.
36:20 Totally true.
36:22 All right.
36:22 Well, that's it for our jokes and everything, Brian.
36:24 Yeah.
36:25 We'll have another fun Wednesday on Python Bytes.
36:28 Absolutely.
36:29 Thanks, Eric.
36:29 Thanks, Brian.
36:30 Yeah.
36:30 Thanks, Eric, for being here.
36:31 Thanks a lot, guys.
36:32 See you around.
36:33 Bye, all.
36:33 Bye.
36:34 Thanks for listening to Python Bytes.
36:36 Follow the show on Twitter via at Python Bytes.
36:39 That's Python Bytes as in B-Y-T-E-S.
36:42 Get the full show notes over at Pythonbytes.fm.
36:45 If you have a news item we should cover, just visit Pythonbytes.fm and click submit in the
36:50 nav bar.
36:50 We're always on the lookout for sharing something cool.
36:52 If you want to join us for the live recording, just visit the website and click live stream
36:57 to get notified of when our next episode goes live.
37:00 That's usually happening at noon Pacific on Wednesdays over at YouTube.
37:04 On behalf of myself and Brian Okken, this is Michael Kennedy.
37:07 Thank you for listening and sharing this podcast with your friends and colleagues.