Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #249: All of Linux as a Python API

Return to episode page view on github
Recorded on Thursday, Sep 9, 2021.

00:00 Hey there, thanks for listening.

00:01 Before we jump into this episode, I just want to remind you that this episode is brought to you by us over at TalkBython Training and Brian through his pytest book.

00:10 So if you want to get hands on and learn something with Python, be sure to consider our courses over at TalkBython Training.

00:17 Visit them via pythonbytes.fm/courses.

00:21 And if you're looking to do testing and get better with pytest, check out Brian's book at pythonbytes.fm/pytest.

00:28 >> Enjoy the episode.

00:29 >> Hello and welcome to Python Bytes, where we deliver news and headlines directly to your earbuds.

00:34 This is episode 249, recorded September 8th, 2021.

00:39 And I am Brian Okken.

00:41 >> Hey, I'm Michael Kennedy.

00:42 >> And I am Edith Christensen.

00:43 >> Hey, Eric, thanks for joining us today.

00:45 >> Yeah, thank you so much for having me.

00:47 >> So tell us a little bit about who you are.

00:49 >> So first of all, I'm a long-time listener to the show.

00:52 I just told Michael, well, I'm listening since episode one of this podcast, actually.

00:56 >> Wow.

00:57 listening to Michael's podcast, obviously. And then once I get to know it, I started listening to your podcast as well. So basically everything that's out there, I'm listening.

01:07 What I'm doing, I'm currently leading the competence center for AI and data science at Data Drivers, which is a consultancy firm from Hamburg, Germany. Our focus is mainly on building big data platforms and applications, mostly using cloud-native services. And we try to apply best DevOps and MLOps practices to wherever we are.

01:29 That's super cool.

01:30 Do you have a favorite cloud?

01:32 In all honesty, probably Google cloud.

01:35 Gotta say it.

01:36 Yeah.

01:37 Yeah.

01:38 Nice.

01:39 Well, Michael, why don't you kick us off with our first item?

01:40 Yeah.

01:41 This one's a little fickle.

01:42 Comes to us from Ollie.

01:43 He sent that in.

01:44 Thank you, Ollie.

01:45 And sort of indirectly from Patrick Gray over at Risky Business, which is a cool security focused podcast.

01:51 Python supports security.

01:52 they talk about it over there.

01:54 So you've heard of Pickles, obviously, Pickling in Python.

01:58 It's like, I wanna take this binary Python object graph and turn it into a blob that I can stash away and then later get it back.

02:07 Sometimes it's real simple, stash it in Reddit and other systems can pull it out real quick as a cache, maybe save it to a file.

02:14 But where it's become really popular as a means of data exchange is actually in machine learning.

02:20 So the people who built this thing I'm gonna tell you about were really built it around focusing on the machine learning use case because people are handing around these models, these pre-trained models, and like here's the model loaded up in roll and loaded up in roll may mean you have an amazing artificial intelligence that drives a car, or it may mean that you have a virus because pickles can contain all sorts of bad things.

02:43 All right, so this thing I'm gonna tell you about is called Fickling, like pickling.

02:47 It's a decompiler, a static analyzer, and a byte code rewriter for Python pickle object serializations.

02:54 So you take these pickle files, these object graphs of Python things, and you can pull them apart and look at them.

03:01 You can ask questions like, is it a virus?

03:04 And you can even say things like, let's put a virus in it.

03:07 So all of these are possible with this tool.

03:10 And it's made by a security pen testing company called TrailerBits for basically that purpose, right?

03:16 So it's kind of either side, the attacking pen testing side or the defensive side of the store.

03:23 So it works on three, six and above and you can see it's super simple.

03:27 You say, basically do pickle stuff and you say from fickling.pickle import pickled and then you can kind of as if you would use the dis module to disassemble Python code, you can do that with this pickled library and it'll print out something that's kind of like an abstract syntax tree of the pickle.

03:48 And they've got a real simple example on the GitHub repo.

03:50 It's like a list of four numbers, one, two, three, four.

03:53 And then it just shows you, look, we're assigning the results of creating a list and setting these constants in it.

03:59 Another thing that is nice about this is it's not specifically built for Python developers.

04:05 So it's also kind of something you can integrate into other tooling and say continuous integration and stuff like that.

04:11 So you can run it off the command line as well.

04:13 you can just on the terminal just type fickling and give it the data and then outcomes some answer.

04:20 The one that people might wanna do is the dash dash check safety.

04:24 And that will try to look and see if it's doing bad things like for example, talking to os.system or doing other malicious stuff like that.

04:33 So that's good, but I wouldn't trust that entirely.

04:36 Like how well is it checking, right?

04:37 If you, for example, were to encode Python code and then decode it and then take that decoded stuff and it did OS something, right?

04:45 You feed that to a Val or whatever.

04:47 There's all sorts of layers here, right?

04:49 So it can check for obvious things, but it's not like an absolute guarantee.

04:53 And then finally, you can inject arbitrary Python code that will run on unpickling into an existing pickle file with dash dash inject.

05:02 Seems fine, right?

05:04 Everything's fine.

05:05 - That's the fun part.

05:06 - Yeah.

05:07 Well, if there's no malicious code present, here you go.

05:11 - Yeah, exactly.

05:12 So maybe I'm imagining something like a little thing that count that like prints out in, in like flashing bright colors.

05:20 We told you shouldn't unpickle untrusted data.

05:22 Don't do it.

05:24 Beginning hard drive format.

05:25 It has a, like a loud beeping sound.

05:27 It was three, two, one.

05:28 And just like, obviously not really do it, but like that would get your attention, right?

05:33 That'd be a mean, mean trick.

05:34 But this, this is interesting.

05:37 And you know, I didn't really put it together with the ML data exchange model exchange story, until I heard the folks talking about it over on Risky Business.

05:46 So it seems like, especially in the ML story, you want to have a look at these kinds of things.

05:51 Yeah.

05:52 So I've thought about the use case before, actually, but I didn't know that somebody would solve it in this way.

05:58 So pretty nice.

05:59 Yeah, I mean, Eric, this is sort of your world, right?

06:02 The machine learning stuff.

06:03 So how does this sit with you?

06:06 What do you think?

06:07 Yeah, so it comes up all the time that you pick up some random model that someone has built.

06:12 So as security issues become more prevalent, this might be a thing.

06:18 Yeah.

06:18 Well, is there better ways to store it?

06:21 Like JSON or something else?

06:23 So even if models don't have to exist that way, do they?

06:26 Yeah.

06:26 I mean, even if there was there, there are some projects that focus on building like some reusable interface across all these different frameworks and stuff.

06:36 But in reality, people just use pickle.

06:38 And yeah, really?

06:39 Yeah.

06:40 Yeah.

06:40 They do.

06:40 I just didn't know anybody was really using it for much.

06:43 No, it's absolutely common.

06:44 So within like say scikit-learn, which is probably most used library ever.

06:49 it just, use pickle on the pivot, store your files.

06:53 Yeah.

06:54 All right.

06:55 Well, cool.

06:55 So this is a useful library from trail a bit so people can check out and we're going to start with everything is fine.

07:01 And we'll end with everything is fine as well, Brian, but over to you.

07:04 Okay.

07:05 >> Well, this is something, it's a blast from the past a little bit, about a year ago.

07:11 Anyway, I want to talk about virtual environments and directories.

07:18 There's an article from Hinnick that's called Python Project Local Virtual Env Management.

07:26 That's a mouthful.

07:28 But the idea, and we've talked about in the morning this before, is to be able to-

07:34 >> I still want it.

07:34 >> Yeah. If I've got several projects going on, whenever I CD into a directory with this project, I just want the virtual environment to activate automatically, and then when I leave it and go to another one, it's just automatically switched.

07:51 Apparently, that already works and we've already covered it, but I missed it. Actually, in Episode 185, you brought up DuraEnv, and in part of it, it's the ability to you can have per project isolated development environments.

08:10 But I didn't pick that up yet, but Hennig just said, this is how you do it.

08:17 How you do it really is you have to install direnv first, and then you put a .envrc file in a directory and say, layout Python and then what Python version.

08:33 Like layout Python, Python 3.9, and then that's it.

08:37 That's all you got to do.

08:39 I'm like, that can't be that easy.

08:42 It was, I did it this morning and it's like, man, this is great.

08:46 On my Mac, it's all solved, but it doesn't work on Windows.

08:51 >> Must use Linux subsystem for Windows or Windows subsystem for Linux WSL, I guess it is.

09:00 >> Okay.

09:01 >> I mean, that semi-solves it.

09:03 >> Yeah. I probably have this need more within Windows than I have on my Mac, but I have it in both places.

09:12 I'm going to start using it. It's great.

09:15 Plus, like you covered last time, you can also have a bonus.

09:21 You can put environmental variables in there too, so that in the project, you've got perhaps your secrets or just different environmental settings you want to use.

09:31 >> Yeah, I think people will look in your .rc, whatever, your bashrc, zshrc, whatever files for your secrets.

09:40 But I suspect it's much less likely to go hunting through virtual environments and looking for their activate scripts and see what's in them.

09:47 People know, but fewer people know that stuff gets stashed in there.

09:50 So that's probably good.

09:52 >> Right. So I guess mainly the story is, I knew that you could do it, but I didn't realize how easy it was.

09:58 So it's super simple.

10:00 It just took a little bit.

10:01 And then my second thought was, it's not that hard to create virtual environments though.

10:06 Is this saving any time?

10:08 I still got to create this file and put this stuff in it.

10:10 It actually is more typing, a little bit more, but it didn't take me long to realize that it's when you're switching between different directories, you save a ton of time.

10:21 - Yeah, it's like going back and forth between projects, right?

10:24 - Yeah.

10:24 So that's it, really.

10:26 - Just kind of neat.

10:27 - Yeah, Brett out in the live stream has got a comment for us.

10:30 If you use pyenv, you can run pyenv localenv name in your project folder and get this behavior as well.

10:36 How do you do that?

10:37 How do you get it to activate by just changing directory into it is what I'm not totally sure.

10:43 Yeah.

10:44 - Yeah, I think you get the Python version that way, right?

10:46 But not the actual virtual environment.

10:48 - Yeah, possibly if you've installed Python through pyenv as well, yeah.

10:53 - And then David has a comment back, the first topic out there in the live stream.

10:58 Hey David, the irony of legacy object serialization being used on cutting edge machine learning.

11:04 - Like that one?

11:05 - Yeah, and then Teddy at live stream.

11:07 Hey Teddy, he says, does it work with an IDE?

11:10 Changes the interpreter based on the folder you're in within a workspace in Viscose, for example.

11:15 That I don't know, but I was gonna add the personal comment that I don't need this nearly as much as I felt like I used to, because the way I jump between projects is usually jump, open them up in PyCharm and jump between them there.

11:29 And that always activates, if you go to the terminal in PyCharm, it activates that environment for that project.

11:34 I don't know.

11:35 - I'm on the command line all the time, so definitely.

11:38 - Yeah, if you're on the command line busting around a lot, then both Brett and Alvaro have a followup.

11:45 Pyenv adds a shim that intercepts the calls to Python.

11:48 So yeah, very good.

11:49 So it must be that you have to install Python through PyMV, but then it'll also do this.

11:54 Very cool.

11:54 Good to know I didn't know that.

11:55 - Oh, me too.

11:56 - Yeah. - Nice.

11:57 - All right, Eric, first one is for you.

12:00 - Yeah, so I brought with me the test containers Python library, which, and let me quote this one from the description, because I think it's a pretty good summarization.

12:12 So test containers Python is a port for test containers Java that allows Docker containers for functional integration testing.

12:20 It provides capabilities to spin up Docker containers, such as databases, Selenium web browsers, and any other containers for testing.

12:27 So maybe not that many new things in here, but we use this in a project lately, and especially we use this in integration pipelines using cloud-native services.

12:42 So there's a container for Google Cloud Pub/Sub, for example, which is pretty amazing, also for like your Kafka.

12:49 This is originally a Java project, so there's still a lot to do for the Python community in order to catch up a bunch of interfaces that need to be implemented and stuff.

12:59 One example, it is here.

13:03 Let me just show you that one.

13:05 There's in the repo, you can find an example of how to use this within your CI pipeline.

13:15 So what's happening here is actually that if you have like a standard CI pipeline for your integration test, which consists of Docker containers that we use Docker in Docker to actually run the integration tests.

13:27 So all your standard 2021 stuff in here, I guess.

13:31 - Yeah, this is super cool.

13:33 And the way you do it is you just create a context manager.

13:36 Right? - Exactly.

13:37 - You just say something like, with my SQL container, here's a connection string, and then you can just do your normal database stuff over to it.

13:44 Yeah.

13:45 So it integrates perfectly fine with pytest.

13:47 We did that a lot.

13:51 And yeah, the syntax is pretty cool.

13:52 It's super easy to use.

13:54 The integration with the CI/CD works fine.

13:56 So, yeah.

13:57 Yeah, Brian, we could use this with a test fixture and a little yield action, something like that.

14:03 Yeah, yeah.

14:05 I can't wait to try to play with something like this.

14:08 Yeah. We talked about this way long ago.

14:10 I brought this up, I believe, but I'm glad you brought it back, Eric, because it's really useful and it's really neat.

14:15 And there's more stuff than actually is listed on the readme for some reason.

14:19 - Exactly.

14:20 - Like if you flip through the actual documentation, you can see that there's other containers, right?

14:27 For example, I believe there's a MongoDB one, for example, but that's not listed in the documentation.

14:33 And then the cloud emulators are probably neat for you for testing, right? - Absolutely.

14:38 - I mean, that's one of the things that I find off-putting from like cloud native type stuff is if you don't have access to the cloud, you're dead in the water, right?

14:46 Like, and that can be a problem for continuous integration and for all sorts of things.

14:50 So things like this are pretty neat.

14:52 - It's definitely challenging.

14:53 So stuff like this helps.

14:54 - Yeah.

14:55 You know, to me, it's an interesting trade-off because on one hand, sure, you can mock out your database and then just test against your test data.

15:04 But then if your data model and the database changes, but you don't think to update the test data, well, then your code's gonna, like SQLAlchemy, for example, will freak out and crash if the scheme is not a perfect match.

15:15 Whereas you wouldn't find that in testing if you weren't letting it talk a little bit to the database.

15:20 There's just interesting things like this.

15:23 Brian, you even had an episode about not mocking out your database, didn't you?

15:26 >> Yeah. I think as little as you can, I guess, let's do it the reverse.

15:34 As close as you can have to the real environment, the better.

15:37 This is when people are deploying on containers, testing with containers makes total sense.

15:42 Yeah, absolutely. Absolutely. All right. Want to talk a little more infrastructure?

15:46 Yeah.

15:47 All right. So I have the one, it's got to be the shortest named thing for a featured item.

15:53 JC, two letters. JC. So JC comes to us from Garrett. Thank you, Garrett for sending that in.

15:59 And at first I was like, I don't know if this is relevant to me or if this is interesting.

16:03 But the more I looked at it, I'm like, yeah, this is actually pretty awesome. To me, let me, I'll read what JC describes itself as in a moment.

16:10 But to me, what this is, is it is basically what web scraping is to the web, JC is to Linux.

16:18 So there's not a nice API for it, but I'd like to somehow wrap a little Python magic around it and then have an API for it, okay?

16:27 So it's official story is it's a CLI tool in Python library that converts the output of popular command line tools and file types to JSON.

16:35 and it allows piping one thing to the next, obviously, 'cause it's Linux-like.

16:40 So the idea is, you know, the example I have on there, the site there is dig.

16:44 So dig is a command that'll give you information about a domain.

16:48 So you could do something like dig example.com/pipe/jc, and then you tell jc what it's expecting, output from just whatever the print output to the terminal is in dig, and it will parse that and turn it into a Python dictionary, right?

17:04 So I could sub-process run dig, but then I just get a huge blob of text and I've got to basically go through it, try to understand it and so on.

17:14 And this knows the exact format and turns it into like structured data.

17:18 So think of all of these different Linux commands you may run, you find a whole bunch of them.

17:23 They're like a huge list down here.

17:24 So airport, ARP, crontab, date, CSV, free, DU, hash, history, hosts, IP config, netstat, all those types of commands, syscontrol.

17:37 So for example, if you're automating daemons and stuff like that, you can now do that from Python.

17:42 And then instead of getting just a text blob and an exit code, you get a dictionary back that you can then check out and program against.

17:49 What do you think?

17:50 - Oh, that's pretty cool.

17:51 - Yeah.

17:53 - Yeah, there's a bunch of built-ins.

17:55 Hopefully the thing you're looking for is one of these.

18:01 - Yeah, exactly.

18:02 I suspect it's not extraordinarily hard to do, to add another one.

18:08 Yeah, but you can also run it on the command line.

18:10 You don't have to use it in Python, which is what I was scrolling around looking for.

18:14 So if you want to, like let's suppose I wanna go and run dig and I just want to go to the answers and get the data, which would be the IP address of some domain.

18:26 You can say JC run this thing and then jq-r or there's like a way to just pass over a string.

18:34 And basically, the string you pass in is the object dereferencing, the traversal of the dictionary.

18:41 So dot bracket dot answer bracket dot data, and it'll go and pull that all apart, which is pretty neat.

18:47 So it's got a cool command line, terminal automation aspect, just like Fickle.

18:53 >> This is a nice wizard effect, so that if you know how to do this well, and people come over and watch you do this, they will be amazed.

19:00 Yeah, just make sure you spin up your third or fourth terminal while you do that.

19:05 Yeah, yeah, yeah.

19:06 Exactly. Eric, what do you think?

19:07 Yeah, so sounds like I found something that I can put my usual Sunday afternoon time into.

19:16 So I'll play around with it.

19:18 Yeah, yeah, yeah, yeah.

19:19 Absolutely.

19:19 Yeah, because every now and then I'll want to do some sub process thing and it needs to call some kind of Linux command. I'm like, "What am I going to do? Am I just going to check the status code, the return code and hope it works and then just say it didn't work if it didn't work or you know you could do so much more with this.

19:34 Sorry Brian.

19:35 Well there's some stuff that's less Unix-y that other people might need like you can parse PipList and PipShow and YAML and XML with this as well so that's pretty cool.

19:52 Yeah, very cool.

19:54 Alright, how about some ellipses?

19:57 or I don't know how else to say it, dot, dot, dot, the next thing.

20:01 >> Do say more.

20:03 >> This was a surprise to me.

20:08 I guess I haven't run into this yet or maybe just I forgot.

20:12 But Python has ellipses and it has the keyword ellipses.

20:17 Ellipses, ellipses, ellipses.

20:20 >> Ellipsi.

20:21 >> Ellipsi.

20:21 >> Keep going.

20:23 >> It's an actual object within Python. Who knew?

20:27 Then also you can just do dot, dot, dot, and that's a valid thing, an identifier.

20:35 It's a special value, but you can use it for all sorts of stuff.

20:41 By the way, I'm referencing an article called, what is Python's ellipses object from Florian Dollitz?

20:49 Thanks Florian for writing that.

20:51 The Python or the definition really is, It's the same, the ellipsis literal is the same as the literal dot, dot, dot.

21:01 It's a special value used mostly in conjunction with extended slicing syntax for user-defined container data types.

21:10 I don't know, what does that mean?

21:13 I guess pandas uses it maybe, but the article has some interesting things.

21:19 You can use it in place of pass because it has a valid value.

21:24 you can do a dictionary or a function definition, and instead of saying pass, just do three dots and that's valid Python.

21:35 >> I'm liking that.

21:37 I'm sure people will be like, what are you doing? But at the same time, it's like that's really what you wanted to put down there.

21:43 It's like, I just don't want to put anything, but Python won't work unless I close this off.

21:46 So here's a pass.

21:48 >> Also, one of the things I was thinking about is, no, I would probably use pass all the time.

21:53 when in that case.

21:54 But when writing documentation and you really want to have a working code example, but you want to just indicate there's going to be more code there, that's a cool thing to put in.

22:04 Anyway, so there's that.

22:07 Then there's also using it in type information.

22:11 With type information, for instance, apparently, let's say I've got a function that returns a tuple.

22:18 We've got these words today.

22:20 Anyway, a tuple with two integers, you can just say a tuple with two int, but if you don't know how many integers are going to be there, you can do the three dots, and apparently that works with typing.

22:32 That's neat.

22:34 >> That's very neat.

22:35 >> There's not a lot. Apparently, it's used also within FastAPI and Typer, but it's there.

22:42 If you want to use to implement a certain feature where that might make sense, it is a thing that's available to you.

22:50 Maybe you could have an operator, a dot-dot-dot operator on your something.

22:54 I learned this just the other day from a tweet from Raymond Hetchinger, where he was asking people, "How would you do this?" And he brought up the exact same example using the documentation and the pass or the ellipsis instead.

23:12 And I didn't even know that this was a Python object.

23:16 I knew it from the typing.

23:18 But so the question is, can you pass this object around?

23:23 Can you return from a function value, like dot dot dot?

23:27 I imagine. I don't know.

23:28 – It should work, right? – It should work, yeah.

23:31 – Yeah. – Nice.

23:33 I'll try it out while we go on to the next topic.

23:38 Yeah, that one surprised me. Well done, Florian.

23:41 Yeah, so the last one that I brought with me Actually, since I lead the data science and AI team, I gotta bring something with me that has to do with it.

23:50 So I brought with me the PyTorch forecasting library.

23:56 So, Michael, you just used this analogy a couple of minutes ago, so I'm going to use an analogy now.

24:05 So for me, PyTorch forecasting looks like What fast AI does for computer vision and natural language processing, it does for time series forecasting.

24:16 Because there was a lack of deep learning for time series forecasting.

24:26 And actually, I think that PyTorch forecasting is going to close this gap.

24:32 So it comes in with a bunch of important features, actually.

24:37 So it's built on top of PyTorch Lightning, which allows training on CPUs, single and multiple GPUs, basically out of the box.

24:47 So there's been a lot of software engineering involved for the data scientists in the past, and this library just makes it pretty simple.

24:57 So you have to work very hard in order to mess things up with this library, I guess.

25:04 So what it also brings is an implementation of a model that is called the Temporal Fusion Transformers.

25:17 So this is from Google Research.

25:19 Actually, there's also a TensorFlow-based implementation.

25:23 I'm going to put the link to the paper in the show notes.

25:27 This is a very interesting model that has performed pretty well on a dozen prominent benchmarks very lately.

25:38 And it has a huge benefit, which is that it is pretty interpretable.

25:43 So it does actually calculate feature portents for you.

25:48 So this is, in the real world applications, very important, because whenever you stick your data into these models and something good comes out, people will always ask you, So, okay, so what was the important part of the data?

26:01 How does it influence the model and the outcome?

26:05 So, Temporal Fusion Transformers, they do this for you.

26:09 Also, the PyTorch forecasting comes with Optumener, which is a popular library for hyperparameter tuning, which is also implemented in here.

26:19 Right, so this does multivariate time series, multivariable time series?

26:27 - Yeah, so the multi-horizon part of it is pretty important actually.

26:32 So go ahead, Sven.

26:33 - I was gonna say, so the hyperparameter tuning, my save, this part actually doesn't make any difference in the prediction, but this other part does.

26:38 So pay attention to that, right?

26:40 - Yeah, absolutely.

26:41 - Yeah, yeah, this looks really good.

26:43 So if you wanna predict the future about sales, home prices, heart rate, whatever, right?

26:50 - Oh, it comes up all the time, comes up all the time.

26:53 And I know from a couple of guys who work for the Google Clouds of this world and the AWS that within these software as a services or these APIs that they provide when like say a demand forecast, they use this temporary fusion transformers under the hood.

27:11 So. - Yeah, this looks great.

27:13 - Just spin it up and use it.

27:14 - Yeah, great recommendation.

27:16 Follow up from the previous one, Brian, Will McGugan, hey Will, the live stream says it's the dot, dot, dot, ellipsis sometimes is used as a sentinel value mean no value when none is a valid value.

27:27 So, yeah.

27:28 - Yeah, and also, yes, you can return it from a function.

27:31 So, just fine.

27:34 - And then let's see, someone out in the live stream asked if it has methods.

27:39 Does it have methods or anything that you can do to it?

27:41 That was Teddy.

27:43 Yes, but only the built-ins, right?

27:44 I don't think it, from object, I don't think it does anything interesting besides just B dot dot dot.

27:48 And then Anderson, hey, Anderson says, it's a pity the ecosystem is moving towards PyTorch Lightning, the separation of concerns there is not very nice.

27:57 In my opinion, PyTorch Ignite does a better job in that aspect.

28:00 Eric, that's all you.

28:01 - Yeah, fair enough.

28:03 Still, I mean, one thing that you've got to keep in mind, so speaking of separation of concerns, right?

28:11 There's so many data scientists out there that if you throw like separations of concerns at them, they just answer like, yeah, here's my model.

28:19 So what is separation of concerns in this sense, right?

28:22 So if this works, if people use it, it's probably good.

28:25 Yeah, cool. Brian, extras?

28:27 Extras! Oh, I just wanted to bring up that Python 3.10 RC2 is out.

28:34 So the second release candidate for Python 3.10 is out, so people can play with it.

28:39 Apparently, we're maybe a month away from getting 3.10.

28:42 So I'm excited about that.

28:44 Yeah, that's me. Very excited.

28:46 Awesome. All right, I got a couple to throw out there.

28:49 - Really? What a surprise. - Can you imagine?

28:52 Can you imagine, so remember we talked about several things.

28:56 I talked about how I turned off all of the tracking stuff and all those things on the website, which I think is good because so many people run ad blockers, it was like pretty inconsistent data anyway, inaccurate.

29:11 Then I mentioned goaccess.io, and I said, that'd be cool, maybe we should apply it.

29:16 I ended up writing a ton of automation to apply this to Python by stock Python, stock Python training, and all the things, and it's pretty cool.

29:22 I built some automation that will download all the NGINX log files, some of which are text, some of which are gzipped, and then run this thing across it, and it will build like one giant monthly log thing, then Go Access can then turn into nice, beautiful reports.

29:37 So very excited to have Go Access working well.

29:40 And instead of running it on the server, I actually just download and then run it on like a monthly report locally, which I think is kind of cool.

29:48 All right, one, we had some feedback about Caffeinate.

29:53 Remember Caffeinate?

29:55 You can type Caffeinate on the macOS terminal and it'll keep your system alive.

30:00 Nathan Henry said, you mentioned over in macOS the Caffeinate tool says you can follow it with a long running command to keep awake.

30:11 So you can say like Caffeinate Python dash C import time, time.sleep or so give it some kind of, so you could say caffeinate Python and some script you wanna run.

30:21 So you could reverse it if that script doesn't use keep awake or I think that's what it was, right?

30:28 So you could apply caffeinate to your Python code and just say, no, stay awake while you're doing this.

30:32 Or you can even apply it to a running process using a PID.

30:36 - So it just stays awake while that process is running then?

30:39 - Yeah, and then it'll go away, yeah.

30:40 - Oh, okay, nice.

30:42 - Yeah, so it's like the reverse of what we talked about then.

30:44 - Then Sean Taver from Teaching Python said, "Isn't this what we were asking for?" Remember, we were talking about the keyboards.

30:52 - Keyboards.

30:53 - And here's a Python one.

30:55 This is M60 Mechanical Keyboard, the open source USB BLE Bluetooth Low Energy 5, hot swappable, 60% keyboard, powered by Python.

31:06 So this one comes with Python built in, which is pretty excellent.

31:10 So if people wanna play that, they definitely can.

31:13 The next one I want to throw out there real quick comes to us from Mark Little, a friend of mine here in Portland.

31:18 And basically the subtitle is that, this is an article from CNBC Finance News, that open source is booming.

31:26 So the headline has to do with MongoDB, but it's more broad.

31:29 So if people are interested in kind of following up on that, it's kind of cool.

31:32 So MongoDB surged on Friday, which was last Friday.

31:36 It's now worth as much as IBM paid for Red Hat.

31:39 Databricks raised private financing around at $30 billion valuation.

31:45 And just, you know, these are the mega open source companies but it's pretty interesting.

31:50 To just give you a sense, like I read this article, I go, "That's pretty interesting.

31:53 These numbers kind of just like bounce off me." But the one that made it stick for me was MongoDB was a private company for a while, then it became, then it IPO'd, right?

32:02 It had VC money, then it IPO'd.

32:04 Do you have a sense?

32:05 Either of you have a sense for how much it IPO'd for?

32:08 Seemed crazy, right?

32:09 like a 1.2, 1.4 billion dollars, MongoDB is worth 30 billion now.

32:16 Right, so even after like the crazy IPO, yeah, 1.2 billion to start and now over 30 billion.

32:22 So that is an insane amount of growth in these.

32:25 And then they talk about Confluent and JFrog and a bunch of other elastic.

32:30 If you kind of wanna dig into the business side of open source, that's pretty interesting.

32:34 All right, two more.

32:35 I've been doing a ton of video encoding lately.

32:38 I use FFmpeg for some of the audio processing and other types of things around both the podcasts and the courses.

32:45 So attribution here, this is from Jim Anderson, sent this over, thanks Jim.

32:50 Ffmpeg.wasm.

32:52 So here's FFmpeg, which is a very popular tool in that world, but as a web assembly thing, which is pretty awesome.

33:00 And I'm trying to remember what the name of the library was, but over in, we did talk about on Python Bytes, I think with Cecil Philip on one time, maybe it was even him that brought it up.

33:11 But there's a Python library that will run web assemblies.

33:16 So not run web assembly in their browser or put Python in their browser, but reverse it, like I have a web assembly library that does cool stuff.

33:22 Put it in my Python code and run it here.

33:24 So you could take FFmpeg.wasm and pure Python and have like a no dependency sort of audio video processing tool in Python, which I think is pretty cool.

33:34 Cool.

33:34 All right.

33:35 Last one, I told you we'd start with everything is fine, and we're going to end with everything is fine.

33:39 Credit card stealing backdoored packages found in Python's PyPI library hub.

33:45 What? That's not good?

33:46 This is not good.

33:49 This is not good.

33:50 When you hear people talk about remote code execution, that typically is bad.

33:56 Like, I'm on the internet, people send me bad stuff, now they have my computer, and I don't even necessarily know it.

34:03 So apparently, in addition to this, These were found and removed.

34:06 It was something, what was it?

34:08 It was something around the line of noblesse, N-O-B-L-E-S-S-E, and a couple of variations on that spelling.

34:16 That was the problem.

34:17 So I'm happy to see I didn't install that, but this doesn't make me happy.

34:20 It looks like it's fixed.

34:21 So the PyPI team also just patched a remote code execution hole in their platform, which potentially could have been exploited to hijack the entirety of PyPI.

34:31 That one makes me way more nervous than typo squatting or the weirdness.

34:35 And it was a vulnerability in the way that they were doing GitHub actions with PyPI, which allowed a malicious pull request to execute arbitrary code over there, which is not ideal.

34:47 Nice.

34:47 Yeah.

34:48 But I'm glad to hear that's fixed.

34:49 Anyway, everything's fine.

34:51 Doesn't feel fine.

34:53 No, not at all.

34:55 More like a nightmare to be honest.

35:00 Yeah.

35:01 To be honest, Eric, anything else you want to share with us?

35:04 Oh, no, just thank you guys again for having me on the show.

35:08 pretty fun.

35:09 And, make sure that, you guys follow me on Twitter and, yeah.

35:14 Awesome.

35:15 I'll put a link in the show notes for your Twitter.

35:18 No, we aren't done.

35:19 Are we, Brian?

35:20 No, we need to joke.

35:21 One thing is missing.

35:22 Yeah.

35:22 Yeah.

35:23 Yeah.

35:23 It's important.

35:23 So this one is more of a, not an ML one is more of a web, web API type type thing.

35:30 So, so often people will write web APIs and just return some kind of message in a JavaScript dictionary that says things like bad response or whatever, but you're supposed to use HTTP status codes, right?

35:42 Like if there's a bad request, you should return the status code 400.

35:47 If it's not found as an entity, you should return 404 or whatever.

35:51 So here's like two kids at school exchanging messages and has server on one of them, client on the other, and 200 on the message exchange here.

36:00 And then at the bottom, the one kid that got that message reads the JavaScript and says, "Status code 400, detail bad request." He's like, "Why, why did you do this to me?" (Brian and Eric laughing)

36:12 - This is good.

36:13 - Yeah, this is like little Bobby Tables.

36:15 Let this be a lesson to you.

36:16 You don't pass messages like that.

36:17 Come on.

36:18 - It's so true.

36:20 - It's totally true, totally true.

36:22 All right, well, that's it for our jokes and everything, Brian.

36:24 - Yeah, we'll have another fun Wednesday on Python Bytes.

36:28 - Absolutely.

36:29 Thanks, Brian. Yeah, thanks, Eric, for being here.

36:31 Thanks a lot, guys. See you around.

36:33 Bye all. Bye.

36:35 Python bytes. Follow the show on Twitter via @Pythonbytes. That's Python bytes as in B-Y-T-E-S.

36:42 Get the full show notes over at Pythonbytes.fm. If you have a news item we should cover, just visit Pythonbytes.fm and click submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit the website and click live stream to get notified of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Okken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page