#300: A Jupyter merge driver for git

Published Tue, Sep 6, 2022, recorded Tue, Sep 6, 2022

Play on YouTube

Watch the live stream replay

About the show

Sponsored by Microsoft for Startups Founders Hub.

Special guest: Seth Larson

Brian #1: Test your packages and wheels

I’ve been building some wheels the last couple of weeks with various tools:
- flit, flit-core, and flit build
- hatch, hatchling, and hatch build
- setuptools, build_meta, and python -m build
There are a few projects I’ve used to make sure my projects are in good shape
- wheel-inspect - you can inspect within Python code through inspect_wheel() function that converts to json. Or use on the command line with wheel2json
- check-wheel-contents - a linter for wheels
- tox - easily test the building, installation, and running of a package locally
  - I actually start here, then utilize the other two tools
Should have been obvious, but it wasn’t to me
- Projects saved on git (such as gitHub) don’t keep wheels in git. (this was obvious)
- When installing from git using pip install git+https://path/to/git/repo.git
  - Your local pip will run the packaging backend to build the wheel before installing.
  - Yet another way to test packaging.

Michael #2: The Jupyter+git problem is now solved

Jupyter notebooks don’t work with git by default (they inherently have meaningless conflicts).
With nbdev2, the Jupyter+git problem has been totally solved.
Uses a set of hooks which provide clean git diffs, solve most git conflicts automatically, and ensure that any remaining conflicts can be resolved entirely within the standard Jupyter notebook environment.
The techniques used to make the merge driver work are quite fascinating

Seth #3: Help us test system trust stores in Python

Package aiming to replace certifi called “truststore”, use system trust stores for HTTPS instead of a static list of certificates.
Problem truststore is solving usually manifests in corporate networks: “unable to get local issuer certificate”.
Experimental support added to pip to prove the implementation
Users can try out the functionality and report issues.

Brian #4: Making plots in your terminal with plotext

Bob Belderbos
Tutorial on using plotext - that’s one t in the middle
With the rise of CLI usage, plots are a nice addition.
Bob’s plot is great, but check out the options in the plotext docs
- lots-o-plots
- streaming data
- images
- subplots
so fun

Michael #5: jinja2-fragments

Carson from HTMX (see podcast and course) wrote about template fragments.
My jinja_partials project sorta fulfills this, but not really.
I had a nice discussion with Sergi Pons Freixes who uses jinja_partials about this.
He created Jinja2 fragments

Seth #6: SLSA 3 Generic Builder for GitHub Actions GA

Supply chain Levels for Software Artifacts, or SLSA (“salsa”)
Tools to attest to and verify “provenance” of artifacts, ie “where it came from”
Prove cryptographically that artifacts are built from a specific GitHub repository, commit, tag. Another future defense against stolen PyPI credentials/accounts.
Generic builder means you can sign anything, like wheels/sdists

Extras

Brian:

Bring your pytest books to PyBay, if you want them signed.
- I’m only bringing a small amount.
I’ll be presenting
- "Sharing is Caring - pytest fixture edition” at 3:05
- “Experts Panel on Testing in Python” at 7:00
And be a zombie on my 8 am flight back unless I can change my reservation.
That’s this weekend, Sat Sept 10, in SF

Michael:

Seth:

Pyxel, retro game engine for Python, v1.8.0 added experimental web support with WASM

Joke: Dev just after work

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:05 This is the big episode 300, recorded September 6, 2022.

00:11 I'm Michael Kennedy.

00:12 And I'm Brian Okken.

00:13 And I'm Seth Larson.

00:14 And this episode is brought to you by Microsoft for Startups Founders Hub.

00:18 More about them later.

00:19 Seth, welcome to the show.

00:21 Thanks for having me.

00:22 This is so exciting.

00:22 I didn't realize it was going to be a 300.

00:24 Yeah, well, you hit the jackpot.

00:27 This is the big one, a big one for at least two more years, I would say.

00:30 And Brian, how about that?

00:32 300 episodes.

00:33 That's amazing.

00:35 When did we start this?

00:36 We should look this up.

00:37 It must have been a while ago.

00:38 I don't know.

00:38 I mean, that's 5.7692307 years.

00:44 Like, that's almost six years.

00:45 It's amazing.

00:46 Actually, a reason that I'm so focused on floating point numbers and large numbers.

00:51 We're going to get to that at the end of the show.

00:53 2016.

00:54 We started November 2016.

00:55 That's pretty cool.

00:56 Absolutely.

00:57 Anyway.

00:57 Yeah.

00:57 Very cool indeed.

00:59 David says, congrats on 300.

01:01 Thank you, David.

01:02 Thank you for being here.

01:03 Indeed.

01:03 Awesome.

01:04 All right.

01:05 Well, I've been thinking about wheels and packages lately.

01:09 Yeah.

01:10 You were thinking about the phrase, rolling wheel gathers no moss or something like that?

01:16 Is that how it goes in programming?

01:17 No, I wasn't thinking about that at all.

01:19 All right.

01:19 What were you thinking about?

01:20 Tell us about it.

01:21 Okay.

01:22 So I was thinking about actually using different packaging tools because pyproject.toml is supported

01:27 like by tons of stuff now.

01:29 Well, by tons of stuff, I mean like three that I know of.

01:32 So we've got flit.

01:36 Well, poetry also, but I don't use poetry.

01:38 Anyway, I've been using flit and hatch and setup tools, which are all really easy to use with

01:44 pyproject.toml lately.

01:46 And I've been using like the flit method of building wheels and hatch and set and Python, the build package also Python or the, if you just pip install build, you can do Python dash and build the build stuff, which is fun.

02:01 But since I've been building all these, I've been using a lot of tools to try to like check these wheels to make sure that they're the packages and wheels are what I expect is inside.

02:11 So there's this, there's a few tools I'm using.

02:14 One is wheel inspect.

02:16 And this one actually, it's kind of cool.

02:20 You can use it programmatically if you want.

02:22 I'm not, I'm using the, it comes with this thing called a wheel to Jason.

02:28 And it, if you run that on a wheel and you give it a wheel name, it just pops out all like dumps the Jason information about the wheel.

02:38 And, and I've been using this to just, I'll like use different ways to build things and then dump this into a file and do a diff to just sort of see what's going on to make sure that, like make sure I got like the description correct or everything's right.

02:54 And just cause I'm, I'm curious if all of these tools are building this kind of the same thing and they kind of are, there's a slight differences, but it's neat that there's so many options now.

03:04 So wheel inspect is really cool for, for wheels.

03:07 I'm also using a thing called check wheel contents.

03:12 And this is kind of like a linter for wheels.

03:15 So if you throw this at, because it's possible to make valid wheels that don't have really anything in them or they don't have the thing that you thought was in there.

03:24 So there's, there's, there's, this is a linter that goes through and it gives you a whole bunch of warnings and stuff.

03:31 If you, you can kind of look through like a, a, we hit W zero zero one wheel contains a PYC and PYO files.

03:41 Like somehow you've configured it wrong to grab that.

03:44 And I don't know how you would do that for the lot of stuff, but with flit, if you have possibly, if you accidentally threw those in your get,

03:52 because flit just grabs anything that's checked in, I think, or committed, duplicate files, it checks for that.

03:59 So it checks for a whole bunch of stuff.

04:01 So this is handy just to check as well.

04:03 But the powerhouse that I'm using, of course, is just talks.

04:07 I kind of wanted to cover the other ones cause they're fun, but I wanted to remind people that one of the great things about talks is it builds things on your own, on its own.

04:15 And so when you, when you run talks on a package, it will build the package, then install it into an environment.

04:21 And then, then you run your tests.

04:24 We think of it as more of a test runner, but it does that whole packaging loop also.

04:28 the, and then the fourth way, I don't have a, like a slide for this, but the fourth way that I've been doing is you can just push them into a Git repo.

04:37 And then you can do the, pip install get plus, and then the repo name thing.

04:43 And pip will use your packaging tools to create the wheel before it installs it.

04:49 So that's another way to check your packaging.

04:52 So I'm doing a lot of packaging.

04:54 So anyway, I'm always super paranoid whenever I configure something to do with packages.

04:58 So my, my method tends to be just unzip the wheel as a, as a zip file and see what's in there.

05:04 See what landed.

05:05 I didn't try that.

05:07 So what does that do?

05:08 You just unzip it.

05:08 Way number five, Brian.

05:09 Yeah.

05:10 So does it just zip, unpacks it in place then?

05:14 Yeah.

05:14 Wheels are technically zip files.

05:16 So you can unzip them and just inspect what made it in there.

05:19 Okay.

05:20 Yeah.

05:20 Put a dot zip extension on it and then you can just put zip tools on it and off it goes.

05:25 So it must, it must store the metadata somewhere then also though.

05:28 But yeah, there's a top level like metadata file that says all the things that it's about.

05:33 Ah, I love the pun.

05:35 In the chat, we got from Pai Lang, wheel good stuff, Brian.

05:39 Brian, that was wheel good stuff.

05:43 thanks.

05:45 Thanks for bringing it.

05:46 yeah.

05:48 So on, onto the next one for mine, huh?

05:49 Yeah.

05:50 Before we, before we jump onto it, you see, I have my, my race jersey on because the Portland

05:56 Grand Prix IndyCar race was here this weekend.

05:58 So people listening and we're close by, they missed it, but next September go, be sure to

06:03 go.

06:03 It was really, really fun.

06:03 Three days of racing.

06:04 Very nice.

06:05 Were they, were they fast cars?

06:07 They were, we have Zindi cars.

06:08 They were like, they were very fast, but they, they had no AI.

06:13 Okay.

06:14 No artificial intelligence yet from what I understand.

06:16 But if you look over on fast.ai, there's something that anybody who does proper data science is

06:24 going to be pretty jazzed about.

06:26 So Jupyter notebooks are notoriously bad citizens of source control and, and get and tools like

06:34 that.

06:34 The reasons are basically whenever you have a notebook file, if you've ever run it, the output

06:40 and the order in which the cells were run and the number of times the cells were run is stored

06:46 in there.

06:47 That's not great.

06:48 If someone gets the file and runs it, someone else gets it and runs it.

06:52 And then you try to put it into source control.

06:54 That's a problem.

06:55 Right.

06:56 I mean, when you and I work on our code, we have Python files, the output goes somewhere.

07:01 We check it in.

07:01 The source code goes in, but with Jupyter, the outputs go in, not just the outputs, but

07:07 the memory address of some of the object used in the address.

07:11 So even if it's you running it twice, you get merge conflicts, which is not the coolest

07:16 thing ever.

07:17 I suspect that this goes by the name, the Jupyter plus get problem where really it should be the

07:23 Jupyter plus version control system VCS, because it doesn't matter what you're using.

07:28 Anything that just diffs files is going to hate this.

07:31 Right.

07:32 Anyway, the article and the feature really that I want to talk about is the Jupyter get problem

07:36 is now solved from Jeremy Howard over at fast.ai.

07:40 The solution may surprise you.

07:42 So it talks a little bit about the challenges here.

07:45 And it says it's interesting.

07:48 It speaks in terms like that are not really developer oriented.

07:53 It speaks more in terms of like end users.

07:55 So like the way that maybe a first year science student might experience what the problem is,

08:00 not the way a seasoned data scientist would.

08:04 Like, for example, here's the problem.

08:06 The problem is when you're collaborating with others over Git, you literally can't load your

08:12 notebook if you both try to check it in because it's broken.

08:15 Well, what does broken mean?

08:16 Broken means it has merge conflicts written into it.

08:19 And that's really the problem is you can easily solve this problem if you accept their changes

08:24 or accept your changes.

08:26 But then you're losing data.

08:27 Right.

08:27 So anyway, I says, OK, let's let's look inside.

08:30 Well, there's JSON and then there's like the head and then the shaw like diff error.

08:35 So I kind of already described this, but they do go into examples of like when you're talking

08:40 about matplotlib or something like that.

08:43 You'll have things like a matplotlib.axes.subplot.axes.subplot at some memory address.

08:49 Right.

08:50 Which is suboptimal, let's say.

08:52 Yeah, that's to that.

08:54 There's a lot of axes.

08:55 That's right.

08:55 Then non-determining outputs and so on.

08:58 It says, OK, we identified two categories of problems here.

09:02 And I would I would like to say this is only accurate if you have zero based indexes when

09:09 you start counting.

09:10 So we've identified in Michael's term three problems here.

09:15 One, Jupyter notebook formats are fundamentally incompatible with version control.

09:21 Problem zero.

09:22 Problem one, Git conflicts lead to broken notebooks.

09:25 There we go.

09:26 And many of these, almost all of these conflicts are unnecessary because metadata, like the environment,

09:34 the machine name and stuff that it was run on, as well as the memory address of the objects

09:38 is stored inside the file.

09:41 What do you do?

09:42 Well, there was this thing called NB dev that would allow you to clean the file.

09:48 I think it was NB dev that will let you clean it.

09:49 There's other ways to clean it within Jupyter as well.

09:51 You can say, I'm only going to commit to version control, the empty version.

09:55 Right.

09:56 You can say clear all cells and then commit that.

09:58 Then that would be fine because you're wiping all that data out.

10:02 However, sometimes that data is incredibly hard to compute.

10:05 Right.

10:06 I have a picture.

10:06 The picture comes from an hour of doing training machine models and then processing a gig of

10:12 data and then looking at this picture.

10:14 If I don't clear it and I check it in, the picture's right there.

10:18 You know what I mean?

10:19 Or some of the outputs are right there.

10:21 So there's a huge reason to not clear it because it might be incredibly hard to regenerate it.

10:26 Maybe on the system you're on, you can't even run the code necessary.

10:30 Right.

10:30 You don't have access to the database or whatever.

10:32 So here's what they did.

10:33 There's a new NB dev named NB dev two as part of the name, not a version, but the name.

10:39 And this comes from the folks at fast AI.

10:43 And here's how it works.

10:44 It has a new merge driver for Git.

10:46 Okay.

10:47 Instead of like a processing the files, it says, what we're going to do is we're going to set

10:52 up hooks in Git.

10:53 So when there is a merge, our special Python code that understands notebooks will present

10:58 a different view for you.

10:59 Wow.

11:00 I know.

11:02 And there's a new save hook for Jupyter that automatically removes the unnecessary metadata

11:08 and non-deterministic cell output.

11:10 So what you'll get is when you open up this conflicting notebook in, in Jupyter, you'll

11:16 actually have the diff shown instead of having a corrupted notebook.

11:19 Additionally, it drops out the metadata.

11:22 So you get these unnecessary ones are just kind of gone.

11:25 So it talks about some interesting things that you can do there.

11:27 You've got to run NB dev install hooks to get it set up and some other various things.

11:34 There's also a lot of history on what has been done before.

11:36 What are some of the other alternatives?

11:38 But the big takeaway is the folks over at fast AI have been using this internally for several

11:44 months and they say it has transformed their workflow.

11:46 It's totally solved this problem.

11:48 And the reason they care so much is almost all of their work, their unit tests, their documentation,

11:53 their actual code, everything is in notebooks.

11:55 They're like all in on notebooks.

11:56 So having Git be a first class citizen is obviously important.

12:01 So I recommend people check this out.

12:02 Postscript side bonus here is there's another thing called review in B.

12:08 Review in B is about like reviewing, say, a GitHub pull request.

12:12 So somebody fixes a bug in a notebook and they do a PR and say, oh, you were generating this

12:17 graph wrong.

12:19 You should have passed this parameter, which means a totally different thing.

12:22 Wouldn't it be nice to have a picture of the before graph and the after graph with this

12:26 review in B?

12:27 That's exactly what you get.

12:28 So you get your code diff, but then you also get the output diff, which might be a map plot

12:34 of picture.

12:34 Isn't that cool?

12:35 That's really cool.

12:36 I'd be surprised if GitHub doesn't have this eventually.

12:39 I mean, yeah.

12:40 Well, this seems like a logical next step.

12:42 Yeah, it sure does.

12:43 Right.

12:43 Notebooks are so important.

12:44 Yeah.

12:45 Right.

12:45 But it's not just GitHub, though.

12:47 So some people are using Git just straight.

12:49 Exactly.

12:50 Right.

12:50 Or GitLab or whatever.

12:53 Yeah.

12:53 This is pretty neat.

12:55 And this is, yeah, one of the things I really like about this is all the other solutions

13:02 that we've tried and everything.

13:03 I mean, data science people are really good about covering that sort of stuff where a lot

13:08 of other people are like, hey, I came up with a problem.

13:10 I solved it.

13:11 Maybe some other people have solved it also, but yeah, whatever.

13:14 Exactly.

13:17 I will say this, this set of tools like exactly solves a problem I had not that long ago.

13:22 So.

13:23 Okay.

13:24 So this really resonates with you, huh?

13:25 This resonates with me.

13:26 Yeah.

13:27 Using notebooks for documentation and as part of like an integration test suite.

13:31 Like this is great.

13:32 Yeah.

13:33 Very cool.

13:34 Piling on the audiences.

13:35 Ah, so it looks like you can actually resolve merge conflicts inside the notebooks rather

13:40 than traditionally ignore conflicts.

13:41 I believe so as well.

13:42 I think there's like a merge, merge inside of Jupyter type of thing you can do.

13:46 Hmm.

13:47 Neat.

13:47 Yeah.

13:48 That said, I haven't, I haven't totally used it.

13:49 All right.

13:50 Anyway, if you're into data science or data science, if you do Jupyter and you care about

13:55 source control, this looks really helpful.

13:57 Which you should care about source control.

13:59 Yes, exactly.

14:01 Yeah.

14:02 So if you use Jupyter.

14:03 Full stop.

14:04 Go.

14:04 There you go.

14:04 Awesome.

14:05 All right.

14:06 Seth, over to you before we jump into the first type topic you want to talk about though.

14:10 Just real quick.

14:11 We were so excited about episode 300.

14:14 I didn't give you a chance to introduce yourself properly.

14:16 So give us a quick background on you and then tell us about your item.

14:20 Yeah.

14:20 So I currently an engineer at Elastic working on the language clients team.

14:26 Previously, I was the maintainer of the well-known within the Python community, the Elastic

14:31 search client.

14:33 Now I'm doing tech leadership for that same team.

14:36 And then in terms of open source work, I am a maintainer of many different Python packages,

14:42 most notably your lib3, which is most downloaded Python package.

14:46 And it's one of the dependencies of requests and Bodo and a whole bunch of other really foundational

14:52 packages.

14:53 That's incredible.

14:54 Well, does it make you nervous to make changes to it?

14:56 Oh, yeah.

14:57 So the very first time that I became lead maintainer and had to make a release, it was, I actually

15:03 spent multiple hours just kind of looking through the wheels and the source distributions and making

15:08 sure that everything was right.

15:09 It was a tough day, honestly.

15:11 Yeah.

15:11 So that chat with that Brian open with you, you've been there as well.

15:15 All right.

15:16 All right.

15:17 Well, what's your first item for us?

15:18 Yeah.

15:18 So my first item is about trust stores.

15:21 So this is about like certificates that you use to verify HTTPS connections.

15:27 And so this is a library that me and David Glick have worked together to implement.

15:33 And it's essentially trying to solve the problem of certify with Python and how it kind of interacts

15:41 with certificates that aren't necessarily trusted by the greater world.

15:45 So, for example, if you have like a corporate proxy, if your company is installing a certificate

15:51 on your behalf, enable to do proxying of some sort, certify just doesn't work with that.

15:57 And you get these errors that are kind of insurmountable.

16:00 You get errors that require really low level debugging knowledge to figure out.

16:04 And so we went and implemented this.

16:06 Anything that has to do with certificates.

16:07 If it goes wrong, it's just like, well, that's never going to work.

16:10 I guess we're done here.

16:11 You know, it's just so hard to understand, right?

16:14 I'm on a campaign to make it so no one on the world needs to type verify equals false ever again.

16:20 That's my mission.

16:21 Awesome.

16:22 Also, you spoke about certify.

16:24 Like, tell us what, give us the background.

16:26 I'm not sure we all know what certify does.

16:28 Sure.

16:28 Yeah.

16:28 Certify is essentially every web browser like Chrome and Firefox and all that.

16:34 They have a bundle, a group of certificates that they are marking as these are trusted.

16:41 And they kind of bundle those along with every single web browser.

16:45 Right.

16:45 And so Mozilla, because it's open source, it open sources its trust store.

16:49 And so what certify is, is it's a small, really thin wrapper Python package around that bundle.

16:56 And it allows Python to make HTTPS connections to websites, essentially, without having to, like, rely on a certificate trust or being configured manually by the user.

17:09 And so a lot of times, because Python is installed on Windows or macOS, but is relying on open SSL for a lot of its TLS, it really requires a file to be there.

17:22 Like, open SSL doesn't know anything about the system certificate trust or any of that.

17:27 It's very, it requires a file to be there.

17:29 And so certify is solving that problem.

17:31 I see.

17:32 So if I went and installed it, if I was on, like, Windows and installed it into the trusted root store or something like that, it wouldn't, that wouldn't count?

17:39 That wouldn't be enough?

17:40 It wouldn't be enough.

17:41 Yeah.

17:41 Oh, okay.

17:41 You would, there is a whole bunch of other things that you get also by using these native operating system APIs for certificates, like auto updates.

17:50 It can be centrally managed, so, you know, your IT department can click a button and update everyone's system trust store.

17:57 So, yeah, there's a lot of really good benefits to using the system trust store instead of this Python managed file.

18:04 And this, this article kind of goes into the nitty gritty of that.

18:07 But the big announcement for this project was that pip actually, with the version 22.2 release, added support, experimental support for using this library instead of certify to verify HTTPS.

18:22 And so what this will allow people to do is try out trust store optionally, right, instead of switching it to a default.

18:30 And if they're experiencing this class of errors with, you know, installing Python packages or upgrading Python packages, they can use one flag.

18:39 It's, I believe it's listed.

18:40 Either way, it would be listed here.

18:44 So you do --use dash feature equals trust store.

18:47 And that will, you'll recognize that use feature flag for the 2020 resolver.

18:52 That's another feature flag that they use.

18:54 So this trust store feature flag is the same thing.

18:57 It will, if trust store is installed on your system, it will use that instead of certify.

19:03 And it allows you to get around the errors that you can see when you have a corporate network involved.

19:07 So yeah, this is kind of the big thing that I'm really excited about.

19:12 And we're really hoping that in the future we can add this to Python, maybe make this a default for requests.

19:19 Like there's a whole bunch of different really interesting things that we can go forward with if we can prove that, hey, this is useful to these users, right?

19:28 Yeah.

19:29 Yeah.

19:29 Fantastic.

19:30 So if I say --use feature equals trust store, do I have to previously have pip installed trust store or something like that?

19:38 You do have to have previously installed trust store.

19:40 So the package is relatively new.

19:43 It's less than a year old.

19:44 And so to ensure that we're able to keep things moving because it's experimental, we didn't want to bundle with pip.

19:53 Their release cycle is a lot longer.

19:55 I collaborated with Su Ping for a good long while on this and making sure that everything was all good to go for pip since shipping with pip is a big deal.

20:04 So yeah, it's been a long, a long road.

20:08 So yeah, this looks super useful.

20:10 Kim out in the audience says, I'd love to never need verify false again on my internal network.

20:16 Seth's mission is fantastic.

20:17 Yeah.

20:18 Yeah.

20:19 I'm very grateful that this work is going on.

20:21 And I hope that that's true because it drives me nuts.

20:24 Is this something you have to deal with internally as well, Brian?

20:26 Yeah.

20:27 Cause we've got, we, you know, internal network, corporate firewall, we've got, the trust stores and on windows systems.

20:36 And, it's, it is an issue and we don't, so a lot of, I mean, one of the ways we get around it is to have internal pipe AI.

20:45 We've got, we've got a mirror inside.

20:47 Yeah.

20:48 but, sometimes I want to try out stuff that's not there.

20:51 So, having, having something like this work, would be good.

20:55 but it's not just pipe AI, it's other places too.

20:57 It's, so yeah, the entire, entire outside internet is usually impacted when you, when you have that sort of situation of a corporate proxy.

21:06 So, yeah.

21:06 And I, I'd like to be able to, and that, so I'm, I'm guessing that this trust store, I mean, using it within pip, it'd be great for a lot of people to try it.

21:13 But, trying out this trust store for applications that depend on, trusted, sites that would be helpful as well.

21:21 Right?

21:21 Yeah.

21:22 So actually the documentation, if you're trying to use it manually with other things, we support your lib3 AIo HTTP requests and I'm sure it'll work with other libraries as well.

21:34 Awesome.

21:34 Like HTTP X.

21:36 Yeah.

21:36 It, it should work with any, anything that uses the standard SSL context, like API, as long as it can use that API, it should work with it.

21:45 This is great.

21:46 Awesome.

21:47 Very cool.

21:48 Nice work.

21:49 Thanks for coming on and sharing it.

21:50 Hopefully it makes corporate Python a little better.

21:53 You know, there's, this was long ago when I first started the podcast, this one and talk Python.

22:00 There was a lot of debate or discussion, I guess, whether Python was an appropriate enterprise software type of language.

22:08 You know, I think that debate is largely over.

22:11 And I think the reason it's over is because the data scientists said it's, this is not a debate.

22:16 Do you want us to do the job or not do the job?

22:18 do the job.

22:19 That's right.

22:19 Okay.

22:19 Well, so let's use Python.

22:20 And then it kind of spread from there internally as a through acceptance that said, like now that it does live in these environments that Brian described much more frequently, it's really important to have the support.

22:32 Yeah.

22:32 It's actually really funny because, so to put this in perspective for Java folks, this is like Java trust stores, is like certify.

22:41 Where you have this manual thing that shipped with Java as opposed to just using the system.

22:46 And I got that comment on, lobsters or something that was talking about this article and they're just like, wow, this is like getting rid of Java trust stores.

22:54 This is great.

22:54 I'm like, okay.

22:55 I didn't even know that existed.

22:56 That's right.

22:57 We really hate it over there.

22:58 And we hate this.

23:00 So this is great.

23:00 I was like, okay, thank you.

23:03 Cool.

23:04 All right.

23:05 Well, before we get to the next topic, Brian, let's talk about our sponsor for this week and many weeks this year, Microsoft for startups founders hub.

23:15 If you are starting a business, doing a startup, you are a little ways going, or you're just thinking about it.

23:21 You should really check this out because Microsoft for startups set out to understand the challenges that we all have creating startups in this digital cloud age.

23:30 And they created Microsoft for startups, founders hub, help solve many of them.

23:35 So that includes getting cloud resources, GitHub credits, other credits like, AI credits, for example, from open AI that you can run your code on, but maybe even more important than that.

23:51 It has support for connecting you with mentors and experts to make sure that you go in the right direction when you're young and getting started.

23:59 So, so often you see the successful startups being in places where there are a lot of mentors, where there's these networks and people have connections to get funding, the marketing side of things, the product market fit, all of those things are super hard.

24:17 So if you are part of Microsoft for startups founders hub, you'll have access to their mentorship network, which gives you access to hundreds of mentors across a range of disciplines, like the ones I just named and more, as well as.

24:29 So that's up to a little bit over a hundred thousand dollars worth of credits and Azure and GitHub and open AI and other places.

24:37 As you go through certain checkpoints, as you sort of grow with this program.

24:41 So really tons of super support that you can get for your startup.

24:45 It doesn't have to be investor backed.

24:47 It doesn't have to be third party verified to participate.

24:50 All you have to do is go to pythonbytes.fm/founders hub 2022 apply.

24:56 And if you accept it, you'll get all of this support from them.

24:58 So make your idea a reality with Microsoft for startups founders hub.

25:02 Fly today for free.

25:03 Get in.

25:04 You'll get tons of support.

25:05 So very nice.

25:06 Also nice, Brian.

25:08 Lots.

25:08 Tell us about these lots.

25:10 Lots and command lines.

25:12 So I like command line stuff.

25:15 And actually with the thanks of Will McCoogan, we've got a lot of people excited about CLIs, but apparently Bob is also Bob Bilderbos from the PyBytes duo.

25:27 So I like this article.

25:30 So actually kind of skimmed the article.

25:33 Sorry, Bob.

25:34 But making plots with your terminal with plot, plot text, plot, it's if you install it, plot text.

25:42 I can see the typo squatting happening right now.

25:46 Yeah.

25:46 So if you pip install it, there's one T in the middle.

25:49 So it's P-L-O-T-E-X-T.

25:51 So, so he was doing a, so he did, had some code where he was looking at plotting the frequency of their blog articles on the terminal.

26:02 Just, so he's using some of their own data to plot stuff.

26:05 And he came up with like, it's kind of cool walking through how he grabbed the data and everything.

26:11 But I was looking at this plot going, oh, this is a pretty nice looking plot.

26:15 I mean, it's totally blocky, of course, but, but it's a bar chart.

26:19 So it's supposed to be blocky.

26:21 So that's okay.

26:22 And so then I went over and looked at this, this package that was plot text.

26:27 And it's cool.

26:30 Look at all these awesome plots.

26:32 I was looking at some of the various things you can do.

26:36 It's got basic plots for, you know, just like sine waves and things like that.

26:41 But you can also do fill in plots and then, multicolor.

26:46 This is kind of a lot.

26:47 You can kind of cool stuff you can do on the command line.

26:50 And then even data streams, which I was, look at that.

26:54 It's a data stream going on in a plot in your terminal.

26:57 It's pretty great.

26:58 images even.

27:00 So there's a cat image.

27:02 You can do low cats all day long.

27:04 Yeah.

27:05 people that put together those examples knew what the internet wants.

27:08 And to do cat pictures.

27:10 Yeah.

27:11 So, and then even, subplots.

27:14 So the first example we saw, it, it, it has a, it has kind of all this, this, it's

27:19 not actually that bad of, the interface.

27:22 It looks pretty, you know, it's tedious to put together plots anyway, but this isn't

27:26 too bad.

27:27 But that, that cover image that we saw is a, is not a combination of images.

27:32 That's one plot that with subplots in it.

27:35 So I see.

27:36 That's cool.

27:37 So within one terminal window, you can do almost like a dashboard view with different plots and

27:41 they could probably can be updating live.

27:43 And yeah.

27:44 Yeah.

27:44 So this is pretty exciting.

27:46 I like it.

27:48 so anyway, that was just, I just wanted to say, Hey, if you want to plot on the command

27:53 line, you can use this.

27:54 I'm loving this terminal renaissance is so fun.

27:58 So yeah, we make me, make us feel like a hackers again, you know?

28:03 So it does absolutely make you feel like a hacker.

28:05 I love it.

28:06 It's so good.

28:07 So, all right.

28:10 On to the next item.

28:10 Yeah.

28:11 Just, hadn't really planned to talk about this, but I just yesterday did an episode with

28:17 Will McGugan, seven lessons from building a modern TUI framework.

28:21 Brian, you covered that article last week on this show.

28:23 So I reached out to Will and said, Hey, we should absolutely cover this stuff.

28:27 and like a deep dive.

28:29 So, Oh, I can't wait to listen.

28:30 This is great.

28:31 People can go check that out as well.

28:32 All right.

28:33 But let's talk about one of my very favorite things.

28:37 HTMX.

28:38 If people are not familiar with HTMX, you really owe it to yourself to check this out.

28:43 It's what the web should have been forever, but it wasn't for some reason.

28:47 It's like it stalled in the late mid nineties.

28:49 I don't know.

28:50 And, you know, hyperlinks and forms are the only things that can make requests.

28:54 You can only, click on them to make it happen.

28:57 And so on.

28:58 Why should the entire screen have to be replaced every interaction and all those things.

29:02 So HTMX is awesome.

29:04 You can just put in little fragments of declarative code and it, it does all the cool work.

29:10 You can have a class on it.

29:11 People want to check that out, but that's not the topic of today.

29:13 The topic is template fragments.

29:15 So Carson Gross over there wrote this article, this essay called template fragments.

29:20 It said, one way you might consider doing this is in HTMX, you very frequently have to first

29:26 show the page.

29:26 And then as little sections of an update, he goes back to the server and says, I just need

29:31 the code, the HTML block that goes into this fragment here.

29:34 Cause somebody moused over something else.

29:36 So refresh it's related item or whatever.

29:38 He's a big fan of this thing called locality of behavior design principle, where instead of

29:43 having a bunch of pieces that cling together and reassemble themselves, like if it could just

29:48 all be right there, wouldn't that be great?

29:50 So he says, normally the way that you would have to do this is you would have to have your

29:55 full HTML and then a little subsection.

29:58 And then that subsection has the optional element, but some frameworks, some template libraries allow

30:06 you to define a fragment.

30:07 And then when the code is requested on the server, it can either show the whole thing or just peel

30:13 that fragment out of the HTML, but you don't have to parse it into a bunch of small files.

30:17 Cool.

30:18 Huh?

30:18 It's really useful if there's no reuse.

30:21 Like if the only reason you would make that little fragment is so that you could return it

30:25 separately.

30:25 This is great because basically it means you can just write the page once and it's, it can

30:30 interact with different data, different elements.

30:32 If for some reason that fragment worth being used in multiple places, all of a sudden it's like code

30:37 duplication and that's not ideal.

30:38 But so we talked about this and Hey, there's some known, implementations of this.

30:44 Apparently Django has the render block extension.

30:47 I created the Jinja partials and chameleon partials, which I'm not really sure.

30:51 I'm thinking I might actually take them out now that there's something for Jinja better,

30:55 which I'm about to talk about.

30:56 But nonetheless, those are kind of sort of allow this, but more, more in the second descriptive

31:02 way where you have like a fragment that's separate, but included.

31:04 But I was talking with Sergey of Rixies, it says between, he said between Jinja2 fragments

31:10 and Michael's my, Jinja partials, htmx plus flask is so awesome.

31:17 So he created this library called Jinja2 fragments, which does exactly what I described.

31:23 So in Jinja, you have blocks, like you might have your main HTML and you say, here's a block of main content

31:28 with his library.

31:29 What you can do is you can say either just render the template or you can now render block and name

31:36 just part of your Jinja template.

31:38 And that part comes back with the data you supply to it.

31:40 That's pretty awesome.

31:41 Right?

31:41 Like this, this one paragraph is the whole response from the server.

31:45 If you call render block instead of render template.

31:47 This is, yeah, this is super great.

31:50 Honestly, I, on Twitter, I, every time I see htmx, I'm just like, I am so like prepared to write a website

31:56 because I've not had the use case for a while, but I'm very excited for the next time.

32:01 I will have.

32:02 I exactly the same.

32:03 I've worked on projects that have been around for six or seven years.

32:06 I'm like, if I rewrite this thing, it's getting htmx all over it, but I just can't bring

32:10 myself quite, quite to do it.

32:12 But yeah, it's, it's so good.

32:13 One day.

32:15 One comment, a couple of comments from the chat.

32:18 Vincent from, com code says htmx is the bees knees and that com code uses it a whole

32:24 bunch.

32:25 I am not surprised.

32:26 Fantastic.

32:26 It's awesome.

32:28 Yeah.

32:28 If I, any website I create after knowing about htmx is likely going to be using htmx.

32:33 If you thought the answer was view.js or react or something like that, you may really, really,

32:39 really want to check this out first.

32:41 Well, especially if you're somebody like me that I'm like, yeah, I want to, I want to put

32:45 this interactive stuff in here.

32:47 I don't, I don't really feel I'm not an expert in JavaScript though.

32:52 So I'm not sure.

32:53 And so, but I, but I do know somebody that knows a lot about htmx.

32:57 So you might, you might know someone you're venturing very close to getting me off onto like

33:01 a very long rant about htmx.

33:04 But it's so good because even if you know JavaScript, it wouldn't it be better to not

33:08 have to think about now I'm running client code.

33:10 Now I'm running server code.

33:11 Now I'm running the APIs to connect the client code to the server code.

33:14 And this one's in this language.

33:15 It knows this, that one's in that language in this location.

33:18 It knows that like in htmx, you just write it all in one place in one language with the

33:22 same context and security model and everything access to the database, for example.

33:27 And then you just do what you need to do.

33:29 It's, it's perfect.

33:30 Well, and it's not really just about thinking about two languages.

33:33 either.

33:34 It's, it's somebody, there's a lot of people like me that already have to think in two

33:39 languages.

33:39 I'm thinking in C++ and Python.

33:41 So I was thinking about it in a third language or a fourth language.

33:46 That's, it's like, you know, come on having a place to stop plus.

33:49 Yeah.

33:50 Yeah.

33:51 A final, final comment I'll make on this is even people are using node JS like htmx, where

33:56 it's the same language.

33:57 It's like, it's also just about the context and location switch.

34:00 Oh yeah.

34:01 That's, I didn't hear, hadn't heard that.

34:03 That's pretty cool.

34:04 Yeah.

34:04 Seth, it sounds like you were going to say something.

34:06 Maybe I'll let you have the last word here.

34:07 Oh no.

34:08 I was honestly just going to say that like the more we can stay in HTML, the better because

34:12 you have to know HTML.

34:14 So you might as well stay in it.

34:15 Right?

34:15 Yeah, absolutely.

34:16 Absolutely.

34:17 So well done, Sergei.

34:19 Check out his Jinja 2 fragments framework.

34:22 It's, it is super new.

34:24 Like, I don't know when it got released, but a couple of days, these are like two and three

34:29 days on all the commits here.

34:30 I was going to say that long stream.

34:32 Very, very new.

34:32 Two, three days.

34:34 Yeah.

34:34 Well done.

34:35 Well done.

34:36 All right.

34:36 Seth, over to you for the final one.

34:39 Sure thing.

34:40 Yeah.

34:40 This, this article was announcing something that's been getting worked on for a while,

34:46 which is a generic generators for salsa three.

34:50 So what you're seeing there, SL, SLSA, that stands for, if I can remember, it is supply chain

34:57 levels for artifacts, levels for software artifacts.

35:01 So SLSA, and you pronounce it salsa.

35:05 And of course, it's essentially a great, great way to say that acronym.

35:09 Yeah.

35:09 Right.

35:10 Makes you hungry every time, which is the best part.

35:12 but yeah, it's, it's basically a set of tools, and standards to attest and verify

35:19 the provenance of artifacts.

35:21 So essentially, where did this thing come from?

35:24 This file, this wheel, this jar, depending on what, like ecosystem independent, whatever

35:29 thing, whatever artifact you're building, where did it come from?

35:32 How was it built?

35:33 and it, so it uses a whole bunch of different like cryptographical primitives and, open IDC,

35:39 which is basically magic.

35:41 and it basically allows you to prove, in effect, okay, this was built from this specific

35:48 GitHub repository, this commit, this tag.

35:51 and someone can then later take this file, this artifact that got built and then verify that that was the case.

35:59 And so this is kind of like in the future, hopefully be used as like a defense against maybe like stolen credentials on the Python package index.

36:08 That would never happen.

36:09 That would never happen.

36:10 That, that would, that would never happen.

36:11 Right.

36:11 That's never happened.

36:12 And that would, that has never happened.

36:14 Other than at the time of the recording never has happened, I would say.

36:17 so yeah, it, it gives a good defense against this.

36:21 Right.

36:21 Cause if you, let's say you have a package and the Python package index knows that this package came from, you know, github.com/sethmlarson/whatever.

36:31 Right.

36:31 And then in the future, it received something that doesn't come from that GitHub repository.

36:37 It can flag that and say, Hey, this isn't right.

36:39 Like this didn't come from the place that it came from before or wherever it's, you know, supposed to come from.

36:45 and the fact that this is generic is the big deal.

36:48 The part that ties us back to Python is that you can use it for wheel files and source distributions.

36:54 You can sign like anything.

36:56 and so for example, one of the Python projects that is featured in here is your lives three.

37:01 I've been trying to get into this and it's been really successful.

37:05 And, but your lip three now does this, and you can actually verify that it came from a specific repo and that the wheel was came from a specific tag.

37:13 And yeah, it's, it's really interesting.

37:15 And this ecosystem is like just getting started.

37:17 and so if you're like interested in anything about like supply chain security and all of that, this is like a great place to start.

37:24 Doing some learning about what the future might look like.

37:26 Yeah, this is great.

37:27 I, when I first saw this, I thought, okay, this is cool, but how does that really help act against somebody sabotaging a package?

37:35 But then again, if you think, and I realize if you think back to what happened with some of those other packages, somebody got ahold of the PI PI account, not the GitHub account.

37:44 And they just published a new version directly, not through the CI, right?

37:49 Right.

37:49 Yeah.

37:50 So this is making, it just makes the amount of things that need to get compromised even larger.

37:55 Right.

37:55 Like, right.

37:56 It closed.

37:56 No longer.

37:57 Do you need to only compromise the email account on PI PI?

38:01 You have to also compromise GitHub.

38:03 And then if you have, you know, GitHub environments configured, you need to compromise a second account to like review the deployment.

38:10 And so it just makes it even harder to actually get that attack off essentially.

38:16 Yeah.

38:18 And if you had to publish the actual vulnerability to a popular GitHub repository to trigger it, it would be discovered sooner.

38:25 Right.

38:26 Because people are like, oh, what's, oh, that's, that's unusual.

38:30 Who has made this, that they've made this commit.

38:33 And now it's doing this URL thing over to hacks or.com.

38:36 Right.

38:37 Like that's, that's just another out in public thing.

38:41 Whereas if the direct account gets attacked, somebody can just use twine or something directly to push it a bad wheel up.

38:47 Yeah, exactly.

38:48 Yeah.

38:48 No more pushing bad wheels.

38:50 You have to go through so many different hoops just to do something.

38:53 You need to flatten those bad wheels.

38:54 Yes.

38:55 Got to inspect them too.

38:57 Exactly.

38:59 All right.

39:00 Awesome.

39:00 This is, this is good stuff.

39:01 Well, Brian, that's no.

39:04 Do you have any more?

39:05 No, that's all of them.

39:05 Do you have any extras for us?

39:07 I do.

39:08 Although I'm going to try to make it quick because now I'm hungry for some salsa.

39:11 So I wanted to, I'm like super excited for this upcoming weekend.

39:17 I can't believe it.

39:18 So on Saturday, on Saturday, September 10th, I'll be in September 10th.

39:24 I will be in San Francisco.

39:26 And I've got two events going on at Pi Bay.

39:29 So Pi Bay, awesome conference.

39:32 I haven't been there before, but you've been there last year or something like that.

39:35 Last year.

39:36 And I absolutely loved it.

39:37 I would go this year if I wasn't on single parent duty and had kids that had to go to school.

39:42 So I'm giving two, two events.

39:45 So one of them is a sharing is carrying Pi test fixture edition.

39:48 I'm going to talk about building.

39:50 Actually, I'm just going to talk about packaging, but it's not really about packaging.

39:55 It's about sharing fixtures with other people.

39:58 And because I think that that's a bigger need than, than people realize.

40:02 So anyway, love fixtures.

40:04 We're going to talk about that.

40:05 And then, and then I got asked to be on this experts panel.

40:09 There's no with, we got Zach Hatfield Dodds, me, Andy Knight, which is, he's got a good automation

40:19 pin.

40:19 Automation Panda.

40:20 That's right.

40:21 Joshua Grant and Nishat Khan.

40:24 So it should be a fun panel.

40:26 And it's at seven o'clock at night.

40:28 I'm like, wow.

40:29 I think I really need to change my flight because I was planning on flying out at 8 a.m.

40:35 the next day.

40:36 And it's going to be tough.

40:37 So, so that's going on next weekend.

40:39 I'm pretty excited.

40:40 Yeah.

40:40 By length says, good luck on the talk, Brian.

40:42 Oh, thanks.

40:43 So how about you?

40:45 Do you have any extras?

40:46 I do.

40:48 I do a bunch.

40:49 I'll make them pretty quick.

40:50 So Heroku, you know, the platform is a service place.

40:54 They for 13 years or something have had a free plan where people can go and create what,

40:59 what are they called?

41:00 Dinos or something.

41:01 I don't use.

41:01 Yeah.

41:02 Dinos.

41:03 I don't use Heroku, so I don't know the terminology and how all the plans break down.

41:09 But for a long time, they've had free plans, but now they are canceling them and you will

41:15 either have to pay or delete your projects.

41:20 So that's going to affect a lot of people.

41:22 They have something like 13 million.

41:24 What's the right number here?

41:26 Claims, yeah, that it's been used by 13 to develop 13 million apps.

41:31 So I bet many of those are free and are going to be suffering this.

41:35 There's an interesting discussion on Y Combinator, so you can check that out.

41:39 I'm sure it's very civil over there in the comments, as it always would be.

41:44 Yes.

41:44 Yeah.

41:45 But basically, you know, Heroku was purchased by Salesforce for they claim, and it may be

41:51 true.

41:52 I'm sure that it is somewhat true.

41:53 They want to cancel this because of fraud and abuse.

41:56 It may be more that they have to spend so much money to fight the fraud and abuse that it's

42:00 just not worth it to them.

42:01 I don't know what it is.

42:02 But however you land on the it's a good idea, a bad idea, it's going to cost money if you

42:07 want to use this.

42:08 And it's pretty pricey, by the way.

42:10 Like, this change will roughly double the cost of a basic plan that uses Redis.

42:14 And from like up to $50 a month, you start bringing in your Redis cache and your Postgres

42:20 hosting and your dinos, and they all add up.

42:23 And then you got to scale this one or that one, right?

42:25 One of the reasons I'm not using it, but not the only reason.

42:29 I just want a little more control as well.

42:30 But anyway, so if you have a free thing running on Heroku, or you were thinking about it, you

42:35 have to think again, find something else.

42:36 There's actually at the bottom, there's a bunch of platforms of service things that I've never

42:41 heard of.

42:41 There's Porter, Railway, Render, Fly.io, and CleverCloud.

42:44 All of these things vying for this business.

42:47 They all look kind of interesting.

42:48 I know nothing about them.

42:50 You can check it out.

42:50 I've seen Fly.io all over the place on Python, Twitter, at least.

42:54 Yeah.

42:54 Okay.

42:55 So if I were personally picking one, I would check that one out first.

42:59 But I don't know anything about any of them, to be honest with you.

43:02 The last time I used Heroku was a long time ago.

43:05 I'd like to see some real comparisons among some of these.

43:11 If somebody's just saying, there's still a place for hobby projects.

43:16 I want to try something out, or do something live, even as a high school app,

43:25 or something like that.

43:25 I know.

43:26 Oh, good.

43:27 You're going to show Python anywhere.

43:29 I was going to.

43:29 I got to find the right link.

43:30 Here we go.

43:32 So I think they still have a free tier.

43:34 I think so.

43:35 I don't know if they advertise it much.

43:36 I think it's free.

43:38 Yeah.

43:38 The part that bothers me really isn't that it's...

43:41 I don't...

43:42 There's a comment in the chat about...

43:46 It's hard to complain about people.

43:51 It's a free service, so they can do whatever they want, right?

43:54 Essentially.

43:54 Yeah.

43:56 Oh, there's that right.

43:57 That's the right one.

43:58 Yeah.

43:58 However, the jump between free and $50 a month is a big jump.

44:04 And that's my gripe about it.

44:07 So anyway.

44:08 Yep.

44:09 Yeah, not to turn this into a recommendation, but yeah, I feel like a lot of the cloud services

44:14 have really pushed how easy it is to deploy.

44:18 Because I remember initially starting with Heroku, the ease of deployment was the big win

44:23 for a lot of people.

44:24 And so, yeah, a lot of cloud services where, you know, you pay for everything you use, but

44:29 what you use ends up being a few cents a month, which is a lot more sumountable than $50 a month.

44:35 So yeah, there's definitely a gap there, but there's not as much of a gap there as there

44:39 was before.

44:39 Yeah, for sure.

44:41 Brian out in the audience says, at my last company, we had to disable our free tier due

44:45 to crypto miners.

44:46 Yeah, of course, I'm sure.

44:48 And Kim also has something, yeah, stealing the computation there.

44:53 But all right.

44:53 Anyway, again, I didn't want to go too far down that one, but for sure, check out some of the

44:58 options below.

44:59 DigitalOcean and Linode are also really, really good options.

45:03 This one, I'm full of rants today, potential rants.

45:07 This one comes to us from Extreme Tech.

45:09 White House, as in the US, bans paywalls on taxpayer-funded research.

45:14 It has always felt super creepy and wrong that we have the NSF, which pays billions of dollars

45:23 a year, millions for individual research projects, to come up with scientific research that all three

45:29 of us and many people listening actually pay for.

45:32 I'm glad to pay it.

45:34 I think this is really important.

45:35 It's important for the country.

45:37 It's important for the world.

45:38 And yet, those results get locked up behind really expensive for-pay scientific journals.

45:45 Right?

45:46 Like, you've got to pay $5,000 a year to subscribe to this journal so that you can read the article

45:50 that, wait, we paid to create that and we can't even get access to it?

45:54 So this article here is, the White House has updated federal rules to close a loophole that

46:00 enabled journals to keep taxpayer-funded research behind a paywall, which I think is great.

46:05 So if you're necessarily in the data science side, I think this might be relevant to you.

46:09 Huh.

46:09 Yeah.

46:10 I'm curious how that's going to get implemented.

46:12 Yeah.

46:13 So, yeah.

46:14 Me too.

46:14 All right.

46:14 Anyway, there's that.

46:15 And then, Seth, back to some of the stuff you were talking about.

46:19 I mean, it would never happen that someone would try to phish.

46:22 Wait.

46:23 No.

46:23 Last week, somebody tried to phish PyPI.

46:26 Maybe it was the week before when it started, but not too long ago.

46:29 So over on darkreading.com, there's an article that says, threat actor phishing PyPI users

46:35 has been identified.

46:36 Juice Ledger has escalated a campaign to distribute its information stealer by now going after developers

46:42 who publish code widely used on the Python code repository.

46:45 Don't want to go too much into it, but there's this group who had originally tried to do typo

46:50 squatting, if I'm correct.

46:52 They wrote some thing to steal some malware written in .NET, by the way, which Will was

46:59 joking about it, only running on Windows.

47:01 Hey, if they use .NET Core, they could expand out the open source version.

47:05 Anyway, I don't want to give them ideas, but they were distributing this malware through

47:10 these malicious packages.

47:12 And then they said, well, what if we could get really popular ones, hack their accounts,

47:16 and then upload bad wheels?

47:17 So anyway, there's a bunch of background on the actual people behind this.

47:21 So it's pretty interesting.

47:23 You can check out that article if you want.

47:24 There's also an Ars Technica article, but it doesn't have as much depth as the dark reading

47:29 one.

47:29 Nice.

47:30 All right.

47:30 Last one.

47:31 I think this is the last one.

47:33 Brian Skin, former co-host on the show, who always contributes many interesting things,

47:38 says, Python Bytes will definitely want to check this out.

47:41 This is a tweet by Steve Dower that says, we have published the details of a critical security

47:48 problem for Python.

47:49 It is very rare that we have direct vulnerabilities in Python.

47:54 Like, it was all fun to have the lulls about Jindy and Log4J.

48:01 But this is not exactly that, but it's a denial of service at that kind of scale.

48:07 So if you've ever thought, I have a string and it needs to be an integer, and that string came

48:12 from user input, that's really bad, it turns out, because there's a denial of service thing

48:18 that you can do by passing very, very long strings to that integer parsing.

48:22 Seth, you're shaking your head like, oh boy.

48:24 Yes.

48:25 Yeah.

48:26 If you've been waiting to upgrade to Python 2, now's the time to upgrade Python 3, I would

48:31 say.

48:31 Exactly.

48:32 The security support.

48:33 And you shouldn't say, eh, just go to one of the older ones.

48:37 You need to get the 310.7 ASP.

48:41 I suspect they'll roll this back to some of the supported ones as well.

48:44 So they'll probably back port it to 3.9 and 3.8.

48:47 But if you're on, say, 3.6, that's a problem.

48:50 That's a big, big problem.

48:52 Yeah.

48:52 So expect releases for 3.7 plus in the next week.

48:55 This came out a few days ago.

48:57 This has now been done.

48:59 But this Twitter thread is super interesting, and that's what I'm linking to.

49:03 So y'all can check that out.

49:04 There was also some feedback like, what are you doing?

49:08 How dare you fix this?

49:10 The way they fixed this is they said, if you're doing base 10 parsing, you can only use 4,300

49:16 digits.

49:16 Not the number to 4,000, but places in the number, 4,000 places.

49:21 That's a really large number.

49:23 If it's bigger than that, basically Python won't be able to parse it before.

49:26 Brian, you do C++ all the time.

49:28 You have to think about, is this over 32,000?

49:31 Is it signed or unsigned?

49:32 Okay, it's unsigned.

49:34 All right, we can get to 64,000.

49:35 This is not that level of thinking, but you kind of do have to think about what the heck's

49:40 going on here.

49:41 I think it's a fair fix.

49:44 I do too.

49:45 People are freaking out for no reason.

49:46 Yeah, this one was really, this one's wild too, because you just pass a long number.

49:51 It's not something sophisticated or anything.

49:53 It also feels almost, not log4j, but kind of log4j a little bit, where you can just do

49:59 denial of service by doing something very trivial.

50:02 Exactly.

50:03 Yeah, you just try to set your username to jndi, colon, slash, slash, hackster.com.

50:10 This is like, well, the number is A1722117.

50:13 Yeah, and then boom, now it goes to the website, right?

50:15 This is denial of service versus remote code execution, which is clearly better, but it's

50:20 not good.

50:21 Yeah.

50:21 Just hold down the zero key for a little longer.

50:23 Exactly.

50:24 Or if you're writing Python code, you can just do times 10,000 carat 10,000, you know, power

50:30 to 10,000 or something and send that.

50:31 Yeah, string extension really coming in handy here.

50:34 RPAD.

50:34 Exactly.

50:36 Or ZFIL in the right.

50:38 ZPAD.

50:39 ZFIL.

50:39 Exactly.

50:40 Yeah, piling wants to send pi across.

50:43 Yeah, that's going to upset it.

50:44 Anyway, I upgraded my servers to 310.7.

50:47 They were not available from Ubuntu directly.

50:50 It was still the old 310.6, which is unnerving.

50:53 But because I build mine from source, I just changed the number 310.7, rebuild and redeploy

50:59 Python.

50:59 I'm good to go.

51:00 I imagine everybody listening to this podcast is on 3.7 or above if they at any chance can

51:08 be.

51:08 I mean, that if they're below, it's not because they haven't tried.

51:13 Yeah.

51:14 Well, let me point this out.

51:15 I would say, actually, I want to follow up with a couple of things because this is, maybe

51:18 this should have been the main item, but whatever.

51:20 One, we've talked about the reason you should upgrade to Python 3 for a long time.

51:26 And Brian, you and I had lots of fun calling it Legacy Python.

51:29 Although we've had people go into iTunes and like post negative reviews of the podcast because

51:35 I had said disparaging things of Python 2, but it's okay.

51:38 I'm willing to stick by them.

51:41 Oh my goodness.

51:42 That is wild.

51:43 More reviews.

51:45 Awesome.

51:45 If you have good things to say, also consider posting a review, not just if you're angry

51:50 that I called it Legacy Python.

51:51 But if you're on old Legacy code, which is even 3.5, but is very seriously Python 2 because

51:59 the gap to upgrade is really hard.

52:00 These are the types of things that we warned about that could be a problem.

52:04 Yeah.

52:05 And there will be no fix, right?

52:06 You better just say, well, we're going to make sure the strings that are really destined

52:11 to be integers are really, really checked.

52:12 And, you know, I mean, it's not good.

52:15 It's not good.

52:16 So just one more reason to be on a shipping version of Python, even if it's just 3.7.

52:22 Yeah.

52:22 All right.

52:23 Yeah.

52:25 That's it.

52:25 Let's see.

52:26 Yeah.

52:27 The changelog.

52:28 One other really quick.

52:29 Yeah.

52:30 So you can see it's like actually described quite well here.

52:32 Hatched by Gregory P.

52:34 Smith and Christian Himes.

52:36 Feedback by a bunch of great folks.

52:38 Sebastian Ramirez said, I sent a tweet out when this got fixed saying, please be kind

52:43 to your open source contributors.

52:45 They just wrote 800 lines of code in a PR so that you can parse strings to integers.

52:51 So apparently it wasn't easy to fix.

52:53 But yeah, I agree.

52:54 Cool.

52:55 Ready for a joke or actually, Seth, you got anything extras you want to throw out first?

52:59 Yeah.

53:00 I had a real hopefully quick one.

53:02 Yeah.

53:03 I so I follow a whole bunch of game art accounts on Twitter because I just I just love

53:08 it.

53:08 Mm hmm.

53:09 Seeing what people create.

53:10 And one came by.

53:11 It was using hashtag pixel P Y XL.

53:15 Did a little ding.

53:16 I'm like, wait a second.

53:17 That's Python.

53:17 And then I just went back in this developers Twitter a few a few tweets back and they just

53:23 released wasm support for this Python like game framework.

53:27 I'm like, this is incredible.

53:29 So, yeah, it was quite the it was a very fast.

53:33 journey of wow.

53:34 Wasm is everywhere at this point.

53:36 That's kind of kind of wild that it's popping up so fast.

53:38 So, yeah, version one eight zero of this retro game engine for Python, which they had a whole

53:45 bunch of really beautiful like examples.

53:47 I think you all have covered this framework before, but I'm not.

53:49 We have.

53:50 Yeah, this is really cool.

53:52 Yeah.

53:52 So apparently they have a whole bunch of demos that you can just play in the browser.

53:56 And I was really blown away that I didn't even know this existed.

53:59 And suddenly there's wasm support for it.

54:01 So awesome.

54:03 I love it.

54:04 OK, that's a great one.

54:05 Yeah.

54:06 All right.

54:07 How about we close it out with a bit of a joke?

54:08 Have you ever felt like you've had a hard day at work?

54:11 There's one of these problems like parsing integers.

54:13 You're like, how could possibly this go wrong?

54:15 I just don't understand what is happening.

54:17 Well, here we have a joke of a guy at a nighttime soccer game.

54:20 Apparently it's a little cool, but he's been running really hard.

54:25 So it's a picture of a guy whose head is literally steaming, like not a little bit, a lot, a lot.

54:32 I think that's a visualization of like integer being parsed into a string right there.

54:37 Exactly.

54:37 The before.

54:38 I'll read what the tweet really says.

54:40 And then maybe we can play with it a little.

54:42 It says, the tweet says, just a JavaScript developer after work.

54:45 You know, like, what do you mean I have to do a new framework?

54:47 I just did a new framework last month.

54:49 I feel like this could be Christian Himes after going, what do you mean parsing?

54:55 Parsing an integer is a denial of service.

54:57 I just can't.

54:59 The ints are wrong.

55:01 The ints are cursed.

55:02 Exactly.

55:04 Anyway, I'll just leave this here for people to appreciate.

55:07 We can call it a show 300.

55:09 Yeah.

55:09 Nice.

55:10 Thanks.

55:11 Yeah.

55:12 Thank you, Brian.

55:13 Seth, thanks so much for being here and sharing the work you've been doing.

55:16 Yeah.

55:17 Thanks so much for having me.

55:18 Yeah.

55:18 It's been great.

Want to go deeper? Check our projects

Course: Python for the Absolute Beginner course

Beginners

HTMX + Flask

FastAPI

pytest book

Full transcript