Transcript #294: Specializing Adaptive Interpreters in Full Color
Return to episode page view on github00:00 I am pulling off a very, very cool trick.
00:03 I just want to point out before we get started.
00:05 >> Okay.
00:05 >> On the Talk Python channel, I'm doing a podcast with Anthony Shaw and Shane from Microsoft about Azure and Python and some CLI stuff they built in FastAPI.
00:15 At the exact same time, I'm doing this one here and they're both streaming live.
00:19 >> I don't know how that's happening.
00:22 >> The other one was recorded two months ago and we couldn't release it because some of the things weren't finished yet.
00:27 So I hit go on that.
00:29 The real one, if you're bouncing around, the real one is here.
00:32 >> Okay.
00:33 >> So, Jonah's here. Anyway, with that, you're ready to start a podcast?
00:36 >> Yeah, definitely.
00:37 >> Hello, and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 294, recorded July 12th, 2022. I'm Michael Kennedy.
00:48 >> And I am Brian Okken. It's just us this weekend, or this-
00:51 >> It's just us.
00:52 >> Yeah.
00:52 >> Yeah, it's, I don't know. Dean of the audience asks, is this a daily podcast show now?
00:58 I'm a little bit torn about I feel like we almost could do a daily show, but then I think what it might take to do a daily show knowing how much work a weekly show is.
01:08 I know it's not a day.
01:09 It's not a daily pocket.
01:10 No, might be fun to do sometime.
01:12 Just do like a full week or something.
01:14 Just right.
01:15 Exactly.
01:15 Just a super this.
01:17 There's so much news.
01:18 We're seeing every day for the week.
01:20 Cool.
01:20 Just like the same topics like six days in a row.
01:23 Over.
01:24 Yeah.
01:25 Exactly.
01:27 Exactly.
01:28 >> All right. Am I up first this week?
01:30 >> You are. Yes.
01:32 >> Right on. Well, let me tell you about something special, specialist.
01:37 >> Okay.
01:37 >> Just last week, I believe it was, I interviewed Alex Waygood who did the write-up for the Python Language Summit.
01:45 As part of the topics we were discovering, the Python Language Summit and Python this year is focusing a lot on performance and what's called the Shannon plan.
01:54 So this is Mark Shannon's plan to make Python five times faster over five releases.
01:59 It's got a ton of support at Microsoft.
02:02 Hito van Rossum's there working on it, but they've hired like five or six other people who are full-time working on making Python faster now.
02:08 So awesome, awesome.
02:10 Thank you for that.
02:11 However, one of the things that made Python 3.11 fast is some of the early work they did.
02:18 And it comes down to PEP 659, a specializing adaptive interpreter.
02:24 So let me tell you about this feature, this performance improvement first, and then we'll see what specialist is about, 'cause it's about understanding and visualizing this behavior.
02:35 So one of the things that is a problem with Python, because it's dynamic and its types can change and what can be passed could vary.
02:44 I mean, you could have type hints, but you can violate the type hints all day long and it's fine.
02:49 So what the interpreter has to do is say, well, we're going to do all of our operations super general.
02:55 So if I have a function and it's called add and it takes X and Y and it returns X plus Y, seems easy, but is that string addition?
03:03 Is that numerical addition?
03:05 Is that some custom operator overloading with a dunder add or whatever it is in some type?
03:12 If it fails in one way, you kind of got to reverse it.
03:14 Like there's all this unknown, right?
03:15 >> Yeah.
03:16 >> What if you knew?
03:17 What if you knew those were integers and not classes or not strings?
03:22 You could run different code.
03:24 You wouldn't have to first figure out what they are.
03:26 Are they compatible?
03:27 Do you do the add in low level CPython internals or do you go to like some Python class and do it?
03:34 You could be much more focused.
03:37 Additionally, if it was adding for a list, you could say, well, if I know their list, what we just do is go list.extend and we give it the other list.
03:46 we don't hunt around and figure out all this other stuff.
03:48 So that's the general idea of the specializing interpreter is it goes through and it says, look, we don't know for sure what could be passed here, but if it looks like over and over, we're running the same code and it's always the same types, is there a way we could specialize those types?
04:07 Is there a way that we could put specific code for adding numbers or specific code for combining lists?
04:14 And this is called adaptive and speculative specialization.
04:19 Okay?
04:20 - Okay.
04:21 - And my favorite part of it when it's performed, it's called the quickening.
04:25 Quickening is the process of replacing slow instructions with faster variants.
04:31 So kind of like I said, it has some advantages over immutable bytecode.
04:35 It can be changed at runtime.
04:36 Like you see, we're always adding integers.
04:39 It can use super instructions that span lines or take multiple operands.
04:43 and it does not need to handle tracing as it can fall back to the original byte code for that.
04:49 So there's a whole bunch of stuff going on here.
04:51 Like the example they give is you might wanna specialize load adder.
04:56 So load adder is a way to say, give me the value that this thing contains, but what is the thing?
05:02 One of the things you might do is you might realize it's a instance class and then you would call load adder instance value.
05:09 You might realize it's a module and you might call load adder module or slot or so on, right?
05:14 But if you knew, you don't have to go through first the abstract step and then figure out which of these it is.
05:20 You just do the thing that it is.
05:21 Okay, so that's the idea of this pep.
05:24 This is one of the things that's making Python 3.11 faster.
05:28 Awesome, so to the main topic.
05:30 - Okay, and I'll just as a note, I'm saying okay as if I understand what you just said, but most of it just went.
05:37 - It's all right, I think we'll, let's look at pictures.
05:40 - Okay.
05:40 So this thing by Brant Boucher, it's called Specialist, and it's about visualizing this specializing adaptive interpreter.
05:50 Oh, okay, good.
05:51 Okay.
05:52 So, it says Specialist uses fine-grained location information to create visual representations of exactly where and how CPython 3.11's new specializing adaptive interpreter optimizes your code.
06:04 And it's not just interesting, it has actionable information.
06:08 Okay.
06:09 So for example, see here, and if you got to pull up this, the website, if you're just listening, if you see in that website, you'll see some color.
06:18 You'll see green, less green, yellow, orange, and all the way to red.
06:23 So there's two aspects, there's sort of a darkness as well as a color.
06:27 So the most, like where Python could take advantage of this feature, you see green, where it can't, you see red.
06:35 and imagine a spectrum it goes like green, yellow, orange, red.
06:40 So it's not on or off, it's how much could it specialize.
06:44 So what you see here, for example, is it's able to take some numbers, an integer and a string, and then use the fact that it knows what those are to make certain things like appending an output and doing some character operations on it.
07:02 Right. It was able to replace that with a different runtime behavior because of this quickening.
07:07 So let's skip down here, give you a bit of the background.
07:10 So let's look at this example. We have F to C, which converts Fahrenheit to Celsius.
07:16 And what it does is, okay, we're going to take an F, and it has type hints that say float, float.
07:21 Okay, so, but those don't matter.
07:23 So it says, we're going to take an F and subtract 32 from it.
07:26 And then we're going to do simple math.
07:28 We're going to take that result, that range, to that size of temperature there based on zero and then multiply it by five and divide it by nine.
07:36 We all learned this in chemistry class or somewhere or we talked about converting different measurement.
07:43 - Yeah, of course.
07:44 - Right, so these are straightforward but there's actually problems in here that make it slower and prohibit Python from quickening it as much as it can be quickened, okay?
07:55 So if we take this code, it just runs F to C and C to F and it gives us some test values and says just do it and tell us what happened.
08:02 We can run specialists on it and it says, okay, this X here, the green areas indicate regions of code that were successfully specialized where red areas are unsuccessful.
08:13 Like it tried and it failed.
08:14 So it says one of the problems is, start out the X equals F minus 32.
08:20 It says, well, we can quicken operations on numerical types that are the same, but for now there's not a float int and float variant of this.
08:29 It's gotta be float float.
08:30 Oh right.
08:31 So it says, right, you could have gotten a faster operation there.
08:34 But because the types didn't match, you won't.
08:37 But then what it did get out is an x, and that's great.
08:39 An x which is a float, and it's gonna do some stuff, and it could sort of make it better, but it said, look, here's some multiplication again by an integer and a float.
08:47 So that's not quickened.
08:48 And this division, division is apparently never quickened.
08:52 So what can we do?
08:53 Well, with that information, you can say, well, what's the problem with subtracting 32?
08:57 Well, it wasn't a float. What if I said 32.0?
09:00 Oh, yes. All right. That gets replaced by faster code.
09:03 - Oh, nice. - Right?
09:04 So that's pretty nice. And if you want to return, it was adding like x plus 32 for the other direction.
09:09 And now it's 32.0. That's faster.
09:11 Okay. Well, what else?
09:12 What if we...
09:13 Now you can see when we did that part of the conversion x times 5 divided by 9, if we put a 5.0, that gets faster still, but the divide is never quickened.
09:24 Okay. Well, what if we put the divide in parentheses?
09:27 It doesn't really matter if it's X times five divided by nine or X times five divided by nine, right?
09:32 It's these are mathematically equivalent, but they're not equivalent to Python because that that operation results in It leverages constant folding right five divided by nine is pre-computed in Python to be a float Okay, right at parse time, right?
09:48 That's just how it works with constants if it says that can do math with constants ahead of time It does it so that becomes a float and then float times float is now quickened, right?
09:55 Isn't this cool the way you can apply this and actually make your code faster, not just go, "Oh, it's interesting.
10:00 It must be quick in it there." >> Yeah.
10:02 >> But it's actionable.
10:03 >> It is really pretty cool and I'd really like to see this incorporated into an editor or something to say, "Your code will be faster if you just add a 0.0 here," or something like that.
10:13 >> It's going to become a float anyway. It doesn't matter.
10:16 Why would you write 32.0 when you just meant 32?
10:19 Seems more precise to say 32.
10:22 >> Because I'm used to doing that, to thinking if it's, okay.
10:26 Well, me personally, if I know it's going to be a float math, I usually do 0.0, but maybe that's not a normal thing.
10:32 >> You're such a C programmer.
10:35 All right. Well, I think this is really cool, this specialist.
10:42 >> It is pretty cool.
10:42 >> I don't know if I have any code that does math at that fine or greater level that I really care, but maybe if you're in charge of a library where you've got a tight loop or you do a lot of math science stuff where it matters, then this can be really useful.
10:56 And what's cool is it's not like, and switch to Rust or switch to C or switch to Cython and it'll take effect.
11:03 Like no, this is like straight Python code.
11:06 This is just how do I take most advantage of what is already happening for performance boosts in 3.11 that we haven't had before.
11:13 - And I think it's gonna be just one more workflow step.
11:16 So you've got, you profile your code, your whole thing is a little bit slower than you'd like it to be.
11:23 You throw a profiler on it, you see the bottleneck areas that you could improve, and you think, should I like rewrite some of this in Rust or C or, you know, what should I do?
11:33 Well, first off, let's try doing this, like throw this at it and have the optimizer from 3.11 help you out.
11:42 And yeah, so I think this, I can definitely see that this is gonna be part of people's workflow.
11:50 But yeah, profile first.
11:51 - I agree that you want a profile first.
11:52 Yes, exactly.
11:54 'Cause while it's fun to do this, only focus where it's gonna matter.
11:58 Don't optimize a bunch of stuff that doesn't.
12:01 So Brian out in the audience says, different Brian, "Is there a plan to do lossless type conversion "or maybe Flake 8 can make this kind of suggestion?" - Yeah, exactly.
12:11 That'd be cool.
12:12 - Yeah, I'm not really sure if, you don't want to write the code where you get different outputs, probably, right?
12:18 But everything that was happening here, you ended up with the same outcome anyway.
12:24 It's just like, well, do I do the division first or the multiplication?
12:26 Or do I start with an int that results after some addition, subtraction with a float, or as I just make them all floats?
12:34 I feel like it's, in most cases, it shouldn't be changing the outcome.
12:38 Yep, cool, anyway.
12:42 That's what I got for the first one.
12:43 How about you?
12:44 - Well, kind of sticking with a 3.11 theme so far.
12:49 Well, we can use Toml now, but in 3.11, we are gonna have a Tomlib be part of Python 3.11 with PEP 680.
13:00 And we covered that in episode 273.
13:03 But one of the things we didn't mention was that the Tomlib is, and I think we did mention it, is based on Tomly, But Tomli, you can use right now.
13:15 So a lot of projects are switching to use Tomli as their Toml parser to read like pyproject.toml or read their own config file.
13:29 And so I just wanted to highlight it.
13:32 It's Tomli is the, a little Toml parser.
13:36 It's a cute little thing on the project.
13:39 - That is cute.
13:40 - But I was reminded of it because The real Python people put out actually, looks like gear, Arne, sorry, I'm not gonna try to pronounce that name.
13:52 Real Python wrote an article called Python and Toml New Best Friends.
13:59 And I really love, it's a very comprehensive article, but I really love at least the first three parts of it.
14:06 Using Toml as a config format, getting to know key value pairs, and load Toml with Python, because this is kind of what you're gonna do with it.
14:15 You're gonna write config files for something.
14:17 And I just kind of, this is a great introduction of Toml for Python, and that's kind of what we care about, right?
14:23 So it goes through, like just getting used to what Toml looks like, what a config file looks like, talking about how all the keys, even if you, it's like key value stuff.
14:35 And even if you put a number there or something, it's gonna be a string.
14:40 All the keys get converted to strings, even if they don't look like them.
14:43 And they are, they're UTF-8, so you can use Unicode in there as well, which is kind of neat.
14:51 - Put your emojis in there.
14:52 - Yeah, well, can you, is, are emojis UTF-8?
14:56 - I think mostly, many of them are.
15:00 - Interesting, that'd be fun to put emojis in there.
15:03 (laughing)
15:05 I don't know.
15:05 - What mode are we running?
15:06 Are we running in cow mode or lizard mode?
15:08 I'll do lizard.
15:09 Okay, well, if you're running in Lizard--
15:11 - Lizard is true.
15:12 Okay, I gotta try that to see.
15:15 I should have done that before.
15:16 - Oh my gosh, I think I almost, it's both horrible and amazing to imagine writing like config files to like put it in Lizard mode, do it.
15:24 - Yeah, one of the things that I didn't, before reading this article, one of the things I didn't know you could do in Toml because I just started cursory, I use it with pyproject.toml and that's about it.
15:34 But you can do, So talks about normal how to read stuff.
15:41 But one of the things is, oh, what was I gonna talk about?
15:44 Arrays.
15:45 And you can do arrays of things, which are neat, and tables and arrays of tables, which is like, so you have arrays of tables or these bracket bracket things.
15:56 And then you can do dot stuff.
15:59 So if you have like, what was it?
16:02 User and user.player, these will show up as like, you know, sub dictionary key things.
16:10 And so one of the things that I, and I played with it this morning, and it really, I should have had something to show, but the thing I like to do is to just read it, just like this article talks about reading it, just read the tomo file into Python and print it.
16:28 And then you can, and it'll print out as a dictionary, and then you can create whatever format you want for your Tomo file and then you can just see what it's going to look like and then you know how to access it.
16:39 It's one of the best ways to do that.
16:41 That's awesome.
16:42 Yeah.
16:43 What an interesting format.
16:44 That's pretty, that's pretty in depth.
16:45 And a blast from last week passed.
16:48 Actually, hey, actually says UTF-8 can encode any Unicode character.
16:53 Emoji your heart emoji.
16:55 Heart of it.
16:56 Oh yeah.
16:57 You could do like, you know, is it in heart mode?
16:59 Heart equals true, heart equals false.
17:02 >> For optimizer, you could do a flame emoji equals true.
17:07 >> Exactly. I love it.
17:09 >> Yeah.
17:09 >> I think we have not leveraged the configuration as emoji sufficiently.
17:13 >> No. Yeah. I think pytest should rewrite all of its configs as emoji items.
17:19 >> Just do a PR. I'm sure they'll take it.
17:20 >> Yeah.
17:21 >> Yeah. It'll be good.
17:23 >> All right. Yeah. All right.
17:25 Let me tell you about our sponsor for this week before we move on.
17:28 This week is brought to you by Microsoft Founders Hub.
17:31 In fact, they are supporting a whole bunch of upcoming episodes.
17:34 So thank you a whole bunch to Microsoft for Startups here.
17:37 Starting business is hard by some estimates, over 90% of startups go out of business within their first year.
17:43 With that in mind, Microsoft for Startups set out to understand what startups need to be successful and to create a digital platform to help overcome those challenges.
17:51 Microsoft for Startups Founders Hub.
17:53 Their hub provides all founders at any stage with a bunch of free resources to help solve challenges.
17:59 and you get technology benefits, but also really importantly, access to expert guidance and skilled resources, mentorship and networking connections and a bunch more.
18:08 So unlike a bunch of other similar projects in the industry, Microsoft for Startup Founders Hub does not require startups to be investor backed or third party validated to participate.
18:20 It's free to apply, and if you apply, get in, then you're in, it's open to all.
18:25 So what do you get if you join or apply and then get accepted?
18:28 So you can speed up your development with access to GitHub, Microsoft Cloud, the ability to unlock credits over time, as in it gets over $100,000 worth of credits over time over the first year if you meet a bunch of milestones, which is fantastic.
18:41 It'll help your startup innovate.
18:42 FounderHub is partnering with companies like OpenAI, the global leader in AI research and development to provide benefits and discounts too.
18:50 - Neat.
18:50 - Yeah, through Microsoft Startup Founders Hub, becoming a founder is no longer about who you know.
18:55 You'll have access to the mentorship network, giving you access to a pool of hundreds of mentors across a range of disciplines, areas like idea validation, fundraising, management coaching, sales marketing, as well as specific technical stress points.
19:08 To me, that's actually the biggest value is the networking and mentor side.
19:12 So you'll be able to book a one-on-one meeting with these mentors, many of whom are founders themselves.
19:17 Make your idea a reality today with the critical support you'll get from Microsoft for Startups Founders Hub.
19:22 Join the program at pythonbytes.fm/foundershub.
19:25 link will be in your player show notes.
19:27 - Nice, yeah, cool.
19:29 - Indeed.
19:30 All right, I guess I'm up next with this order we got.
19:32 And oh my goodness, Samuel Colvin, take a bow.
19:35 (Samuel laughs)
19:36 Because he put out a plan for what's happening with Pydantic version two.
19:42 But the reason I say take a bow is this is one detailed plan that is really, really thought through, thought out, backed up with a bunch of GitHub discussions and so on.
19:53 - Wow.
19:53 So the idea is, Pydantic started out as an interesting idea and surprise, surprise, a bunch of people glommed onto it probably more than it was originally envisioned to be so.
20:04 So for example, SQL model from Sebastian Ramirez is like, Pydantic models are now our ORM to the database with all the interesting stuff that ORMs have.
20:14 And Roman Wright said, guess what?
20:16 We could do that for MongoDB as well.
20:19 Same with the Pydastic thing we recently spoke about.
20:21 And then Sebastian Ramirez is like, also like, hey, FastAPI, this can be both our data exchange as well as our documentation.
20:29 Sure, so I was like, oh my goodness, what's going on here?
20:31 So, so there's a bunch of stuff on the insides that could be better, let's say, or, you know, maybe time to rethink this.
20:40 So in this plan, it talks about what they'll add, what they'll remove, what will change, some of the ideas for how long it will take and so on.
20:46 - Interesting.
20:47 - Yeah, here's a pretty significant thing.
20:49 I'm currently taking a kind of sabbatical after leaving my last job to work on this, which goes until October.
20:56 So that's a big commitment to I'm going to help make Pydantic better.
21:01 So it sounds familiar.
21:03 It sounds a bit like a rich and textual and those types of things as well.
21:07 But this is a big, big commitment from Samuel and he's really doing a ton of work.
21:12 It says people seem to care about my project.
21:14 It's downloaded 26 million times a month.
21:17 >> Wow.
21:18 >> It's insane. Yeah, it's awesome.
21:20 >> That's incredible.
21:22 >> It is. It says, "Here's the basic roadmap.
21:25 Implement a few features in what's now called the Pydantic Core." We just had Ashley, who as we saw is out in the audience.
21:31 Hey, Ashley, who gave a bit of a shout out to this feature.
21:34 Also, I do want to also credit a couple of other people because Douglas Nichols and John Thagen also let me know that this was big news coming.
21:42 So thank you all for that.
21:43 The Pydantic Core is being rewritten in Rust, which doesn't mean you have to know or do anything, it just means you have to pip install something, you get a binary compiled thing that runs a lot faster.
21:54 Okay, so more on that in a second.
21:56 First, they're working to get 1.10 out and basically merge every open PR that makes sense and close every PR that doesn't make sense and then profusely apologize to why your PR that you spent a long time making was closed without merging.
22:10 Some other bookkeeping things, start tearing the pedantic code apart and see how many existing tests can still be made to pass and then release eventually Pydantic.
22:19 The goal is to have this done by October, probably by the end of year for sure.
22:22 A couple of things worth paying attention to, there are a bunch of breaking changes in here.
22:26 A lot of things are being cleaned up, reorganized, renamed, some removed, like from ORM, people might be using that with SQLAlchemy, that's being removed for example and so on.
22:37 So there's, if you depend heavily on Pydantic, especially if you build a project like Beanie that depends heavily on Pydantic, you're going to need to look at this because some of the stuff won't work anymore.
22:47 But let's highlight a couple things here.
22:49 Performance. This one is really important because this is the data exchange level at for FastAPI.
22:57 This is the data base transformation level.
23:00 When I do a query from the database, what comes back comes back in some raw form and then is turned into a Pydantic model.
23:05 And those are computationally expensive things that happen often.
23:09 And in general, PyDandic version 2 is about 17 times 1700% faster than v1 when validating models in a standard scenario.
23:19 Says between 4 to 50 times faster than PyDandic 1.
23:22 >> Wow.
23:23 >> That's cool, right?
23:24 >> Yeah.
23:25 >> That alone should make your ears perk up and go, excuse me, my ORM just got 17 times faster?
23:30 Wait a minute, I'm liking this.
23:32 I know that this is not the only thing that happens at ORLM level, But the ones that the ones I called out that depend heavily on it, like that's in the transformation path. So this is important.
23:42 Yeah, this is actually I'm super impressed. I have not I normally don't even see this sort of advanced planning in commercial projects.
23:51 Yes. Oh, yeah. You could do a whole business startup that doesn't have the amount of thought that went into like what's happening in the next version of it. It's ridiculous.
24:00 Incredible. Yeah.
24:01 It's incredible. I was serious when I said take a bow.
24:04 It really lays out, opens a discussion about certain things and so on.
24:09 So like another one is strict mode.
24:11 I think I even saw a comment in the chat about it.
24:14 So one of the things I actually like about Pydantic, but under certain circumstances, I can see why you would not want it, is if you have something you say is an integer field, and then you pass 1, 2, 3, the number, great.
24:26 But if you also pass "1, 2, 3", Pydantic will magically parse that for you.
24:31 Like this happens all the time on the internet.
24:32 Like a query string has a number, but query strings are always strings.
24:35 There's no way to have anything but strings.
24:37 Yeah.
24:38 So you got to convert them.
24:39 Right.
24:39 And so this automatically does that.
24:41 But if you don't want that to happen, you say you gave me a string.
24:44 It's invalid.
24:45 You can turn on strict mode, which is off by default.
24:47 I believe.
24:47 So there's a bunch of play.
24:49 Good.
24:49 So strict mode does the conversion or strict mode won't do the conversion.
24:54 It says you said it's an ant.
24:56 You gave me a string.
24:57 Nope.
24:58 Rather than could it be an integer?
25:00 Let's try that first.
25:01 You know what I mean?
25:02 >> Yeah.
25:03 >> Maybe one of the things you do is, in the ORM level, one of those things you might put it in strict mode so it doesn't do as much work trying to convert stuff.
25:11 I don't know if that actually would matter, but formalizes a bunch of conversions.
25:15 It has built-in JSON support and different things.
25:19 Another big thing is this Pydantic core will be able to be used outside of Pydantic classes now.
25:27 So you can do a significant performance to improve stuff like adding validation to data classes or validating arguments in query strings or a type DIC or a function argument or whatever.
25:39 - Yeah. - Yeah.
25:41 Let's see, next up.
25:43 And let's see, this one, strict mode.
25:46 We talked about strict mode.
25:47 Another one is required versus nullable.
25:50 It was a little bit of ambiguity of, if you said something's a string, that means it's required and it can't be none.
25:56 If you say it's a string type none, or it's an optional string or something like that, then basically the behaviors were a little bit different.
26:06 So originally, I think this is when typing was pretty new, said, "Pydenic previously had a confused idea "of required versus nullable.
26:14 "This is mostly resulted from Sam's misgivings "about marking a field as optional, "but requiring a value to be provided to it, "but allowing it to be set to none." Or something along those lines.
26:24 Anyway, there's minor changes around that.
26:28 Let's see, final one that I want to cover is namespace stuff.
26:32 This is like a whole bunch of things are now getting renamed.
26:36 For example, if you implemented or overrode validateJSON, it's now model_validateJSON.
26:43 If you had isInstance, it's now model isInstance.
26:46 There's a bunch of these changes all over the place that look like they're going to cause breaking changes.
26:51 They're easy to fix, just change the name, but it's not nothing.
26:54 Also parse file.
26:56 I still love his hander here.
26:59 Parse file. This was a mistake.
27:01 It should have never been in Pydantic or removing it.
27:03 >> Okay.
27:04 >> Parse raw, partially replaced by this other thing.
27:07 Anything else it did was a mistake.
27:09 From ORM, this has been moved somewhere else.
27:11 Schema and so on.
27:13 There's a lot of stuff that people were using here.
27:15 So just have a look, try it out.
27:17 Don't just go, "Oh, then version 2 is out.
27:19 Is this going to work?" This is going to have some significant changes.
27:22 >> Another reason why it's really awesome that he goes through so much detail is because there's going to be stuff that breaks.
27:30 It's a breaking interface change.
27:33 Yeah, it's cool that it's this detailed.
27:36 A couple of things to notice.
27:39 Let's see, somebody else in the chat mentioned, Richard mentioned, and he has emojis in the headers.
27:46 Yeah, there's emojis in the headers.
27:48 I gotta say like the navigation in the table of contents, very cool, it goes to like light gray for areas you've already seen and then--
27:59 >> That's interesting.
28:00 >> It's a cool thing.
28:02 >> Yeah, it's quite cool. I think going on and on, but two real quick things.
28:06 One, there'll be no pure Python implementation of the core.
28:10 It's always Rust, but they list out the platforms where it'll be compiled to, including WebAssembly.
28:15 >> Nice.
28:16 they previously had some Cython in what was supposed to be pure Python's Pydantic.
28:22 So now, a bonus is the Pydantic model, the Pydantic package becomes a pure Python package, whereas previously it wasn't.
28:30 So they've taken all of that behavior and put it under this core thing that ships as a Rust binary and now, instead of doing some Cython middle ground, it's pure Python again.
28:40 So that's interesting refactoring, I think.
28:42 >> Yeah.
28:43 >> Yeah. Finally, documentation.
28:45 When you get a validation error, it gives you a link to the documentation in the JSON error message.
28:52 That's pretty cool.
28:53 Yeah.
28:54 That's nice.
28:55 All right.
28:56 Yeah.
28:57 Anyway, that's quite a plan, isn't it, Brian?
28:58 Yeah, quite a plan.
28:59 All right.
29:00 Well, I'm excited for it.
29:02 Okay.
29:03 Well, next topic is a little more lighthearted.
29:08 It's about fish.
29:09 Pike, to be specific.
29:12 No, it's about PDFs.
29:15 So it's just a cool project I saw, I noticed, PycPDF.
29:20 It's a Python library for reading and writing PDF files.
29:24 What's the big deal?
29:25 We've had these before.
29:26 But this is, it's based on QPDF, which is a C++ based library, and it's presently continued being maintained.
29:38 So it's kind of pretty fast.
29:41 Well, actually, I'm assuming it's fast if it's C++ in the background.
29:44 But it's also pretty just nice and elegant to do things.
29:52 The documentation has this nice fish, which is good. I always like cool diagrams, cool logos.
30:00 But some of the neat things that you can do with it, so it's recommending that you not use it if you're just writing PDF files.
30:08 That there's other things that you can use, what was it, like Report Lab to write PDFs.
30:14 But if you're having to read or modify PDFs, then this is where it shines.
30:19 You can do things like copy pages from one PDF to another, split and merge PDFs, extract content out of PDFs.
30:27 Like if you're using it for data stuff, you get a report in PDF and you're trying to pull that information out, you can use it for that.
30:37 Or images, you can pull all the images out of a PDF file.
30:39 Or, this is kind of cool, you can replace images in a PDF file and generate a new one without changing anything else about the file.
30:47 It's neat. Just a neat, if people are working with reading or modifying PDF files, maybe check this one out.
30:56 >> Yeah, this looks great. The fact that it's in C++, I'm guessing it's probably standalone.
31:02 I remember I've done some PDF things before and it felt like I had to install some OS level thing that it shelled out to, so this is cool.
31:09 Yeah, and the some nice on the readme.
31:13 It has a comparison of some of the different PDF doc or PDF libraries that you could use.
31:20 And some of the reasons why you might want this one, like it supports more versions.
31:24 I didn't realize that like one of these libraries I've heard of before.
31:28 PDFRW doesn't support the newer versions.
31:31 So bummer.
31:33 And then also password protected files.
31:38 It supports that except for but not public key ones, but just normal passwords.
31:43 >> Straight passwords. Yeah, that's great.
31:45 >> It's going to be.
31:45 >> Also like the measure of actively maintained, the commit activity per year over the years.
31:52 >> All right. That's interesting.
31:54 >> Yeah, it's an interesting metric.
31:55 It seems good. I haven't really thought about it lately, but yeah.
31:59 >> Okay.
32:00 >> Yeah, this is a great one.
32:01 >> Well, so that's it for our main items.
32:05 >> Yeah. What else you got? Any extras?
32:07 >> Well, last week we talked about the critical packages.
32:13 >> Critical packages.
32:14 >> Or at some recent.
32:16 Yeah, last week we talked about the critical packages.
32:20 >> Either yesterday or last week, depending on how you consume this.
32:23 >> Exactly.
32:23 >> I'm sure it really is.
32:24 >> Yeah. I was surprised to find out that pytest-check, the plugin I wrote, was one of those.
32:31 I'm like, "Really?" Because it's like the top 1 percent.
32:35 If anybody's curious, I wanted to just highlight that a little bit.
32:39 pytest-check is a plugin that allows multiple failures per test.
32:44 One of the best ways, it's a secondary way that one of the contributors added, is you can use it as a context manager.
32:51 You can say like with check and then do an assert, then you're going to have multiple of those within a test.
32:56 >> I like the one-liner, that's pretty nice.
32:59 >> This is totally like black, we'll totally reformat this if you ran it through black.
33:03 but it's nice, you'd have to block it out.
33:06 Anyway, I was like, how could it be?
33:10 I'm curious what on the list it was.
33:13 There's a place called HugoVK, has a top PyPI packages list and it's updated.
33:23 I think it's just updated once a month or something.
33:25 But you can do the top 5,000.
33:29 It's the top 5,000 or 1,000 or 100.
33:33 I was curious about where on the list I was.
33:38 I'm number 1,677, so far down the list, but hey.
33:44 >> It's still in the top third of the top 1 percent.
33:48 That's pretty awesome.
33:49 >> The pytest is number 72.
33:51 That was pretty neat. Pydantic, which we covered, I just checked, 117.
33:58 But there are 57 pytest plugins that show up in the top 3,500.
34:03 So that's pretty neat.
34:05 - That is pretty neat.
34:06 - That's all I got for extras.
34:08 - All right, well, I have zero extras.
34:10 So mine are finished as well.
34:11 How about a joke?
34:12 - Yeah, great.
34:13 - All right, I told you we're coming back to it.
34:15 So this one comes from Netta, Netta, code girl @netta.mk.
34:22 And let me just pull this one up here, right?
34:24 So this one is, there's this colleague here.
34:28 Can I make this, there we go, make it a little bigger.
34:30 There's the two women who are developers, Netta and her unnamed friend who always has gotten in trouble with the elevator last time, basically.
34:39 And there's this sort of weird manager looking guy that comes in and says, "I tested your chatbot, but some of its replies are really messed up." "Well, that's what testing's all about.
34:50 I'll go through the logs later," says one of the girls.
34:52 "No, no, no, no, no, no, no, no, no need." (both laughing)
34:56 Right, check out the faces.
34:58 She was like, "Excuse me, I'm not even sure I want to open the logs now.
35:02 >> Yeah, don't look at the logs.
35:04 >> That's what testing is for.
35:06 >> I'll go through the logs later.
35:09 >> She's got some good ones in her list there. Love it.
35:14 >> Yeah. I like the art too. Nice art.
35:17 >> I do too.
35:18 >> Me too.
35:19 >> Also nice was our podcast. Thanks for being here.
35:22 >> Thank you.
35:23 >> Yeah, you bet.
35:23 >> See you next week.