Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #236: Fuzzy wuzzy wazzy fuzzy was faster

Return to episode page view on github
Recorded on Wednesday, Jun 2, 2021.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds.

00:04 This is episode 236 recorded June 2nd 2021. I'm Michael Kennedy.

00:10 And I'm Brian Ockett.

00:11 And I'm Anastasia Timoshev.

00:12 Hey Anastasia, so great to have you here. Nice to have you on the show.

00:17 Thank you for inviting.

00:18 Yeah, absolutely. Why don't you tell people a little bit about yourself before we get into the topics?

00:22 So I'm joining from Germany, Berlin remotely right now.

00:25 And I have a little one, a baby dog joining as well.

00:30 You might hear him on the stream.

00:33 I am originally from Ukraine.

00:35 I'm not German.

00:35 I moved to Germany around five years ago, maybe five and a half.

00:40 And my passion is Python.

00:42 I used to be a C++ developer, a game developer, and so many more languages, but the best one I think for me is Python.

00:49 So I decided to stick with it for around eight years now.

00:53 - Oh, how cool.

00:54 I started out doing my professional programming in C++ and I know Brian still touches a little bit of C and C++ in his world, so that's cool.

01:02 - Yeah, it's half my life.

01:03 - Nice, and what kind of games?

01:07 - Well, they were adapted first for iPad.

01:11 They were like two and a half D games and then later on it was mostly 3D games with Unreal Engine.

01:19 - Cool, yeah, that's awesome.

01:22 All right, well, once again, welcome, welcome.

01:24 So glad to have you here.

01:26 Brian, do I have the first item this time around?

01:28 No, you do.

01:28 Go for it.

01:29 What do you got for us?

01:31 - Well, accessibility isn't really something, I probably should think about accessibility more, but I don't really, but I probably should.

01:38 So I was excited to see, there was a tweet recently by Matthew Fickert that said, "I need to give some serious praise "to a fellow scikit-hep dev, Hans Daminski, and it's an excellent monolens tool for interactive simulations of colorblindness.

01:56 So I checked this out.

01:57 So monolens is a Python package, and you can pip install it.

02:02 And as Matthew said, you can pipx install it, so you just always have it around, which is nice.

02:08 And it just pops up this really cool window, and you can just drag it around, and it makes whatever the windows are over, all over your desktop, it just makes it black and white instead of color.

02:27 So you can see what it looks like in grayscale.

02:31 One of the things I really liked about this is the example showing it with Matplotlib and plots, because plots are really where you're using color to distinguish between two different sets of data.

02:48 You really want that data to look different even if people don't see color.

02:54 That's an important thing. That was neat.

02:57 Then somebody that replied to that and said, "Hey, I always try to use Cmasher, Smasher, I'm not sure, to make sure they're colorblind friendly." I'm like, "I've never heard of this." I went and checked out Smasher.

03:16 What it is is it's a bunch of color maps.

03:20 You don't really have to think about it.

03:23 There's all these great named color maps, and they're actually fairly attractive color changes.

03:30 But it shows you what they look like in black and white also.

03:35 This is a little demo at the top that we're looking at on the stream.

03:41 But the code that you have to, it's just built into MapPlotLib already.

03:47 It's also an extension to MapPlotLib and other things that use color maps.

03:51 You can just say when you're plotting, you can just specify a color map like rainforest or something, and it automatically is a colorblind, friendly color map.

04:02 You can do your plots and have it still look nice everywhere.

04:07 >> Yeah. This is really cool.

04:08 Matthew friend of the show, thanks for sending that in.

04:11 I never really thought about this and I should have, you know, I mean, I feel like maybe I should go over my websites and go, do they look terrible for people who have, you know, color, vision impairments and whatnot.

04:25 So really cool.

04:27 And it looks like it's this independent thing that we'll just go over.

04:30 You just move your mouse around.

04:31 It works on anything.

04:32 It doesn't necessarily have to do with Jupiter or matplotlib or something like that.

04:36 Right.

04:36 Great.

04:37 So the monolens is just something that works on anything.

04:40 I drug it over even my desktop, my background, and it showed the picture in black and white.

04:46 So it is cool.

04:48 >> The other thing is, wait, there's color maps I can just add to Mapplotlib?

04:54 That's cool. Like rainbows and stuff, how neat.

04:56 >> I didn't know you could just do that.

04:58 So that's a neat thing.

05:00 Then you can, for instance, one of the examples that they have on the C-Mesh or Readme, It's just sort of a simple plot.

05:09 And when you're in map plot lib kind of just picks colors for you unless you specify colors for different plot lines.

05:16 But you can just you can give it a color map instead of a specific list for each item.

05:24 So and that's just kind of nice.

05:28 Why not do it?

05:29 Yeah, why not do it?

05:30 Anastasia, what do you think?

05:31 It looks amazing, really.

05:33 And it's super helpful.

05:35 Yeah, when you were doing the video games?

05:37 But that would be great to use it as well.

05:40 For sure. When you were doing games, did you have to think about this kind of stuff?

05:43 No, actually, we were not that far at that time. It was around seven years ago, eight.

05:49 Yeah.

05:50 Yeah.

05:51 On the Monolens site, one of the examples they show is having one of the plots use some sort of pattern underneath and not just color. And I'm not sure how to do that. So So people that are great at Matplotlib probably know how to do that really right away.

06:09 But that's kind of a neat idea also to have like, one of the graphs has hashes versus stars or slant lines or something like that.

06:16 - Oh yeah, I have it like some sort of ASCII differentiator.

06:20 Yeah. - Yeah.

06:21 - So that's nice.

06:22 Yeah, this is super helpful.

06:23 I'm Matthew again.

06:24 Thanks for sending in.

06:26 And Joy, yeah, welcome to the live stream.

06:28 Thanks for being here for the recording.

06:31 So the next one I wanna talk about is something called RapidFuzz.

06:36 - RapidFuzz.

06:37 - Yeah, so last time I talked when we had Vincent on, I saw the fuzzy wuzzy, fuzzy text matching for that, that chat bot that he was showing off.

06:47 I thought, oh, fuzzy wuzzy is cool.

06:48 So Mikhail Honkala sent in RapidFuzz.

06:53 And it's very much like fuzzy wuzzy, but it turns out to be a whole lot faster.

06:59 And it uses some of the same ideas, but you know, coming back to the, some of the things we were talking about.

07:03 It is basically written in C++ using the Levenstein distance algorithm for words, similarities, but obviously has a Python API that we all work with.

07:14 And so yeah, it's pretty neat.

07:16 It's really easy to work with.

07:17 You just, again, pip install it, and then you come down here and do things like fuzz.ratio, and you can then give it two sentences.

07:25 This is a test, or this is a test exclamation mark, and it says that's 96.5% the same, Or, you have fuzzy wuzzy was a bear.

07:34 I guess these are, yeah, fuzzy wuzzy was a bear.

07:37 I guess those are, those are the same?

07:39 - No, wuzzy fuzzy.

07:40 - Oh, wuzzy fuzzy, yeah, I gotta read better.

07:42 Wuzzy fuzzy was a bear versus fuzzy wuzzy was a bear.

07:45 Oh my goodness.

07:46 That's 90% the same.

07:48 Given a bunch of phrases, you can sort them by similarity.

07:52 You can say, gonna use selection, like, you know, to call in, sort of call center type of automation.

07:59 given three choices and given some text, you can say find which one, you know, like Atlanta Falcons, New York Jets, New York Giants, and so on.

08:07 Somebody says, you know, lowercase New York Jets instead of uppercase, it'll say, well, here's the likelihood that that's a match, but here's another possible match that's, you know, and it gives you the ratios of how good of a match it is.

08:19 So if you've got a select set of choices and you're asking for input on it, you can just say, well, give me the closest match.

08:25 And if it's anywhere close, you can just run with that.

08:27 So yeah, pretty neat, right?

08:29 - That is pretty cool.

08:30 - Yeah, and the other thing that's interesting is the performance.

08:35 And before people tell me that all benchmarks are broken and they don't work, sometimes at least they give you a sense.

08:41 So here's some of the things that they've got in terms of performance, say versus Fuzzy Wuzzy, and the numbers are like 10 or 20 times faster.

08:49 - Definitely broken.

08:51 - It's definitely broken.

08:53 I think it's because it's written in C++ instead of Python at most of its core, you know, probably.

08:58 But anyway, if you're looking for fuzzy text matching, fuzzy wuzzy is a good option.

09:03 And apparently, thanks to Mikko, rapid fuzz is as well.

09:06 So yeah, pretty neat.

09:07 - Yeah, we probably should do a segment on benchmarks at some point.

09:11 - No, no.

09:12 - No. (laughs)

09:14 - No, we should do it.

09:15 But I've written blog posts and stuff on it, and it's just an endless battle of you're doing it wrong.

09:20 Your situation is not my situation.

09:22 in my situation, it's not as good or it's worse or it's better or you're, yeah, no, I hear you.

09:28 It would be interesting, but at the same time, yeah.

09:30 - Okay, there we go.

09:31 We just had a section on benchmarks.

09:33 - Yeah, I've already just explained the emotional trauma that I'll go through from receiving all the feedback.

09:38 Now it's, (laughs)

09:39 Ines, what do you think about this fuzzy text matching?

09:43 - Well, maybe next time we can organize a battle between them.

09:46 (both laugh)

09:47 - That's right, yeah, we'll bring some in.

09:49 - Yeah, sure.

09:50 - Do you have any use for this fuzzy text matching, string matching stuff?

09:53 - Well, actually, yes, it's work.

09:55 We have lots of matching algorithms, but we are using different tools, and I'm not a data scientist person, but I would love to try that, actually.

10:06 Looks super cool.

10:07 Yeah, we use some C++ libraries, well.

10:10 - Cool. - Yeah.

10:11 - Yeah, Robert out there in the live stream says, we would have to benchmark the episode if we had an episode about benchmarking.

10:17 You see, it's like recursion.

10:19 Save that thought for the end of the show, by the way.

10:21 - All right, Anastasia, you're up next.

10:23 Structured logging, tell us about it.

10:25 - Well, a few years ago, I went to a meetup and I heard a talk from Markus Holterman about StructLog.

10:33 That's the first time when I heard about this and I decided to give it a try.

10:37 And actually, I fell in love with it and I'm using it since at least two and a half years, maybe two.

10:45 It's awesome way to bring a bit of structure to your logs to make them more visible and more usable because usually how we log, it's like just one huge sentence, which is readable by humans, but it's not machine readable.

11:02 And the idea is here to bring more structure, to build some dashboards based on different keys and then values, and then see what's actually happening with the system without touching the logs, without scrolling through the whole log and then just reading a whole bunch of things.

11:23 And I already used it in production.

11:26 It looks pretty well.

11:27 If you try using JSON format, just fantastic.

11:32 - Oh, how cool.

11:34 Yeah, you can pass it all these processors and type stuff.

11:37 So you can say, render out the print the stack info, the log level, timestamp, all those kinds of things.

11:44 That's neat.

11:45 - We added a bunch of processors, like custom made, which were specifically designed for our applications, which made a life of our DevOps parsing the logs way easier because they didn't have to write them by hand.

12:00 And if you use structured logs for all applications, not just one, but for example, microservices, and you pass the key ID or like trace ID or something that will identify the path which the log goes through, then you might see what happened before the bug happened.

12:21 Or maybe because if you want to see how the system is working, you also need to be either one of the detectives of the system or use the struct log.

12:33 - Yeah, it's interesting when you log out stuff, it looks like you can just do key keyword arguments and those will add to the log really nicely.

12:43 So you don't have to create a message that you're going to send that embeds, you know, the value, you know, variable equals valuable, variable equals value, you just pass them to the log message and they become part of the message like that.

12:55 That's cool.

12:56 - Yeah, and you can also use the initial message, which is an event like greeted here, as some kind of key, which would give more clues where this message is coming from and what type of event happened instead of a usual message.

13:12 - Yeah, nice.

13:13 Very cool.

13:14 The other thing it says is if you have Colorama installed, it will automatically render in nice colors.

13:20 That's very neat.

13:21 I love Colorama and I love having colors in the code that we look at.

13:26 It really makes a nice difference.

13:27 So yeah, you get things like the colored, whether it's an info message or an error and whatnot.

13:34 Yeah, very neat. I like it.

13:35 - I keep meaning to use this more and I know, I'm glad you brought it up 'cause I definitely want to try this.

13:41 - Definitely try this.

13:42 - Yeah, this is a really good one.

13:44 This is new to me, but quite neat.

13:46 All right, not new to me, but also quite neat, is our sponsor for this episode.

13:50 So this episode is brought to you by Sentry.

13:54 So how would you like to remove a little stress from your life?

13:57 Do you worry that users may be having difficulties and encountering errors with your app right now?

14:02 Would you even know until they send that support email?

14:04 Yes, maybe using StruckLog, but are you watching the StruckLog now?

14:08 You don't know, right?

14:08 So how much better would it be if you had that error or performance details immediately sent to you with the call stack and local variables and active user and all that stuff.

14:18 And with Sentry, it's not just possible, it's easy.

14:21 We use Sentry on all of our web apps, Python by setup them, Talk Python Training, all those kinds of things.

14:26 And we know if there's some kind of problem.

14:28 It's unfortunate if someone hits a problem, but it's better to know and be able to fix it right away.

14:31 In fact, one time somebody ran into a problem over at Talk Python Training, getting a course and got the message.

14:38 I could see who was logged in when they had the problem and I actually fixed the bug and was about to push out the changes and I got an email, "Hey, I'm having a problem with your site." I'm like, "Yeah, I know, I just fixed it.

14:48 "Try again, please." And they were quite surprised.

14:51 So surprise and delight your users today.

14:53 Create your Sentry account at pythonbytes.fm/sentry.

14:56 And please, when you're signing up, click the "Got a promo code?" redeem option and enter Python bytes.

15:01 It's not automatic.

15:03 So make sure that you enter Python bytes as the promo code, otherwise they won't know it's from us.

15:07 You'll get a bunch of cool stuff.

15:08 two free months of the team plan with many more errors and events and other features as well.

15:13 So check them out at pythonbytes.fm/sentry.

15:15 That's pretty awesome.

15:16 Brian, I guess you should probably also test your code, maybe before you end up with errors.

15:22 What do you think?

15:23 Definitely. And actually, before we go on, I think I've mentioned this before, but the graphic on that is on the Sentry page is so cool.

15:30 I know, I really like it too. I love the upset console terminal reading of paper.

15:35 >> Yeah. This is like inside baseball maybe.

15:40 But I don't know, maybe three people might care about this.

15:43 But anyway, I'm one of them.

15:45 XFail now works with pytest subtests.

15:51 It's neat, but I got to explain it a little bit.

15:55 Subtests are this weird feature of unit tests that came along in Python 3.4, and it's a context manager so that you can have possibly several places where your test might fail, but continue, it doesn't stop if it fails.

16:12 That was within unit test.

16:15 pytest had, well, pytest had pytest check, the plugin that I wrote that allows something similar context manager.

16:22 But then pytest subtests came out, which was a plugin in about 2019 that started, that allowed you to run the unit test subtests from pytest, but there's also a pytest style of doing subtests also.

16:40 They're a bit quirky.

16:42 I'm linking to two resources, an article by Paul Gansel and an episode of Testing Code where he and I talked about subtests.

16:54 Before you jump in and use them right away, you should know some of the quirks about them, but they're still cool if they work for you.

17:00 But one of the quirks that was around for a long time was that XFail didn't work.

17:04 XFail is a way to say, I know my test is going to fail, and then you get to decide whether or not you want to make market as an XPass or market as a fail if it fails.

17:17 Anyway, XFail didn't work with subtests, but it does now as of the start of the month.

17:26 So somebody named maybe siber on GitHub.

17:30 - Maybe. - Maybe.

17:31 Merged a fix or submitted a fix as a pull request and it got merged and it's now in version 0.5.0.

17:39 So XFail, if you wanted to use subtests, XFail now works with them.

17:43 So that's the good news.

17:45 - Yeah, yeah, this is so interesting.

17:47 So the basic idea is I wanna loop over a bunch of scenarios or whatever, maybe test them all and then have the test fail if any of them did, but actually just go through them all before?

17:57 >> Yeah. On the subtests site, there's a little example.

18:03 Let's say you're looping through a range and you want to run all of them, not a parameterized, just within the test, you're doing several things.

18:12 If something fails, you want to actually report all of the failures.

18:17 This is helpful with loops, but why not just use parameterization?

18:23 But the one part where it does really help is if you really are checking four or five different things and you really want to know, like let's say you're measuring something or you're checking several dimensions of something.

18:39 Having all of the failures together would help you determine what the real problem is.

18:44 So when you have to have all the information, this is a good idea.

18:50 >> Very cool. Anastasia, what's the testing story in your world?

18:54 Well, we use mostly parameterized testing because we don't have the sub-test need.

18:59 We don't need to test it multiple times.

19:01 Maybe in the future.

19:03 Yeah.

19:04 When we use it.

19:05 Parameterized works, so I'd stick with it.

19:07 Yeah, it's definitely good.

19:09 All right.

19:10 Another thing that I think is really neat to talk about, but I feel like it's almost down to the benchmark type of situation is what do you do with the secrets in your application. There's shgit, s-s-s-h-git, which is always terrifying.

19:28 If you go here, you can see, oh, here's all the code that we found in this branch of this GitHub repository. For example, here's your database connection string with username and password right there.

19:39 Right? So, you can see all kinds of issues if you go over here.

19:43 Like even a live stream, if it doesn't feel bad enough, you can watch the live stream of all the things that are coming in. Like right now, apparently, there's some username and password and a URI and some kind of private key and whatnot.

19:55 You don't want that. So what do you do?

19:57 Well, there's all kinds of things you can do.

19:59 Do you encrypt those secrets and put them in source code?

20:03 Well, then where do you store the encryption key?

20:05 There's some kind of certain types of vaults you can install on your server, kind of like one password, but for servers, you could do that kind of thing.

20:13 There's just leave it in there and hoping for the best.

20:17 there's put in environment variables, that's a very, very common one.

20:21 Right. But still, no matter what you pick, you kind of got to get that data back and deal with it. So I want to introduce you to Pydantic. Brian, you've heard of Pydantic, right?

20:31 Yeah.

20:32 In fact, I didn't know this had anything to do with secrets.

20:36 Yeah. If you go to Pydantic right here at the top, I believe there might be some nice little comment here. Oh, yeah.

20:45 I thought you were in here, but apparently I'm in here right now.

20:47 I think it toggles between us. Anyway, yeah, so we've known, the point is we really talked about pedantic a lot.

20:54 It's a really cool way to create these classes that are kind of like data classes.

20:58 Point them at some data source and then they validate it and adapt it, right? So if I've got like a JSON document and it has a field in it, and that field is a list of something, I could say in my model, this thing has a list of integers. And if it happens to be, quote a string or a number that has quotes on it, it'll just automatically do the int parse type of thing to get it fixed.

21:19 Or it'll tell us that it couldn't figure out what to do with the third value, something like that. It's really fantastic.

21:25 But what I also didn't know was that it has built-in support for working with these user secrets.

21:30 So Dennis Roy pointed this out to me.

21:33 And there's all kinds of things.

21:34 You can have the .env file, you can have Docker secrets, you can have environment variables and all of these things as your secrets.

21:45 And if you just derive from instead of base model, you derive from base settings, then this will automatically determine any of the fields that are not passed to it from the environment or from .env files.

21:57 What do you think?

21:58 >> Well, that's cool. Where do the .env files go?

22:01 >> Not in GitHub.

22:03 >> Okay.

22:05 >> You know, you store them somewhere else, right?

22:07 probably what ideally I think you do is you would store like an env template file that has, you know, put this value and then the real value here, this value and the real value there, and then you of course, ignore, get ignore the other one, the real one, right? So you at least have a structure.

22:22 But so the idea is you come down here and say I've got these settings, and we've got like an API key and auth key, we've got a Redis connection, all those kinds of things. And you can even say, I'm going to put a prefix on it.

22:37 So in your environment variables is fine if you've got one app and one server.

22:41 But if you've got 10 apps running or 10 APIs running on your server, what does the API key refer to?

22:48 What does the database connection string with the database name in it refer to?

22:52 Which one of those 10 apps, right?

22:53 So you can put a prefix so you could have like login app API key or you know, login app API key.

23:02 And you put that in there and it automatically will just let you access it as if it's API key.

23:07 So you can sort of configure an environment a little bit better.

23:10 There's just lots of really neat things that you can do in here to make that work.

23:13 You can say whether it's case sensitive.

23:16 Let's see, let me pull up...

23:17 I had to take notes of other things I thought were super cool.

23:20 So it's a regular Pydantic model, which means it'll do all the conversions and the validation.

23:25 So if something is missing that's required from your environment, it'll let you know exactly what's missing.

23:31 It'll do those conversions.

23:32 Yeah, all sorts of stuff.

23:34 It has support for raw sequence files as well, which is like a slightly different way to do it.

23:40 You can have differently named env files like a prod.env versus qnad.env or whatever.

23:48 All sorts of settings. So I've always thought Pydantic is amazing and I had no idea it had this built-in support for working with this.

23:56 The other thing that's really cool about this is if you go back to the top where it describes it, it says it will try to get these values from the environment if you don't pass them over.

24:05 So if you're in, say, a testing environment, and you want to actually pass values that would control it, you could just explicitly pass them along instead of having them come from the environment.

24:15 So it's really easy to test, set the test values instead of trying to configure a test environment.

24:20 Nice.

24:21 We do use it, by the way, by settings.

24:24 But we didn't use prefixes.

24:25 Yes.

24:26 Yeah.

24:26 Which is a good idea.

24:28 A really good idea.

24:29 Yeah, the prefixes are cool if you have a bunch of apps.

24:30 So if you just have one, it doesn't really matter, right?

24:33 Yeah, of course.

24:34 Cool, you like this? It's working well for you?

24:36 Yeah, it's working perfectly well.

24:37 And we are committing on the development version with some dummy keys just to have them around, of course.

24:43 Of course. Oh, wow. How neat. Okay.

24:46 Well, cool. Well, that's neat that you're using it.

24:48 Brian, you got the next one? Is that right?

24:50 - You've already done it. - No, but I just wanted to mention the...

24:54 Oh, wait.

24:55 Never mind. I hit the wrong thing.

24:58 Oh, here we go.

25:00 The quote I think you were looking for was from FastAPI.

25:03 Oh, yes, yes, of course, of course.

25:06 Yeah, it is. I'm over the moon.

25:09 Yeah, super excited about it. Yeah, FastAPI. Thanks.

25:12 Yeah, we use it.

25:14 I love FastAPI as well. And to me, like, Pydantic and FastAPI, they go together because I learned about them at the same time.

25:20 I know there are different people and different projects, but, you know.

25:23 It works like magic.

25:24 Yeah, yeah, absolutely.

25:26 And if it's not magic, maybe you should document it.

25:29 Or maybe it is magic, you should document it.

25:31 Definitely.

25:33 Definitely.

25:35 Actually, I'm the one who is usually bringing this topic to the team.

25:39 How to write documentation.

25:41 And first, the question is why to write documentation?

25:43 Everyone knows that we need documentation, but it's hard, it's time consuming, it's annoying, and how it usually happens, someone leaves the team, and then the last days are about handing over everything. Oh my gosh, I remember I've had this experience twice at least.

26:03 Writing? Where you said you're going to, you've given me your two weeks, so your next two weeks, your two weeks notice that you're going to leave, your next two weeks will be to start writing documentation for everything you've ever worked on and anything that people might need to do. So your next two weeks are to begin writing documentation that you should have been doing the whole time. In Germany, you will have a notice period of three months, so like it's - Oh, that's a lot of documentation writing.

26:28 - Yeah.

26:29 (both laughing)

26:31 Just kidding, but normally, even if you leave the team, like you, for example, move from one team to another, it doesn't mean that you have to leave the company.

26:40 Still, you have to hand over everything that you worked for, let's say, in a year or even half of the year.

26:47 And for example, in my experience, when I started with Python, I didn't know any Python, I had to learn it.

26:53 And of course, I didn't know about Sphinx or Read-a-Dox or any kind of documentation for Python.

26:58 And what did I do?

27:00 Nothing, I didn't write it.

27:01 And half a year later, I was wondering who wrote this code.

27:05 So I did git blame, and of course it was me.

27:07 And I was like, what a stupid person.

27:09 (laughing)

27:11 So yeah.

27:12 And I suggest to start writing documentation now, even if you're not leaving the team.

27:18 The reason why I'm bringing up the Sphinx and Read the Docs is that it will allow to have continuous documentation.

27:26 And with Sphinx, you can easily write just some doc strings which will explain what the function does, what the class is doing, add some input output parameters, and then you will automatically generate it.

27:42 So there's no need to write it somewhere on Confluence or any other source, because if there are too many sources, that's where the documentation will die because no one will go and check it.

27:54 And during the handover, usually it happens like that.

27:57 You write documentation somewhere where nobody knows where and nobody reads it.

28:01 - Yeah, you pointed out that you've got it in Jira and you've got it in GitHub and you've got it in all the different places.

28:07 - Google Docs, yes.

28:08 - Yeah, yeah.

28:09 - Especially Google Docs. - Oh yes.

28:11 (laughing)

28:12 - And then you share like 10 Google Docs with different people and then they lose the links and people are leaving.

28:20 It's nice when people are leaving the team, but it's not nice to the people who are leaving the team to another team because they are getting all the questions for a year.

28:29 (laughing)

28:30 Where to find this?

28:31 How can I get this function?

28:33 How to get this data?

28:35 - Yeah, yeah, very good advice.

28:37 You know, for a long time, Sphinx was like synonymous with restructured text, but now we've also got the markdown with the mist parser there.

28:46 So that's very cool as well.

28:48 I'm a fan of Markdown instead, yeah.

28:51 - And also it supports the Sphinx itself.

28:54 It supports different types of documentation.

28:57 For example, you can write code reference, then you can go through all the code, and then you can also write extra documentation, like Markdown.

29:05 Even Readme can be included into documentation.

29:08 And you can also style it.

29:10 - Oh, nice.

29:11 Yeah, yeah, very cool.

29:12 - Yeah, there's lots of great themes to it too now.

29:14 It really looks attractive.

29:16 - Yeah, you did recently cover that, right, Brian, the Sphinx themes.

29:19 - Yeah, and actually when the Markdown, the support came on, that's when I went back and started looking at Sphinx.

29:27 So some of our documentation is done in Sphinx now because it does Markdown.

29:33 And you can even make it do, it's not built in, but you can make it read doc strings and interpret doc strings as Markdown.

29:41 - Yeah, very cool, very cool.

29:44 Robert on the live stream has an interesting addition to continuous integration and continuous delivery.

29:50 So can we deploy yet?

29:51 Only if the documentation is complete.

29:53 - Definitely.

29:54 - Very cool.

29:56 All right, well, that's it for our main topics.

29:59 Brian, you got anything you want to share?

30:00 Any extra stuff you want to throw out there?

30:02 - Mostly I'm curious about pytest uses.

30:06 So I'll drop a link in the show notes, but basically I've got a pinned tweet on my Twitter.

30:13 And I'd like to have people tell me where they see where they're using pytest.

30:20 So I've got some examples and then I kind of went, my first question was people, projects that have switched.

30:27 But I was looking at just the guts of how Python works.

30:32 And there's some amazing projects that use pytest, like wheel, tip, setup tools, warehouse, those all use pytest, that's pretty cool.

30:40 >> Wow, how interesting.

30:41 Yeah. And those are sort of almost inside of Python, which is interesting because they're not using unit tests, right?

30:47 >> Yeah. So then I just learned about recently, even if it's proprietary, that'd be interesting.

30:52 I just learned that Stripe and Lyft went through a pytest conversion recently, so that's going to be neat.

30:57 >> That's cool. Yeah, very cool.

30:58 Anastasia, anything else you want to throw out there or let people know about while we're here?

31:03 >> Yeah, maybe using exceptions.

31:06 Don't use base exception.

31:08 >> Yeah, create custom ones that are for your app or have certain.

31:12 Absolutely, I definitely second that idea.

31:14 All right, this, Brian, this was in danger of almost being an extra, extra, extra, extra, extra hero about it. So I'll just go quick.

31:22 So Matthew Fikert's getting a couple of shout outs on this show. So he also pointed out that whoa, super cool pipx, which we've talked about on the show before, it lets you install Python tools kind of like homebrew or apt.

31:35 They're not part of a project, but you want to have them managed and installed in their own isolated environment. So you pipx instead of pip install a thing, which is great.

31:42 That is now officially part of PyPA, the Python Packaging Authority.

31:46 So yeah, pretty cool.

31:48 So pipx is now officially part of Python.

31:51 Not Python, the distribution, but the group, you know.

31:54 Next, I will be presenting-ish.

31:57 It's recorded, but then there's like a live Q&A afterwards.

32:00 Manning is having a conference on developer productivity.

32:04 I don't honestly remember what my talk is going to be about.

32:08 Oh, yes, here it is.

32:09 It's 10 tips and tools you can adopt in 15 minutes or less to level up your developer productivity.

32:14 So I'm going to be speaking on that.

32:16 All sorts of fun things.

32:18 So if you want to check that out, it's free to register for.

32:20 It's later this month, I guess.

32:22 Here's just a thought I would throw out there for you.

32:24 I don't expect an answer, but yikes, cloud bills can pile up.

32:28 Alex Chan, who is teaching, I guess.

32:32 I could figure out exactly the context of this, but put out a tweet that said, I have a panicked student in my DMs who accidentally racked up an $8,000 AWS bill.

32:43 My suggestion of talk to support is no good. Apparently, they won't issue a billing adjustment.

32:48 Anyone got ideas out there?

32:49 Could you imagine as a student? I mean, as a professional, it's still a lot of money, but as a student, $8,000 is like a ton of money.

32:59 Yeah, it's like a term of bills. It depends on...

33:03 Yes, exactly. Yeah, like a semester of studies or something.

33:06 So maybe other students and basically all people out there, put up billing alerts on whatever cloud thing you're doing, on whatever places I have, including AWS.

33:18 I get periodically, I get an announcement like, "Your bill is now at $50, your bill is at $100, your bill is now at $500, your bill is now at $1,000." If it goes beyond that, I'm going to have to start paying a lot of attention to what's going on with my AWS account.

33:32 So just put these alerts on there.

33:34 It's usually easy with whatever platform you're on.

33:37 Anyway, don't be that poor student.

33:40 All right, what's next?

33:41 Brian Skin shot it out.

33:43 Hey, this might not be a total new item, but maybe we can mention it, maybe it's interesting.

33:47 Developed a, mentioned a Flake, he didn't develop it, I don't believe, a Flake 8 plugin for FastAPI.

33:54 So if you're doing FastAPI, there's different ways to do things like routes and whatnot.

33:59 And there's like the natural way, and there's sort of a clumsy way.

34:03 And so here's a flake eight thing to make sure you're doing fast API.

34:06 Nice.

34:07 Interesting.

34:08 Yep.

34:08 And I think, yeah, yeah.

34:10 And I think this is my last one.

34:12 It is my last one here.

34:13 So Sal Shannon Brooke tweeted JupyterLab 3 will have localization.

34:19 So localization means like the menus and the help text and the button hover tips and all that kind of stuff are localized for different languages.

34:27 So JupyterLab 3 will have localization making it more approachable for people who don't want to work in an English UI and they're crowdsourcing translations.

34:38 So if you wanted to contribute to Jupiter and you were good at programming and in a language that's not English, that's already done in English, you know, go check that out.

34:47 That would be kind of cool.

34:48 What if anybody just messes with people and like does wrong translations just for fun?

34:52 I'm so afraid of that.

34:54 Yeah, I think they do.

34:56 I bet they do.

34:57 I bet they do.

34:58 And maybe not really obvious, maybe in real subtle ways.

35:00 Yeah.

35:01 Yeah.

35:01 Yeah.

35:02 Never mind. Don't have any ideas.

35:04 Brian, don't give people ideas. This is not...

35:06 That's a good one.

35:08 [laughter]

35:10 Alright, well that's all the extras as well. So how about a joke? Yeah.

35:14 Okay. So, imagine you're learning programming, you're learning Python.

35:18 Take one of these computer science courses where they talk about weird things like recursion.

35:24 So, recursion is the idea that the function calls itself with different parameters, right? Like a really common example would be factorial. So if I'm going to calculate a factorial, it's just n times n minus 1 times n minus 2. So that's just n times factorial of the smaller number. You just like work your way back, right? But there should be an exit condition, like if n equals 1, return. Don't keep recursing. So here's a nice little graphic under the banner of "Only programmers would understand." And it's got the four squares. It's kind of like screen sharing. We got that infinite view. So it's learning a program in one corner. Next corner, make recursive function.

36:01 Third corner, no exit condition. And then it just repeats and repeats and repeats down to smaller and smaller and smaller. I love it. This is bad. No, this is good. That's how you learn.

36:13 That's right. No. Yeah, exactly. It's like when you share your screen in Zoom or maybe Google Meet, but you still got the window up or something like that. But it's about recursion. It's beautiful.

36:24 And then you silence base exceptions and you cannot exit the program.

36:27 Yes, that's right.

36:29 Do you know if Python has a tail recursion optimization?

36:34 I'm thinking no.

36:37 So the whole point is here, Brian, that we would run out of a call stack space really quickly.

36:41 And that's usually the error stack overflow, error if you recurse too deep type of thing.

36:45 But with trail recursion, it basically becomes an infinite loop.

36:49 So you run out of time instead of memory.

36:51 Okay.

36:52 So that would be the advantage of tail recursion. I have no idea if it is or not.

36:56 Yeah, I mean, there's some languages that do the optimization so they don't generate a new call stack because there's nothing to save.

37:04 Yeah. I don't know. I'm sure we will find out before next week.

37:09 Yeah. One of the reasons why I like asking open-end questions on the podcast.

37:14 Yeah, that's awesome. Yep.

37:16 Well, Brian, thank you as always. And Anastasia, thank you for being here. It was great to have you as a guest.

37:20 Thank you for inviting me.

Back to show page