Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #270: Can errors really be beautiful?

Return to episode page view on github
Recorded on Wednesday, Feb 9, 2022.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:04 This is episode 270, recorded February 9th, 2022.

00:09 I'm Michael Kennedy.

00:10 And I'm Brian Okken.

00:12 And I'm Dean Langsam.

00:13 Dean, so great to have you on the show.

00:15 Thank you.

00:16 So often you help me with that start in the live chat.

00:19 I know you're a big participant in the show, so we pulled you in and now here you are.

00:25 Welcome.

00:26 Thank you.

00:27 I've been a fan actually since episode one. I've been hearing this weekly.

00:31 That goes back years, like five years.

00:34 Yeah, it's about five years. I remember I moved apartments back then and I listened to Python. I didn't know Python as well back then.

00:42 And I actually grow with the show. So that's very nice.

00:44 That's fantastic.

00:45 That's incredible. We've heard that from other people and that's just like mind-blowing to me. But yeah, it's cool.

00:52 I was taking like intro to data science classes in Coursera while listening to the show and now other people call me a senior Python. So that's that was very nice. That's fantastic. And it does go fast. Yeah. So awesome. Thank you so much for joining us on the show. It's awesome. Before we get into it. I also want to say this episode is brought to you by data dog. Check out their awesome stuff at Python by set of them slash data dog. I'll tell tell you more about that later.

01:20 Right now, Python, I want to hear about a better PyGame loop.

01:25 Brian, tell us about it.

01:26 - Yeah, so this is a article from Glyph, and this is, PyGame is a package that's used for game programming a lot.

01:37 And it's, I mean, a lot.

01:39 And programming games is definitely, I think it's one of the things I tried to do early on when I was a developer, and I think it's something that I think I encourage a lot of new developers to try out things like simple games because it's fun to learn coding that way.

01:55 And it's a, anyway, it's a big part of learning programming and the programming space.

02:02 And with Python, it's pretty easy with Pygame.

02:04 And there's a lot of tutorials out there, but one of the things that Glyph points out is a lot of the tutorials have this sort of simple while one loop, where you, the main loop of a game, where you just spin and wait for events and then handle the event or draw things or whatever and then go back.

02:23 And draw, you know, keep going.

02:25 And this just happens forever.

02:27 Well, while one loop in programming is a busy loop and it's generally something that kind of has some issues.

02:34 So Cliff is pointing out that some of the issues with this are that it's a waste power for one, your CPU is just spinning all the time and you're really not gonna get events that fast.

02:47 And then also there's a thing that I didn't know about called screen tearing, which is when you're drawing the screen at the same time you're writing to the screen buffer.

02:58 - Right, you're not waiting for the Vsync 60, 100 frames a second, whatever it is, right?

03:03 - Yeah, and that can cause glitches in the game and it doesn't look as good.

03:09 Pygame does allow a Vsync option, but apparently there's some problem with that.

03:15 So really, the article walks through both of these problems and the VSync fix and the problems with that.

03:26 But the end result really is he's got, it's actually an interesting discussion about really what's going on in PyGame.

03:35 And he talks about that there's really three jobs going on, the drawing and the game logic and the input handling all at once.

03:42 And so this is a three thing.

03:44 It's probably a good idea to do maybe async stuff so things can work together.

03:49 And the solution he came up with is still, I mean, it's definitely a larger loop, but it's not that big of a loop, more complicated.

04:01 And it's an async version to have some sleeps in there with some delays possibly, but a better loop for gaming.

04:10 And it's not that complicated.

04:12 And actually, if you're learning gaming while programming, hearing about these sorts of issues and learning how to solve it, it's probably just gonna make you a better developer faster.

04:23 So I think it's a good thing to look at this.

04:26 - Yeah, this looks really interesting.

04:28 This gaming loop stuff, it's so often very much the same and there's like these core elements like process input, if the key's down or if there's a joystick attached, draw the scene, do the hit detection and AI game logic.

04:43 It's almost always the same.

04:45 This looks great as a way to tell me what I should be doing.

04:50 Maybe the next step would be create a class that I just override, do the AI logic, draw the screen, and just let that not even be something I ever see.

04:58 This is ripe for a little bit of hiding away even this cool stuff.

05:03 That's true. Maybe Pygame can extend a better built-in loop to hook into or something.

05:10 Yeah.

05:11 I always think about... I'm not actually used to a lot of gaming on Python, but I always think about browsers, which are also kind of a loop that runs forever and renders stuff on your screen.

05:23 And I think, well, the front-end guys got it so easy.

05:26 They just write the code and the browser does it for them.

05:30 And I'm not sure if it works exactly the same, but maybe if someone manages to implement something that's like, just write your game and put it in this thing, maybe this could attract more people into writing small games in Python.

05:48 Yeah, absolutely. And my thought is if you just sort of abstract that away, it's just 2D stuff, right, which it's pretty easy to get into.

05:56 I just listened to or watched a Netflix series called High Score, which is the history of video games going way back to the Atari 2600 and Asteroid and whatnot.

06:08 There's this woman in here talks about how she got so inspired about just text-based games.

06:13 If you're learning to program, I definitely think games are a fun way and often I think people might perceive that as like, "Well, I've got to write Angry Birds or something." Which is fine, you can write that and that's super fun, but you can do a lot of stuff with just sort of text-based little fun story adventure type stuff as well.

06:32 I gotta check out that Netflix series. That sounds great.

06:35 Yeah.

06:36 I was just helping a friend writing this small game and he's written this with one thread and everything for this school project.

06:44 And then he told me, "Well, but how do I show a score that updates with the game?" And then I thought, "No, for that you'll need multiple threads, a Pygame loop, maybe and stuff like that. So if that could have been easier and on him while learning Python, this could have been awesome.

07:01 Yeah, absolutely. There's a lot of nice comments out in the live stream. Anthony says, I teach Pygame in my code club after school class. Smart kids, Pygame is great. So is Arcade, which is an alternative and OpenGL based alternative to Pygame. That's very cool.

07:18 I do think having something visual for people when they're learning, it just, it reinforces things so much, right?

07:25 Like writing that API back in the Toxin database is great when you see the next three steps down the line, how it's going to enable something.

07:32 But when you're getting started, you need quick feedback.

07:35 Absolutely.

07:36 All right.

07:37 Well, let's talk about something else that's awesome here.

07:40 I want to talk about SQLAlchemy.

07:42 SQLAlchemy has been getting a lot of attention lately and that's super cool.

07:47 Mike Bayer released SQLAlchemy 2, which was the first async API version.

07:53 So now you can use async and await with SQLAlchemy, which opens up lots of possibilities.

07:57 Sebastian Ramirez released SQLModel, which is like a marriage of Pydantic and SQLAlchemy, which is also super neat.

08:05 But there are many other things that you can do with SQLAlchemy that are really handy.

08:10 So as all the awesome lists go, here's one for a curated list of SQLAlchemy.

08:16 Now, first, just a word of warning.

08:19 From what I can tell, including the PR that I added yesterday, all the way back to the one in June, 2020, it doesn't seem to be getting a whole lot of love, which is unfortunate.

08:31 So it seems like it might be sort of stalled out, but that said, it's still a really good list of things.

08:36 So I'll pull out a couple that I think are nice here.

08:39 Which ones did I want to highlight?

08:40 The first one is called Continuum, SQLAlchemy Continuum, and this is versioning.

08:46 So imagine you would like to have a history or a record of changes to your database.

08:52 Like maybe this is some sort of financial thing.

08:54 And if you see changes, you want to be able to say, this person made this change on this date when they said, you know, update, get the record, make a change, and call commit on the SQLAlchemy session.

09:07 So what this does is it will create versions of inserts, updates, and deletes.

09:12 It won't store those if there's not actually a change.

09:16 It supports Olympic migrations, you can revert data objects and so on.

09:21 So if you want that, SQLAlchemy Continuum, it's just like one of the many, many, many things in here, which is pretty awesome.

09:28 Another one I wanted to highlight is UTC.

09:31 So one of the challenges that people often run into is when you're storing stuff in the database, a date in particular, what time is that?

09:39 Is that the time of the user who might be in a different time zone than the API endpoint that it was running at, right?

09:45 So it might be nice to be able to store zone, like time zone aware things and store them as UTC values.

09:54 So they're always the same.

09:55 And then you can convert them back to like the time zone, which is pretty cool.

09:58 Another one is a SQLAlchemy utils is pretty cool.

10:01 So it's got things like a choice type, which I'm guessing is basically a new, but country, JSON, URL, UUID, all of these different data types, data ranges, all kinds of stuff, your RMHelpers, utility classes, and different things like that.

10:19 So that's kind of a grab bag of them.

10:21 Let's see, one also is called File Depot.

10:25 There's cool stuff for processing images.

10:27 You've got File Depot, which is a framework for easily storing and serving files out of your database on the web, as well as SQLAlchemy Image Attach, which is specifically about storing images in your database, which by the way, we do right in on Python Bytes.

10:41 - Cool.

10:42 - You know, if you go to any page, any episode page, and you see like that, that watch it on YouTube, that little thumbnail, we go get that dynamically from YouTube and then serve it up so we don't have to depend on YouTube.

10:53 So anyway, that's pretty cool.

10:55 Let's see, maybe two more.

10:58 There's searchable.

11:01 So if you wanna add full text search to your model, you can add, use this, and then only supports Postgres 'cause I'm sure it depends upon some core element there, but you can also do another one from iSQL as well, which is pretty cool.

11:15 And then the last one is schema display, which generates basically graphs of your models and how they relate to each other, stuff like that, which is kind of neat.

11:25 - Nice.

11:26 - What do y'all think?

11:27 Cool stuff, right?

11:28 - Yeah, very cool.

11:29 - Yeah.

11:31 So if you're really bought into SQLAlchemy, you owe it to yourself to just flip through this list to just go like, wait, it can do that?

11:39 I had no idea that it could do that, right?

11:41 And just sort of see what are the other things that people built on top of here that I think would be super, super helpful.

11:48 And by the way, my PR was really to say, there's a layer called thin abstractions.

11:56 And it says, you know, under the thin abstractions, we really should have us some SQL model because that thing is super popular straight out of the gate, right?

12:04 So people should check this out.

12:06 already got almost 7000 stars and it's what a month old or something. That's crazy. Yeah.

12:12 Maybe maybe six weeks but really really new. Yeah. And but the author I mean. Yeah. Exactly.

12:21 I know. Brandon on the audience says there should be a meta awesome list like an awesome list of awesome lists. I'm sure there is. There is. I'm sure. And yeah quite quite fun.

12:35 I definitely recommend people check that out.

12:37 All right, Dean, that brings us to your first item.

12:40 Tell us about it.

12:41 - Yeah, so at work, I needed to write something that required threading, and I was very afraid of threading at the beginning.

12:49 Basically what we needed to do, we have some mechanism.

12:51 I'm a data scientist, and we need to take many queries at once and get them as pandas data frames and save them to disk, and later take all of them and work with them.

13:02 And instead of writing, like sending them sequentially, I wanted to send a bunch of them together.

13:08 And the bonus thing I found is that when you release them to a threading, if you don't lock the threads or you don't wait for the threads, you can actually still work with the Jupyter notebook while waiting for the queries.

13:20 So that was my main reasoning.

13:21 And eventually, after I've written most of the code, I got this blog post called "The Threadpool Executor in Python, the Complete Guide." So this is basically Jason Brownlee.

13:33 He's also the guy from Machine Learning Mastery, so I'm very familiar with him.

13:38 It's a very long blog post, so you could read it as an e-book or just access the stuff you need, because it's a two-hour read, maybe.

13:50 And he explains everything from the beginning.

13:53 He explains what are Python threads, how to work with them.

13:57 Then he introduces the ThreadPool executor, which is a more convenient way to use threads.

14:03 He explains about the life cycle of theā€¦ What does it do? How to do it then with a context manager and stuff like that.

14:11 And eventually, what he talks about that other people do not, when you search for a threading tutorial, is actually about the complete life cycle and then the usage patterns.

14:22 And then he explains about I/O bound versus CPU bound and everything.

14:26 And he finishes off with the common questions.

14:29 So this is like the link I've saved because I will forget it in a week, but the next time I need to, I just know I can come back to this and like read the common questions part.

14:40 And yes, there's a question, the question is like, - "How do you stop running?" - There's a lot there, yeah.

14:46 - There is a lot there in this article, isn't there? - Yeah, it's a lot.

14:49 But the thing is you can come back later and just take the stuff you need.

14:52 Like I remember, I know I'm working, then I can ask myself, "How do you set a chunk size in map?" Well, it says there that you don't because that's for the process pool.

15:01 But then I have another question, maybe, "How do you cancel a running test?" And the answer is that.

15:05 So I think that's a good thing to have, like, to quickly access when you need to.

15:11 And it finishes off with, like, "What's the difference from async.io, from threading.thread, from process pool executor?" So that is a very helpful guide, very complete.

15:24 And the entire blog actually explains, like, it's an entire blog dedicated to the threading pool executor and the process pool executor.

15:33 I love that it's covering the thread pool and process pools because it's easy for things to just completely get out of control.

15:43 You know, as you throw more work at it, stuff can completely back up. So if you just say, create me a new thread and run that and then another place, create me more threads, and I got a bunch more, oh look, now I have a thousand items of process, create a thousand threads.

15:55 - Yep.

15:56 - Each thread takes a lot of context switching to switch between and they take a decent amount of memory and all sorts of stuff, right?

16:02 Through the thread pool, you can say, queue up the work and run 10 at a time.

16:06 Same for processes, which sort of sets an upper bound on how much concurrency you can deal with, right?

16:12 - Yep.

16:13 - Yeah, this is cool.

16:15 So you talked about solving some problems in Jupyter Notebook using this.

16:19 What in particular were you trying to do?

16:21 So basically, I can send, I know, a thousand queries.

16:26 And once they get, like, we have big data, and then I have a query that takes a part of it, like, after maybe some group buys and limitations and stuff like that, and I want to take the data frame and save it.

16:38 - Right. - And then, once I have the entire data from all the queries, I want to join them, or maybe do some, I don't know, some processing and then join everything.

16:49 The thing is, after 10 of those came back, I have a sample of my data that I can work with and try to manage and then have a code written while the other stuff are still written. I want to have that. I can play with it.

17:03 So if I release the other things to the threads and they work in the background, the main thread of the Jupyter notebook is open.

17:13 And you can start working on the same notebook.

17:16 Before then, I used to open a notebook that's querying stuff, open a notebook that I'm playing with, and see that the file paths are the same, so I'm not confused with some other directory of the other versioning of this data.

17:32 And now it just works.

17:34 - Oh, that's really cool. - And you can also add a thread for, I know with some visualizations of what's finished, what's error, what's everything.

17:43 Fantastic.

17:44 >> That sounds really good. I'm sure there's a lot of concurrency and parallelism in the data back-end.

17:49 It's just how do you access that from Python, so how do you issue all those commands?

17:54 Excellent. All right. Let's see.

17:57 Brian, anything you want to add before I talk about Datadog?

18:00 >> No. Some comments like Sam, morally concurrent futures is a much less painful way to work with them at a higher level.

18:09 So maybe we could get an article on concurrent futures on the opposite sometimes.

18:14 Yeah, for sure.

18:15 So that the ThreadPool executor gets you back futures.

18:21 And then part of what's explained in the blog post is how to work with futures like as completed or sequentially or like you decide your strategy, but you work with the futures.

18:32 Nice. Okay, cool.

18:34 Yeah, nice.

18:35 And of course, requisite shout out to Unsync, which is all sorts of awesome for this stuff.

18:41 Unifies the API for direct threads, for processes and async I/O.

18:49 But what I want to tell you all about now is Datadog.

18:52 Datadog is really awesome.

18:54 You should really have insight into your applications and that's what Datadog brings you.

18:58 So Datadog is real-time monitoring that unifies metrics, traces logs into one tightly integrated platform.

19:06 Their APM empowers developers to identify anomalies and resolve issues, especially around performance.

19:13 You can begin collecting stack traces, visualize them as flame graphs and organizing them into profile types such as CPU bound, IO bound, and so on.

19:22 And teams can even search specific profiles and correlate them to distributed traces to find things across different parts of your infrastructure and microservices and identify slow or underperforming code and then make it faster.

19:35 Plus you can use their APM live search and you can search across the full stream of all the traces over the last 15 minutes.

19:42 So try Datadog APM for free with a 14 day trial.

19:47 And then Datadog will send you one of these very cute doggy t-shirts, which who wouldn't want one of those, right?

19:52 So visit pythonbystud.fm/datadog or just click the link in your podcast player show notes to get started.

19:58 Thanks Datadog.

19:59 And Brian, back to you.

20:01 - Back to me.

20:04 I was a I'm going to apologize whoever tweeted this, but it's somebody who's tweeted this out.

20:10 A link to this article and talking about chaining operators.

20:15 So this is an article by Rodrigo Sarau.

20:21 Pie don'ts.

20:22 Yeah, so so I don't know what the pie don'ts are about.

20:28 Just I don't know.

20:29 Maybe he started blogging about things you shouldn't do in Python, but anyway, this article is called chaining comparison operators and I use chaining all the time.

20:38 Mostly I use it for simple things like, let me find one, A is less than B equal less than C.

20:45 So ranges like min, my X value is between min and max.

20:50 >> Yeah, that's really nice.

20:51 >> Yeah. My hint on that, just tip for anybody doing that, always do them less than, don't do greater than because it's hard to do that.

21:02 Anyway, so keep them like that.

21:04 But this article is talking about other stuff.

21:06 So this is pretty easy to think about, like the less than operator.

21:10 So A is less than B, less than C, is really the same as A is less than B, and B is less than C.

21:18 It is that combination.

21:19 That's what chained operators are.

21:21 And the importance there is it doesn't really work for some operations.

21:28 And it gets into like the equal operator.

21:31 So you can do A equals B or equals C, which means they're all equal.

21:36 Great.

21:37 What about not equal?

21:38 Does that work the same way?

21:39 And it doesn't, because if you've got like, A is not equal to B is not equal to C, it doesn't mean they're all different because A and C still could be the same and have that pass.

21:53 So this, this article, if you're working with chained expressions, which I think you should, if you're doing complicated things, it's way, I like it better than having a bunch of ands in there, as long as you can keep it readable.

22:07 But this article talks through some of the gotchas and things to watch out for, like side effects and non-constants and things like that.

22:17 So, great discussion of chained operators.

22:21 - I hadn't even thought of doing this, not equal to, this seems wrong.

22:24 (laughing)

22:25 It just looks wrong.

22:27 >> Yeah, but don't do chain not equal.

22:31 That's just, even if that's what you meant, that A is not equal to B and B is not equal to C, but it's okay for A and C to be equal.

22:39 That would be a terrible expression because it's confusing. So don't do that.

22:42 >> It is.

22:43 >> Yeah.

22:43 >> My favorite one of these chainings like X, seven less than X less than 10, something like that, that's nice.

22:52 My favorite is converting X, if x is not none else y to just x or y.

22:59 Boom. That's so clean and so nice.

23:01 And coming from a C++ background and C#, I never thought that was possible.

23:07 And that's great.

23:08 Yeah.

23:09 Dean, what do you think about this?

23:10 I love it.

23:12 I use it a lot.

23:13 It didn't always work.

23:15 I think it's still not working with Pandas data frames or Pandas series and arrays.

23:20 And I do wait for this to finally work.

23:24 Arrays, when you do an array, like in NumPy or Pandas, when you do an array that's less than some number, it returns a new array, like a Boolean array, which is true and false.

23:35 And last time I checked, a few months ago, but the last time I checked, it didn't work.

23:40 I couldn't do one is less than the series, is less than two, and get the Boolean array.

23:46 So I'm waiting for this, but I love the concept a lot.

23:50 Okay, that's good.

23:51 Yeah, I hadn't really considered the integration into pandas.

23:54 Yeah, but of course.

23:55 I'm not sure how would you implement that with the regular data model of like dunder dunder EQ or is this something else?

24:03 LTE, yeah, possibly. I'm not sure either.

24:06 Yeah, there's probably some magic method and it might just expand out to less than and then and, you know, like the two tests basically.

24:14 Probably does.

24:15 - Cool, cool.

24:16 - We could ask, we should ask Brett Cannon to do a deep dive into what trained off-sitters.

24:23 - He's pulling apart all the different parts of Python syntax, right?

24:26 - Yeah.

24:27 - All right, I wanna give a quick shout out to Rich because it's one of our episodes, so we talk about Rich.

24:34 I was gonna talk about Anthony Shaw, but I didn't have enough information, so I mean, he's the other person who needs a shout out in every show.

24:41 So I wanna talk about this article, highlighting some tools by Martin Hines.

24:48 Yeah, Martin Hines.

24:49 Well, creating beautiful tracebacks with Python's exception hooks.

24:53 So two things that I wanna point out here.

24:55 One, Python has an exception hook mechanism, which is pretty cool.

24:59 So what you can do is you can create a function that has this signature of exception type, the actual exception and the traceback.

25:08 So three arguments.

25:10 And if you have a function like that, you can just go to the sys and say sys.except hook equals that function, not calling it, of course, just passing the function as the value.

25:20 And then whenever there's an exception, this will be called by Python.

25:22 That's pretty cool, right?

25:23 - Yeah.

25:24 - So depending on what you wanna do, like you could say, well, we're gonna store all the errors.

25:29 Like let's imagine here's a scenario where you might make use of this.

25:33 I'm gonna create an app and I'm gonna send it out.

25:36 I'm gonna use py2app or py2xe, or just let people install it somehow.

25:41 And then when it runs, I want it, it's gonna run on their computers, but I want to gather up all the exceptions of all the users across the company or the research team or whatever.

25:50 You could have this, submit this error along with other details right back to a database over an API, right?

25:56 And then you could do like analytics, like well here's the most common error and so on.

25:59 Of course you could use Sentry or something like that, but maybe you're trying to gather some specific information that's different, right?

26:05 So that's one of the types of things you could do with this.

26:08 - So I got a question before I go on.

26:10 - Yeah.

26:11 - So this doesn't catch the exception, it just, it doesn't interrupt the flow, it just gets called when it happens?

26:19 - It doesn't catch the exception, it lets you basically change what kind of output comes from Python.

26:27 So if you just wanted to print out like, here's a file where there was an error and here's the error message.

26:32 - Okay.

26:33 - Like you could do that, right?

26:34 Or the type and then the message.

26:36 - I'm just noticing the example doesn't rethrow it, So you don't have to do that then.

26:41 - No, I don't believe so.

26:42 And I'm not 100% sure.

26:44 I think the app, I think the process still ends if it's just a regular running script rather than a web app.

26:52 I think it still ends, but--

26:54 - Anyway, sorry.

26:55 - You get a different kind of output.

26:56 Yeah, yeah, no, you just don't get the standard print output that Python gives you, right?

26:59 So you could say, avoid printing the trace back if you wanted.

27:03 You could just say this file on this line had this error.

27:05 - Oh, right, okay, nice.

27:07 Okay, so it's easy enough to do.

27:10 Like for example, they have this function that they call that cause an error and all you see when this crashes is there's a trace back.

27:18 You know, this file, this line and this module, here's the error message, right?

27:22 Instead of the huge stack trace that might scare people.

27:25 Okay, so I mean, obviously you can use try and accept but this is global, right?

27:29 So even if some library's calling something and you're not catching it and like, right, it's catching everywhere.

27:35 Okay, so then you could do more work about breaking that apart, and they talk about doing that, but the real interesting part is if you go and look at some options.

27:44 So there are five, I believe there are five libraries mentioned in here that do really cool stuff for solving this.

27:49 The first one is by Will McGugan's Rich Library.

27:54 So you can just go from rich.traceback import install, and then say install show_locals is true, and then this also basically installs one of those global exception hooks.

28:05 But with the benefit being, when you get the errors, what you get is a nice rich output.

28:11 - It's super pretty.

28:12 - It's pretty and it's useful.

28:14 I mean, it's color highlighted so you can see where the error happened, but it also will print out in a really nice way with formatting and highlighting the locals, right?

28:23 So, well, what values were passed to that function when it's crashed?

28:27 Well, here's a little table of those and so on.

28:29 So this is really easy to identify and at the very bottom, like a nice clear way to like, Okay, what happened?

28:35 So you can do this super simple version here.

28:38 There's also some manual ways to make rich print this type of stuff.

28:42 Number two is better exceptions, which does similar stuff.

28:47 You can see that it doesn't quite take over how the look and feel is so much, but it basically colorizes the standard look and feel of errors.

28:55 So you can see, you know, which function, which error and so on.

28:59 So that's pretty good.

28:59 And there's pretty errors.

29:01 Check out pretty errors.

29:02 This looks pretty good, right?

29:04 It's got a lot of like bold and highlights.

29:07 You can really call out the error messages and the functions involved and the modules involved.

29:12 Here's one for you, Dean, the built-in one to IPython.

29:15 It has Ultra TV for Ultra Traceback.

29:20 And this is pretty nice, right?

29:21 Actually, the IPython one's pretty good.

29:23 - Yeah, the Python one is really nice.

29:26 And also I was planning to talk about it in the extras, but on IPython 8, which is pretty new, They even have this improved with some color coloring of exactly where the error happened.

29:38 I think this uses the 310 part or something like that.

29:42 - Oh, awesome.

29:43 Yeah, that's cool.

29:44 We'll hear more about it when we talk about IPython 8 as well.

29:47 Cool.

29:48 Yeah, so that's built in, kind of if you're already on the data science stack.

29:52 And then finally, stack printer, which you can give it a trace back and it will print that out.

29:57 So you can sort of do like rich, you can say set exception hook and give it a theme like dark or whatever.

30:03 And then it does this pretty nice printout as well.

30:06 So these are all great.

30:07 I'm personally liking the rich tracebacks version best, but this is really nice.

30:13 Yeah, Connor out there in the audience says, wow, using show local SQL tree would have saved me hours and hours of time.

30:19 And you and me both.

30:20 - Yeah, I feel the pain.

30:21 - I do too.

30:23 So because a lot of times you're like, I know it crashed and it says, none type does not have attribute whatever, but like, why is it none? I need to go back three levels, right? Like, yes, so good.

30:34 - And then you find out you just forgot to return from the function.

30:38 - Yes, exactly.

30:40 - I was just debugging a test failure the other day, and pytest has an option to throw a local, you can show locals with a crash, or with every failure. And the, >> I forgot that the particular thing I was testing had like variables that were storing 1,000 element arrays.

31:06 It just went on for.

31:08 >> I believe Rich has a truncate variables, where it'll do an ellipsis or something like that.

31:17 >> That's nice.

31:17 >> I think. I mean, yeah, I'm not a 100 percent sure because I've been looking at all five of these today.

31:21 >> Will's in the chat. We'll have to ask him.

31:22 >> Will's in the chat. You'll have to give us a shout out, Will.

31:25 I think truncate is out there, right?

31:27 I'm not 100% sure.

31:28 - I think of how can I actually, so I talk with databases and sometimes the errors from the databases are like this big Java trace and then you need to like, a lot of go, a lot of apps, sorry, something, some noise here, sorry.

31:47 You need to get a lot up in the browser to actually see the error.

31:51 And if I could just shut it down and just give me the Python stuff.

31:57 - Yeah, I don't know what setting you set for that, but certainly with this mechanism, you could set it up so that if the word Java appears, you just stop.

32:06 (laughing)

32:07 You just stop going back.

32:09 And Will says, yes, that's right.

32:10 Thank you, Brian, for pulling that up.

32:12 Yeah, you can truncate it so the printing won't go completely insane.

32:16 Because it could be gigabytes.

32:17 I mean, it could be out of control, right?

32:19 - Yeah, but even if they have a reasonably large limit, sometimes it's just like, oh, I forgot that huge array was there, and it's hard to see stuff.

32:28 - Yeah, absolutely.

32:30 - Yeah.

32:31 - All right, over to you Dean.

32:33 Speaking of testing, Brian was talking about testing stuff and looking at the color and so on.

32:37 - Yeah, so I thought Brian, this would be up your alley.

32:40 So it's called "Ways I Use Testing as a Data Scientist." It's by Peter Baumgartner.

32:47 And I'm a data scientist, but I also love testing.

32:51 The thing about testing with data science is sometimes it's not that clear what you should test for, right?

32:58 Because some things we do are stochastic, and then you could not actually test for stuff or stuff like that.

33:06 So this blog talks about the art of testing, because sometimes it's not clear what you should test, and the more experience you get, you can actually see what's coming your way.

33:19 And it talks about data validation, and he is throwing many packages that could help you, packages like Pandera and Great Expectations that I think we've talked about before in the podcast.

33:34 And also, like, the NumPy has some stuff, like isClosed, checks for two numbers that are close to each other, or array equal, assert data frame equals in Pandas data frame.

33:47 So he talks a lot about that.

33:49 He also talks about using assert in your code.

33:52 Like, even if you had some ad hoc stuff of analysis, use assert within the code.

33:58 Don't think about the tests later.

34:00 Just think, like, where does this thing could hurt me?

34:03 He gives an example.

34:05 Maybe if I'm trying to join two data frames and they think they have the same shape, I want to check if they have the same IDs, so that way I know that the join works correctly.

34:16 So he asserts that the length of the IDs is the same within the two data frames.

34:22 And this is not even like real testing, we would say.

34:25 He doesn't use some testing framework. He just says, "Write it within your code." And then continues to Hypothesis, which basically bombards the functions with a lot of ways to actually try to fail it.

34:41 It continues with some other packages, and it eventually goes into pytest and shows how it would work with pytest and with an approach that I haven't heard of, but it sounds good.

34:55 Arrange, act, assert.

34:58 Arrange the data, then act on the thing you want to check, and then just assert if they are equal or almost equal in the thing you want it to check for.

35:09 - Yeah, it's such a easy mistake to make, like this number equal, equal that number.

35:15 - Yep.

35:16 - And it's, but when you're doing science or data science.

35:19 - I'm glad he talks about structure because a lot of people that get into testing get these giant tests that do a little work, test something, do a little more work, test something.

35:29 And then if it breaks, you're not sure where the failure is.

35:33 So this looks, sounds fascinating.

35:36 And actually I'm not sure how I missed it, but I really want a way to compare an array for almost equal.

35:42 So I'm gonna have to go read that.

35:45 - Yeah, so NumPy and Pandas both have mechanisms for that.

35:49 It's pretty great.

35:50 - Nice, cool.

35:52 - Yeah, very nice.

35:54 I know this will be helpful to people.

35:55 It's really, I always wonder about testing data science stuff and machine learning things and so on, where you get small perturbations, but they're fine, right?

36:04 it's off by one millionth of some unit, but like, that's totally good.

36:10 Those are equal, but it's, it takes, I think, an extra level of thinking about it.

36:14 So much people focus on, but how do you get rid of your dependencies?

36:17 And how do you make sure that you don't talk to the real database when you do this?

36:20 So it's, right.

36:21 And that's one aspect that people focus on, but this working with like sciency type stuff is its own specialty.

36:28 - Yeah, I think that the entire community is, it's a fairly new community, although it's not as new as it was.

36:35 And I'm not sure like we're on top of how to do tests in machine learning.

36:40 Like many, we have many packages for that.

36:43 We have many theories for that, but I'm not sure like that we have like actually one solid good way and maybe we shouldn't have, but it's a debate.

36:54 - Yeah, for sure.

36:56 - Same with the rest of the software world.

36:58 So welcome.

36:59 (all laughing)

37:02 - Thanks.

37:02 - Yeah, and Sam out in the live stream says, "Numpy has an assert array almost equal in numpy.testing." - Nice.

37:09 - I just learned there's a numpy.testing, that's cool.

37:11 - Yeah.

37:12 - Awesome, all right, Dean, while you have your screen up, do you have any extras you wanna talk about?

37:18 I know IPython 8 was a thing.

37:20 - Yeah, so IPython 8 was released like last month after three years of waiting for a major version.

37:27 It has a lot of new features, but this is the extra part, So it won't go over them. Just two and a half things I wanted to mention.

37:35 It says that it's less code, and I love that.

37:38 Once you get better in a programming language, you understand that you shouldn't write more code, you should delete code.

37:45 And that's what those guys do.

37:48 And the way they could have done that is by hiring a person through the NumFocus Small Development Grants.

37:56 And I think this is important.

37:57 It's actually been talked a lot about after the Log4j stuff.

38:02 It's been talked about like, well, those are three guys who worked tirelessly.

38:06 They have their full-time jobs and they couldn't fix the Log4j stuff maybe as quickly as some other people wanted.

38:13 But then you realize that they got donations of like a few hundred dollars within 10 years.

38:19 And then after the Log4j, suddenly they got thousand.

38:22 So this, I think it shows you how the like Money could help open source stuff.

38:28 And maybe if you use some package in a company, in some corporate, maybe try and think how you can give back money.

38:36 Or even if you give back code, if you free up your developers to actually contribute.

38:41 This is awesome.

38:43 And the half thing just mentioned, because it talks about the tracebacks, it shows that you can now see it's colored.

38:53 You can see on the screen.

38:54 It's called the part where the actually the arrow was, it's colored now.

39:00 So it's very nice to see like the example shows you, you add the function three times, but only it fails on just one input of them.

39:10 So it shows you which of the three times the function failed.

39:14 - Right, you call it the same thing like foo of zero plus foo of one plus foo of two.

39:19 And it's the middle one that failed, not just line seven, but the second invocation with the value one where it failed, which that's awesome.

39:26 - Yeah, exactly.

39:27 And, well, sorry.

39:30 - I was gonna say, the same thing for indexing into, what is that, a data frame or something like that.

39:34 Like, you're chaining together like bracket zero, bracket one, bracket zero, it's the second one.

39:40 Trying to get to the one of zero, that was the one that failed there.

39:43 That's really, those are hard to come back and find if you're not in a debugger.

39:48 Like, well, which one of these failed?

39:49 like great array index out of bounds on line three.

39:53 Well, there's three of those happening, which one?

39:56 - Yeah. - Yeah, that's cool.

39:57 - And another thing is a tweet by Victor Stinner is a core dev.

40:02 And he says, I mean, it's now time to deprecate the standard lib URL lib module.

40:08 And this is brought a lot of haters and fans.

40:13 And I'm not sure what's my opinion yet.

40:15 I'm not a heavy user of your lib.

40:19 But it opened up a debate, like we know how to do.

40:22 - Yeah, that's really interesting.

40:26 There are certain things in the standard library you're like, yeah, yeah, I know what that's there and you could use it, but you probably shouldn't use it.

40:31 There's like so many better external choices that are so good that it would be kind of silly to bite them, right?

40:37 That's sort of the recommendation here.

40:39 - Yeah, but also some people don't like it.

40:42 They have people there that say, they hate dependencies and sometimes you can do most of the work with the standard lib.

40:51 And some of the tweets said, like, maybe deprecate the major parts that requests can do, but there are some other parts that are actually really needed.

41:01 So maybe deprecate half of it.

41:03 - Yeah, I'm not sure if I'm about deprecating it, but, you know, it's one thing to say there are better choices and we as a community recommend you probably just don't use this, but to deprecate it means to people who would rather go with a dependence, a lower level of dependencies, you're giving them warnings that they shouldn't be doing this when maybe it's unlikely it's gonna actually vanish, right?

41:27 - There's like a fallacy though that I think some people have that if they don't have dependency and it's in the standard, they're using something in the standard library, it's more solid.

41:38 But I don't know if there's that many people working on URL lib right now.

41:43 And some of the other parts that maybe people want to stop supporting.

41:50 There's a, that's something very valid.

41:51 Python still is an open source project and we can make those decisions.

41:56 - Yeah, Victor actually says there are four year old security issues in your relic.

42:01 So maybe it's better to use something outside of it.

42:05 - Yeah.

42:06 - Yeah, people want it to stay, but there's these issues.

42:10 Yeah, I wonder if there's a way to go, well, let's look at some of the libraries that are out there try to bring them in and just use their core to replicate that functionality.

42:19 Not to say, you know, like, you could, let's just pick on requests.

42:21 Like, bring a request in, like, vendor a little bit of it in so it does what URLlib does.

42:27 And just go look, okay, this is the latest, greatest that we got and everyone's been looking at requests already.

42:32 I don't know, could be interesting.

42:35 Yeah.

42:35 And then Brandon out in the audience points out, there are also maybe environments where you can't install dependencies for security reasons.

42:41 And so having things like URLlib allows you to do more with Python in those situations.

42:46 - But if there's security problems with the URL lib, yeah, anyway.

42:50 (Brian laughs)

42:51 - Yeah, just in some of the functions, you don't call those, no, I'm just kidding.

42:55 All right, Brian, how about you, extras?

42:57 - Just one extra, I brought this up last week.

43:00 I'm currently not writing a book.

43:02 So-- - Yay!

43:04 - So I want to write more blog posts.

43:06 So one of the things I wanted to make sure that my blog, I migrated to pythontest.com and now it has a blog setting.

43:16 And I-- - I like it, looks pretty too.

43:19 - Instead of just pulling everything over from my old WordPress blog, I'm trying to edit it.

43:25 So I'm up through 2012.

43:29 I'm gonna go oldest to newest and gradually do things, bring things in.

43:35 So that's one of my side projects I'm working on.

43:38 - Yeah, that's a great side project.

43:40 Nice.

43:41 What's that running on?

43:42 like some static site generator or other hosted thing.

43:45 - It's Hugo hosted by a free Netlify account.

43:49 - Yeah, Netlify's pretty awesome.

43:51 All right, I got a couple things.

43:52 I wanna give a quick shout out to, yeah, Brandon has the same question, but we got it.

43:58 All right, first of all, I have two new, my Python shorts, two new versions, two videos from there.

44:03 I got beyond the list comprehension, so basically set and dictionary comprehensions, fun stuff.

44:07 - Nice picture. - There.

44:09 Thank you.

44:10 and it's like just a screenshot out of an animation.

44:13 And then combining dictionaries, the Python 310 way is the title of the article.

44:17 It really should be 3.9, but I kind of want to communicate like if you're on the latest Python, how should you be doing it?

44:23 It came out in 3.9, the features that are actually in there.

44:26 Anyway, the pipe stuff, dictionary one, pipe, dictionary two, pipe, dictionary three, which is all fun.

44:32 And then I wanted to talk about a feature over on pypi.org.

44:35 I don't even know how I found this.

44:37 Probably just like an accident, like bump the keyboard or something.

44:39 But if I'm over here and you just want to search for something forward slash.

44:43 Now you can search.

44:44 What?

44:45 So the now I have a beam in the browser.

44:47 Exactly.

44:49 So if you were on pipe, bi.org and you want to search forward slash.

44:53 Yes.

44:54 So that's pretty cool.

44:55 Yep.

44:56 All right.

44:56 That's it for the extras.

44:58 Nice.

44:59 I don't even remember what my joke is, so that's good.

45:01 That'd be fine.

45:02 You're ready?

45:06 Yeah.

45:06 All right.

45:07 Yeah, here we go.

45:09 Oh yeah, this is another one of these sort of like frustration type of things, that's great.

45:14 This comes from the Programming Humor, Twitter account, you know, twitter.com/programminghumor, which is, there's a lot of good stuff in there.

45:21 Some that I really liked, I didn't want to necessarily put on the show, but this one is developers really frustrated that they're sucking in on their lips, they're pulling on their cheeks, they're going, "Oh, I hate this job.

45:32 "I hate my life.

45:33 "Why is this happening to me?

45:36 Never mind, I misspelled a variable.

45:38 [laughter]

45:40 Good to go.

45:42 Yeah, linting is good.

45:44 Indeed, indeed.

45:46 If you just flip through the programming humor one, it's pretty good.

45:52 This 8-year-old is learning Python after dealing with the syntax bug, she asks, "If the computer knows it's missing a semicolon here, why won't it add it itself?" I don't know. I really don't know.

46:02 Yeah.

46:04 >> Yeah, and so he follows up and says what he meant, he meant colon, not semicolon.

46:10 But so many people are like, "Semicolon? We're using semicolon for Python." >> Exactly. There are uses.

46:18 They're rare though. All right.

46:19 Well, fantastic.

46:22 >> That last one.

46:25 >> See?

46:27 >> Yeah.

46:28 >> It shall not be spoken, but it's good, right?

46:29 >> Yeah.

46:29 >> Okay. There's a lot of good stuff.

46:31 I recommend people go flip through that Twitter account.

46:34 - Nice.

46:35 - Brian, thank you as always.

46:36 It's good to be back with you.

46:37 - It's good to be back.

46:39 - And Dean, thanks for coming on this side of the presentation and joining us for the show.

46:44 - Thanks for having me.

46:46 - Thanks for listening to Python Bytes.

46:48 Follow the show on Twitter via @PythonBytes.

46:51 That's Python Bytes as in B-Y-T-E-S.

46:54 Get the full show notes over at PythonBytes.fm.

46:57 If you have a news item we should cover, just visit Python by set of them and click Submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit the website and click live stream to get notified of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Okken. This is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page