Transcript #180: Transactional file IO with Python and safer
Return to episode page view on github00:00 Hello and welcome to Python Bytes, where we deliver news and headlines directly to your earbuds.
00:04 This is episode 180, I can't believe it, recorded April 29th, 2020. It's almost Mayday.
00:11 I am Brian Okken.
00:13 And I'm Michael Kennedy.
00:14 And this episode is brought to you by DigitalOcean, and we'll talk more about them a little later.
00:19 Well, I have some very timely news, because not very long ago, a couple days ago, Ubuntu 20.04 is out.
00:27 What?
00:29 That's cool, right?
00:29 Yeah, the new Ubuntu.
00:31 And it's, why is this big news?
00:33 Well, there's a lot of releases of Ubuntu and whatnot.
00:36 But this is the first new LTS long-term support version in two years.
00:42 So basically, this is the first real production-grade version of Ubuntu that's been out in two years.
00:49 So that's a big deal, I think.
00:50 Oh, yeah, really big deal.
00:51 And it's got something special in it.
00:54 It does.
00:55 It hates legacy Python, but it loves modern Python.
00:58 So one of the things that's bugged me about 1804, which is what I've been using for production, is it was stuck on 3.6.
01:05 I mean, imagine April 2018.
01:08 It's using Python 3.6.
01:09 It didn't change.
01:10 Well, what's the current version now?
01:13 Well, that's Python 3.8.
01:14 Sadly, 3.9 is going to be out really soon.
01:18 And this is 3.8.
01:19 But nonetheless, hey, 3.8 is awesome.
01:22 It has a bunch of cool new features that we can use.
01:25 And yeah, it comes included.
01:27 I don't even think you have to pip install Python.
01:29 I think 3.8 is already there.
01:30 That's really cool.
01:31 Yeah.
01:31 And to get legacy Python, you can get it, but you have to go like apt install it explicitly to say, no, no, I want the old one.
01:37 Yeah.
01:37 Python 3.8.
01:39 Automatic.
01:40 It is now.
01:41 F-strings everywhere.
01:41 That's right.
01:42 Like, it's time to hit all your code with flint and auto F-string all the things.
01:47 So I upgraded all of the servers for Python bytes.
01:51 The servers are pretty small and simple.
01:53 But if you look at all the stuff that I'm running, there's actually a ton of servers.
01:57 And I actually talked about that with Dan Bader on Talk Python episode 215.
02:02 So people really want to look at what we're doing, what I'm doing here, what we're doing for Python bytes.
02:07 In terms of infrastructure, they could do that.
02:09 But upgraded a bunch of servers to 20.04.
02:12 There's a bunch of stuff that kind of built up cruft.
02:15 And I'm like, oh, we could do this way better.
02:17 Redid all that stuff over the weekend.
02:18 And so now everything's on the shiny new versions of Python 3.8 and Ubuntu 20.04.
02:24 And it went really well for me.
02:25 So that's great.
02:26 Yeah.
02:27 And the kernel has been upgraded to 5.4, which adds support for WireGuard, VPNs, better support for like Raspberry Pis and Intel and AMD hardware.
02:36 New version of GNOME.
02:39 You can install the desktop on top of the ZFS file system if you care about that.
02:43 And you talked about DigitalOcean at the top.
02:46 You can go to DigitalOcean right now and just check off.
02:49 I want a new 20.04 droplet.
02:52 Boom.
02:53 Off it goes.
02:53 That's how I got ours.
02:54 Oh, that's great.
02:55 Nice.
02:55 Yeah.
02:56 And actually, we've already had the kernel upgraded.
02:59 The 5.4 kernel upgraded to like a new version.
03:02 I just had to like apply some patches.
03:03 So I guess it's pretty active.
03:06 But here you go.
03:07 So yeah, different topic.
03:09 But for our servers, you have to like pay attention to kernel upgrades and stuff?
03:13 Yeah.
03:13 Apparently, yeah.
03:14 They're not like on a platform as a service type of thing.
03:17 It's not a big deal.
03:17 It's pretty regularly like once a month or whatever.
03:20 I'm usually logged in to one of them every couple of days doing something.
03:24 And it'll say, oh, like either it's already applied.
03:27 It says you got a reboot.
03:27 It'll be like obvious that there's an update when you log in.
03:30 And then I'm like, oh, I should just run that thing that upgrades all of them.
03:34 Okay.
03:35 Once I notice it.
03:36 Yeah.
03:36 So pretty much.
03:37 Yeah.
03:37 Okay.
03:38 Neat.
03:38 Yeah.
03:38 Okay.
03:39 Well, I'm going to switch hats.
03:40 So I want to talk about warnings.
03:44 So warning, I'm going to switch a hat.
03:45 And so Reuven Lerner is a friend of the show and great guy, teaches Python.
03:50 And we wrote an article called Working with Warnings in Python.
03:53 And I like this because I don't think we've talked about warnings much.
03:57 No, not much at all, actually.
03:58 It's a good introduction, but he talks about exceptions and the class hierarchy and printouts
04:03 and stuff.
04:03 But if you want to like, if something goes wrong, you kind of want, you've got options
04:08 of like printing out to the user or throwing an exception, but you also have warnings.
04:12 And how should you treat those?
04:15 And I love what he wrote.
04:16 He said, most of the time, warnings are aimed at developers rather than users.
04:22 Warnings in Python are sort of like the service needed light on a car.
04:25 The user might know that something is wrong, but only a qualified repair person will know
04:30 what to do.
04:31 Developers should avoid showing warnings to end users.
04:35 But one of the things that the warning system is used for is deprecation warnings.
04:40 A lot of projects do this where they kind of want to get rid of a feature so they can
04:44 refactor some stuff and or just doesn't fit in the API very well.
04:49 So they'll deprecate it and they'll issue a deprecation warning when somebody uses it.
04:54 So it's an alert.
04:55 It doesn't stop working, but it alerts people to that.
04:58 There's a warning.
04:59 One of the things I love about warnings is by default, pytest turns on warnings.
05:05 And so you can see those and you can also make them make pytest so that it fails on warnings.
05:10 So this is a good thing to pay attention to, but it doesn't stop your project.
05:15 That's cool.
05:16 I didn't know you can make pytest, like observe and use the warnings as an error.
05:21 Yeah.
05:21 The warning system gives you a whole bunch of stuff.
05:23 Python's warning system, it treats warnings as separate types of output so that we don't
05:28 confuse them with either exceptions or printed texts.
05:31 It lets us indicate what kind of warning we're sending the user.
05:34 So we have different types.
05:35 It's like the exception hierarchy.
05:37 You can have a warning.
05:38 There's a warning hierarchy.
05:39 You can create your own and you can filter on them so that you can screen out stuff that
05:45 you don't care about, selectively fix things.
05:48 Anyway, it's a very powerful system.
05:50 People should use it when they need it.
05:52 The article goes on to give specifics on the syntax of how to use them, how to create
05:57 custom warnings, and how to filter on them.
06:00 And it's a good intro.
06:01 Yeah, this looks super interesting and like something I should be paying more attention to
06:05 than I have so far.
06:06 Is this something I'm not really using?
06:07 I'm more a consumer of warnings.
06:09 I'm like, oh, that library, it started issuing warnings about something.
06:13 And sometimes it's really frustrating because it's like the library being used by the library
06:19 I'm actually trying to use is doing something wrong.
06:22 It says, well, this is going to be deprecated and now you've got to do this.
06:25 I'm like, well, but I'm not doing that.
06:27 I don't want to see this.
06:28 But nonetheless, it looks like way simpler than maybe.
06:32 I just haven't looked at it.
06:32 It looks great.
06:33 So I should use this more.
06:34 One of the cool use cases that I heard recently is using pytest warnings or pytest's knowledge
06:40 of warnings and testing your system when you're upgrading Python so that you can say,
06:45 oh, when we're, because Python will deprecate things too.
06:49 And then you can have a heads up that you need to start fixing your code because it'll
06:55 pinpoint you exactly.
06:56 It's kind of like the exception system.
06:57 It tells you exactly where it's coming from.
06:59 So yeah, that is really nice.
07:01 Do you want to know something else that's nice?
07:02 DigitalOcean.
07:03 DigitalOcean is very nice.
07:05 And DigitalOcean just launched their virtual private cloud or VPC system and new trust platform.
07:14 Ooh, a trust platform.
07:15 Together, these make it easier to architect and run serious business applications with
07:19 even stronger security and confidence.
07:22 VPC allows you to create multiple private networks for your account or your team instead of having
07:29 just one private network.
07:30 DigitalOcean can auto-generate your private networks, IP address range, or you can specify
07:37 your own IPs.
07:39 You can now configure droplets to have, to behave as internet gateways.
07:44 That's cool.
07:44 Yeah.
07:45 It's like your own little baby internet.
07:46 Yeah.
07:47 That's neat.
07:47 And a trust platform is a new microsite that provides one place to get all your security
07:53 and privacy questions answered and download our available security certifications.
07:59 DigitalOcean is your trusted partner in the cloud.
08:01 Visit pythonbytes.fm/DigitalOcean to get $100 credit for new users to build something
08:08 awesome.
08:08 Yeah.
08:08 We love it.
08:09 Like I just said at the outset, put Ubuntu 20.04 on there, and it's been working great
08:14 for so many years.
08:14 Now, one thing that I ran across, there's a few little libraries that are so simple, and
08:20 yet when you come across them, you're like, oh yes, this is so cool.
08:24 One of those that I go on and on about is unsync, how that unifies all the different APIs that
08:31 do asynchronous programming, like asyncio, threaded stuff, multiprocessing stuff, and whatnot.
08:37 Right?
08:38 So this is one I think that kind of is like that.
08:41 It's not about unification, but it's about solving a problem in a way that's kind of transparent
08:46 to the user, but is really, really awesome because it just adds some nice durability to
08:52 your code.
08:52 So there's different levels of like exception handling if you look at it, right?
08:59 So if you look at code, there's probably like the beginner level that has no try except blocks
09:05 anywhere in the code.
09:06 It's just like, I don't know what you call it.
09:08 Is that optimistic programming?
09:09 Like I don't need to do error handling.
09:11 It's going to be fine.
09:12 Everything's fine.
09:13 This is fine.
09:13 That's one way.
09:15 The next level would be to say, okay, I'm going to have some exception handling.
09:19 I'm going to do a try, do a bunch of stuff, except handle the error, right?
09:24 That's good.
09:24 And maybe you're catching different errors.
09:26 Like maybe that's another level.
09:27 I don't know what the making of these levels up a little bit, but even if you are catching
09:31 an error, something could have gone terribly, terribly wrong and corrupted your data along
09:37 the way.
09:37 So there's like durable error handling and there's, it isn't technically crashing at the moment
09:42 error handling, right?
09:43 So the durable error handling, I don't think a lot of people think about nearly as much.
09:48 So simple example is what you would maybe use a transaction for in a database is like,
09:53 I'm going to transfer money from this account to that account.
09:56 But what happens if the transfer to the second account fails?
10:00 I want to make sure I don't actually take the money from the first account, right?
10:03 Or I want to write some piece of data to a file.
10:07 So I'm going to open the file and I'm going to make sure there's a try accept.
10:11 I'm going to put it in a width block.
10:12 So the file pointer gets closed.
10:13 Everything's going to be good.
10:14 I'm going to make one change and another change.
10:17 And then a third change, like write these three things to the file.
10:20 What if the exception happens after the second line?
10:23 You've half written to the file.
10:25 Now what?
10:26 I don't know.
10:27 Wow.
10:27 That's bad, right?
10:28 Yeah.
10:29 So there's all these ways in which, like you still have a try accept.
10:31 You still catch it.
10:32 You still close the file pointer.
10:33 It doesn't matter.
10:34 It's corrupted, right?
10:35 So there's like this another level of error handling of like kind of treating memory and
10:40 files and whatnot as transaction, transactional type things, right?
10:44 If there's an error, they just go back the way they were.
10:47 And so this thing that this long winded introduction is about is called safer.
10:52 So a safer file writer.
10:54 And it's this cool, simple little thing.
10:58 Instead of saying with open file name as file pointer, you say with safer dot open.
11:04 File name as file pointer.
11:06 And then otherwise all your code is identical.
11:08 Okay.
11:08 Okay.
11:09 Here's what it actually does.
11:10 So as you write to the file pointer, it's writing to a temporary file behind the scenes.
11:16 And then if, you know, when you exit a width block, the width block, the exit, the dunder exit
11:22 takes whether or not there was an error on the way out the door.
11:26 So, you know, as you exit the width block, did as, am I leaving because a crash or am I leaving
11:31 because everything is cool and we're done?
11:35 So it uses that information to either throw away the temp file or move the temp file over
11:40 top the thing you thought you were writing on.
11:42 Oh.
11:42 Isn't that cool?
11:43 So if there's an exception in your width block, it still closes up the file pointer and everything,
11:48 but your data is unchanged.
11:49 It's kind of like a transaction with a roll auto rollback for files.
11:53 That's pretty cool.
11:54 Isn't that cool?
11:54 And it's like 28 lines of code that does that little bit.
11:57 Yeah.
11:57 Is it any idea what the time hit is?
11:59 It's got to be a little bit, but.
12:01 It's pretty small because it just uses shutil to replace the file.
12:04 Like it writes to the file just as you would write to the file.
12:07 Okay.
12:08 And then at the very end, it goes, move this file to this destination and overwrite.
12:12 So it's basically adds a file move, which in an SSD is like nothing, right?
12:17 Okay.
12:18 It doesn't matter how big it is.
12:19 It probably just like updates the, I don't know, like the table in the drive, whatever
12:24 that means.
12:25 Yeah.
12:26 Isn't that cool?
12:26 That is very cool.
12:27 I like it.
12:28 Yeah.
12:28 So it seems so easy to use.
12:31 It looks like something that might be worth looking at.
12:32 So I'm linking to a couple of things.
12:34 I'm linking to an article that introduces this.
12:36 And in the beginning, apparently there was like some edge case where something wasn't
12:41 working quite right.
12:42 If you passed like an integer representing a file handle or something funky like that, it
12:48 didn't deal with that.
12:48 Right.
12:49 So there's a, another, like an updated article that doesn't have all the motivation, but then
12:54 talks about this fix.
12:56 And there's also a GitHub repo and you can just pip install it.
12:58 So all those things are good.
12:59 And the final in this section, I'm linking to the actual 28 lines of code.
13:03 Do you have that open?
13:04 I did.
13:05 Click on that really quick.
13:06 Cause I want to talk about a couple of really interesting patterns here.
13:10 Like if you wanted to study 28 lines of code that took and brought together a bunch of interesting
13:15 ideas, like, Whoa, this is pretty crazy.
13:18 So it has a generator expression on an infinite sequence of numbers to find the temporary file,
13:27 which is pretty interesting because it just says, I'm going to call it dot one, dot two,
13:31 dot three.
13:32 And in case those exist, we're just going to go through all of them until one doesn't.
13:36 Isn't this crazy?
13:37 So that's pretty fun.
13:39 And that uses shutil to copy the file over, which is pretty cool.
13:43 It uses yield to automatically return the inner file pointer.
13:48 So when you say with thing as whatever, even though you said safer dot open, it actually
13:54 yields out the underlying pointer file pointer that came from open.
13:59 And there's just a bunch of different layers of, Oh, that's interesting.
14:02 Oh, that's neat.
14:03 Yeah.
14:03 Anyway, I think this is really clever.
14:05 And it seems like a cool little library.
14:08 The reason I think it would probably be useful and not going to give you a big hit.
14:13 It's like, this is literally it.
14:14 You can see it's creating the temp file.
14:15 It writes to the temp file and then it uses OS dot rename the temp file to the actual thing.
14:21 So, you know, not a whole lot of magic going on, but really quite useful.
14:25 I think not a lot of code either.
14:27 Yeah.
14:27 Just.
14:27 Yeah.
14:28 Isn't that crazy?
14:28 Yeah.
14:29 That's pretty cool.
14:29 I love it.
14:30 Useful gives you that sort of durable error handling, almost like transactional files.
14:35 And yet super simple.
14:36 Very good.
14:37 Uses unit test as its test runner, though.
14:39 Oh, my God.
14:40 Well, all right.
14:41 I retract all of my endorsements of this thing.
14:43 All right.
14:47 What's the next one?
14:48 Okay.
14:48 I'm on the other tab.
14:49 So did I distract you?
14:51 Oh, yeah, you did.
14:52 And new article, new hat.
14:54 So code spell.
14:56 So I got this from Christian Klaus, that silly little project I play with on the side called
15:03 Cards.
15:04 I got a pull request against the project to add a pre-commit hook to run code spell.
15:10 And I had never heard of code spell.
15:12 So I was excited to have a new topic for the podcast.
15:16 Also, just it's as neat.
15:18 So code spell.
15:19 What it does is it fixes common misspellings in text files.
15:22 And specifically, it's designed primarily for checking misspelled words in source code.
15:27 But it can be used as other files as well.
15:31 When Christian applied this to the cards project, it noticed that in one of the documentation files I've got, one of the markdown files, I had spelled arguments with an extra U in the middle of it.
15:44 And one of the problems with spelling, I mean, it's embarrassing to do and distracting to have spelling errors in your code or your comments or anything.
15:55 It's hard to deal with because a lot of source code doesn't have, you can't just throw normal spell checkers at source code because it'll just, it'll warn you on your variable names and all sorts of stuff.
16:06 Right.
16:06 You can't drop it in grammarly.
16:08 That's not going to go well.
16:09 It's not going to work.
16:10 But so I'm really excited to try this and to start using it because if it can work for just about anything, it might be able to work for, you know, non-Python programs too as well.
16:21 Why not?
16:21 So yeah, it's pretty cool.
16:23 Yeah, all sorts of documentation.
16:24 That's cool.
16:24 It's an open source project.
16:25 The GitHub repo has the entire dictionary so you can scan through it.
16:29 And there's ways to ignore certain words if you're like, no, that's the correct spelling and it keeps doing stuff.
16:35 You can ignore it.
16:36 Nice.
16:37 Well, that's a really good one.
16:38 The most embarrassing misspelling I've ever done in code was I'd misspelled like a namespace or a class name or package name or something like that.
16:50 I can't remember quite where it was.
16:52 But it was on a project I had been working on for like a year and I misspelled it, but everything was autocomplete.
17:00 And so I don't care.
17:00 I'm like, da, da, da.
17:01 It is just like, okay, autocomplete.
17:02 I'm not even like ever typing that again.
17:04 Right.
17:05 Yeah.
17:05 I guess I just wasn't paying attention to like that.
17:07 I kind of suck at spelling.
17:08 That was like an extra bad case.
17:11 Some new person came on the team and said, dude, why is this misspelled all over the place?
17:16 And I'm like, oh, we got to fix it.
17:17 But it was like other applications depended on that library and they used the misspelling.
17:22 It was so bad because it was like it had become pervasive throughout like all these different things.
17:28 So like we may have to leave that misspelled.
17:30 I think we eventually fixed it.
17:32 But it was like it was quite a bit of work considering what it should have been.
17:36 That's funny.
17:37 That's awesome.
17:38 Well, at least you didn't have like both of the spellings be valid symbols in your program and mean completely different things.
17:45 That's true.
17:45 Yeah.
17:47 So one of the things that's awesome about this podcast is we'll say we'll find some random thing or maybe somebody will send it to us and we'll say, oh, did you even know that this was a thing?
17:57 I had never heard of this.
17:59 And then like five other people shoot us a message and say, yeah, and this variation or this other thing.
18:04 And that's cool.
18:05 But there's also X, Y and Z.
18:06 Right.
18:06 Isn't that awesome?
18:07 Yeah.
18:07 Yeah.
18:08 I learned so much by doing this.
18:10 Yes, I know.
18:10 We just got to throw something we vaguely know about and like people will correct us.
18:14 Yeah.
18:14 Awesome.
18:15 So, no, seriously, we talked about profilers and I talked about scaling, how it was really nice and fast and it did memory profiling and all that.
18:25 Well, friend of the show, Anthony Shaw said, hey, since you're on this kick for profilers, have you heard about Austin?
18:31 To me, Austin is either a guy's name or a town in Texas.
18:35 I hadn't heard about Austin.
18:37 Have you?
18:37 I got a neighbor named Austin.
18:38 Yeah.
18:39 I don't think this is the same thing.
18:41 So, this is like scaling is a frame stack sampler for CPython, meaning it doesn't have like this huge effect of once you run it on your code, it doesn't become 10 times slower as instrumenting.
18:54 You know, it just asks like, hey, what are you up to really quickly?
18:58 So, that's cool.
18:59 It's nice and fast.
19:00 It also is just pure C code.
19:02 There's no real dependencies like other than like the C runtime, which is in all the operating systems.
19:07 So, it looks at running Python code at intervals and then it dumps out whatever it finds, which is cool.
19:13 It has a really simple output, but as you will learn, it has all these interesting ways to visualize that output.
19:20 So, it's sort of base, it's atomic unit of output is a flame graph.
19:24 So, flame graphs are like stacked up sort of things that are colorful and they also have information.
19:31 So, like the color communicates information and the height.
19:34 So, it's kind of like a graph with like color bars type of thing and it has the parts of code they're running.
19:39 If you want to see what that is, just click on the link and it has it right there at the top.
19:43 And that's cool.
19:44 So, it puts that out, but you can build other tools to analyze that or you could even make like a little player application that replays the execution of your application in like slow motion.
19:54 Like replays that flame graph over time.
19:56 Oh, that's neat.
19:57 Isn't that cool?
19:57 Yeah.
19:58 So, now is where it gets really fun because there's a couple of user interfaces on top of this like simple output that can be interpreted.
20:05 So, the first one is called the TUI, the Terminal User Interface.
20:11 Do you see this animated in our little show notes?
20:13 And we'll be in there.
20:13 It's nice.
20:14 Yeah, it's really cool.
20:15 Yeah.
20:15 So, let me try to describe it.
20:17 Like imagine you've opened, I don't know, Emacs or something like that.
20:21 But the top part of it shows the process information, the CPU it's using, the memory it's using, how long it's been running.
20:30 And then a graph, an active like interactive flowing graph across the top of like the performance analysis.
20:38 And then it has something that's a little bit like top maybe, showing you like what it's currently running, how much time it's using.
20:45 Is this time being spent on a sub function call?
20:49 Like did I call a thing that called request that is talking to the network and that's why it's slow because we're waiting on the internet?
20:54 Or is it actually computationally my stuff running in Python or whatever, right?
20:59 So, what do you think?
21:00 That's cool, huh?
21:01 Yeah, that bottom part reminds me of the thing that you put the process explorer on Windows where you look at all your processes.
21:08 Yeah, a little bit.
21:09 Like task manager.
21:10 But it's actually for like your functions instead of other processes.
21:13 Yeah, yeah.
21:14 It's nice.
21:15 That's cool.
21:15 So, that's the TUI, which is going to be a popular one.
21:18 But you may also want to be on the web.
21:20 So, there's WebAustin, which is another example of making this for the web.
21:24 So, you basically can log into wherever you're running it, connect to it, and it has a D3 flame graph that's like animated of what your web app or whatever process you're watching on that remote system is up to.
21:36 So, it's kind of the same thing, but like more visual, more graphical.
21:39 Like the flame graph is there and whatnot.
21:42 So, that's pretty cool.
21:44 People can check that one out.
21:46 You can even pause it and whatnot.
21:48 Then finally, there's this other format called SpeedScope, which can be visualized in other tools.
21:55 And you can convert Austin output into the SpeedScope JSON format.
22:00 And there's a sample for that in the repo.
22:03 If you go look at that, you can load it into the SpeedScope visualizer type of things and have another way to view the data.
22:09 So, this is really nice because so many of these profilers are like, we collected all this information.
22:14 How would you like it as a CSV?
22:17 Or how would you like it as just like random columns in a terminal?
22:22 And this is so much like, I would not like it that way.
22:25 I really like the visualization because it's one thing to gather the information.
22:30 It's another to go, oh, I see.
22:33 Right there is actually where it's slow.
22:35 And if it's just a dump of a bunch of numbers, I mean, yeah, you can like sort it and whatnot.
22:40 And you can use cProfile with different sorting options and get it to mean stuff.
22:44 It is not the same.
22:45 It's like, aha, there's the picture.
22:47 I see.
22:47 It's red right there.
22:48 And it's really tall.
22:49 Let's go figure that out.
22:49 Yeah.
22:50 And the web one, the logo is awesome.
22:52 It's good.
22:53 It is really good.
22:55 Yeah.
22:55 It's like a 70s thing.
22:56 It reminds me of Austin Powers.
22:58 Yeah.
22:59 A little bit, right?
23:00 Yeah.
23:01 Yeah.
23:01 In a non-copyright infringing way.
23:04 Anyway, that's it.
23:06 If people are looking for a profiler, Austin looks pretty cool.
23:09 Check it out.
23:10 Definitely.
23:11 It seems like it's definitely one of the contenders.
23:13 Anthony for sending that in.
23:14 I want to talk about numbers.
23:15 Does this fit in the screen?
23:17 Oh, yeah.
23:17 You got your mathematician hat on now or your wizard hat.
23:20 I can't decide.
23:21 I got this from a man.
23:23 He writes two great stuff.
23:24 First name Mosh.
23:26 Is the last name Zadka?
23:27 Zadka?
23:27 Mosh, you got to contact me and find out.
23:30 Tell me how to pronounce your name.
23:32 But numbers in Python.
23:33 Really great article.
23:35 In Python, you don't really have to think about numbers too much.
23:37 They just sort of work.
23:38 Yeah.
23:39 But you do kind of need to think about them.
23:41 And this article is a really good, quick tutorial about the different things that you need to know.
23:45 Like integers, they turn into floats really easily.
23:48 Like any time there's a division, it'll turn into a float.
23:52 Right, which is unlike other languages, which are like truncating sort of things, right?
23:57 That basically take the floor of whatever the result would be.
24:00 Yeah.
24:01 In earlier Python versions, 2.7 like that.
24:04 Yeah.
24:05 Yeah.
24:05 Like truncated off.
24:06 Right.
24:07 If you want that old type, now you got to double divide, like the two slashes.
24:11 Yeah.
24:11 And I forget about the two slashes thing.
24:13 Yeah.
24:13 I never use the two slashes because that seems wrong.
24:15 Anyway, so the implications are weird though.
24:20 And the other thing, okay, so you got integers, they turn into floats if you divide them.
24:24 You got floats, which are things with decimal points in them.
24:27 They're not the only things with decimal points in them though.
24:30 One of the things you learn early on in programming, but some people are new to programming or numbers,
24:36 so it's a good thing to remember, is floats don't behave like floating point numbers in math.
24:41 Like the subtraction and addition are not inverses.
24:44 And addition is not associative always.
24:47 And you can't multiply and then divide and get the same number.
24:51 Those are weird things you should be aware of.
24:53 The normal thing that I mostly need to remember is don't try to compare floating point numbers with the double equals.
25:02 You have to use something like approximate or something.
25:04 Yeah.
25:05 That's the one that can really catch people out.
25:07 I mean, okay, so I thought I was going to get 14 and I got 13.9999999978.
25:13 Okay.
25:14 Well, it's computers.
25:16 We know that stuff's truncated, but it's really easy to go.
25:19 If X equal equals some number I'm looking for and that never ever happens.
25:25 Right.
25:25 It looks right and it is so wrong.
25:28 And I think just our training for so many years in theoretical mathematics means that it's hard to look at that and go, that's wrong.
25:36 Yeah.
25:37 Well, it's interesting that when you see it in numbers, you can, like, for instance, one of the examples is one plus two minus two minus one is zero.
25:46 Obviously.
25:47 Of course it is.
25:48 If it's floating point numbers, though.
25:50 So floats don't end up with zero.
25:53 You end up with a very small number, but it's not zero.
25:56 Okay.
25:57 So floats are weird.
25:58 Be careful.
25:58 Fractions.
25:59 So if you don't use floats, there's fractions.
26:02 Python has built in fractions.
26:03 I actually have never really used these.
26:06 It's neat.
26:06 They're there.
26:07 They're there.
26:07 I've never used them either.
26:09 But yeah, there's like a class called fraction with a numerator and a denominator or takes another fraction or a floating point.
26:15 Yes.
26:16 Even takes a string.
26:17 How about that?
26:18 The warning in this article is they, fractions take a lot longer than you expect they would for algorithms.
26:25 So you can represent things as fractions.
26:28 It's cool that you can do that.
26:29 Be very careful with any sort of algorithm because it can explode in memory and size and time and stuff like that.
26:37 So probably use floating point.
26:39 That doesn't surprise me because when I have to do like fraction algorithms, in my mind, it takes a lot longer too.
26:46 Yeah.
26:47 The last one that he talks about, and it's something that some people don't realize right away, is that decimals are built in.
26:54 So there's a decimals library that it's probably not surprising.
26:58 One of the reasons it's in there is for financial transactions.
27:01 They're set up to be correct with precision and do the right thing.
27:05 And so I'm really glad it's there.
27:06 Otherwise, we'd have like competing decimal third-party libraries or something like that.
27:11 And we probably do.
27:11 But this one's built in.
27:13 I'm glad the article was written, though, because something weird about decimals that I didn't know about was there's a global state variable called context that holds the precision that's being used for decimal division and stuff.
27:30 It could be anywhere in your program that the precision gets changed.
27:35 So the recommendation in this article is to use a local context.
27:39 So you can do one of those blocks, context, what are those things called?
27:43 Context manager.
27:45 Context manager, yep.
27:47 You can use the context manager, local context, to set a local context precision for your arithmetic.
27:54 So that's good.
27:55 That seems like that should be the required way.
27:58 Because just setting it globally seems really, I don't know, it seems wrong.
28:03 Because, you know, think of the race condition there.
28:06 I was doing math, and then the precision got cut in half, and then it wasn't what I expected anymore.
28:11 Yeah.
28:13 Or, I don't know, maybe there should be a minimum precision.
28:17 This is interesting, though.
28:18 Like, I didn't realize that you could even change the precision of decimals.
28:21 So, like, in the docs it says, unlike hardware-based binary floating point numbers, the decimal module has a user-alterable precision defaulting to 28 places, which can be as large as needed for a given problem.
28:34 So, yeah, you can change it.
28:36 The example in the Python document on the docs show, just globally changing it halfway through a calculation, which seems like a bad, let's kick them down the stairs instead of teaching them to hold on to the handrailing.
28:48 But this is really cool, like, this local context, change it, you can set it really high.
28:53 That's cool.
28:54 I had no idea that you could actually change that to grow as you need it, which is cool.
28:57 Yeah, I guess you could still use the global context as long as you, maybe this isn't safe, but as long as you always remember to set it before you do decimal arithmetic.
29:07 It's safe as long as you're not doing threading.
29:09 Oh, yeah.
29:10 Okay.
29:11 Yeah, because what if somebody, some other thread has the same idea and changes it?
29:14 I think the idea is maybe, it seems to me like possibly it would be better if once set, it couldn't be set again.
29:21 Like, you could set it at the beginning of your program, but it couldn't be altered and altered and altered.
29:25 Yeah.
29:25 Like, right, something along those lines, like, okay, we set it, we're done, it's an exception if you try to set it again to something else and so on.
29:32 I mean, it probably is a convenience.
29:34 Yeah, this is the whole world I didn't even know about.
29:37 This is cool.
29:37 Yeah, I would probably set up some sort of, like, hook or something to make sure that you're only setting it one place if you're doing that.
29:43 Yeah.
29:44 I don't know.
29:44 Yeah, sounds good to me.
29:45 Cool.
29:46 Anyway, yeah, this is actually more interesting than I thought because, like, as usual, I've learned something, which is cool.
29:51 Last thing on this, we're going to link to the standard library documentation for fractions and decimals because you may not have heard of them.
29:57 And then a very old article that if you really care about floating point numbers, you should at least know this article exists.
30:05 Although I don't think I've actually gotten through the whole thing ever.
30:08 But it's what every computer scientist should know about floating point arithmetic.
30:12 That's a good article.
30:13 Yeah, cool.
30:14 All right.
30:15 Any extras for us today, Michael?
30:17 You know, not too much.
30:18 I don't have too much to share right now.
30:21 Nothing personal.
30:21 But I do want to say thank you to everyone who subscribed to the YouTube feed of this podcast because we're breaking every segment.
30:29 We've just covered six things.
30:30 We're breaking that into six different videos.
30:32 And you can see us on video, which is kind of cool.
30:35 A bunch of people are subscribing at pythonbytes.fm/YouTube.
30:39 You all can check that out.
30:40 And you will see that Brian has awesome hats for every segment.
30:43 Well, at least this episode.
30:45 This episode.
30:46 Which, yeah, you got to wait.
30:48 Yeah, for sure.
30:49 This episode.
30:49 So eventually, you'll get to see the hats.
30:52 Okay.
30:52 I'm glad we mentioned that.
30:53 I wanted to mention also that Python 3.9.0 alpha 6 is the last alpha release before we go into betas, I believe.
31:03 And it is available.
31:05 And it has the peg parser that we talked with, I think, last week about a little bit.
31:10 Yeah.
31:11 Yeah.
31:11 We was here and talked about that was really cool.
31:13 The work he's been doing there.
31:14 That's a big long-term upgrade, right?
31:18 That's something that got written in the original version of Python.
31:21 It was unchanged and obviously can be better, right?
31:24 Basically, the syntax was limited by the parser and how much it looked ahead and stuff.
31:29 And so this should open up the language for more complex concepts or make it easier to add concepts to it.
31:35 Yeah.
31:36 All right.
31:36 So I see we have some competing jokes here.
31:38 You want to go first?
31:39 Yeah.
31:39 I just put a call out on Twitter and said I need some more jokes.
31:42 And boy, I got a whole bunch of great ones back.
31:45 I'm going to pick one.
31:46 This one's from James Zabel.
31:48 If you put 1,000 monkeys at 1,000 computers, eventually one will write a Python program and the rest will write Perl.
31:55 That's right.
31:56 I think maybe like 950 of them will write Perl.
32:00 A couple of them are just going to be writing regular expressions like all on their own.
32:03 Yeah.
32:04 That's true.
32:05 All right.
32:07 I have one that's maybe in a similar vein here.
32:09 So, you know, like we talked about Austin.
32:11 It has all these different user interfaces and it's very user friendly.
32:14 Well, you could say that Unix is very user friendly as well.
32:19 It's just very particular about who its friends are.
32:22 Yeah.
32:25 I got friends like that.
32:26 Yeah.
32:26 I got that one from the Pyjoke package.
32:29 So pip install Pyjoke and you can have it too.
32:31 Yeah.
32:32 That's good.
32:32 Yeah.
32:33 Anyway.
32:34 Cool.
32:34 Thanks.
32:34 All right.
32:35 Fun as always.
32:36 Great to be here with you.
32:37 Bye.
32:37 Bye.
32:38 Thank you for listening to Python Bytes.
32:40 Follow the show on Twitter at Python Bytes.
32:42 That's Python Bytes as in B-Y-T-E-S.
32:45 And get the full show notes at Python Bytes.fm.
32:48 If you have a news item you want featured, just visit Python Bytes.fm and send it our way.
32:53 We're always on the lookout for sharing something cool.
32:55 This is Brian Okken and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.