« Return to show page
Transcript for Episode #189:
What does str.strip() do? Are you sure?
00:00 Hello, and welcome to Python bytes where we deliver Python news and headlines directly to your earbuds. This is Episode 189, recorded July 2 2020. I'm Michael Kennedy. And I am Brian McCann. And this episode is brought to you by us our courses and our books and all those sorts of things. tell you more about that later. Right. Now, let's talk about improving Python exceptions. Yeah, this came from ROM Rajim probably getting his last name wrong. But this is pretty interesting article. Actually, I had seen this in code. So exceptions are a big part of programming and Python. I think some people don't like to use them, because they're kind of difficult in some languages, but it's pretty easy in Python. So let's say you've got a, you're dealing with an exception, you're inside of the Accept part of the try except block. And an exception happens. And if an exception is raised from an accept clause, which exceptions should be reported. And I really don't know what the mechanism was in Python two, but there were changes made for Python three that are associated with Pep 3134 and others, that both of them will be reported. So you get both exceptions. But you get something like that, often, you get something that says, during handling of the above exception, another exception occurred, right, which is super frustrating, because it's not actually saying the error occurred trying to handle it. It's like, I'm trying to add more information to it or give you more detail and more specificity or something like that. Right, right. So what ROM is pointing out is that if an exception is raised within an except clause, it can be really one of two things, it can be something I called some code that that I didn't expect to have any exceptions, and an exception happened there. And that's what kind of the default message sort of implies. But the It might also be any as like this long block, I'm just going to read it, an exception was raised, and we decided to replace it with a different exception that will make more sense to whoever called this code. Maybe the new exception will make more sense, because we're giving a more helpful error message. Or maybe we're using an exception class that is more relevant to the problem domain. And whoever's calling our code could wrap the call with an accept clause that's tailored to this failure mode. So a long mouthful, but basically trying to be nice and raise a better exception or a better message or something. And in order to do that, all you have to do is just change one thing, and it's the from E so within your exception clause, you accept something as he or as you know, when any variable but he is often used, you raise the other exception with a different message from me. And that little from will get you a message within traceback will have both exceptions. It'll say, during the handling of the above exception, another exception occurred. Oh, no, wait, no, it'll say that's what we had before. It'll say that's a confusing one, because it looks like that. We tried to handle it, but we actually crashed again, inside the handling of it. Right, right. So the new thing, if you do the front me, it'll say the above exception was the direct cause of the following exception. So that's a little clearer. I think. So you have the one on the top caused the second one to happen. So you know, that we're in the case of somebody trying to wrap a an exception with a friendlier exception, which is subtle. The reason why I kind of, I don't know, I just like it. Rom likes it and but he didn't see that, nope, that like nobody was using it. So he went even further, anyone out and wrote a script that can look for probable places where that should be happening. And he did. He did pull requests to set up tools, sci fi, matplotlib, pandas, pi test ipython, my PI pigments, and Sphinx, which is like awesome that he just went out and did pull requests against all of this projects. That's really cool. The other thing he did was he wrote a new rule for piland. And so ws 0707 res res missing from, which is kind of weird, but but basically, pilot will now catch it. So if you use the most recent, it's not released yet. But if you run pilot from GitHub, you can see that rule, but the next time they release pilot, it'll be in there. Yeah, this is neat. I like it. I think it's confusing the way it is now, if you don't do it the right way. And I also often get messages from students saying, Oh, I got this really bizarre error, like some, you know, some other thing. And the problem is buried in the other part of this whole story, right? And it's, it's not clear what part they should be paying attention to. So I think this makes it more clear, and it's nice. All right, well, the next thing I want to talk about is publishing data science and other interactive stuff on the internet. Now, there's a couple ways you can do this. You could go to like Google collab and whatnot. But Tim Pogue sent over this
05:00 new place where you basically custom built for this story. And it's called Data pain. So data pain is an open source framework, which makes it easy to turn scripts and notebooks into interactive reports. So you can export a Jupiter notebook as a report, but then it's just static, you can't interact with it. Or if you try to run it, then you know, there's a bunch of code in there. And there's all kinds of stuff that you probably don't want people to see. So data pane is a way to like create and host interactive HTML that runs against API's and runs data and so on. It's pretty cool. Do you go there and like use sliders to adjust what's happening and say, a graph or filtering or things like that? That's neat. So this is one of those things that it's a paid service, but they have free accounts that you can do, right? Yes, exactly. So I'm not entirely sure the whole spectrum of what's going on. So it says it's an open source framework. So I think there's an API that you bring in to make this happen. And there might be a self hosting story, they also have a platform for publishing it to and on that platform, it's free for individuals, but I, I can't find a price anywhere, but it looks like they're going to eventually charge like for a team, commercial, corporate version. I don't really know what's going on with that anymore. But it's free for individuals. And you can use their open source framework to generate these reports. Okay. And they also have the gallery there. So I was looking for, found it for a while. So obviously, right at the top says Up, up and over announces calories for looking at pictures. Exactly. Yeah. So the idea is that you can use like the tools that you already know and love. So you can use Jupiter or co lab or airflow. And you can build reports, using their frameworks using markdown as well and cool libraries that people probably already using for visualization like Altair, but I've been using for my visualizations lately. And then you can either export these as standalone HTML files, which is one option, that's the totally disconnected, just take the library. The other one is you can publish them to data pane for free. Or you can people can share them and embed them. And there's probably this paid corporate thing that I don't have details on nice. Yeah, sure, you can add like forums to help filter and drive the report, I have talked to API's. Also, they have API's that you can use to deploy your scripts to their server. And you can even have integrated setup with like GitHub actions, or airflow or standard ci CD stuff to like, update your reports on data pane comm so pretty cool. Like you said, there's a gallery I'll link to the gallery at the end of the section. So these kind of things, you just want to poke around the gallery and get a sense before you mess with it. Right? Like pictures I do, we both like pictures. Now before we go on to the next thing, and we look at the ways in which pickle might make you sour. Let's really talk about things that won't make you sour, your PI test book if people are interested in learning pi test or getting into testing, not just scratching the surface, but really deeply understanding pi test, they can check out your book, which we'll link to in the show notes. And if they would like to take some courses, we have a whole bunch of courses over at talk Python, we're almost up to 200 hours of courses over there. That's a lot nice. It is a lot. But like I've said before, one of the things I like about it is you don't have to spend hours at a time you can spend a few minutes and not lose your place. Nice way to set it up. Yeah, thanks on the PI test site, I wanted to reach out to people I'm pretty available on Twitter or through the contact form on Python, not type Python. Python by inside however, I've had some people try to email me at the contact form on talk Python and Michael just send it to me nicely. So anyway, I want to like have people know I'd like to know people, people if the if they want to learn how to test more, whether or not they've read the PI test book. What's stopping you? I don't know how to get in people's head of why are they not testing more? So please reach out to me and let me know your questions. I'd be absolutely now you want to make me sour? I love pickles. Do you like pickles? I'm coming around to them. I've used to really not like pickles. But I'm okay. Okay, I have a really fun pickle story to tell you but it's like 10 minutes it's better Yeah, next time but let's talk about pythons pickle. I kind of like it sometimes. No it's one of the first things I learned about for saving data and stuff with is you can pickle stuff so if you got whatever object a bunch objects or whatever in a collection, you can use pickle to serialize it somewhere. I think we've brought it up here on this podcast but you will occasionally run into people saying just be careful with Pickler never use pickle or something. And Ned batchelder put together a post called pickles nine flaws where he says it's not never use it but you know use pickle if you're okay with these flaws. So, and I like this list because I've heard some of these before, but never in one place. First off, it's insecure. So unless you really I mean, if you're only pickling on pickling your own data within your own program and there's not a chance of having somebody else
10:00 feed you bad pickles, then you're okay. But it's not secure, there's ways to have them have malicious pickles be generated, that can cause the undeclared to run random code, just not what you want. And then he links to a more thorough article on the security if you're worried about that. Old pickles. Second one is old pickles look like old code. And the gist of it is, is it's you can't really version this stuff. So if you change the way your objects Look, your pickles aren't gonna, it's not gonna work, right? Change your data structure. And you got to pretty much throw all data we started. Yeah, this one is really tricky. Because it takes the entire object graph, you say, like you're seeing this list, this list contains these things, these things contain pointers. So there's other things, it's gonna save all of that stuff. And if the code that defines the fields of that change, if you rename a field, and you try to load it, you're done. It's not going to be able, it's like this blind binary blob that doesn't fit together anymore. There's not like a well let me adjust for this. It's just like, nope, nope, you can't no longer load that file. So it depends, but not being able to load your file at all. or open it in a text editor. Even that might be a bad place. Basically, if you're trying to save code between versions, or save stuff between versions, it's not a good thing. There may be cases where you're just using it to serialize between a live app, a running application, and you're, you're not going to save that data anywhere. I think there was an interesting problem. I think it was Instagram that had this problem, somebody that had done a big talk at pike on, they were doing caching and something like Redis or something and they were putting their pickles, they would pickle stuff, put it in the cache, they pull it out, and then they switch from like two to three, and they could no longer read their cache. And it like took down the system or something. Yeah, like, so they're really even really weird like that, like deploying a new version of the site. Yeah, I'll read through the whole list, we can pick out a few there implicit, so it serializes everything, even if you don't want it to, oh, that's the overseer realizes, the implicit part is it serializes things as class objects. And even if that's not really what you'd want, like his example, is date time, you really wouldn't want to serialize it as a date time object, you'd want to serialize it as a string or something, the init isn't called. So normally your classes get in, it gets called. But during the unpicking, it just creates this object without calling in it. So if there's any side effects that you need to have happen, that ain't gonna happen. It's Python only it can't convert it between different languages. It's unreadable. You mentioned, it's binary. So if anything goes wrong, good luck with debugging, it appears to pickle your code, but it really doesn't. So if you've got a list of functions, and classes and objects, those will be in there. But it's just names of stuff. And then when when you unpack it, it creates those objects without calling in it. It's slow. And so the binary part gets me but basically, these are all the things that are wrong with it. But there's some ways to get around it. And a lot of people will say, well, there's ways to get around it. And Ned's comment is, if you're going through some of the code to work around the problems of pickle, then just why are you using pickle, there's other things out there. And my favorite alternative is just Jason. So isn't mostly works most of the time, and you can read it. So you can look at what's wrong and go, Oh, I see why that didn't work because I was putting this garbage in there, or whatever, or I haven't tried these others. So there's other suggestions like marshmallow catters, protocol buffers, I haven't used those before. But the ones I'm going to bring up, you brought up something about being binary. And that's one of the parts that gets me as a serializer. Because the binary, that makes sense if it's really fast, but using a slow, serializer to slow and come on.
13:59 Yeah, one of the things that over serializing that I found is interesting is caching is something we do for like long, hard computations or something or just to save some computation, you cache the result that stuff will get saved in your pickle also. And that's surprising. So just weird. Yeah. So thanks that Yeah, that's a cool article. You know, the one place that pickle kind of makes sense to me is if I'm doing like a really quick prototype, and what I would need to save would be fairly complicated. I'm just like, you know what, I just want to save this. And if I really invest, and I like what we built here, we're going to go and rewrite this with a proper serialization. See data saving story. Yeah, I could see that I Alright, just pick it. Now. I don't really often do that, that very, like it's not very common, but I would certainly consider like, I just need to save this complicated data structure. And I don't really know if I even need this forever, but I'm just gonna do it. For now. I might consider pickling but in general, it's not super awesome. And also the malicious bits makes you you know, you got to be careful. So the reason why you can't grab Jason
15:00 right away is because there's some objects that are not serializable. Right? Yes, exactly in some things that are simple like date times, like, if you had a really complicated some structure that had like, say, daytimes or other objects in there, you can't just dump the whole thing to JSON, right? So if you were just in a hurry, like, I'm just really trying to spend 30 minutes to figure this out, but I just need to save this and then load it in like, then maybe pickling makes sense. But the fact that you you could never be, it's very possible, you upgrade Python, you can never load it again, or things like that. You can't look at it like that makes me never want to invest in this as like a long term option. Yeah. The other thing I thought might be a reasonable thing is, I'd like to hear other people's thoughts on what reasonable uses for Pickler but like, in memory stuff. So you're, you're serializing it just to get it from one part of your system to another, or something? Yeah. Yeah. Like a super deep copy sort of thing might make sense, potentially. Right.
15:55 Yeah. Speaking of it might change because you get a new version of Python. That is now going to be more common. Yeah, yeah. So Lucas Langa. He's been in charge of releasing a release management of Python recently. And he has gotten pepp, six oh, to accept it. And PIP 602 says, We're switching from a 17 to 18 month release cycle to a yearly release cycle. Yay. That makes sense. Yeah, remember, we were talking about something that got into Python three, nine, and we were looking at the conversation on the dev channel. And it was like, I really, really tried, I submitted this before the deadline, but you guys weren't able to like review the PR before the window closed for three, eight. So because of that, like that feature missed by a couple of weeks, and it was a year and a half, before it could actually get accepted. Right. It was like really long. So on one hand, this, this is to shorten that cycle. The other thing is, on a given year, when will Python come out? I don't know. Is it an even year or an odd year? Like that's just a weird way to think of things right? We have conferences yearly, in theory, things like that, right. But for Python, he had been on this 18 months sort of alternating cycle, this way they can schedule it so that things like the feature freeze or beta stage happens right after pi con so the Sprint's can always focus on getting that last bit into C Python, at pi Khan and what and is much more predictable. So I think that'll be nice. Yeah, and there's other stuff in life, like, you know, holidays and things like that, that you can just sort of figure out when a good time is for the whole schedule, and then just do that every year? Absolutely, absolutely. So let's see it, there's a little bit of interesting detail. It says the pet proposes that Python three dot whatever be developed around, he gets a seven over 17 months across a 17 month period, which doesn't sound Daniel does it? No. So the way it works is the first five months will overlap with the previous versions, beta period. And then there's seven months spent on the alpha release, and then three months on beta releases, and then finally two months on unreleased candidates. So there's the way it's gonna work instead of like, we're now on three, eight. Now we're focusing on three, nine that three is out, there's like an overlap. So basically, as soon as the features freeze on the previous on the what's coming out, right, so like, right now three, nine is coming out. So as soon as it hit feature free 310 would start, right. So there's like this overlap as the version that's coming out, it's getting stabilized and finalized new features and developments already happening on the other one, in the way with that overlap working, it's going to result in a new version, every 12 months, there'll be like a five month period where there's like new features go to the new one and stabilization to the old one. And this is kind of a commercial ciphers done anyway. Yeah, you never like no one touches the new one until we've like shifted the gold version, like now, like, some people are fixing stuff that they need to fix. And some people are putting the new features. So in terms of advantages, they call out, makes the releases smaller. So doubling the cadence doesn't double their available development resources, and so on. And consecutive releases are going to be smaller in terms of features. So that's good. It puts features and bug fixes in the hands of users sooner, right? Six months sooner, it creates a more gradual upgrade path, right? Instead of adopting 18 months at a time, you could adopt 12 months at a time like you could be if you're saying like we're always going to be one version back of what just got released on the Python, right? So if you're saying we're going to run on three, seven, today's a world, right, that's not an 18 month gap, that's a 12 month gap. So that's good. You're a little less far behind if you're on the laggard side of things, the predictable calendar. So the final release is always going to be in October, after the annual core sprint and the beta phase always starts in May. So as
20:00 To the pike on us friends, so there's like a nice lineup for that as well. So my prediction is that people won't be doing the two version thing though they're actually do three. Now, I recommend everybody do this anyway, so let's say they right now, if you've got a project on Python, you should make sure that works on three, seven, and three, eight, of course, but also three nights around. So just on three, nine also. And then when when three, nine is the official one, maybe you keep three, seven, but if you don't want to use you're at least doing making sure you're compatible with the next one, the current one and the last one, so it'll make more sense. Absolutely. Very good. Now, we spent a lot of last episode on get right, yeah, well, Episode 187. We talked about Oh, shit get from a Xen from Julie Evans. And I, I mentioned that I was concerned about I wanted to buy this for my team, but I was enough HR would like the naming of it. And john place reached out one of our listeners and said, You know, there's a non swearing version, he actually sent us a ton of great good information. So the same little magazine, you can get it as Dang it, get a shot, gosh.
21:11 Darn it, get
21:13 all of those would be good. But it's dang it get. And then also, when I was looking around these, there's also these were jeans that you can buy. But these are inspired by two websites that are put together by Katie. So Katie Siler Miller, dang it get calm and ohshit get calm. Are there is actually really cool. These are free websites that have kind of the idea of something went wrong, how do I fix it? And then every single one of the topics is clickable and linkable and I didn't understand the clicks at first, I'm like, Oh, these are links, they go to articles? Oh, it just takes me right to where I already was. I don't understand this. Why would I do this? Well, it's so that they're anchors to the you can click on it and get the URL and send it to somebody. So if somebody asks you a question, in the answers here, you can send them the link and say, here's how to fix it. And then next time, they can fix it themselves. So these are neat things in the community. And then also this, okay, do you see the link for get cheat sheet? Have you looked at this yet, I'm looking at it. Now, I thought this was going to be a get PDF, like a PDF or something with a bunch of commands. But this is an incredible resource. They get cheat sheet is an interactive single page website. That's just beautiful to look at, for one. But it's got five columns, I'm going to pull it up also got five columns, the stash workplace index, local repository, and upstream repository. And you just when you hover over override, or click on the different columns, it shows you all of the different Git commands that affect that. So if you click like, click on the local repo, for instance, it'll show you how to get information from, you know, diff, how do you compare between the workspace in a local repo, what reset does switch rebase, cherry picking, and all sorts of stuff, and then different commands that go between the index and the local repo, and the local repo in the upstream one with pushes and stuff, but it has this nice visual of where your stuff is, and how do you get it back and forth. And then you hover over if you hover over any of the commands, it shows you the information on the bottom, this is just great. I'm gonna show this is what I'm gonna share this with my team, for everybody that's kind of is one of the things that that takes a while to get your head around, get is this idea of what is the workspace and the index and the local repo and stuff. So having a visual of what commands affect which part of git, it's pretty, pretty darn cool. The last thing I wanted to share was a this is a one pager that you could print out it's get pretty that is kind of like the Dang it, kid and Oh shit, get offerings, where it's a single page PNG flowchart that starts with so you have a mess on your hands. And then it asks you a bunch of questions, have fun only to like what commands might fix it. And I don't know if I use this much, but I totally want to print this out and put it on my cubicle wall cuz it's kind of entertaining. Yeah, this is really cool. I like it more get resources. These are a bunch of great. Yeah, a lot, a lot A bunch of good resources. I'm gonna close this at our main items out Brian with a simple one. Okay. Okay, so there's a new Pep around strip. Basically, that's the short version before I tell you the long version, okay. So if you have a string, and the string is Saturday is the first like so Saturday is the number one st. And suppose for whatever reason, I'm not sure it's a really normal use case. But for whatever reason, I want to get the st off of the end here. And so I might say, dot strip, st what comes out of there.
21:13 Yeah, that's interesting. So what I thought was, when I first encountered the strip function, I would pass it a string. And that string would be taken away from the string that I was stripping from. So like, if I say dot strip st, it's going to take the st off the end. But what it really means is take away all the esses and take away all the T's, until you hit something else. So it would be added a is the one right, it would take the S off the front, even though it's not st is just s, right? Yeah, yeah, I mean, I guess the idea is you give it a space to tab and a backslash n and a backslash R, and you say, Take that away. And that destroys the whitespace. Right? That's probably the foundation, right? But But you give it a string, it doesn't take the string away, it turns it into a bunch of characters, and it just takes the characters away. But what if you wanted to wait, take away the st. Well, you've got to write a bunch of code that goes and finds the st. MCs if it's at the end, and then take it away, or you wait a little while and use a new function, that's going to be in three nine, called remove prefix or remove suffix. So then aswini has got a pep accepted to add these two functions that are actually probably what you first thought that strip would do, given a string, it's going to take that string off the ends. Okay, not a big deal. But I suspect that there's probably people out there that type more than one character into strip and expect it to be treated as a substring. When it's not. And so there's going to be in 392 functions that do that. Yeah, I mean, I totally cuz I get the need to have like, a set of things. If you want to pass it in, like a space and a tab, a new line, there's a bunch of things you want to strip off the end. Well, there's easier ways to strip the whitespace. But like, let's say there's other like random stuff that might appear at the end. Yeah, stripping that but having a B character. I was totally surprised by that. I didn't believe you. I just typed it in. Because I'm like, Well, what about the Yes, and is, but no, it just the strip takes those. It's a set of characters that are only at the end. Yeah, it takes the S off Saturday and the st off at first, but at least the other ones alone. Yeah, yeah. Yeah. So replaced, doesn't there's not there was not really a great way in Python to do that without writing your own function that checks that there's only on the ends, and then takes them away. And so now remove prefix and move stuff. And it's nice. And just finally, this will be applied to all the string like things in Python. So you know, code stir, binary bytes, byte arrays, objects and collections dot user string, so not just against stir, but it takes all these other things that are string. Oh, nice string ish. Is that a tactic after? That's, I think, if we get the pep accepted, it'll be technically okay.
21:13 No, yeah. All right. So that's it for our main items. What else you got? I've just been working a lot. So I got nothing. I bet you bet nothing. Well, actually a couple things going on. So Manning, the book publisher, they are going to have a Python conference. In a couple days, I'll pull it up really quick to see exactly what day that is. I think, actually, I do know what date is. I think it's the 17th. Okay, I think it's the 17th of July. Anyway, we're gonna be there. Yeah, not the 17th. There's a for it to the 14th, the 14th of July. So we're going to be doing a live Python bites at the Manning conference, I'll put a link to it in the show notes, so you guys can sign up. So you want to see us do a live event. You know, we've done that at Python. We've done that at some other events. And it'll be fun to do it virtually, which is basically all we get for our conferences, right. By live you mean, like in real time, but like online, right with audience questions and stuff like that. So that'll be fun. Also, I did a talk on 10 techniques for web developers with a little bit of a focus on pyramid, but more generally, just for web developers, tips and tricks and tools and whatnot. At the Python virtual conference, that was last month, or maybe the big, yeah, last month, for sure. In June, to the recording for that is out and I'll put a link to that as well. We also have a humble bundle running right now. It's probably when this episode comes out got another week or so. So there we've got a couple of courses for me, but tons of stuff, stuff from Matt Harrison stuff from real Python, and we've been learner and on and on. Great long list. There's like 1400 dollars worth of Python content and get it for 25 bucks. So let's go in like crazy right now. So if you're thinking about maybe taking one of the courses for either for me or one of these other guys, check it out. It's pretty much as good of a deal as you're gonna get on those. And then lastly, lastly, I'll be check pednekar. Hopefully I got your name close to right there, actually built something really cool for us. So at Python by setup him and talk by thunder FM, we have a Search API and click on the little search box in the top right and there's like a link to a JSON API you can use. So he did of course what you should do when you find an API and use it. He wrote to telegram bots, and you
21:13 You can just speak to the telegram bot. And you can either install the talk Python line or the Python bytes. telegram bot, if you just text or speak to it in telegram, it'll search talk Python and give you relevant responses about stuff we've covered. What? That's cool. Isn't that cool? Yeah, it's super, super cool. So I'll put the links to get those those telegram bots for if you want. Nice. Alright, you ready for a joke? I am. Alright. This one comes from Karen ci. I just don't think she submitted it to us. I think we just found it on Twitter. But here's a quick adaptation. It's not exactly what she said. I explained it a little she did it on Twitter. So it had to fit within a couple of characters. So is this conversation and Brian, let's just have it together. Okay. Okay. Who do you want me to be? Oh, I'll be you know,
21:13 I'll be me. Wait, okay. Yeah, I'll start off. How's that? Because it's very confusing. The way I phrased that. All right. Hey, famous engineer. inventor is coming over tonight for dinner. You want to join us? Sure. Who is it? Oh, his name is Rube Goldberg. That name rings a bell which sets off a trap under the that undoes a buckle and releases a ball that rolls down the pipe and yeah.
21:13 And on and on and on. And I did Rube Goldberg I love this is I should probably go check the internet for some really fun ones. And with all the the the Coronavirus stuff. There's actually some awesome Rube Goldberg that people are putting together on YouTube lately because people are at home with their time in their hands and their kids are home.
21:13 Absolutely, absolutely. All right. Well, I'm definitely going to go check YouTube now. Okay, but all right. See you Brian. Thanks. Follow the show on Twitter via at Python bytes events Python bytes as mb yts and get the full show notes at python bytes.fm. If you have a news item you want featured just visit Python by set FM and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian knockin. This is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.