Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book

« Return to show page

Transcript for Episode #176:
How python implements super long integers

Recorded on Wednesday, Apr 1, 2020.

00:00 Hello, and welcome to Python bytes where we deliver Python news and headlines directly to your earbuds. This is Episode 176. Recorded April 1 2020. April Fool's. So I'm Brian Aachen. I'm Michael Kennedy. And this episode is brought to you by Digital Ocean. We'll talk about more about them later. But yeah, let's just catch up a little bit. What's going on? Oh, you know, I, this is one of the few bits of human interaction outside of my house that I get is to talk to you here on Skype video, right? Like, it's a weird time, isn't it? It's totally weird. We normally don't record video, but we do see each other. And we've noticed as we were contacted, as they're connected today that both of us have new haircuts, but they're homegrown haircuts. I never cut my own hair, and it was getting bigger and bigger, and my hair doesn't get long. It just gets big and poofy. And it's more like a tip on top my head. And eventually I just break down like I can't do this. So at first I thought I looked at my daughter who's 90. I'm like, here's a shaver, you got it, you got to cover. She's like no, no, no, she did it. Okay, job. And that was really good. And then I decided to go back and like it was still quite a bit actually longer than one. So I gave it a second haircutting as well. And, you know, honestly, it's about as good as the haircuts I pay for. So I'm feeling like this might be a trend that I was forced into. And you're looking good yourself. Yeah. So I normally have like a, like a short goatee thing that I keep, and I just have when I bought the trimmer for it, I just only needed one guard because I'm, I mean, I didn't think ahead. And I am just like, I don't know where all the rest of the guards were. So I only have that like little short guard for the trimmer. And my wife was willing to cut my hair, but it's a little shorter than I was expecting. But I think it's alright, I don't have to worry about it anymore. My kids like you'd say it's like a peach thick. So they come by and rub my head all the time. Oh, it looks good. And that's the weird world we find ourselves in. You know, I'm here drinking coffee out of this Pittsburgh mug, but there will be no Pittsburgh right? Python is now officially canceled. Yeah, that's a bummer. This might be the first recording since it was actually fully fully canceled. At least the first time we spoke about it possibly, we might have mentioned last time, but there's certainly a tightening up on getting together for conferences, and that kind of thing. And, you know, we spoke a little bit about what's going to stick these changes that we have to have that are gonna stick like a lot of means could just be emails, or they could be remote or whatever. Right. And I wonder what that's going to do to the conference space. Yeah, I wonder too. One of the reactions I heard from somebody was, even if it was to do move towards a virtual which that they were they were talking about that a little bit, maybe doing a virtual conference. But that wasn't the thing. It's essentially canceled. Well, though, some people are, are sending videos, they're gonna have some people be able to send in their presentations, his videos. So that's neat. But it's the meeting people we interact with people online all year long, and it's a one to one chance to get together. So at least with icon, I don't think that will go away. No, I don't think so. For me, it's not. It's not about the toxic, I go to toxin. They're interesting. But it's about that the stuff that happens in between it. It's about, you know, hanging out with you hanging out with people from other places that don't normally see making those connections and just stumbling onto things that are like, Oh, I had no idea that this was happening. Let's talk about that. But that that's the value. And that's really tough to replace. Yeah, it really is tough to replace, hey, maybe more video chats will help. Absolutely. All right. Well, I guess everyone be safe out there. Yeah. And follow our lead. Stay home, even if you got to catch on here.

03:35 Well, let's move on to some topics. So first off, I was hoping to talk about pi project tamo. I think we've I'm sure we've covered it before. But the break cannon is, is he still part of the core? Did he get voted out? I think he's still part of the core head honchos. I think he's on the Sierra Council. Yeah, yeah. He wrote an article just released, recently called What the heck is pipe project tunnel. And one of the reasons why he wrote it, and I think there's an interesting side effect is, so if you're just, if you're just joining us, the pipe project, that tomo came out of the efforts with Pep, 517 and 518, to define what the file looks like, and it kind of defines what tools are needed to build a project and, and how to build it. And that used to be mostly just set up tools. But now there's lots of other tools like flit in poetry, and you can make your own if you want to, but these are this file. One of the side effects is people started adding other their tool configuration to this file, such as like coverage and talks. Now you can you can put the, those configurations into pi project tamo for those tools, even even though it has nothing to do with building and it just saves you from having an extra file even though those tools have their own configuration files. And then Black came around

05:00 Black uses only has a few configurations available. So instead of creating their own configuration file, black uses the pipe project tunnel for looking up configuration if you want to configure like line length or something. And so there's a, there's been projects that have added pi project tamo files, for the reason because they were trying to they're using black also. And now they're their bills don't work. And the reason is, if there's a pipe project, a tomo, even if you're using setup tools, that will, Pip will look for the build stuff around that within the file. So Brett is has gone and added showed the the few lines of code you need to add to the terminal file to specify how to build with setup tools. That's the main contribution for this. But it's also I think that people go read the article because it's a good good summary of where we're at right now. Yeah, the project tommo is great. And it's super cool that you can specify things like here's how you can say build the package, right requires a wheel and whatnot. Yeah, it's, it's cool. I really like it. I honestly need to start working with it. I'm still on the requirements. txt side of the fence, just waiting for things to you know, shake out, but it certainly seems like there's a lot of energy here. Yeah, definitely. Yeah, so definitely good pick there. You know, sometimes we talk about awesome things on Python bytes, right? Yeah. And we've even talked about awesome lists, which are lists, organizing various other things, projects, websites, tools, whatever, around something. But last time in passing, we mentioned that there was a cool article by jack McHugh, where he blogged about the awesome Python bytes. Awesome list. Yeah. Right. Yeah. So now, jack has made this a proper GitHub repository, where it has a clear way to contribute to it. You can do prs, and all sorts of stuff. And he said, he'll be adding personally to the repo whenever he hears about awesome things like this. Maybe we'll add his own list to it, which would be very meta, right, which link back to itself. Now. Just kidding. But that's really awesome. And there's already five prs from listeners accepted to on this list. Oh, wow. That's great in that cool. People seem to be liking it. After we announced it. Then we had other people saying, and I saw some commentary about, there was some missing that that he forgot about? Yeah, well, it's just the stuff that really stood out to jack in the beginning, right. So if you want to go back and kind of look through what some people thought of a year and a half ago that we talked about, that was awesome. You can go there. And if you look at it, it comes with graphics, like all good things that present things if you can have graphics, guess what? It's like 1000 words, isn't it?

07:42 Oh, yeah. Isn't that cool? This is great. Yeah, it's really, really polished. And you just flipped through and like, we talked about D type? Well, the visualization for pandas, dataframes. And you know, there's like some nice examples and pictures. They're really, really good stuff. So I think it's a great way to explore some of the stuff that we talked about a couple of fun projects I saw on there that I just remembered, oh, yeah, we did talk about that was cool with great expectations for validating, documenting and profiling data. And his vet, which is a flake eight, a linter, for opinionated pandas code, geo alchemy, which is like geospatial databases on top of SQL alchemy and view.pi, which provides in browser Python runtimes interacting with view j s. So just, you know, all sorts of cool stuff there. And I guess related to this, although not technically anything to do with that is, we have a super cool search over a Python byte set of M's last search. And it's really fast. It even searches our spoken words. So if you want to just find something that maybe we talked about a long time ago, or see if it's covered, throw it in there, and it'll pop up. That's really neat. And I love this because I'm like, flipping through this going, Oh, I want to go back and check that out again. Exactly. Like, that was like a year and a half. And we only spent like, you know, half an hour playing with it before we talked about it, then maybe we didn't use it again. But yeah, it was cool. So I legitimately enjoyed going through the list here. And I thought listeners might as well, if they heard it before, even if they didn't, yeah, yeah, that's awesome. Yeah, well, I guess you are going to continue a theme a little bit with your, from the project tamo onwards, I've been playing with the project for a while often on associated with testing code, called the cards project. And it's a little to do app thing. It's just a little sample thing to mostly to play with all of the stuff around Python and distribution and testing and, and all that. And somebody,

09:39 oh, gosh, it was months ago. And I should have their name here. But one of the somebody else contributed a pull request to add GitHub actions. So GitHub actions are a way to you know, first, you can use that for like ci CD workflows, with Python project with all sorts of stuff but with especially with Python projects, because you

10:00 Know that the building pipeline is kind of short, with, with Python, it's not as complicated as other things. So often you don't really it's, it might be overkill that use another ci system. And so GitHub actions are really pretty cool for that. And so I incorporated that. And then was looking into the last step was, how do I publish to the pi pi. And I'd really like to add that. So there's a, I was looking it up and the PI PA, the Python packaging authority, actually has a article written called publishing pack, publishing package distribution release releases using GitHub actions, with ci CD workflows. And so this was like I'm following through this. I am actually ran into some hiccups, but I wanted to cover it anyway. My promise is that by the time this goes live, the cards project will actually be complete. Because it's, it's almost there. There's a few hiccups I've ran into. ipi requires you to have an email associated with package, which is annoying, because a foot and PIP don't anymore. But by P i still does. So I had to do that. And there's a few few other hiccups. I don't know if I'll run it, if there's just these are just fix them and have it ready for people to look at. But if it's too many more, we'll have to write a new blog post and associate that with this as well. Yeah, this looks great. And I had no idea about the GitHub secrets. So one of the challenges is you want to have your personal pi pi account where you are the maintainer, admin of that package on pipe, you might not contribute that to the source code, especially for an open source project, right? I like that idea. Yeah, that's gonna result in some bad things happening there. Eventually, it talks about how to use the secrets settings in GitHub to store those there, and then how to like, pull those out as replacements in your GitHub actions, which I had never done. That's pretty cool. Yeah. And the workflow that they're they're suggesting, which is I think a great idea is to go ahead and have all every pull request or merge to the your to master branch or to anywhere, any branch, go through the entire thing, and try to publish to the test API server. And only ones where you change the tag, which is where we would change, you change the tag, if you change your version number, that's when it would push to a pi pi, the real one, but to have that workflow, even going through the test pi bi, every time you push something, I think that's a great idea. Yeah, it is. But somehow you got to change the version number, or it's not gonna let you publish even to test Right. Yeah. So I'm not sure that's it, I still figured out how to get around that. So we'll see. Yeah, cool. One of the things I love, I love really great tutorials. The internet, I think is working because of the great tutorials around. And one of the the groups that have put out some awesome tutorials is digitalocean. So digitalocean is sponsoring this episode. And one of the things that they're offering, of course, is the offer a way to get started with the offer Kubernetes clusters, and you can do that with digitalocean. But to get started with hosting and running Linux servers with Kubernetes clusters, that can be a little tricky. And that's why we want to highlight that digitalocean is launched their new Support Center. And this makes it easier to find answers to your questions help you get up to speed, get what you need. And you can search across product documentation and community tutorials and forums all in one place. And especially with something tricky, like a Kubernetes. cluster, or really even if you're new to any of this. It's like there's support centers. Awesome. got great tutorials. So visit Python, slash doc support, to see their tutorials and of course use Python byte set FM slash digitalocean. Or $100. credit for new users. Yeah, absolutely. Their tutorials are so good that you can even select different operating systems and different versions so that the steps exactly line up. If you're on, say, a boon to 16 you don't have to try to patch that back the tutorial steps to be different, which is that's taken to the next level very good stuff. I've heard some from somebody once before where they were actually using a different host, and using digitalocean tutorials to help them set it up. And then they finally realized why am I giving money to somebody else? Let's use digitalocean because they help out. That's where our our infrastructure is very good. Now this next one, Brian, this one is best seen through pictures. So make sure you open this up and we talk through it. So like I said, all good presentations, especially stuff that's memorable, has pictures people. This is amazing. So this thing that I want to talk about is called rich, rich text for terminals. So we've talked a lot about how it's great to have gooeys and web apps and stuff like that. That's very visual for the presentations which are still true. But a lot of times you just want something in a little terminal app command ci app. And it would be nice if it wasn't just all one plain color, or just you know, all text left the lines

15:00 Because there's this cool project called rich, and rich, lets you have up to 16 point 7 million colors for your terminal colors, not just like eight, or whatever it is, you know, red and light red, like 16 point 8 million colors, it supports bold, dim, italics, underline, strike to strike through. And even, please don't use it the blink tag,

15:24 you can put the link tag out there, you can have text that's left align centered, right align justified supports, like Chinese and Korean has emojis like you can put colon, Apple colon and an apple shows up. You can, as part of a string put little escape. So you can say this word is bold, magenta. And then here's the rest of the words and just print that thing out. So you have like inline styling tables, beautiful tables, like in the terminal, those are really nice. Those are really nice. syntax highlighting for code. So you can print out Python code with line numbers, and it'll highlight it and even has markdown support. So you could write markdown and it renders it as its own version of rich text, not not quite HTML or something like that. But it has, you know, bulleted lists and like titles and whatnot, even as progress bars and logger support and all kinds of stuff. Isn't that cool? Yeah, the logger handling is actually pretty great. Yeah. So if you're working on a terminal app, and you're like, we're just gonna keep it this way. But I want it to look nicer, I want to look a little more professional. This is like kind of your all in one, do a bunch of cool stuff here. You can even have multiple progress bars all updating in parallel on the screen. So if you got a bunch of downloads, you can indicate them all happening or something like that. A bunch of jobs running. So yeah, this is a cool project might have to add this to cart. That'd be cool. That would be awesome. Yeah, it definitely would. It even has, like, all of that support for Windows. If you use the new windows console, or terminal, I guess is the word that they're they're now using because console, I guess is the old school thing. And Microsoft like no, no, we all have terminals now. And you can go get that from the you probably don't have that on Windows. It's in preview, but you can get it on the Windows Store for free. And I link to that as well. So if you're on Windows, and you want a better terminal in general that supports this kind of stuff, check that out, too. Hmm. Interesting. checking them. Do you know if that'll roll into the normal just releases of windows at some point? I would hope so. Maybe it's still in beta, I guess we'll have to see. But it would be nice to be one step closer to parody across all these platforms, which, you know, it's always good. Yeah, just use the bash terminal for bash or get for Windows comes with a bash turn. Right? Nice. Speaking of a Windows, I guess it's any operating system. But I'm actually kind of surprised this has been on our to do list for a long time library called PS util. It's a cross platform library for process and system monitoring in Python. And I'm actually surprised that we haven't really covered it, we must have covered it in passing a couple times, probably. But I wanted to highlight it because it is an amazing little tool. It's not something I really like love when I have to use it. But things like there's times around controlling multiple computers or services running on a different machine with Windows, like I said, you can use it with it's a cross platform thing. But you can use this to grab CPU utilization, memory utilization, disk usage network, what ports are being used. And you can even see all of this information based on which process so you can get like, per process, list of CPU usage and everything within that. And then different around processes, you can suspend and kill and signal different processes. So if you're one of the things that we use this for is monitoring our build servers. I know there's other ways to do it. But there's a pretty easy way to take the go over to another server and grab the CPU usage and memory usage. So we can keep track of all over. Make sure all of our build servers are working and not overloaded. So we use this it's pretty cool little tool. Yeah, this looks really great. And I think we also talked about it last week, briefly when we covered high test monitor, and that was built on top of PSU tail amongst other things, we didn't cover PSU tail, but like it's that kind of tools that this enables, right? Yeah, pi test monitor use that also those those I'm like, man, I think we covered this recently. But yeah, it was last week, we covered that. It by itself, it's still a really great thing to use right itself. And the, the readme is huge, but it shows you a lot of different examples for, for how to use it. Yeah, it looks super useful if you're going to do any automation, and sort of admin, portal work. That's great. One of the great things that I love is the cross platform part because this sort of stuff you can do on I mean you can do directly with Windows and with Unix but, you know, it's different on everything else on everything. And so having a cross, I'm pretty impressed that it can even exist as a cross platform thing, but it can help

20:00 No kidding. Apparently cannon, it's awesome. Yeah, definitely, definitely good one, too. Let me ask you this question. If I am going to store some numbers, let's say up to 100,000, how big of energy joy I need to make in Python, I don't have to make an integer.

20:16 Or they're just there. It's beautiful, right? But I know that you've done a lot of C, c++. And I have as well back in the day and used to have to really think about that, right? Like, if you saw a negative 32,000, for a number, and you thought you were adding to it you're like, Oh, it was a short and it overflowed? Darn. Well, I guess it's, maybe it should be a you shorter, unsigned short, it could be. It could also be 65,000. Right? Like, this is something you always had to think about. And in Python, you don't. And how that happens is pretty interesting. I didn't really know the internals, I kind of guess maybe something like, what's happening was happening, but I didn't know. And so there's a cool article by our Pete Bayani, and he wrote something called how Python implements super long integers. So for example, if you tried to take two and raise it to the power of one, 1000, and C, you would get infinity.

21:08 Right back because it's, well, it's bigger than we can hold it. So it must be infinity as far as we're concerned. But in Python, it's fine. It just gives you a 6021 digit integer. And you don't have to declare, like I'm working with really big numbers or anything like that. And so Python is pretty cool. And how it's transparent these right, yeah, yeah. So this article digs into the C Python source code. It talks about the algorithms and the data structures that make this happen. So basically, the numbers in Python are represented as what's called a PI var object. So pi object, those that's the core type of things in Python. But this is a variable length one, right. And so there's a couple of different types that are like this. We've got lists, we've got tuples, we also have numbers. And that indicates that they can be of different size, and they can basically grow as as they need to. Right. So pythons numbers ultimately are represented by this thing called a long object. And that has a PI object base. But then it also has a size and a digit. And these digits are I think they're four or eight bytes long. Remember, it doesn't say here, it's a macro that would expand it for a time like that. But basically uses a list of digits. And it, you know, initially just uses one of those. But then when it gets full, it adds another and another another. What's interesting is their base 230? Not 10. Not 16. If you wanted 30. That's weird. Okay, yeah, because apparently it can most efficiently use. That's exactly that space of its four bytes, or whatever, as individual elements in a base. So it's pretty interesting. Like there's this ginormous number. And if you were trying to represent it in base 230, it's 100. I not gonna read it off, because it's really long. But it's pretty interesting how it uses this. But then when you get into operations, right, if I'm going to add two numbers, and they're based 230, like, what algorithm do you do, it's not like base 10, where you normally do the thing, or division or something. So that's also interesting to think about. If you look at arithmetic, it's pretty straightforward. You just add within the digits. And if you overflow 230, you do a carryover, like you learned in elementary school. subtraction is like you do the borrow. So it's like reverse. But then multiplication, in order to keep things efficient, uses an algorithm called Kara, stop, Bo, da, boo, boo, algorithm, which is interesting way to multiply to in digit numbers in different bases and stuff. And yeah, so it's, if you've ever wondered, how come you don't have to worry about numbers overflowing in Python? Here's a cool look inside that at the C Python source code and some of the algorithms. That's amazing. Yeah, actually, that's pretty cool. It's pretty cool. I'm so glad I don't have to worry about it. Right. It's just

21:08 one of the definitely benefits of Python is this, this notion of, you know, developer time is way more valuable than computer time. So let's figure all this stuff out for everybody else, and then we can stop thinking about it so much. Yeah, for sure. I had a project many years ago, where I had to work with a an FPGA. And the clock system was such that the timing had the set the current time, had to divide that out for multi radix numbers. So it wasn't like base 10 or base 230. It was each digit had its own base. Oh my goodness. That's crazy. It was it was a it was kind of a beast. It was cool, cool. algorithm, but it was a fun thing. Yeah, I love this sort of stuff. Yeah, this is really cool. Cool. Thumbs behind the curtain, a little bit of the red pill.

21:08 You go down inside, see what's happening. One final comment here is there's some funny little tricks you can play on people to ask you can like create the number 10 and then create the number 10

21:08 somewhere else and ask if those are the same number like with the word is? It's true for small numbers, but it's false for large numbers. And that's because Python pre allocates the numbers, negative five to 256. Oh, wow. Okay, when you have 100, in your program, it's the same hundred everywhere. But if you have 1,226,411, that was made on the spot, because these are pointers, these are not just like four bytes on the stack. These are getting allocated in complicated ways each time. And so they said, Look, negative five to 256. We use these all the time, let's just make them when Python starts. Why minus five? Well, because who uses minus six? Come on?

21:08 I have no idea. You can see, I can see why. So on zero to 256. But beyond that, I don't really know what they must have some reason? Yeah, I don't know why. It's probably started with eight of one. Somebody said, Well, let's do minus two also. Oh, let's let's go hog wild and go all the way down to minus five. Exactly. Exactly. Anyway, hey, that's our six. Yeah, anything any news for us? Or any extra? I have two really quick things. You know how Microsoft bought GitHub? That's kind of interesting news. A while ago, actually, people were contemplating the effects that would have an open source at all. Yeah, well, NPM, the PIP equivalent of JavaScript is now was acquired by GitHub as well. So just interesting, moving and shaking over there behind the scenes. That's really interesting, unlike PPA and PIP NPM was like a commercial venture or something to that effect, to try to bring that order to the JavaScript side. And now so since it was commercial that's been acquired. So NPM is the JavaScript thing. NPM is how JavaScript spells Pip.

21:08 Yeah.

21:08 Okay. Okay, quick. Other one is, we're going to try to set up a YouTube channel where people can see both of us talking, the all the silly stuff that we do in a fairly uncut, unedited way. But we're going to try to put a video for each topic. So we just talked about the Python number thing, like just a single video on that it'd be easy to share with friends. It'll be something we put up there. So we'll have more details with you soon on the YouTube channel. But I'm looking forward to I think people, it'll give people a new look, I think it'd be fun. Yeah. And then people will be able to recognize our faces when we are not walking around, because everybody should stay home. Exactly. Assuming that we continue to just give ourselves haircuts. So we look the same.

21:08 Yeah, yeah. All right. Anything you want to share with folks? Nope, nope. Just working from home. All right, well, well, I had this joke about see and like numbers overflowing and like, oh, why is it negative $32,000? Because it's 37,000 positive, or something like that. Right? Well, here's another one. And this one is just to kind of make you feel good about yourself as a Python developer, right? Yeah. This one's visual, as a lot of them seem to be lately, but I'll go ahead and we'll do our best to describe it. Right, I'm gonna let you be the developer. So you do the first three. You've got to give a little description though. Okay, so the dude's 830 in the morning. I'm staring at the screen. And I comment, stupid bug. And then seven hours later, and this guy apparently grows visual hair really fast, because he's already got stubble at seven, seven hours. Oh, it must be Linux. And then the next day, his faces read and he's even got more hair. And he says JavaScripts broken. Okay. Oh, yeah, Bob. Yeah, Bob comes in are the other guy comes in look says Oh, hey, Bob. It looks like he forgot a semicolon. Ah, fix.

21:08 Oh, man. Yeah. So that'll be in the show notes. You could check out the little comic. It's fun. by Eric Burke. Nice fun. Yeah. Don't miss semicolons. Now. Oh, there's a lot of things I don't miss.

21:08 Although I'm doing a lot more see now c++. I got to do it again. Well, it just makes you appreciate when you get to do Python. Yeah, sure does. So appreciate my time with you. Because this Yeah, I'm special. Absolutely. Thank you. It's great to get together and chat about this and share with everyone. Thanks. Thank you. Yep. Bye. Bye. Thank you for listening to Python bytes. Follow the show on twitter at Python bytes. That's Python bytes as in V yts. And get the full show notes at Python byte set FM. If you have a news item you want featured just visit by thumb and send it our way. We're always on the lookout for sharing something cool. This is Brian Aachen and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page