« Return to show page
Transcript for Episode #45:
A really small web API and OS-level machine learning
00:00 KENNEDY: Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode #45, recorded September 27th, 2017. I’m Michael Kennedy.
00:00 OKKEN: And I’m Brian Okken.
00:00 And as usual we’ve got a bunch of cool news items lined up for you. So, Brian, let’s first say thanks to Rollbar.
00:00 Thanks, Rollbar.
00:00 Yeah, thanks for sponsoring this episode. We’ll tell you more about Rollbar is you don’t know about them later.
00:00 let’s start with something super small. I don’t want to start with anything big.
00:00 Ohh, wow. I missed that. That’s cool.
00:00 Yeah, isn’t that cool. So, instead of having to define the REST call and actually just do that, like with direct Ajax calls or whatever framework you’re using and how you do that. If you have something that has a hello function and takes a string, you can create one of these clients and say .hello and pass it a string. Then it gives you like, a promise, which is really cool; I think that’s kind of unique.
00:00 Yeah, it’s a simple little boilerplate. I was looking at this because if I had some services at work trying to pull out some database objects, I think non-developers could maintain it fairly well. Probably not non-developers but not web developers.
00:00 Right. I think it’s pretty interesting. I hadn’t heard of it so I don’t know how durable it is and how good it is for building a rich application that has lots of requirements, but it looks pretty cool to me. It’s definitely worth checking out. And it’s small and easy to get started with so that’s always nice. There’s not a lot of mental overhead to use the thing, right?
00:00 Yeah, and the link we have – and thanks, Ivan for this also – is to the exact part of the Lightning Talk, so it’s just a few minutes of one of the maintainers talking about this project and it’s a really good overview.
00:00 That’s cool. You always hear, ‘Why did I build it? Yes, I know Django, REST framework and other things exist. I still built it.’ Things like that.
00:00 Yeah. It’s specifically not REST-compliant but for a lot of cases you don’t really care.
00:00 Yeah, it’s interesting. It’s more like traditional XML web service, proxy-type/looking thing. Anyway, it’s cool. Definitely check that out if that sounds interesting to you.
00:00 I have a question for you and everyone. Brian, have you installed MacOS High Sierra?
00:00 It came out yesterday. I already installed it. It was a bit of a risk but I had stuff backed up, so why not give it a shot.
00:00 And we’re talking today, so apparently it went okay.
00:00 We’re talking on the same computer I installed it on, so it went okay. It was all pretty smooth and seamless so, super excited to have a new MacOS. But this one is more like a foundational release, so there’s a lot of underlying systems and things that have changed to make it able to build more cool stuff. So, one of the popular things people talk about it APFS, an Apple file system that is like a modern, built-in, 2017-type file system. Not like a 30-year old file system. So, really cool stuff like that.
00:00 reason we’re talking about it on this show is that it comes with something that I think is kind of a big deal. It comes with something called CoreML. That’s Core, like all the systems, like CoreStorage. CoreML is Core Machine Learning. So, here with the latest Apple operating system may be the first major OS to come with built-in machine learning.
00:00 Is that crazy, or what? Well, CoreML is a set of APIs that you can use and basically, it packages up a lot of the stuff that they’re already doing, like photos where the the photos can identify people. So, you can say, ‘Show me all the pictures of Brian.’ And it would just find those magically in all my photos. Siri, text to speech, all those types of things. The want to make it possible for you to use some of those. Basically, CoreML comes with pre-built machine learning models. So, you can create your own models and package it up with your app and send it on. You can train it to do whatever. They even offer some default ones. It’s pretty cool.
00:00 thing that’s pretty sweet about it is it will use – on any of the Macs from 2012 or later – a mix of CPU processing and GPGPU processing, depending on the task. It will just figure that out for you.
00:00 I’m guessing that makes it slicker.
00:00 Well, how many cores does your Mac have?
00:00 I have no idea.
00:00 Probably four with hyper threading. Probably. It’s either two or four, plus hyper threading, which would double that. Some of the GPGPUs have thousands of cores. Thousands.
00:00 if you want to do something in Parallel, which a lot of machine learning is, if you have either eight cores or 2,000 cores, it’s a big difference. So, it’s really cool that that’s built-in. I think this might be the first major OS so come with machine learning built-in. It’s just a sign of the times.
00:00 probably got to log your code and figure out what’s happening when your machine learning models don’t do what you want, right?
00:00 I don’t have a list, but we’ve covered several simple logging modules on the show so far. But right now we’re just talking about plain old logging, the built-in logging library. I think it’s just logging, right?
00:00 Yep, it’s the logger.
00:00 Yeah, the logger. Import logger.
00:00 reason why I haven’t really used it too much before, to be honest, is I’ve had trouble getting my head wrapped around all the little pieces. It’s a fairly complex module, and for good reason, it does a lot. I’m pointing to a blog article called, “A Guide to Logging in Python” and it walks through using logging very simple and then changing the mental model to include all the different pieces like logging file handlers and memory handlers and filters and all that stuff. It’s the first time I’ve read about logging from start to finish. I wasn’t lost the entire time, so it’s a good introduction.
00:00 Yeah, it’s cool. And it talks about why not just do print. There’s all sorts of things like multithreading support, categorization and different login levels, Py rotating files, all kinds of stuff better than just print. This is really cool. I do feel like there’s a lot of configuration and stuff in the built-in logging module that kind of tries to do everything, so it can make it tricky. This is nice.
00:00 There’s some things that it does that I didn’t even know it did, like automatic file rotation built-in. That’s cool.
00:00 Yeah, that’s really nice.
00:00 If you’re trying to figure out the built-in logging modules, check this one out.
00:00 I can tell you that time-based rotating file is important when your website generates gigabytes of log files. You don’t want to be one file. (Laughs)
00:00 of websites, it kind of sucks when your websites crash for your user. They don’t like it, but they might not tell you; they might just go away. So, that’s why you want to get Rollbar. Like we said, Rollbar’s supporting the show. Visit them at pythonbytes.fm/Rollbar. And you can install it in just a few minutes. Pip install Rollbar, a few lines of configuration and all the errors in your website are captured with tons of detail. Things like local parameters, arguments passed and methods when it crashed. All that kind of stuff. Notifications, Slack, email, whatever. It’s beautiful. So, definitely install that if you’re running a web app based on Python.
00:00 speaking of web apps, you might care about memory, right? A lot of times on of the things that puts a lot of pressure on your web apps is not the CPU, but is actually memory. I’d say that’s certainly true for my web apps. It seems like memory is more of a pressure than CPU by quite a bit.
00:00 of the things that I thought was interesting is somebody wrote an article called, “Let Me Introduce: Slots.” Slots are alternative backing store for class data, I guess is maybe the simplest way to put it. Have you played with these, Brian?
00:00 This is really crazy. When you create a regular Python class, and you implement a (dunder init) _init_ and you self.name=something passed in self., email =some email address past in, that goes into _dicts_, right? Each instance of that class has a dictionary that has the name, email, the name name and then the two values that you passed in. And every instance of the class gets a separate instance in the dictionary. They’re one to one. That makes it super easy to do lookups or one, it’s super easy to make it dynamic, like if you just interact with class and you try to add new stuff to it, it just goes into that dictionary, so that's cool. But what’s not so cool is if I have ten million instances of that class, I have ten million copies of that dictionary which has ten million strings, each one says email, and another ten million strings with the say name. Why do I need to store those? I probably don’t, write? If I’m really not going to be dynamic, I probably don’t. So, you can use this thing called _slots_ and you would say, ‘The slots of this class are name and email.’ And then that slot is stored on the type, not the instance. Consider having ten million names and ten million emails in terms of the field name, you just have the two, and otherwise they’re stored like in an array, a positional thing.
00:00 super good for performance. The test they did in this article, 57% less memory usage just by adding that one line. And it’s a little bit faster for access but it’s definitely better on memory.
00:00 Can you use both?
00:00 No.Well, you could still do the self.whatever and assign to it, but basically if you try to assign to something that’s not declared in the slot, it will say it doesn’t have that property, it wasn't pre-allocated in the type, basically pre-defined in the type.
00:00 pretty cool. I actually go into this in-depth in my “Write Pythonic Code” course and you’ll see that this is even better in terms of memory than unnamed tuple.You wouldn’t think you could do better than an unnamed tuple for space, but this is actually even better. And you get all the type class inheritance behavior that you would expect. Very cool.
00:00 Seems like more of the mental model of classes I have in the first place.
00:00 Yeah, for sure. It’s very much like C++, C Sharp, traditional, ‘these are the things that are in here and they never change.’ Status language-type of thought.
00:00 Wow, I’m definitely going to have go and rewatch your “Seasoned Developer” course and do these again.
00:00 Yeah, it’s pretty cool. It’s super easy. You shouldn’t use it all the time but when it makes sense, it can save you tons of memory.
00:00 Well, that’s cool. A long time ago in Episode #11, we covered Pipenv, from Kenneth Reitz.
00:00 Maybe that was one of the ten things he did that week.
00:00 Yeah, he’s been doing a lot.
00:00 first time I took a look at that, I gave it an honest college try and to be honest, I didn’t know why I needed it.
00:00 You were like, ‘I already got this covered.’
00:00 Yeah, I’ve got this covered. But one the things that changed my mind is not too long ago, he put up a video. So, if you go to docs.pipenv.org, there’s a four minute screencast of him just using it. That video got me convinced. I’m like, ‘Oh, wow. This is really a lot easier than I’ve done before.’ And actually, I’ve been doing a lot more virtual environments than I used to and I kind of lose track of which ones are where, so this helps. Pipenv, if you haven’t listened to Episode #11, is something that deals with your virtual environments and pip and install and all that for you. It’s just a way of working. If you give it a try, you might might like it.
00:00 the video is one thing that’s new that convinced me, but there’s also a bunch of other stuff that he’s done recently. He also included security checks, so our scare from last week of whether or not you’re going to install a problem package, this will look through with Pipenv check. You can look through all your dependencies and make sure you don’t have any security vulnerabilities installed in any of them.
00:00 That’s awesome. That’s not like you have an old version of Django so it has a security vulnerability. That’s like, somebody called it, ‘Django without the D.’ (Laughs) That type of thing, right.
00:00 Yeah. What it’s had from the start is a lot of packages have these hash values to compare your actual install from what’s been published. Pipenv does deal with that and checks those, which is hard to do manually. And one of the things is does recently is it allows multiple package indexes so you could have PyPI of course, but also maybe a company index, group index and maybe even one for your project.
00:00 That’s really cool.
00:00 The features are piling up and he recently said that it’s nine months old but it’s had 192 releases. So, he’s not sleeping a lot, I don’t think.
00:00 No, probably not. (Laughs) That’s really cool. My favorite thing is pipenv lock -r will generate a requirements.txt file. That’s cool.
00:00 Right and that’s actually the thing that turned me off the first time and it’s because Pipenv uses a thing called pipfile and pipfile.lock, which I don’t really follow why I need those but I know that sometimes I need a requirements. You can use this and still get your requirements file.
00:00 Yeah, very cool.
00:00 the final thing I want to talk about is a little bit of a softer more squishy concept, not just an API or something. StackOverflow, they’re doing some interesting data science. I think they actually have full-time data scientists just mining these and generating reports and analysis on the industry. It’s pretty cool.
00:00 I’m actually pointing to for this is not StackOverflow but to an Ars Technica article, which is the follow-up to the kind of unfortunate article they did called, “Tabs and Spaces: Who Gets Paid More?” Or something like that. They made the claim that people who are uninformed use spaces and for some reason, they get paid more than people who use tabs. I don’t know why. That was something they found in a survey.
00:00 the reason why is because those are Python developers, whereas the other ones aren’t. So, that’s an interesting thing in and of itself. But this is like a follow-up to say, ‘Let’s look at not programming languages but just different locations.’ So, if you live in New York, versus you just live in a random place in the U.S., versus Germany or France. Basically, the U.S. versus Europe – well, the U.S. and Europe all compared against each other. So, it talks about these different places. If you’re in dev ops or a data scientist,you earn really well, probably using Python. Surprisingly, if you do graphics programming like OpenGL or something, you’re not paid very well even though that’s super hard to do. The reason is, I think and they sort of hinted this as well, is you’re probably working at a game company and there’s a lot of young people working at game companies who just want to work on games. It doesn’t matter if they have to work 80 hours a week and get paid little for it, right?
00:00 So, that’s pretty tough. I’ve heard the game industry is a pretty hard place to work but that’s sort of one part of it that you don’t get paid tons. But the most surprising fact was that in the U.S., developer pay is significantly more than in Europe. It’s not like 10% more or something like that. It’s close to double. It's really quite a bit more. They say things like, ‘People in the U.S. have substantially higher median income, even regardless of experience.’ For example, a median salary of a developer in the US is comparable to somebody with 20-years experience in Canada or Germany and it is even quite higher than people in France or the UK for people with 20-years experience. Like a new – ‘Hey, I just graduated. What can I do now?’ – sort of job.
00:00 comments are also super interesting because people coming from all over the place and they’re thinking about, ‘Well, okay. Salary is not everything. There’s cost of living, there’s cost of healthcare, there’s social support, there’s a lot of stuff. This is partly interesting for the article, but also partly interesting for the way people are analyzing it.
00:00 Yeah, well actually it’s kind of nice to have some good news for being an American.
00:00 (Laughs) Yeah, it’s been a little sketchy lately. But hooray.
00:00 We’ve got the weirdest president and highest healthcare cost, but hey, we get paid a lot.
00:00 Yeah and healthcare actually makes a big part of the conversation. And like, ‘Hey, you guys pay so much more for healthcare but the salary doesn’t offset it.’ We don’t pay half our salary, yet.
00:00 pretty interesting. If you’re thinking about this kind of stuff, here’s an article with a lot of data to back it.
00:00 that’s our news items. Brian, do you have anything else you want to share
00:00 the folks?
00:00 Oh, my gosh, you’re not doing anything, right? You’re just chilling now? The book is done and you’re just kicking back?
00:00 I think some people have already received the book, although I haven’t. I’m waiting for my box to show up this afternoon.
00:00 Oh, how exciting. If seen a lot of Twitter messages, people posting that they’ve shipped and things like that. It’s great, congrats.
00:00 Thank you. How about you?
00:00 Not too much going on. Right now I’m working on a free MongoDB course and that is super close to done so, I’m hoping to have some announcements soon but I’m not there yet.
00:00 I’m going to try to start some new projects and not talk about the book so much in every episode, but I’d really love to hear from people when they get them and what they think. Go ahead and send me a shoutout on Twitter @brianokken and say, ‘Hey, I got your book.’ That’d be cool to hear from people.
00:00 Yeah, it’s really cool. People are excited about it. I’ve been watching from the sidelines.
00:00 well, thanks for joining me for another one of these chats.
00:00 Thank you. Talk to you later.
00:00 You bet. Bye.
00:00 you for listening to Python Bytes. Follow the show on Twitter via @pythonbytes and get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit python bytes.fm and send it our way. We’re always on the lookout for sharing something cool. On behalf of myself and Brian Okken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.