#45: A really small web API and OS-level machine learning

Published Fri, Sep 29, 2017, recorded Wed, Sep 27, 2017

This episode is brought to you by Rollbar: pythonbytes.fm/rollbar

Brian #1: pico

"a very small web application framework for Python"
Recommended by Ivan Pejić
lightning talk from EuroPython 2017
This would be a good web framework for building internal services and tools that non-web developers need to interact with and modify.
Very simple.
Not REST, but not confusing either.

Michael #2: High Sierra ships, first major OS with machine learning built in?

September 26th macOS High Sierra was released (yay)
Mostly a foundational release with barely visible changes but awesome things like APFS replacing HFS+, etc.
Comes with CoreML
- Apple’s intent with the new CoreML framework is to package up prebuilt ML models and execution engines and make them possible for third-party apps to use.
- Developers can take a trained machine learning algorithm, package it up as an MLModel, and integrate it into their apps.
- Apple offers a few default machine learning models that developers can download and use too
Rather than sharing your data with a central server, grouping it together with a lot of other people's data, and improving machine learning models that way, Apple stresses that everything CoreML does is happening on the device.
On Macs that support Metal—generally, Macs from 2012 and later—CoreML uses a mix of CPU processing and GPGPU processing, depending on the task.
Add on the fact that High Sierra has external GPU support now and you have a sweet combo.

Brian #3: A guide to logging in Python

Simply put, the best logging introduction I've read so far.

Michael #4: Let me introduce: slots

So what are slots? __slots__ are a different way to define the attributes storage for classes in Python.
for normal Python classes, a dict is used to store the instance's attributes.
With __slots__ we don't have an attribute called __dict__ inside our instance. But we have a new attribute called __slots__.
But why would you need to use slots when you have a dict? Well the answer is that __slots__ are a lot lighter and slightly faster.
Outcome:
- ~57% less memory usage thanks to just one line of code.
- __slots__ are also slightly faster.
Covered in depth in my Write Pythonic Code Like a Seasoned Developer course.

Brian #5: pipenv revisited

Covered in episode 11. However, there are some notable changes since then.
Reminder:
- pepenv handles virtualenv and pip interaction for you
- pipenv install creates a virtualenv (if one doesn't exist) and installs stuff into a virtualenv.
- pipenv shell uses the virtualenv
- exit allows you to get out of the virtualenv
- pipenv lock -r will generate a requirements.txt file for you, so you can use it even if you need a requirements.txt file.
Notable changes:
- New 4 minute screencast with Kenneth demonstrating how to use it. Watching him use it makes it very simple to understand.
- Specify multiple package indexes, and even specify a particular index for particular packages. So you can combine both pypi with a company index and a group index and maybe one for your project.
- pipenv check will tell you about any known security vulnerabilities in your installed packages
- 9 months old with 192 releases, so keep an eye on it for new features all the time.

Michael #6: Stack Overflow gives an even closer look at developer salaries

Tabs and spaces aren't the only things that influence developer pay…
Some of the broad trends are no big surprise; for example, the chosen cities tend to pay more than their respective nations do, for example.
DevOps specialists and data scientists both earn well.
Other aspects of the data are a little more surprising. Graphics programmers, for example, aren't particularly well paid, in spite of having a relatively specialized, complex niche.
And while earnings in four of the countries are surprisingly similar, those in America are substantially higher, regardless of experience; in fact, the median salary of a developer in the US is comparable to that of someone with 20 years of experience in Canada or Germany and markedly higher than 20-year veterans in France and the UK. Even after taking into account the US' higher healthcare costs, America is the place to be if you're a programmer.
Comments
- I do have to wonder how much Silicon Valley skews that salary chart, as the Web 2.0 companies pay HUGE comparatively [ref]
- I asked Stack Overflow's data scientist that question, and she said not much; even without its outlier cities, the US pays much more than the rest of the world. [ref]
- Healthcare cost are only part of it. I got paid about $600/month 9 months a year by my government to study in university. [ref]
- I feel like a lot of IT people lack soft skills, and it caps their salary at a lower end. [ref]

Our news:

Hardcopies of Python Testing with pytest now shipping on Amazon, as well as Pragmatic.
- When you get your copy, let me know. Send a pic to @brianokken

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:04 This is episode 45, recorded September 27th, 2017.

00:10 I'm Michael Kennedy.

00:11 And I'm Brian Okken.

00:12 And as usual, we got a bunch of cool news items lined up for you.

00:15 So, Brian, let's first say thanks to Rollbar.

00:18 Yeah, thanks, Rollbar.

00:19 Yeah, thanks for sponsoring this episode.

00:20 We'll tell you guys more about Rollbar if you don't know about them later.

00:22 But let's start with something super small.

00:25 Like, I don't want to start anything big.

00:26 This was recommended by a listener, Ivan.

00:29 I'm not going to try his last name.

00:31 But thanks, Ivan.

00:32 A little micro framework called Pico.

00:35 And there was a lightning talk given at EuroPython 2017.

00:41 And we have a link to it.

00:42 But this is just a very simple, very, well, I don't know how simple the code is.

00:48 I haven't looked.

00:48 But it's simple to use.

00:49 It's a little web framework that you can use for actual web pages.

00:53 It does have some CSS and JavaScript serving, I think.

00:58 But the main idea of it is a very simple, easy-to-use web framework for people that are not web developers.

01:06 So, let's say, I think it was developed in a scientific community.

01:09 So, people that can just hook up.

01:11 You really, it's really hook up a web endpoint with just a decorator that says Pico.expose.

01:18 And you've got a function.

01:19 And there you've got a service, a web service you can use.

01:23 So, it's pretty amazingly simple.

01:26 Yeah, it is quite simple.

01:27 And one of the things that is unique about it, well, relatively unique, is that it comes with a JavaScript client that automatically generates a proxy for objects described in your API.

01:41 And that's pretty trick.

01:42 Oh, wow.

01:43 I missed that.

01:43 That's cool.

01:44 Yeah, isn't that cool?

01:45 So, instead of having to, like, define the REST call and then actually just do that, you know, like with direct AJAX calls, whatever framework you're using, how you do that.

01:54 If you have something that has, like, a hello function and takes a string, you can create one of these clients and just say .hello and pass it a string.

02:05 And then it gives you, like, a promise, which is really cool.

02:08 I think that's kind of unique.

02:08 Yeah.

02:09 It's one of the simplest, like, very little boilerplate you have to throw in some code.

02:14 I was looking at this because if I had some services at work trying to pull out some database objects, I think non-developers could maintain it fairly well okay.

02:23 I mean, probably not non-developers, but not web developers.

02:27 Right, right.

02:27 So, I think it's pretty interesting.

02:30 I actually haven't heard of it, so I don't know, like, how durable it is, how good it is for building rich applications.

02:35 I have lots of requirements, but it looks pretty cool to me.

02:38 It's definitely worth checking it out.

02:39 And it's small and easy to get started with, so that's always nice.

02:42 Like, there's not a lot of mental overhead to use the thing, right?

02:44 Yeah, and the link we have, which I thank Ivan for this also, is to the exact part of the Lightning Talk.

02:52 So, it's just, I don't know, a few minutes of one of the maintainers talking about this project, and it's a really good overview.

02:59 That's cool.

03:00 You always hear, like, the why did I build it?

03:02 Yes, I know Django REST framework and other things exist.

03:05 I still built it.

03:06 Things like that, right?

03:07 Yeah, and it's specifically not REST compliant, but for a lot of cases, you don't really care.

03:13 Yeah, it's interesting.

03:14 It's almost more like a traditional XML web service proxy type looking thing.

03:18 Anyway, pretty cool.

03:19 Definitely check that out if that sounds interesting to you.

03:21 So, I got a question for you and everyone.

03:25 Brian, have you installed macOS High Sierra?

03:27 No.

03:28 It came out yesterday.

03:29 I already installed it.

03:29 It was a bit of a risk, but I had stuff backed up, so why not give it a shot, right?

03:33 And we're talking today, so apparently it went okay.

03:35 And we're talking on the same computer I installed it on, so it all went okay, and it was all pretty smooth and seamless.

03:40 So, super excited to have a new macOS.

03:43 But this one is actually more like a foundational release.

03:46 So, there's a bunch of underlying systems and things that have been changed to make it able to build more cool stuff.

03:52 So, like one of the popular things people will talk about maybe is APFS, a new Apple file system that is like a modern built-in 2017 type file system, not like, you know, 30-year-old file system.

04:03 So, really, really cool type stuff like that.

04:05 But one thing, the reason we're talking about it on this show is it comes with something that I think is actually kind of a big deal.

04:10 It comes with something called Core ML.

04:12 So, that's Core is like all the systems like, you know, Core Storage, Core whatever, right?

04:18 Core ML is Core Machine Learning.

04:20 So, here with the latest Apple operating system, maybe the first major OS to come with like built-in machine learning.

04:27 Wow.

04:28 Is that crazy or what?

04:29 Well, Core ML is a set of APIs that you can use.

04:32 And basically, it packages up a lot of the stuff that they're already doing in things like photos where the photos can like identify, you know, people.

04:43 So, you can say, show me all the pictures of Brian.

04:45 And it would just like find those magically in all my photos, Siri, text-to-speech, all those types of things, right?

04:51 So, they want to make it possible for you to use some of those.

04:54 So, basically, with Core ML, it comes with pre-built machine learning models.

04:59 You can create your own models and then package it up with your app and send it on so you could train it to do whatever.

05:05 And they even offer some default ones.

05:07 It's pretty cool.

05:07 Yeah.

05:08 Yeah.

05:08 So, another thing that's pretty sweet about it is it will use basically on any of the Macs from 2012 or later,

05:15 it will use a mix of CPU processing and GP-GPU processing depending on the task.

05:19 And it will just figure that out for you.

05:21 So, this whole, do I use them?

05:23 I'm guessing that makes it slicker.

05:25 Well, how many cores does your Mac have?

05:28 I have no idea.

05:28 Probably four with hyperthreading, right?

05:30 Probably.

05:31 So, it's either two or four plus hyperthreading, which would double that, right?

05:35 Some of the GPUs have like thousands of cores.

05:38 Thousands.

05:39 So, if you want to do something in parallel, which a lot of machine learning is, like if you have either eight cores or 2,000 cores, that's a big difference.

05:47 So, it's really cool that that's built in.

05:48 Yeah.

05:49 Anyway, so, I think this might be the first major OS to like come with machine learning built in.

05:52 It's just a sign of the times, right?

05:54 All right.

05:54 You probably got to log your code and figure out what's happening when your machine learning models don't do what you want, right?

05:59 I don't have a list, but we've covered several simple logging modules on the show so far.

06:05 But this, right now, we're just talking about plain old logging.

06:09 The built-in logging library.

06:12 Am I getting that right?

06:13 I think it's just logging.

06:14 Yep.

06:15 Just the logger.

06:16 Yep.

06:16 Logger.

06:17 Import logger.

06:18 Yeah.

06:18 The reason why I haven't really used it too much before, to be honest, is I have had trouble getting my head wrapped around all the little pieces.

06:26 And it's a fairly complex module.

06:29 And for good reason, it does a lot.

06:32 But this is the first.

06:34 I'm pointing to a blog article called A Guide to Logging in Python.

06:38 And it walks through using logging very simple and then adding on, changing our mental model to include all the different pieces like logging file handlers and memory handlers and filters and all that stuff.

06:54 And it's the first time I've read about logging where, from start to finish, I wasn't lost the entire time.

07:00 So it's a good introduction.

07:02 Yeah, it's cool.

07:03 And it talks about, like, why not just do print, right?

07:06 There's all sorts of things like multi-threading support, categorization and different logging levels, time-rotating files, all kinds of stuff better than just print.

07:15 So, yeah, this is really cool.

07:17 I do feel like there's a lot of configuration and stuff in the built-in logging module.

07:21 And it kind of tries to do everything.

07:23 So it can make it tricky.

07:24 And this is nice.

07:25 Yeah.

07:25 And there's some things that it does that I didn't even know it did.

07:28 I didn't know it does, like, automatic file rotation just built in.

07:31 That's cool.

07:31 Yeah, that's really nice.

07:33 Anyway, if you're trying to figure out the built-in logging module, check this one out.

07:37 I can tell you that time-based rotating file is important when your website generates gigabytes of log files.

07:43 You don't want that to be one file.

07:44 Yeah.

07:45 Speaking of websites, it kind of sucks when your website's crashed for users, right?

07:49 Yeah.

07:50 They don't like it, but they might not tell you.

07:51 They might just go away.

07:52 So that's why you want to get Rollbar, right?

07:54 So like we said, Rollbar is supporting the show.

07:56 Visit them at pythonbytes.fm/rollbar.

08:00 And you can install it in just a few minutes.

08:03 pip install Rollbar, a few lines of configuration, and all the errors in your website are captured with tons of detail.

08:10 Things like local parameters, arguments passed to methods when it crashed, all that kind of stuff.

08:16 Applications, Slack, email, whatever.

08:18 It's beautiful.

08:18 So definitely install that if you're running a web app based on Python.

08:22 So speaking of web apps, you might care about memory, right?

08:27 A lot of times one of the things that puts a lot of pressure on your web apps is not the CPU, but it's actually memory.

08:32 And I'd say that's true certainly for my web apps.

08:35 It seems like memory is more of a pressure than CPU by quite a bit.

08:39 So one of the things that I thought was interesting is somebody wrote an article called, Let Me Introduce Dunder Slots.

08:47 So slots are alternative backing store for class data, I guess is maybe the simplest way to put it.

08:56 Have you played with these, Brian?

08:57 No, I haven't.

08:58 This is really crazy.

08:59 When you create a regular Python class and you implement a Dunder init, and in there you say self.name equals something passed in, self.email equals some email address passed in, that goes into DunderDict, right?

09:15 Like each instance of that class has a dictionary that has the name, email, the name, name, and then the two values that you passed in.

09:22 And every instance of the class gets a separate instance of the dictionary.

09:26 They're one-to-one.

09:26 That makes it be super easy to do lookups, right?

09:31 Or one.

09:31 It's super easy to make it dynamic.

09:34 Like if you just interact with a class and you try to add new stuff to it, it just goes into that dictionary.

09:39 So that's cool.

09:40 But what's not so cool is if I have 10 million instances of that class, I have 10 million copies of that dictionary, which has 10 million strings, each one that says email, and another 10 million strings that say name.

09:53 Why do I need to store those?

09:54 I probably don't, right?

09:56 If I'm really not going to be dynamic, I probably don't.

09:58 So you can use this thing called Dunder Slots.

10:00 And you would say the slots of this class are name and email.

10:03 And then that slot is stored on the type, not the instance.

10:07 So instead of having 10 million names and 10 million emails in terms of the field name, right?

10:13 You just have the two.

10:15 And otherwise, they're just stored like in an array, in a positional thing.

10:19 So super good for performance.

10:21 Like the test they did in this article, 57% less memory usage just by adding that one line.

10:27 And it's a little bit faster for access, but it's definitely better on memory.

10:30 Can you use both?

10:32 No.

10:32 Well, you could still do the self dot whatever and assign to it.

10:35 But basically, if you try to assign to something that's not declared in the slot, it'll say it doesn't have that property.

10:41 It wasn't pre-allocated in the type, basically, or predefined in the type.

10:45 So yeah, it's pretty cool.

10:47 I actually go into this in depth in my write Pythonic code course.

10:51 And you'll see that this is even better in terms of memory than an unnamed tuple.

10:58 Like you wouldn't think you could do better than an unnamed tuple for like space.

11:01 But like this is actually even better.

11:02 And you get all the type class inheritance behavior that you'd expect.

11:07 Very cool.

11:08 Seems like more of the mental model of classes I have in the first place.

11:12 Yeah.

11:12 Yeah, for sure.

11:13 It's very much like C++, C#, traditional.

11:16 Like these are the things that are in here and they never change.

11:20 A static language type of thought of a class.

11:22 Yeah.

11:22 Well, I'm definitely going to have to go and re-watch your seasoned developer course and do these again.

11:29 Yeah, it's pretty cool.

11:30 Yeah, it's super easy.

11:31 Like you shouldn't use it all the time.

11:32 But when it makes sense, it can save you tons of memory.

11:35 Well, that's cool.

11:36 Hey, a long time ago in episode 11, we covered Pipen from Kenneth.

11:41 And I always get his name wrong.

11:42 So you say it.

11:43 Kenneth Wright.

11:44 Okay.

11:45 Maybe that was one of like the 10 things he did that week.

11:48 I don't know.

11:48 Yeah.

11:49 So he's been doing a lot.

11:50 But one of the things that the first time I took a look at this, I gave it a honest college try.

11:57 And to just be honest, I didn't know why I needed it.

12:02 You're like, I already got this covered.

12:04 Whatever.

12:04 Yeah, I already got this covered.

12:06 But one of the things that changed my mind is not too long ago, he put up a video.

12:11 And so if you go to docs.pipenv.org, there's a four minute screencast of him just using it.

12:18 And that video got me convinced.

12:21 I'm like, oh, wow.

12:22 This is really a lot easier than I've done before.

12:26 And actually, I've been doing a lot more virtual environments than I used to.

12:30 And I kind of lose track of which ones are where.

12:34 So this helps.

12:35 So PIPenv, if you haven't listened to episode 11, it's something that deals with your virtual environments and pip and install and all that for you.

12:44 And it's just a way of working that if you give it a try, you might like it.

12:49 So the video is one thing that's new that convinced me.

12:52 But there's also a bunch of other stuff that he's done recently.

12:54 He also included security checks.

12:58 So our scare from last week of whether or not you're going to install a problem package, this PIPenv will look through with PIPenv check.

13:08 You can look through all your dependencies and make sure that you don't have any security vulnerabilities installed in any of them.

13:14 That's awesome.

13:15 And that's not like you have an old version of Django, so it has a security vulnerability.

13:19 That's like somebody called it Django without the D and put a virus in it, right?

13:25 That type of thing, right?

13:27 Yeah, and the other thing that one of the things that it has that it's had from the start is a lot of packages.

13:32 So packages have these hash values to compare your actual install from what's been published.

13:39 And PIPenv deals with that and checks those, which is hard to do manually.

13:45 And then one of the things it does recently is also it allows multiple package indexes.

13:52 So you could have PyPI, of course, but also maybe a company index and a group index and maybe even one for your project.

13:59 That's really cool.

14:00 The features are piling up.

14:01 And he recently said that it's nine months old, but it's had 192 releases.

14:07 So he's not sleeping a lot, I don't think.

14:10 No, probably not.

14:11 Yeah, that's really cool.

14:12 My favorite thing is pipenv lock-r will generate a requirements.txt file.

14:17 That's cool.

14:17 Right.

14:17 And that's actually the thing that turned me off the first time.

14:21 And it's because it uses a, PIPenv uses a thing called pip file and pip file.lock, which I don't really follow what those, why I need those.

14:29 But I know sometimes I need a requirements.

14:31 And this allows you, you can use this and still get your requirements files.

14:35 Yeah, very cool.

14:36 Very cool.

14:37 All right.

14:38 So the final thing I want to talk about is a little bit of a softer, more squishy concept, right?

14:44 Not just an API or something.

14:46 But Stack Overflow, they're doing some interesting data science.

14:51 I think they actually have like full-time data scientists that are just like mining these and like generating reports and analysis on the industry.

14:57 So that's pretty cool.

14:59 And what I'm actually pointing to for this is not Stack Overflow, but to an Ars Technica article, which is a follow-up to this kind of unfortunate article they did called Tabs and Spaces, Who Gets Paid More or something like that.

15:16 And they made the claim that like, well, people who are uninformed use Spaces.

15:20 And for some reason, they get paid more than people who use Tabs.

15:23 Don't know why.

15:24 That was something they found in the survey.

15:26 Well, the reason why is because those are Python developers, right?

15:31 Whereas the other ones aren't.

15:33 So that's an interesting thing in and of itself.

15:35 But this is like a follow-up to say like, let's look at not programming languages, but like just different locations.

15:42 So if you live in New York versus you just live in a random place in the U.S. versus Germany or France.

15:48 Basically, the U.S. versus Europe.

15:50 Well, U.S. and Europe all compared against each other.

15:53 So it talks about like in these different places.

15:56 If you're in DevOps or a data scientist, you earn really well.

15:58 Probably using Python.

16:00 Surprisingly, if you do graphics programming like OpenGL or something, you're not paid very well, even though that's super hard to do.

16:06 The reason is, I think, and they sort of hinted this as well, is you're probably working in a game company.

16:13 And there's a lot of young people working at game companies who are just, they want to work on games, period.

16:18 It doesn't matter if they have to work 80 hours a week and get paid a little for it, right?

16:22 Okay.

16:22 Yeah.

16:23 So that's pretty tough.

16:23 I have heard that the game industry is a pretty hard place to work.

16:27 But, you know, that's sort of one part of it, right?

16:30 You don't get paid tons.

16:31 But the most surprising fact was really that in the U.S., developer pay is significantly more than in Europe.

16:39 And it's not like 10% more or something like that.

16:43 It's like, I don't know, close to double.

16:46 It's really like quite a bit more.

16:48 So they say things like, hey, people in the U.S. have substantially higher median income, even regardless of experience.

16:56 So they say, for example, a median salary of a developer in the U.S. is comparable to somebody with 20 years experience in Canada or Germany.

17:03 And it isn't even quite higher than people in France or the U.K. with 20 years experience.

17:08 Like a new, like, hey, I just graduated.

17:09 What can I do now?

17:10 Sort of job.

17:11 So pretty interesting.

17:13 The comments are also super interesting because people coming from all over the place and they're thinking about like, well, okay, salary is not everything.

17:20 There's cost of living.

17:22 There's cost of health care.

17:23 There's social support.

17:24 There's a lot of stuff.

17:25 So like this is kind of partly interesting for the article, but also partly interesting for the way people are analyzing it.

17:32 Yeah.

17:32 Well, actually, it's kind of nice to have some good news for being an American.

17:38 Yeah.

17:38 It's been a little sketchy lately, but hooray.

17:41 We've got the weirdest president, the highest health care costs.

17:44 But hey, we get paid a lot.

17:46 Yeah.