Transcript #81: Making your C library callable from Python by wrapping it with Cython
Return to episode page view on github00:00 Hello and welcome to Python bites where we deliver Python news and headlines directly to your buds. This is episode 81 recorded May 25th 2018 I'm Michael Kennedy and I'm Brian Okken and we have a bunch of great stuff for you as always Super excited to talk about it. But before we do Brian, let's say thanks to digital ocean. Thanks to the ocean. Yeah Thank you digital ocean. They're Supporting the show by sponsoring this episode and a number of others and they're giving you all something awesome as well $100 free credit if you're a new user at Python by set FM slash digital ocean more about that later I'd like to learn learn some stuff right now, right?
00:37 learn about learning We hear a lot about machine learning and I came across them a couple ways to learn it So I've got a couple topics here. First one is a single page site called. Hello tensorflow and And it's a kind of fun little demo with one machine learning example of a bad guess at the coefficients for a polynomial and having the machine learning back end, I think it's TensorFlow, figuring out what the real answer is.
01:10 And it's got like a graph on there where you can watch it zoom or narrow in on it.
01:18 So it's fun.
01:19 - That is cool.
01:21 Yeah, so it has a secret formula And it says, "Here's some equation which you don't know.
01:25 Machine learning system, go learn how to predict where on that line it's gonna be given a few points." And it's like this nice little animated thing that rolls in.
01:35 I love the real-timeness of it.
01:37 That's great.
01:37 - And it doesn't quite get it right away, and if you run it again, it's even better.
01:42 So it increases its guesses.
01:44 It's kinda neat.
01:45 - Oh, nice.
01:45 - The other topic I wanted to bring up is I found out that there's both an article about this that we'll link to, but there is a Google provided machine learning crash course that actually looks pretty slick.
01:58 It's got like 15 hours of course material and a bunch of lessons and some exercises.
02:03 So Google has put together a free course on getting started with machine learning.
02:08 - Oh, that's really cool. - It's kind of fun.
02:10 - Yeah, so one, it's kind of like gets you interested in the idea of it, the other is like, let's actually teach it to you.
02:14 - Yeah, neat.
02:15 - Yeah, so remember last week when we talked about PySystemD with Dan Bader?
02:19 - Yeah.
02:20 So he pointed out that it was interesting that PySystemD was implemented in Cython, even though it had no real performance issues.
02:28 I mean, it just asked, like, hey, has this service started?
02:31 And it just delegated to the C API.
02:33 So I came across an article called "Making Your C Library Callable from Python by Wrapping It in Cython" by Stav Shamir, which is a really nice, short article.
02:43 Interesting.
02:43 Yeah, this is my guess why they use Cython for PySystemD.
02:47 So Cython is known for its ability increase performance of Python code, of course. But it's also a really interesting way to sort of bring Python syntax into the realm of C directly, right. And because of that, it makes calling C code from what looks like Python, and then exposing that part that looks like Python directly to Python really easy. So let me give you a quick example. Suppose I have a function, it's called Hello, it takes a character pointer called name, right, this is C. And then you do do printf hello, percent s name, right? So super simple. It's like a function, it takes character pointer and returns void. So if I wanted to call that function from Python, I could write just like a couple of lines of code in Cython. So I would say c def external, right define here's an external function, void hello const car, right, just the there's the signature of that C function. And then I'd write a function like pi underscore hello I want to say name colon bytes goes to none and just say hello name.
03:50 That's it.
03:50 Not all these crazy extensions or py objects or whatever.
03:54 Just like it's basically Python code with type hints or type annotations applied.
04:01 Right?
04:01 I mean that's what Cython more or less is.
04:03 This is a really nice small example of how to get some.
04:06 I thought I had to go through some like figure out how to link to the DLL and stuff like that.
04:12 And yeah.
04:13 Yeah, and so the final step you have to actually build, like you have to compile your Cython code so that it can be imported from regular Python code.
04:22 And so they provide a setup py which does that. And you just run Python setup py build, right? And it goes and does its magic.
04:30 And then you have it, you can work with it. And by the way, it happens to be fast.
04:33 Even if that's not the point.
04:34 Okay, I'm going to have to go play with this. It's neat.
04:36 Yeah, it's cool, isn't it? It's cool and it's easy.
04:38 Yeah.
04:39 Yeah, it's way better than like flipping into C land and doing all sorts of stuff.
04:42 to stuff, I think your point is only to consume this code.
04:45 Maybe even just to write regular Cython code.
04:48 There's a lot of good things going on there.
04:49 - Yeah. - Awesome.
04:50 So what's this next one you got?
04:52 Feature flags?
04:53 - Yeah, feature flags.
04:54 And I can't remember who was the, there was a, now we're going back in time, but a year or so, but was it Instagram that talked about--
05:02 - Yeah, it was Instagram that moved from Python 2 to Python 3 on their Django deploy, on deployment, which is, I think, the largest Django deployment in the world, on the same branch and without anything, right?
05:15 Just by using feature flags.
05:16 - I don't remember if they used feature flags in that or not, but I know a lot of people, a lot of teams that do have a model of merging to the master frequently or just working off master for everybody often use feature flags.
05:31 And I know how to do that in C++, but I wasn't quite sure.
05:35 I could have hacked together something for Python, but this is a nice article about really how to do feature flags in Python well, and why you would do it and how to do it.
05:45 So a few of the benefits they talked about is improving Teams response time to bugs because if you add a feature with a feature flag, you can just turn it off by flipping the flag if you need to without having to redeploy everything.
06:00 - I really like this idea.
06:01 It's great to be able to just keep everything in the same code base and also keep the database schemas in sync, which is nice as well.
06:10 So a lot of cool stuff going on here.
06:12 - You have just one code base to test, and you can turn on and off features to test different parts, 'cause you want both with the feature and without the feature to still work.
06:21 And you can migrate user groups with A/B testing or group splitting or however you wanna migrate the feature in.
06:29 But then it went on to talk about some of the ways to implement it nicely so that it's a maintainable flag system, how to measure your success with different analytics, and then using third party tools to make your flag support clean and not reinventing the wheel and other people have figured this out.
06:49 And then also just at the end a comment to say, once you've really decided that a feature's in place, you have to go through and do feature flag cleanup.
06:57 So make sure that you remove the flags and have the features be permanent when you're ready to have them permanent, clean up your code base.
07:05 It was just a nice write-up for this.
07:06 - Yeah, and it's really nice, I like it.
07:08 And they have maybe one of the best visualization of flat is better than nested, with like some kind of Mortal Kombat type character.
07:16 How do you throw it down?
07:17 It's a crazy nested if statement.
07:20 That's the cleanup conversation, right?
07:22 Like, don't do this.
07:24 - Yeah, and the example of how not to do feature flags.
07:26 Yeah, definitely. - Exactly.
07:28 Don't do this, yeah, it's quite cool, nice.
07:31 All right, so speaking of quite cool, it's quite cool that Digital Ocean is sponsoring this podcast.
07:35 And I want to just tell you guys quickly about them.
07:37 So they're a hosting company.
07:38 They've got data centers throughout the world.
07:41 And I think one of the cleanest, nicest ways to create a set of virtual servers and get them up and running and configure them.
07:49 So if you want to create one that's already pre-configured for some infrastructure like Disqus, you can just click a button and say, create me a Disqus virtual machine based on Ubuntu, you know, whatever version you're looking for, or create a fresh one and set up however you like.
08:02 So they provide a lot of the infrastructure for us.
08:05 We actually pay for it, but they are the people we pay for getting you this podcast, which is pretty awesome.
08:12 They've been really good.
08:12 We're happy customers.
08:13 And so if you want to be new happy customer, you can get $100 to try them out at pythonbytes.fm/digitalocean.
08:23 And that's for new customers.
08:24 Check it out and hopefully you create something cool and run it there.
08:28 - Nice.
08:29 - Hey, Brian, I got one that I think you're gonna like.
08:31 - Okay.
08:32 - I like testing.
08:33 - Recently I had a Talk Python episode on the release of PyPI and the inside story of how that got revised.
08:40 And finally, pypi.org is the official thing, not like a weird, scary place that also matches to the same database.
08:46 So that's really awesome.
08:47 And I can't remember who said it on the show.
08:49 I'm sorry, 'cause there were three folks.
08:50 But one of the libraries brought up was Pretend, which is a stubbing library.
08:56 - Neat.
08:57 - Yeah, so stubbing is like mocking, but it's different.
08:59 How's that, Brian?
09:00 (Brian laughs)
09:00 You know more about that than I do.
09:01 Oftentimes in mocking you want to check the behavior of the code.
09:05 Like if you're interacting with some system that's not really there, you want to make sure that you've called it in certain ways or you called it a certain number of times or the order of calls, things like that.
09:17 And stubbing is really like, I just want to have my code be able to call something and have the return value be like some pre-canned data.
09:26 So it's more about pre-canning than about the behavior.
09:30 - I see, so with mocking, maybe I'm gonna say, I'm gonna call this login API, and I want to make sure that it checks that my password is correct, or something like that.
09:39 And that would be a mocking thing, whereas for stubbing, I just like, I need it to give back a password so it doesn't crash, 'cause it's like a, you know, non-type attribute error type of crash if nothing comes back, so we gotta give it something, so let's create a stub to do that.
09:53 - Stubbing also is a great way to do things like, if there's error conditions, like when you're connecting to a third party that goes out to some, if you wanna like, if the server crashes or something, you'll get an error code.
10:06 So how do you simulate that?
10:08 You can't go out and crash the server.
10:10 It's a really great way to pretend bad things happen.
10:14 - Yeah, so let me give people a sense of the API.
10:16 It's real simple.
10:16 So from pretend import stub, and then you just say stub and say like, here's a function name equals some lambda, and now you can just start, you pass that around.
10:25 If somebody calls that function, It returns the value the lambda returns, done.
10:29 It's like, I don't think that it could really be simpler, to be honest, right?
10:33 - Well, that's one of the reasons why it's pretty cool is that it's simpler than using mock.
10:37 Mock can do this too, but this is simpler.
10:40 - Yeah, really nice, yeah.
10:41 You could probably use that with Flask, couldn't you?
10:44 - Yeah, you could probably use it with Flask, yep.
10:47 And my next topic, this is a surprise to me, I forgot that I put this down.
10:52 But one of the things I got out of PyCon was I got to sit next to one of the people at dinner that works on the Flask project.
11:00 And he reminded me that they just went through and rewrote a bunch of the Flask tutorial.
11:06 So I went through and took a look, and yeah, the official Flask tutorial is got a lot of updates.
11:12 It's the code that goes along with it's been updated, and just everything's been simplified, updated.
11:18 It's a little cleaner.
11:19 It includes a, I don't know if it had this before, but it includes a section on testing, which highlights pytest, of course, and coverage, which is good.
11:28 And one of the things I learned also with that discussion was Flask is a part of the, of Pallets.
11:35 I'm not sure what their entity is really, but Pallets is a collection of people that work on a collection of projects.
11:43 And it's some pretty important stuff.
11:45 We've got Flask, Click, it's dangerous, I'm not sure what that is.
11:49 - It's a request validation foundation of Flask.
11:52 Okay, it's dangerous. I already said that, but Jinja and Jinja2. Is there a Jinja? Is it just Jinja2? I've not seen any Jinja in the wild lately, but there probably was at some point.
12:03 Okay, but and then markup safe, which is an HTML markup safe string for Python library and then works. So we work so that's how I pronounce that. We're doing with a V. Everybody relies on a lot some of these things even if you don't use Flask.
12:22 And the Pallets project has a donate page now.
12:25 So if you wanna donate, and you can donate through their donation page.
12:31 Pretty neat.
12:32 - It's really nice.
12:33 So we've had a lot of news around Flask lately.
12:34 Flask went 1-0 a while ago.
12:36 I talked to one of the guys, I'm sorry, I'm so forgetting his name, 'cause I met so many people, but who is basically responsible for that whole progression.
12:45 David, I believe.
12:46 Anyway, he's like, I know people think because the Flask went 1.0 the week after you guys did the Zerover episode, actually that was in the works for like a year.
12:56 I would love for you to be doing that, but it was actually just a coincidence.
13:00 So anyway, glad to see it go 1.0.
13:02 But yeah, this looks like Flask is getting some, a renewed love, which is good.
13:06 - Yeah, and I didn't know Click was by the same people.
13:09 I use Click all the time, so that's pretty neat.
13:12 - Yeah, for creating CLIs, very nice.
13:16 All right, so let's round it out with a little bit of an internals look here as well.
13:20 I feel like a lot of stuff I'm covering this week is like deep in the guts with the Cython and if you're not doing Cython, you're doing regular Python, then you're operating in the bytecode space.
13:30 So do you think people would be surprised if you told them that Python is compiled?
13:34 - Yes.
13:35 - A lot of them, I think they would.
13:36 But it's not compiled to machine instructions or even JIT compiled unless you're using PyPy.
13:40 But it's compiled to bytecode.
13:42 So those PYC files, those are like the instructions to the Python virtual machine, not instructions to your processor, but they're still compiled and there's still this bytecode.
13:52 And so understanding it's pretty interesting to just know like how the internals of Python is working.
13:57 So there's this nice article called "An Introduction to Python Bytecode." So if this bytecode concept is kind of new to you or you just want to play around with a little, check it out, it's pretty accessible.
14:07 So I feel like there's a lot of hello world examples in my topics.
14:12 So back to hello.
14:14 So they have a function def hello, and it just prints the static string hello world, right?
14:19 So it's okay, well, what does this actually mean?
14:21 And then they show you all the byte codes.
14:24 It's okay, we're gonna load the global, which is the print function.
14:26 We're gonna load the constant, which is hello world.
14:28 Then we're gonna call the function that is on the stack.
14:31 So CPython uses a stacked-based virtual machine.
14:35 And so they like load these things under the stack.
14:37 If you have like a function that takes two arguments, they might load two arguments on the stack and then call the function and things like that.
14:43 So that's how your Python source code gets into executable form.
14:49 And then these steps are actually sent down to the CPython runtime and this giant wild loop with a switch statement that's like literally 3000 lines of C code that says, what's the bytecode?
14:59 Let's go do that.
15:00 So it's pretty interesting.
15:02 And they talk about how you can take your Python code and look at this.
15:05 So you just import dis as in disassembly and you say, dis.dis, and you pass it like a callable or something like that.
15:12 it'll show you the bytecode instructions that make it up.
15:15 So it's pretty nice.
15:16 - New phone, who dissed?
15:17 (laughing)
15:18 You can't actually just look at the PYC files though, right?
15:20 - No, they're like bytes.
15:22 That's why you need this diss thing.
15:23 I mean, I guess if you can like parse the bytes you can, but it's not strings, I don't think.
15:29 - Okay.
15:30 - Yeah, I mean the PYC files are basically the compiled steps from your hello world text into the bytecode instructions in terms of bytes.
15:39 And then that's why those cache things laying there, the next time you hit that app, right, it's just going to go and say, "Well, let's just load up that PYC so we don't have to reparse and validate it." Yeah, this is kind of neat.
15:50 I'm definitely going to go check this out because I just want to know more about this.
15:55 It seems like something I should know about even though I probably don't need to on a daily basis.
15:59 Yeah, I mean, I'm not sure how helpful it is, but it's helpful in your conceptualization of how stuff works, I think.
16:04 Yeah, definitely.
16:05 Very cool.
16:06 Yeah, a little bit deeper down into the...
16:08 Is that the red pill or the blue pill that takes you farther down?
16:11 That's a red pill, right? I don't remember.
16:13 Awesome. Well, anyway, it's definitely worth checking out if you haven't played with Bytecode before.
16:17 It's a really nice, simple way to get introduced to it.
16:19 Brian, you got anything, any other news you want to share with everyone this week?
16:22 I don't think so. Do you?
16:24 No. I'm all out of news this week other than the ones I found.
16:27 So, I just want to say thank you.
16:28 Thank you for being part of this episode and sharing everything with everyone.
16:31 Thank you. Bye.
16:32 Yeah, you bet. Bye.
16:34 Thank you for listening to Python Bytes.
16:37 Follow the show on Twitter via @pythonbytes.
16:39 bytes that's Python bytes as in B Y T E S and get the full show notes at Python bytes dot FM if you have a news item you want featured just visit Python bytes dot FM and send it our way we're always on the lookout for sharing something cool on behalf of myself and Brian Okken this is Michael Kennedy thank you for listening and sharing this podcast with your friends and colleagues