Transcript #81: Making your C library callable from Python by wrapping it with Cython
Return to episode page view on github00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your buds.
00:05 This is episode 81, recorded May 25th, 2018.
00:10 I'm Michael Kennedy.
00:11 And I'm Brian Okken.
00:12 And we have a bunch of great stuff for you, as always.
00:14 Super excited to talk about it.
00:16 But before we do, Brian, let's say thanks to DigitalOcean.
00:18 Thanks, DigitalOcean.
00:19 Yeah, thank you, DigitalOcean.
00:20 They're supporting the show by sponsoring this episode and a number of others.
00:25 And they're giving you all something awesome as well.
00:27 $100 free credit if you're a new user at pythonbytes.fm/digitalocean.
00:33 More about that later.
00:34 I'd like to learn some stuff right now, Brian.
00:37 Learn about learning.
00:38 We hear a lot about machine learning.
00:41 And I came across a couple ways to learn it.
00:45 So I've got a couple topics here.
00:47 First one is a single page site called Hello TensorFlow.
00:52 And it's a kind of fun little demo with one machine learning example of a bad guess at the
01:01 coefficients for a polynomial and having the machine learning back end.
01:06 I think it's TensorFlow.
01:07 Figuring out what the real answer is.
01:10 And it's got like a GUI or a graph on there where you can watch it zoom or narrow in on it.
01:17 So it's fun.
01:19 That is cool.
01:20 Yeah.
01:21 So it has a secret formula.
01:22 And it says, here's some equation which you don't know.
01:25 Machine learning system.
01:27 Go learn how to predict where on that line it's going to be given a few points.
01:31 And it's like this nice little animated thing that rolls in.
01:34 I love the real timeness of it.
01:36 That's great.
01:37 And it doesn't quite get it right away.
01:39 And if you run it again, it's even better.
01:42 So it increases its guesses.
01:43 It's kind of neat.
01:44 Oh, nice.
01:45 The other topic I wanted to bring up is I found out that there's both an article about this
01:49 that we'll link to.
01:50 But there is a Google-provided machine learning crash course that actually looks pretty slick.
01:58 It's got like 15 hours of course material and a bunch of lessons and some exercises.
02:03 So Google has put together a free course on getting started with machine learning.
02:08 That's really cool.
02:09 It's kind of fun.
02:09 Yeah.
02:10 So one is kind of like gets you interested in the idea of it.
02:12 The other is like let's actually teach it to you.
02:14 Yeah.
02:14 Neat.
02:15 Yeah.
02:15 So remember last week when we talked about PySystemD with Dan Bader?
02:19 Yeah.
02:20 Yeah.
02:20 So he pointed out that it was interesting that PySystemD was implemented in Cython even though
02:26 it had no real performance issues.
02:28 I mean, it just asked like, hey, is this service started?
02:31 And it just delegated to the C API.
02:32 So I came across an article called Making Your C Library Callable from Python by Wrapping
02:38 It in Cython by Stav Shamir, which is a really nice short article.
02:43 Interesting.
02:44 Yeah.
02:44 This is my guess why they use Cython for PySystemD.
02:47 So Cython is known for its ability to increase performance of Python code, of course.
02:52 But it's also a really interesting way to sort of bring Python syntax into the realm of C directly.
02:59 Right.
03:00 And because of that, it makes calling C code from what looks like Python and then exposing
03:06 that part that looks like Python directly to Python really easy.
03:09 So let me give you a quick example.
03:11 Suppose I have a function.
03:12 It's called hello.
03:13 It takes a character pointer called name.
03:16 Right.
03:16 This is C.
03:17 And then you do printf hello, percent S name.
03:20 Right.
03:21 So super simple.
03:22 It's like a function.
03:23 It takes a character pointer and returns void.
03:25 So if I wanted to call that function from Python, I could write just like a couple of lines
03:31 of code in Cython.
03:33 So I would say C def external.
03:35 Right.
03:36 Define.
03:37 Here's an external function.
03:38 Void.
03:39 Hello.
03:39 Cons car.
03:39 Right.
03:40 Just the, there's the signature of that C function.
03:42 And then I'd write a function like pi underscore hello.
03:45 I'd say name colon bytes goes to none.
03:48 And just say hello name.
03:50 That's it.
03:50 Not all these crazy extensions or pi objects or whatever.
03:53 Just like it's basically Python code with type hints or type annotations applied.
04:01 Right.
04:01 I mean, that's what Cython more or less is.
04:03 This is a really nice small example of how to get some.
04:06 I thought I had to go through some like, like figure out how to link to the DLL and stuff
04:11 like that.
04:12 And yeah.
04:13 Neat.
04:13 Yeah.
04:13 And so the final step you have to actually build, like you have to compile your Cython
04:18 code so that it can be imported from regular Python code.
04:21 And so they provide a setup py which does that.
04:25 And you just run Python setup py build.
04:27 Right.
04:28 And it goes and does its magic.
04:30 And then you, then you have it.
04:31 You can work with it.
04:31 And by the way, it happens to be fast.
04:33 Even if that's not the point.
04:34 Okay.
04:34 I'm going to have to go play with this.
04:36 It's neat.
04:36 Yeah.
04:36 It's cool.
04:37 Isn't it?
04:37 It's cool.
04:38 And it's easy.
04:38 Yeah.
04:38 Yeah.
04:39 It's way better than like flipping into sea land and doing all sorts of stuff.
04:42 I think if your point is only to consume this code, maybe even just a regular Cython code.
04:47 There's a lot of good things going on there.
04:49 Yeah.
04:49 Awesome.
04:49 Cool.
04:50 So what's this next one you got?
04:51 Feature flags?
04:52 Yeah.
04:53 Feature flags.
04:54 And I can't remember who was the, there was, now we're going back in time, but a year or
05:00 so, but was it Instagram that talked about?
05:02 Yeah.
05:03 It was Instagram that moved from Python 2 to Python 3 on their Django deploy deployment, which
05:08 is I think the largest Django deployment in the world on the same branch and without anything,
05:14 right?
05:14 Just by using feature flags.
05:16 I don't remember if they used feature flags in that or not.
05:19 But I know a lot of people that, a lot of teams that do have a model of merging to the
05:25 master frequently or just working off master for everybody often use feature flags.
05:31 And I know how to do that in C++, but I wasn't quite sure.
05:34 I could have hacked together something for Python, but this is a nice article about really how to
05:40 do feature flags in Python well and why you would do it and how to do it.
05:45 So a few of the benefits they talked about is improving Teams response time to bugs, because
05:51 if you add a feature with a feature flag, you can just turn it off by flipping the flag if
05:57 you need to without having to redeploy everything.
06:00 I really like this idea.
06:01 It's great to be able to just keep everything in the same code base and also keep the database
06:07 schemas in sync, which is nice as well.
06:10 So a lot of cool stuff going on here.
06:11 You have just one code base to test.
06:14 I mean, you can turn on and off features to test different parts because you want both
06:18 with the feature and without the feature to still work.
06:20 And you can migrate user groups with A-B testing or group splitting or however you want to migrate
06:27 the feature in.
06:29 But then it went on to talk about some of the ways to implement it nicely so that it's a
06:35 maintainable flag system, how to measure your success with different analytics, and then using
06:41 third-party tools to make your flag support clean and not reinventing the wheel.
06:47 And other people have figured this out.
06:49 And then also just at the end of comment to say, you know, once you've really decided
06:53 that a feature's in place, you have to go through and do feature flag cleanup.
06:57 So make sure that you remove the flags and have the features be permanent when you're ready
07:02 to have them permanent clean up your code base.
07:04 So it was just a nice write-up for this.
07:06 Yeah, and it's really nice.
07:07 I like it.
07:08 And they have maybe one of the best visualizations of flat is better than nested with like some
07:13 kind of Mortal Kombat type character.
07:15 It's a crazy nested if statement.
07:20 That's the cleanup conversation, right?
07:22 Like, don't do this.
07:23 Yeah, and the example of how not to do feature flags.
07:26 Yeah, definitely.
07:27 Don't do this.
07:29 Yeah, it's quite cool.
07:29 Nice.
07:30 All right.
07:31 So speaking of quite cool, it's quite cool that DigitalOcean is sponsoring this podcast.
07:35 And I want to just tell you guys quickly about them.
07:37 So they're a hosting company.
07:38 They've got data centers throughout the world.
07:41 And I think one of the cleanest, nicest ways to create a set of virtual servers and get
07:48 them up and running and configure them.
07:49 So if you want to create one that's already pre-configured for some infrastructure like
07:53 Disqus, you can just click a button and say, create me a Disqus virtual machine based on
07:57 Ubuntu, you know, whatever version you're looking for.
08:00 Or create a fresh one instead of PowerView like.
08:02 So they provide a lot of the infrastructure for us.
08:05 We actually pay for it, but they are the people we pay for getting you this podcast, which
08:10 is pretty awesome.
08:11 They've been really good.
08:12 We're happy customers.
08:13 And so if you want to be new happy customer, you can get $100 to try them out at pythonbytes.fm
08:20 slash digital ocean.
08:22 And that's for new customers.
08:24 Check it out.
08:25 And hopefully you create something cool and run it there.
08:28 Nice.
08:28 Hey, Brian, I got one that I think you're going to like.
08:31 Okay.
08:31 It's about testing.
08:32 I like testing.
08:33 Recently, I had a Talk Python episode on the release of PyPI and the inside story of how
08:39 that got revised.
08:40 And finally, pypi.org is the official thing.
08:43 Not like a weird, scary place that also matches to the same database.
08:46 So that's really awesome.
08:47 And I can't remember who said it on the show.
08:49 Sorry, because there were three folks.
08:51 But one of the libraries brought up was pretend, which is a stubbing library.
08:56 Neat.
08:56 Yeah.
08:57 So stubbing is like mocking, but it's different.
08:59 How's that, Brian?
08:59 You know more about that than I do.
09:01 Oftentimes in mocking, you want to like check the behavior of the code.
09:05 Like if you're interacting with some system that's not really there, you want to make sure
09:11 that you've called it in certain ways or you called it a certain number of times or the order
09:15 of calls, things like that.
09:17 And stubbing is really like, I just want to have my code be able to call something and have
09:23 the return value be like some pre-canned data.
09:26 So it's more about pre-canning than about the behavior.
09:30 I see.
09:31 So with mocking, maybe I'm going to say, I'm going to call this login API and I want
09:34 to make sure that it checks that I, my password is correct or something like that.
09:39 And that would be a mocking thing.
09:40 Whereas for stubbing, I just like, I need it to give back a password so it doesn't crash
09:44 because it's like a, you know, none type attribute error type of crash if, if nothing comes back.
09:51 So we got to give it something.
09:51 So let's create a stub to do that.
09:53 Stubbing also is a great way to do things like if there's error conditions, like when, when you're
09:59 connecting to a third party that goes out to some, if you want to like, if the server
10:04 crashes or something, you'll get an error code.
10:06 So how do you, how do you simulate that?
10:08 You can't go out and crash the server.
10:09 It's a really great way to, to pretend bad things happen.
10:14 Yeah.
10:14 So let me give people a sense of the API.
10:16 It's real simple.
10:16 So from pretend import stub, and then you just say stub and say like, here's a function name
10:22 equals some Lambda.
10:23 And now you can just start, you pass that around.
10:25 If somebody calls that function, it returns the value of the Lambda returns.
10:28 Done.
10:28 It's like, I don't think that it could really be simpler to be honest.
10:33 Right.
10:33 Well, that's one of the reasons why it's pretty cool is the, it's simpler than using mock.
10:37 Mock can do this too, but this is simpler.
10:39 Yeah.
10:40 Really nice.
10:40 Yeah.
10:41 You could probably use that with flask, couldn't you?
10:42 Yeah.
10:44 You could probably use it with flask.
10:46 Yep.
10:47 And my next topic is a surprise to me.
10:50 I forgot that I put this down.
10:51 But one of the things I was, I got out of PyCon was I got to sit next to one of the people
10:56 that at dinner that works on the flask project.
11:00 And he reminded me that they just went through and rewrote a bunch of the flask tutorial.
11:05 So I went through and took a look and yeah, the flask, the official flask tutorial is,
11:10 got a lot of updates.
11:11 It's, the code that goes along with it's been updated in the, just everything's been
11:16 simplified, updated.
11:17 It's a little cleaner.
11:19 It includes a, I don't know if it had this before, but it includes a section on testing,
11:23 which, highlights pytest, of course, and coverage, which is good.
11:28 And, one of the things I learned also with, with that discussion was flask is a part
11:33 of the, a pallets.
11:35 I'm not sure what their entity is really, but pallets is a collection of people that work
11:41 on a collection of projects.
11:42 And it's some pretty important stuff.
11:45 We've got flask, click.
11:47 It's dangerous.
11:48 I'm not sure what that is.
11:49 It's a request validation foundation of flask.
11:51 Oh, okay.
11:52 It's dangerous.
11:53 I already said that, but Jinja and Jinja2.
11:56 Is there a Jinja?
11:57 Is it just Jinja2?
11:59 I've not seen any Jinja in the wild lately, but there probably was at some point.
12:03 Okay.
12:03 But, and then, markup safe, which is an HTML markup safe string for Python library.
12:10 And then workzug, workzug, I don't know how to pronounce that.
12:15 Workzug.
12:16 With a V.
12:17 Everybody relies on a lot, some of these things, even if you don't use flask and the pallets,
12:23 project has a donate page now.
12:25 So if you, if you want to donate and you can donate through their donation page, pretty
12:31 neat.
12:31 It's really nice.
12:32 So we've had a lot of news around flask lately.
12:34 Flask went one Oh, a while ago.
12:36 I talked to one of the guys.
12:38 I'm sorry.
12:39 I'm so forgetting his name, but because I met so many people, but who is basically responsible
12:43 for that whole progression.
12:44 David, I believe.
12:46 Anyway, he's like, I know people think because the flask went one Oh, the week after you guys
12:51 did the zero over episode actually that was in the works for like a year.
12:56 I would love for you to be doing that, but, it was actually just a coincidence.
12:59 So anyway, glad to see you go one Oh, but yeah, this looks like flask is getting some renewed
13:05 love, which is good.
13:06 Yeah.
13:06 And, I didn't know click was by the same people.
13:09 I use click all the time.
13:10 So that's pretty neat.
13:12 Yeah.
13:12 For, creating CLIs.
13:14 Very nice.
13:15 All right.
13:16 So let's round it out with a little bit of an internals, look here as well.
13:20 I feel like a lot of stuff I'm covering this week is like deep in the guts with the Cython.
13:24 And if you're not doing Cython, you're doing regular Python, then you're operating in the
13:29 bytecode space.
13:30 So do you think people would be surprised if you told them that Python is compiled?
13:34 Yes.
13:34 A lot of them, I think they would, but it's not compiled to machine instructions or even
13:38 JIT compiled unless you're using PyPy, but it's compiled to bytecode.
13:42 So those PYC files, those are like the instructions to the Python virtual machine, not instructions
13:48 to your processor, but they're still compiled and there's still this bytecode.
13:52 And so understanding it's pretty interesting to just know like how the internals of Python
13:56 is working.
13:56 So there's this nice article called an introduction to Python bytecode.
14:01 So if you know, this bytecode concept is kind of new to you, or you just want to play around
14:05 with a little, check it out.
14:06 It's pretty accessible.
14:07 So I feel like there's a lot of hello world examples in my topics.
14:12 So back to hello.
14:13 So they have a function def hello, and it just prints the static string hello world, right?
14:19 It's just, okay, well, what does this actually mean?
14:21 And then they show you all the bytecode.
14:24 So, okay, we're going to load the global, which is the print function.
14:26 We're going to load the constant, which is hello world.
14:28 And we're going to call the function that is on the stack.
14:31 So CPython uses a stacked based virtual machine.
14:34 And so they like load these things under the stack.
14:36 If you have like a function that takes two arguments, they might load two arguments on
14:40 the stack and then call the function and things like that.
14:43 So that's how your Python source code gets into executable form.
14:48 And then these steps are actually sent down to the CPython runtime and this giant wild loop
14:54 with a switch statement.
14:55 That's like literally 3000 lines of C code that says, what's the bytecode?
14:59 Let's go do that.
14:59 You know, so it's pretty interesting.
15:01 And they talk about how you can take your Python code and look at this.
15:05 So you just import this as in disassembly and say this dot this and you pass it like a callable
15:11 or something like that.
15:12 And it'll show you the bytecode instructions that make it up.
15:15 So it's pretty nice.
15:15 New phone who dis you can't actually just look at the PYC files though, right?
15:20 No, they're like bytes.
15:21 That's why you need this dis thing.
15:23 I mean, I guess if you can like parse the bytes, you can, but it's not strings.
15:28 I don't think.
15:28 Okay.
15:29 Yeah.
15:30 I mean, the PYC files are basically the compiled steps from your hello world text into the
15:36 bytecode instructions in terms of bytes.
15:38 And then that's why those cache things are laying there.
15:42 The next time you hit that app, right?
15:44 It's just going to go and say, well, let's just load up that PYC so we don't have to reparse
15:48 and validate it.
15:49 Yeah.
15:49 This is kind of neat.
15:50 I'm going to definitely going to go check this out because I just want to know more about
15:54 this.
15:55 It seems like something I should know about, even though I probably don't need to on a
15:58 daily basis.
15:58 Yeah.
15:59 I mean, I'm not sure how helpful it is, but it's helpful in your conceptualization of how
16:03 stuff works.
16:03 I think.
16:04 Yeah, definitely.
16:05 Very cool.
16:05 Yeah.
16:06 A little bit deeper down into the, is that the red pill or the blue pill that takes
16:10 you farther down?
16:11 That's the red pill, right?
16:12 I don't remember.
16:13 Awesome.
16:13 Well, anyway, it's definitely worth checking out if you haven't played with bytecode before.
16:17 It's a really nice, simple way to get introduced to it.
16:19 Brian, you got anything, any other news you want to share with everyone this week?
16:22 I don't think so.
16:23 Do you?
16:24 No, I'm all out of news this week other than the ones I found.
16:27 So I just want to say thank you.
16:28 Thank you for being part of this episode and sharing everything with everyone.
16:31 Thank you.
16:32 Bye.
16:33 Yeah, you bet.
16:33 Bye.
16:35 Thank you for listening to Python Bytes.
16:37 Follow the show on Twitter via at Python Bytes.
16:39 That's Python Bytes as in B-Y-T-E-S.
16:42 And get the full show notes at pythonbytes.fm.
16:46 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.
16:50 We're always on the lookout for sharing something cool.
16:53 On behalf of myself and Brian Okken, this is Michael Kennedy.
16:56 Thank you for listening and sharing this podcast with your friends and colleagues.