WEBVTT

00:00:00.001 --> 00:00:05.160
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:00:05.160 --> 00:00:10.560
This is episode 89, recorded August 2nd, 2018. I'm Michael Kennedy.

00:00:10.560 --> 00:00:11.540
And I'm Brian Aitken.

00:00:11.540 --> 00:00:12.440
Hey, Brian. How you doing?

00:00:12.440 --> 00:00:14.760
I'm doing great. It's good to talk to you again.

00:00:14.760 --> 00:00:17.900
Yeah, you as well. And we got some cool stuff lined up this week.

00:00:17.900 --> 00:00:21.220
We always say it, but there's really so much happening in the Python space.

00:00:21.220 --> 00:00:22.900
It's more exciting every week.

00:00:22.900 --> 00:00:26.440
Yeah, I think it's because we're here and then people are excited about Python.

00:00:26.440 --> 00:00:29.320
And then they do more and then it builds on itself.

00:00:29.400 --> 00:00:32.260
It's this awesome feedback loop. I can totally feel it.

00:00:32.260 --> 00:00:36.100
What also is awesome is Datadog for sponsoring the show.

00:00:36.100 --> 00:00:41.980
So if you need infrastructure monitoring or monitoring of your apps, check them out at pythonbytes.fm/Datadog.

00:00:41.980 --> 00:00:43.380
Tell you more about them later.

00:00:43.380 --> 00:00:46.640
Brian, you found some code that's pretty tenacious out there, didn't you?

00:00:46.640 --> 00:00:47.640
It just won't quit.

00:00:47.640 --> 00:00:49.220
Tenacious is a fun word.

00:00:49.220 --> 00:00:54.820
And there's a project called Tenacity that is pretty cool.

00:00:54.820 --> 00:00:57.140
We'll, of course, have a link in the show notes.

00:00:57.680 --> 00:01:06.560
But Tenacity says Tenacity is a general purpose retrying library to simplify the task of adding retry behavior to just about anything.

00:01:06.560 --> 00:01:08.760
And there's a little code snippet.

00:01:08.880 --> 00:01:12.100
The gist of it is you import retry.

00:01:12.100 --> 00:01:18.200
And it can have a lot of options, but you can just – the defaults just sort of work also.

00:01:18.200 --> 00:01:20.740
You can just put this around a function.

00:01:20.740 --> 00:01:28.760
And if your function raises an exception, it just tries it again and eats the exception.

00:01:29.540 --> 00:01:31.740
So for a lot of stuff, this is terrible.

00:01:31.740 --> 00:01:33.780
It's a terrible idea in a lot of places.

00:01:33.780 --> 00:01:43.780
But a few places, this might be reasonably good, especially like if, for instance, if you're going to – like, I guess I'm not even going to come with, for instance.

00:01:43.900 --> 00:01:53.940
You guys know in places where retrying is a good idea, like maybe saving something to a file system or connecting to a service that sometimes has contention.

00:01:53.940 --> 00:01:55.120
Yeah, or even a database.

00:01:55.120 --> 00:02:00.840
Like, what if you have to, like, restart the database server or something to that effect, right?

00:02:00.840 --> 00:02:05.320
Or you need to really quickly apply a migration, right?

00:02:05.400 --> 00:02:09.620
You could say, just keep retrying the database until you get there.

00:02:09.620 --> 00:02:12.580
Yeah, and then there's a whole bunch of extra conditions you can put on it.

00:02:12.580 --> 00:02:20.320
You can say, try so many times or a wait time for – if it doesn't work after a while, then give up.

00:02:20.320 --> 00:02:24.880
Then you can customize which exceptions it catches or doesn't catch.

00:02:24.880 --> 00:02:29.260
It even has retry on coroutines, which is kind of fun.

00:02:29.260 --> 00:02:31.780
I haven't tried that, but I've tried the simple case.

00:02:31.780 --> 00:02:33.960
But I'm going to use it right away.

00:02:33.960 --> 00:02:42.780
We've got several conditions where we're in testing devices where sometimes it takes a device – we're doing, like, little Wi-Fi devices, for example.

00:02:42.780 --> 00:02:47.460
It might be asleep, so it might take a while for the thing to wake up and respond.

00:02:47.460 --> 00:02:50.220
So a number of retries is good.

00:02:50.220 --> 00:02:53.820
And we have, like, retry code all over our code base.

00:02:53.820 --> 00:02:57.660
And so having Decorator do that for us is just pretty slick.

00:02:57.660 --> 00:02:58.280
I like it.

00:02:58.280 --> 00:02:58.980
Yeah, it's really cool.

00:02:58.980 --> 00:03:03.780
And, you know, like I said, the database thing, like, you wouldn't just put retry and say continue to hammer the database.

00:03:03.780 --> 00:03:06.540
And definitely, if you run into any problems.

00:03:06.540 --> 00:03:14.220
But, you know, like the, like, try five times with an exponential fall off just so you could basically handle five-second down times.

00:03:14.220 --> 00:03:15.100
Things like that.

00:03:15.100 --> 00:03:15.620
Be nice.

00:03:15.620 --> 00:03:15.880
Yeah.

00:03:15.880 --> 00:03:25.160
There's definitely be parts of, like you said, a distributed system where you've got things that sometimes are not available for a little while.

00:03:25.160 --> 00:03:30.440
Yeah, especially stuff that you don't control everything, like other services you depend upon.

00:03:30.440 --> 00:03:30.740
Yeah.

00:03:30.740 --> 00:03:31.400
Yeah, pretty cool.

00:03:31.400 --> 00:03:39.240
So I'm going to bring a theme into this show here for the next couple of topics that I'm going to bring in.

00:03:39.240 --> 00:03:42.340
And I think it's actually going to touch on every one of the things I'm bringing.

00:03:42.340 --> 00:03:45.860
So let's start with Anthony Shaw, of course.

00:03:45.860 --> 00:03:48.400
We've got to cover an article by Anthony Shaw, right?

00:03:48.400 --> 00:03:50.220
Yeah, he writes a lot of good stuff.

00:03:50.340 --> 00:03:50.780
Yeah, he does.

00:03:50.780 --> 00:03:55.580
And one of the ones that I came across here is called, Why is Python so slow?

00:03:55.580 --> 00:04:00.320
So I think it's interesting to even ask the question, like, is Python slow?

00:04:00.320 --> 00:04:01.680
And explain how that is.

00:04:01.680 --> 00:04:08.920
So Anthony looks at some benchmarks comparing, say, Python to Java, to C#, to C++, and things like that.

00:04:08.920 --> 00:04:13.660
And talks about, well, in some cases it is slower, and why might that be, right?

00:04:13.660 --> 00:04:23.000
So basically answering the question, like, Python completes a comparable operation, like, two to ten times slower than, say, Java or C#.

00:04:23.000 --> 00:04:23.640
Why?

00:04:23.640 --> 00:04:25.060
And why can't we make it faster?

00:04:25.060 --> 00:04:32.060
So he has three main hypotheses, which are pretty interesting.

00:04:32.740 --> 00:04:36.000
One is, it's the GIL, the global interpreter lock.

00:04:36.000 --> 00:04:39.840
The other is, it's because it's interpreted rather than compiled.

00:04:39.840 --> 00:04:43.360
And the final one is, it's a dynamic language.

00:04:43.360 --> 00:04:47.440
So those three ones are pretty interesting.

00:04:47.440 --> 00:04:48.780
What do you think?

00:04:48.780 --> 00:04:52.960
Okay, so he's not saying that these are the reasons, but these are theories.

00:04:52.960 --> 00:04:53.820
These are theories.

00:04:54.040 --> 00:05:00.220
And then he goes through each one of them and, like, pretty deeply looks at how they work and then compares them.

00:05:00.220 --> 00:05:02.400
So, for example, it's interpreted, not compiled.

00:05:02.400 --> 00:05:04.480
Well, what does that mean in terms of tradeoff?

00:05:04.480 --> 00:05:08.660
Let's compare that to, say, the way C# does things and the way C++ works.

00:05:08.660 --> 00:05:10.160
What are the tradeoffs there?

00:05:10.160 --> 00:05:13.960
So let's go, we'll go through some of them.

00:05:13.960 --> 00:05:17.780
One is, it's the GIL, right?

00:05:17.780 --> 00:05:23.000
Now, modern computers, modern processors have multiple cores.

00:05:23.000 --> 00:05:27.760
And if you actually look at Moore's Law, Moore's Law is still alive and well, right?

00:05:27.760 --> 00:05:30.620
The number of transistors in a chip.

00:05:30.620 --> 00:05:37.560
But people sort of correlated that indirectly to, well, that means computers are getting faster and faster and faster.

00:05:37.560 --> 00:05:43.920
But it turns out that a number of years ago, four, five, I don't know how many years ago, not too long ago,

00:05:43.920 --> 00:05:49.180
the actual clock speed no longer kept doubling along with Moore's Law.

00:05:49.180 --> 00:05:51.340
It sort of went flat, more or less.

00:05:51.460 --> 00:05:55.240
And what started happening was we got two core machines and then four core machines.

00:05:55.240 --> 00:05:56.680
And I just got a new MacBook.

00:05:56.680 --> 00:05:58.940
It has six hyper-threaded cores.

00:05:58.940 --> 00:05:59.740
It's crazy, right?

00:05:59.740 --> 00:06:07.200
But on Python, if I want to take advantage of that computationally, it's super hard within one process because of the GIL.

00:06:07.760 --> 00:06:12.640
So, he said, well, if you want to take advantage of modern hardware, maybe the GIL is the problem.

00:06:12.640 --> 00:06:16.880
So, he talks about some of the trade-offs there, when it matters, when it doesn't matter.

00:06:16.880 --> 00:06:22.140
So, for example, if what you're doing is IO bound, it basically doesn't much matter, right?

00:06:22.140 --> 00:06:25.620
The GIL is released when you're waiting on network calls and stuff like that.

00:06:25.740 --> 00:06:28.880
So, in some sense, like the GIL is not the problem.

00:06:28.880 --> 00:06:36.820
If you created a bunch of threads and they all started, you know, reading, writing files, talking over the network, it should just automatically handle that.

00:06:36.900 --> 00:06:47.720
But if, on the other hand, you tried to create six threads and do computational stuff, you'll still probably get 12% CPU usage on my machine because, you know, it only really gets to run one at a time.

00:06:47.720 --> 00:06:49.520
So, that's one of the theories.

00:06:49.520 --> 00:06:54.920
And I think this theory applies more in some places and less in others.

00:06:54.920 --> 00:06:56.520
And I kind of touched on that a little bit.

00:06:56.520 --> 00:07:02.620
Like, if your goal is to do computational mathematical things, the GIL can really, really matter.

00:07:02.620 --> 00:07:04.200
It makes a big difference, right?

00:07:04.700 --> 00:07:07.760
Because you're trying to execute your Python code, it doesn't let go of the GIL.

00:07:07.760 --> 00:07:11.940
But if, say, you're building a web app, it probably doesn't matter, right?

00:07:11.940 --> 00:07:17.060
There are some ways and you can do some things that would be better, but it doesn't really matter.

00:07:17.060 --> 00:07:30.420
So, for example, if I looked at the various servers before we came on today, the training.talkpython.fm site has 16 worker processes, all running parallel versions of the website handling requests.

00:07:30.420 --> 00:07:32.740
Talk Python itself has eight.

00:07:33.520 --> 00:07:35.340
I think maybe Python bytes has eight as well.

00:07:35.340 --> 00:07:37.960
So, anyway, there's these eight processes.

00:07:37.960 --> 00:07:45.320
And sure, one of them may, like, lock up something with the GIL, but there's a whole bunch of others that can leverage those other CPU cores and just keep on rocking.

00:07:45.320 --> 00:07:49.380
So, if you're doing web stuff, it matters less in this sense, I think.

00:07:49.760 --> 00:07:53.000
I mean, even if you are, there's ways to get around it.

00:07:53.000 --> 00:07:56.780
Yeah, if you can break up your algorithm and do sub-process type parallelism.

00:07:56.780 --> 00:07:58.320
Okay, so that's the GIL.

00:07:58.320 --> 00:08:01.300
The other could be it's an interpreted language.

00:08:01.300 --> 00:08:05.720
And I think this one is the most interesting, probably.

00:08:06.580 --> 00:08:08.360
So, it is an interpreted language.

00:08:08.360 --> 00:08:11.340
But actually, it does compile code to bytecode.

00:08:11.340 --> 00:08:13.660
But it just doesn't JIT compile it, right?

00:08:13.660 --> 00:08:20.120
And so, one of the main considerations around JIT compile languages versus not is startup time.

00:08:20.960 --> 00:08:30.240
So, if our Python code is going to start and run for a while, then doing a whole bunch of JIT optimization would be maybe a little slower to start, but then faster to run.

00:08:30.240 --> 00:08:39.660
But if we want to just do some CLI stuff that starts really quick, does a tiny thing and goes away, a whole bunch of JIT stuff might be, you know, sort of counterproductive.

00:08:40.060 --> 00:08:45.060
So, there's a pretty interesting comparison against C# and Java and CPython here.

00:08:45.060 --> 00:08:52.320
The other thing I think that's worth throwing in here is because of C extensions and things like that, it's an interpreted language.

00:08:52.320 --> 00:09:00.420
I think that's a simplistic view that's like, well, if you just take straight Python code and just run it on Python, you don't interact with any libraries.

00:09:00.420 --> 00:09:11.020
But if you work with NumPy or if you work with SQLAlchemy or a whole bunch of stuff that has C extensions to make certain parts fast, well, all of a sudden, it's not interpreted, right?

00:09:11.020 --> 00:09:12.220
So, there's these weird blends.

00:09:12.220 --> 00:09:13.140
All right.

00:09:13.140 --> 00:09:13.620
Last one.

00:09:13.620 --> 00:09:15.820
It's because it's dynamically typed.

00:09:15.820 --> 00:09:19.080
So, this is also, I think this is really interesting.

00:09:19.080 --> 00:09:21.260
I think, actually, this is probably why.

00:09:21.260 --> 00:09:24.340
And I'm going to throw in another one unless I get distracted and forget it.

00:09:24.660 --> 00:09:30.420
So, I think this is really why it's probably the slowest is that it's a dynamic language.

00:09:30.420 --> 00:09:38.360
And it's not that you can't make a dynamic language fast, but because it's so flexible, it's hard to know how to optimize it.

00:09:38.360 --> 00:09:40.120
Right?

00:09:40.120 --> 00:09:40.220
Yeah.

00:09:40.220 --> 00:09:47.440
Like, you might want to inline a function, but somebody could monkey patch that function and then you wouldn't be inlining the right thing, for example.

00:09:47.440 --> 00:09:49.220
And then you monkey patch it only sometimes.

00:09:49.220 --> 00:09:50.700
What do you do then, right?

00:09:50.700 --> 00:09:56.940
So, things like method inlining, which can really make things faster, is super hard because that could actually change.

00:09:56.940 --> 00:10:01.160
Where, say, in C# or C++ or whatever, the method won't change.

00:10:01.160 --> 00:10:01.720
Yeah.

00:10:01.720 --> 00:10:02.300
Okay.

00:10:02.300 --> 00:10:02.660
All right.

00:10:02.660 --> 00:10:03.120
Final one.

00:10:03.120 --> 00:10:03.940
This is mine.

00:10:03.940 --> 00:10:04.940
I'm adding this to his.

00:10:04.940 --> 00:10:07.760
And it sort of has to do with this dynamic typing thing.

00:10:07.760 --> 00:10:12.320
Is everything is a reference type allocated on a heap in Python.

00:10:12.320 --> 00:10:13.560
Right?

00:10:13.560 --> 00:10:20.700
Some of the stuff that makes C++ and C# really, really fast is things like numbers and other stuff are allocated on the stack.

00:10:20.700 --> 00:10:24.200
And when you work with them, you never do pointer dereferencing.

00:10:24.200 --> 00:10:25.720
You never do reference counting.

00:10:25.720 --> 00:10:27.880
You never do garbage collection or memory management.

00:10:27.880 --> 00:10:30.500
You just work with little bits on the stack.

00:10:30.500 --> 00:10:39.520
And because that little thing on the stack could become a full-blown list or something just by changing what it points at, I think probably that also makes a big difference.

00:10:39.740 --> 00:10:40.000
Okay.

00:10:40.000 --> 00:10:45.160
There's also, like, function calls are slower than they need to be.

00:10:45.160 --> 00:10:50.560
And I think that's one of the things the Python core team is working on is to try to.

00:10:50.560 --> 00:10:54.220
And luckily, there have been some advances there in the latest version of Python.

00:10:54.220 --> 00:10:56.340
I think 20% or something they got.

00:10:56.340 --> 00:10:58.440
Was that 3.7, I think, that they got that much faster?

00:10:58.440 --> 00:11:00.200
So work is being done.

00:11:00.200 --> 00:11:02.440
But, yeah, it could be more, I guess.

00:11:02.440 --> 00:11:07.700
But what's really interesting is the tradeoffs or sort of comparisons of the article.

00:11:07.860 --> 00:11:16.040
The thing that I want to, there's a forest in the trees sort of issue I have here is that I don't think Python is slow.

00:11:16.040 --> 00:11:16.920
Yeah, I'm with you.

00:11:16.920 --> 00:11:27.000
And my people time is way more expensive than computer time for, like, 98% of the applications in the world as far as I probably.

00:11:27.000 --> 00:11:34.740
Well, and somebody made that comment below in the comment section of this article that said, well, if you're optimizing milliseconds versus nanoseconds, yes, maybe.

00:11:34.740 --> 00:11:41.760
If you're optimizing weeks versus month from idea to shipped, you know, Python's not slow at all.

00:11:41.760 --> 00:11:42.380
It's really fast.

00:11:42.380 --> 00:11:42.780
Yeah.

00:11:43.040 --> 00:11:51.340
And, I mean, that's what I see is the maintenance development time, the maintenance time, all of those extra people time things.

00:11:51.340 --> 00:11:53.240
Python is way faster.

00:11:53.720 --> 00:11:59.380
And a lot of these things that we say is, like, the GIL is a problem or multiprocessing is difficult.

00:11:59.380 --> 00:12:01.540
Well, multiprocessing isn't easy.

00:12:01.540 --> 00:12:03.440
Maybe it's easy in some languages.

00:12:03.440 --> 00:12:07.100
I mean, Go is sort of designed to do that from the start.

00:12:07.100 --> 00:12:13.620
But getting a complex algorithm in C++ to utilize multi-cores, that's not a trivial task either.

00:12:13.760 --> 00:12:14.520
No, it's not.

00:12:14.520 --> 00:12:15.500
It's definitely not.

00:12:15.500 --> 00:12:21.060
And I would say on the web framework side, actually, Python is really quite fast.

00:12:21.060 --> 00:12:31.160
I've compared it against other, like, C#-based JIT compiled things like ASP.NET and stuff for some conference talk I did comparing Python to the .NET stuff.

00:12:31.160 --> 00:12:37.100
And, actually, Python was not just as fast, but faster than the JIT compiled C# stuff.

00:12:37.100 --> 00:12:37.380
Yeah.

00:12:37.520 --> 00:12:43.400
And also, just for people that do think that maybe the Python speed's the problem, really measure it.

00:12:43.400 --> 00:12:44.600
And you can optimize.

00:12:44.600 --> 00:12:47.000
There's ways in Python to optimize the parts that are slow.

00:12:47.000 --> 00:12:47.940
Yeah.

00:12:47.940 --> 00:12:51.300
And my next item will come back to this and show you some more ways to make it faster.

00:12:51.300 --> 00:12:51.600
Okay.

00:12:51.600 --> 00:12:53.200
All right.

00:12:53.200 --> 00:12:55.840
So, what's this Mew thing all about?

00:12:55.840 --> 00:12:57.960
Mew, you talked about Mew before, right?

00:12:57.960 --> 00:13:00.480
It's like a simplified IDE type thing.

00:13:00.480 --> 00:13:00.960
Is that right?

00:13:00.960 --> 00:13:01.320
Right.

00:13:01.320 --> 00:13:07.020
So, I'm going to highlight Mew again, partly because I think it's a neat project that's going on.

00:13:07.100 --> 00:13:08.880
And there's a lot of cool people working on it.

00:13:08.880 --> 00:13:12.320
But there was an article called Keynoting in Mew.

00:13:12.320 --> 00:13:14.840
And Mew being Mew.

00:13:14.840 --> 00:13:21.480
And in the EuroPython 2018, David Beasley did a talk and a demo called Die Threads.

00:13:21.480 --> 00:13:25.360
And what amused me is my first thought was, is this a German joke?

00:13:25.360 --> 00:13:27.700
And he actually addressed it early on.

00:13:27.700 --> 00:13:28.800
It's not a German joke.

00:13:28.800 --> 00:13:30.380
But it's a good demo.

00:13:30.520 --> 00:13:33.920
But he used Mew during his Python talk.

00:13:33.920 --> 00:13:39.120
And this article talks about, and also asked him about it, why he did that.

00:13:39.120 --> 00:13:41.560
And it's just a simple thing.

00:13:41.560 --> 00:13:44.060
There's not, it's the same experience for everybody.

00:13:44.060 --> 00:13:45.660
Like, for instance, I use PyCharm.

00:13:46.080 --> 00:13:54.120
But my environment in PyCharm, all the different colors I like or the plugins I use, it's going to be different than everybody else.

00:13:54.120 --> 00:13:55.780
It's one of those customizable things.

00:13:55.860 --> 00:14:05.460
And so having a very simple interface like Mew that works as a learning tool, but also shows people exactly what you're doing.

00:14:05.460 --> 00:14:12.800
So one of the features of it is if you've got a little script that you can want to run, you can just push run at the top.

00:14:12.800 --> 00:14:19.420
And then automatically a little window pops up at the bottom and shows you the output of the thing you're running.

00:14:20.160 --> 00:14:22.040
And that's just really handy.

00:14:22.040 --> 00:14:26.740
And so it's kind of fun to watch that in a talk and being used.

00:14:26.740 --> 00:14:30.740
And I think that would be, I use it, do little demos for people at work.

00:14:30.740 --> 00:14:37.520
And I think I might try this also just to not have to answer questions like, what plugin are you using or whatever.

00:14:37.520 --> 00:14:40.640
Yeah, just sort of keep the distractions to a minimum, huh?

00:14:40.640 --> 00:14:40.960
Yeah.

00:14:41.100 --> 00:14:46.880
And it's a real clean interface and looks like you can change the font size and it looks fun.

00:14:46.880 --> 00:15:00.400
Plus, Mew is, if you haven't played around with it yet, it's also something that it automatically has hooks into things like running MicroBit and Raspberry Pi and stuff.

00:15:00.400 --> 00:15:03.100
The MicroPython and embedded IoT things?

00:15:03.100 --> 00:15:03.660
Yeah.

00:15:03.660 --> 00:15:04.460
Yeah, very cool.

00:15:04.460 --> 00:15:05.440
Yeah, I like it.

00:15:05.440 --> 00:15:06.100
That's a nice one.

00:15:06.380 --> 00:15:11.740
So before we get on to my other, my next performance thing, let me tell you guys about Datadog.

00:15:11.740 --> 00:15:14.360
So Datadog is sponsoring this episode like they have many.

00:15:14.360 --> 00:15:16.140
Thank you to them for that, of course.

00:15:16.140 --> 00:15:21.180
So they have infrastructure monitoring, distributed tracing, and they've added logging.

00:15:21.760 --> 00:15:29.980
They provide end-to-end visibility for requests across different parts of your infrastructure and as well as health and performance monitoring of your Python apps.

00:15:29.980 --> 00:15:33.920
So get started with them today for a 14-day free trial.

00:15:33.920 --> 00:15:37.300
And, of course, they'll send you a cool Datadog t-shirt on it.

00:15:37.300 --> 00:15:40.240
Just go to pythonbytes.fm/Datadog and check it out.

00:15:40.240 --> 00:15:42.380
So, yeah, very cool stuff from Datadog.

00:15:42.380 --> 00:15:46.640
Brian, I told you about the performance thing and whether Python is slow.

00:15:46.640 --> 00:15:47.760
I agree with you.

00:15:47.820 --> 00:15:50.880
I generally find for what I do, it's like actually blazing.

00:15:50.880 --> 00:15:54.800
So not a problem, but sort of depends on choosing the right infrastructure.

00:15:54.800 --> 00:15:55.340
Yeah.

00:15:55.340 --> 00:15:55.640
Yeah.

00:15:55.640 --> 00:16:06.700
So there's this interesting proof of concept done by this, I forgot the name, European Open Source Consortium.

00:16:07.260 --> 00:16:14.880
And the basic theme of this article is sort of exploring a response to the question of, so I've heard Python is slow.

00:16:14.880 --> 00:16:15.780
Is it?

00:16:15.780 --> 00:16:18.580
And I think it depends.

00:16:18.580 --> 00:16:29.240
So what they've done is they've created a multi-core, talked about how that can be hard to take advantage of, but here it is, multi-core Python HTTP server that is much faster than Go.

00:16:29.640 --> 00:16:35.840
So you've heard a lot of people say, well, we're going to go to Go because it's parallelism is so much better than Python and it's fast.

00:16:35.840 --> 00:16:37.720
So we can, you know, do things fast.

00:16:37.720 --> 00:16:44.620
And of course, there's that big tradeoff with sort of the functionality and speed to market or speed to idea completion and so on.

00:16:45.180 --> 00:16:48.760
But this thing is like, hey, and we could even go as fast or faster.

00:16:48.760 --> 00:16:55.000
So they compare it against actually two Go web servers and it's faster than both of them.

00:16:55.000 --> 00:16:59.660
So what it is, is these guys have gone and said, we've talked about this idea before.

00:16:59.660 --> 00:17:11.120
They said, we want to go look at all the various C-based, not Python, C-based sort of co-routine libraries that will let us write like low level HTTP servers.

00:17:11.240 --> 00:17:17.140
And it turns out that there were not that many good options and the best option that they found was still slower than Go.

00:17:17.140 --> 00:17:19.940
So they're like, well, maybe this isn't going to work.

00:17:19.940 --> 00:17:25.320
But then it turned out they found this thing called L-W-A-N, which is a C library.

00:17:25.320 --> 00:17:33.400
And they used Cython to create a Cython-based web server that wraps it up and exposes it to Python.

00:17:33.400 --> 00:17:34.060
Ooh, neat.

00:17:34.060 --> 00:17:34.680
Cool, right?

00:17:34.680 --> 00:17:37.660
So basically it's not really ready to ship or anything.

00:17:37.660 --> 00:17:41.020
It just validates the concept of creating a high-performance thing with Cython.

00:17:41.020 --> 00:17:43.700
And I think that's a big part of the article.

00:17:43.700 --> 00:17:46.280
So it says, here's some interesting things about Cython.

00:17:46.280 --> 00:17:55.740
It's both an optimizing static compiler and a hybrid language that gives you the ability to write Python code that can call back and forth with C and C++.

00:17:55.740 --> 00:17:56.380
Really easy.

00:17:56.940 --> 00:18:04.940
It has static type declarations that make Python code faster because it can do like, it doesn't have to treat everything like a reference type.

00:18:04.940 --> 00:18:08.040
It can, you know, put integers as integers on the stack and whatnot.

00:18:08.040 --> 00:18:14.060
And the other thing is it releases the gil by just having like a keyword in Cython.

00:18:14.060 --> 00:18:16.060
So you can say this part actually don't need the gil.

00:18:16.480 --> 00:18:18.060
I'm doing this in C, leave me alone.

00:18:18.060 --> 00:18:18.780
Oh, okay.

00:18:18.780 --> 00:18:22.460
Isn't it interesting how that actually hits so many of the things that Anthony brought up, right?

00:18:22.460 --> 00:18:22.860
Yeah.

00:18:22.860 --> 00:18:27.060
The typing, optimizations, and the gil stuff is all right there.

00:18:27.060 --> 00:18:32.960
So it generates super efficient C code that is then compiled into a Python module.

00:18:32.960 --> 00:18:37.280
So it's like really great for wrapping up C libraries and exposing them to Python.

00:18:37.280 --> 00:18:39.860
I hope they continue on with work on this.

00:18:39.860 --> 00:18:41.380
Yeah, it looks pretty awesome.

00:18:41.380 --> 00:18:41.720
Neat.

00:18:41.720 --> 00:18:42.020
Yeah.

00:18:42.020 --> 00:18:46.980
Anyway, I think, you know, if you're interested in sort of checking out the performance, definitely have a look.

00:18:46.980 --> 00:18:47.720
It's pretty cool.

00:18:47.720 --> 00:18:50.080
So are you going to cover something on testing in this episode?

00:18:50.080 --> 00:18:51.360
Oh, yeah.

00:18:51.360 --> 00:18:52.860
It's my turn again.

00:18:52.860 --> 00:18:54.200
It's your turn for a theme.

00:18:54.200 --> 00:18:55.400
Where am I at?

00:18:55.400 --> 00:18:56.480
I'm trying to get on here.

00:18:56.480 --> 00:18:57.740
Oh, yes.

00:18:57.740 --> 00:19:01.660
I am very excited about some news that came out last week.

00:19:01.660 --> 00:19:07.580
So I've been playing with, I started PyCharm using PyCharm a while ago.

00:19:07.580 --> 00:19:09.120
You've been using it off and on.

00:19:09.120 --> 00:19:10.560
I think you use lots of stuff now.

00:19:10.560 --> 00:19:11.140
Yeah, yeah.

00:19:11.480 --> 00:19:12.700
I've done it for quite a while, yeah.

00:19:12.700 --> 00:19:17.500
The pytest support has been getting better and better in the last year or so.

00:19:17.500 --> 00:19:28.860
And the PyCharm released 2018.2, I think it was last week, and it totally beefs up support for pytest fixtures.

00:19:29.620 --> 00:19:38.420
And that's, I think, I just wanted to mention that because anybody that's using both PyCharm and pytest definitely needs to update to 2018.2.

00:19:39.380 --> 00:19:41.760
A few things that used to do a few things that used to do now.

00:19:41.760 --> 00:19:54.680
If you have a fixture that you're, so a fixture, if anybody's not familiar, a fixture is just a little piece of code, a little function that you declare as a fixture that you put it as the parameter to your test.

00:19:54.900 --> 00:19:58.320
And it gets run before your test gets run at various levels.

00:19:58.320 --> 00:20:00.300
That's a simplification, but that works.

00:20:00.920 --> 00:20:05.440
But it shows up as a parameter, but you don't actually have to use it in your function.

00:20:05.440 --> 00:20:08.520
It just tells pytest that run that code.

00:20:08.520 --> 00:20:13.200
Well, PyCharm used to flag that as a variable that isn't referenced.

00:20:13.580 --> 00:20:15.020
So it gives you a warning.

00:20:15.020 --> 00:20:20.940
And you also couldn't do code completion with it if it was an object that had methods on it.

00:20:20.940 --> 00:20:28.720
And you also couldn't use it to look up where is that fixture defined because it's often defined in a different file, in a ConfTest file or something.

00:20:28.720 --> 00:20:30.000
But now you can.

00:20:30.000 --> 00:20:31.280
All those things now work.

00:20:31.280 --> 00:20:35.380
So big thank you to the PyCharm team for getting that out so quickly.

00:20:35.380 --> 00:20:39.360
And yeah, I just wanted to let people know about that.

00:20:39.360 --> 00:20:40.120
Yeah, that's really awesome.

00:20:40.120 --> 00:20:41.820
PyCharm just keeps getting better and better.

00:20:42.080 --> 00:20:47.200
All right, so speaking of getting better, let's go back to Python performance in an indirect way.

00:20:47.200 --> 00:20:51.100
You got like this bone that you won't let go of, man.

00:20:51.100 --> 00:20:52.440
No, it's good.

00:20:52.440 --> 00:20:57.120
So this one actually hits on two themes that I think we've touched a lot on.

00:20:57.120 --> 00:21:01.080
One is this performance kick that I'm on today, but the other is packaging.

00:21:01.080 --> 00:21:04.740
Like we've talked about how it's not super easy to package up Python.

00:21:04.740 --> 00:21:09.500
So for example, Go compiles to a single executable binary with zero dependencies.

00:21:09.500 --> 00:21:12.080
You take that, you give it to somebody, they run it.

00:21:12.080 --> 00:21:17.160
That's not how it works for complicated apps that have package dependencies and stuff in Python.

00:21:17.160 --> 00:21:20.020
You can't just easily go, here, double click this.

00:21:20.020 --> 00:21:21.960
You're like, yeah, good luck.

00:21:21.960 --> 00:21:22.660
Hold on.

00:21:22.660 --> 00:21:23.860
First pip install requirements.

00:21:23.860 --> 00:21:24.500
Now what else?

00:21:24.500 --> 00:21:25.720
Oh, you got the wrong version of Python.

00:21:25.720 --> 00:21:26.260
Hold on.

00:21:26.680 --> 00:21:32.620
So the whole packaging thing, you know, there are some attempts to deal with it, like CX freeze and stuff like that.

00:21:32.620 --> 00:21:34.320
And it's somewhat working.

00:21:34.320 --> 00:21:39.840
But here's a interesting thing from Facebook called Azar.

00:21:41.040 --> 00:21:44.160
So Azar is an executable archive.

00:21:44.160 --> 00:21:56.220
And basically what it is, is you can package up some bunch of code into a single executable file that you can then mount as like a separate file system that's read only.

00:21:56.220 --> 00:21:58.720
Like a CD or something, I guess.

00:21:59.260 --> 00:22:01.400
And so you can mount this and then execute it.

00:22:01.400 --> 00:22:12.120
But because it came as a whole like block, basically it's sort of a native file system that, you know, you can read files that are next to your Python files.

00:22:12.120 --> 00:22:14.700
And you just package up your dependencies and run it.

00:22:14.700 --> 00:22:15.520
Yeah.

00:22:15.520 --> 00:22:22.240
So it's this read only file system, which when you mount it, it looks like a regular directory to user space.

00:22:22.240 --> 00:22:25.660
The one drawback, like I was, when I saw this, I'm like, oh my gosh, is this it?

00:22:25.660 --> 00:22:29.960
Is this the thing that is going to make it that we can just go here, double click this in Python?

00:22:29.960 --> 00:22:33.360
It turns out there's a minor little step here.

00:22:33.360 --> 00:22:38.260
This requires a one-time installation of a system level device driver for the file system.

00:22:38.260 --> 00:22:40.740
So maybe not so much just double click.

00:22:41.260 --> 00:22:50.840
But if you're willing to install this, you know, maybe like as your organization, you install it or on your servers, for whatever reason, you're willing to install this thing called squash FS.

00:22:50.840 --> 00:22:55.220
Then stars become these things you can just pass around and run.

00:22:55.220 --> 00:22:56.240
So that's pretty cool.

00:22:56.240 --> 00:22:57.900
So that's a good caveat.

00:22:57.900 --> 00:22:59.520
But, you know, think of Facebook.

00:22:59.520 --> 00:23:04.140
Their goal is to make it super easy to deploy these applications onto servers and run them.

00:23:04.140 --> 00:23:05.200
Right.

00:23:05.200 --> 00:23:11.120
So they must pre-configure their servers with squash FS and then just make this part of their deploy mechanism.

00:23:11.120 --> 00:23:15.160
So they say there's basically two primary use cases for these stars.

00:23:15.160 --> 00:23:20.920
One is simply collecting a number of files for automatic atomic mounting somewhere in the file system.

00:23:20.920 --> 00:23:21.500
Cool.

00:23:22.340 --> 00:23:26.980
And so you can use this thing called czar exec helper.

00:23:26.980 --> 00:23:31.480
It becomes a self-contained package of executable code and its data.

00:23:31.480 --> 00:23:40.540
So an example might be a Python app that archives all of its source code and its native libraries and configuration and all that kind of stuff.

00:23:40.540 --> 00:23:41.700
I still think that's pretty cool.

00:23:41.700 --> 00:23:42.040
Yeah.

00:23:42.140 --> 00:23:45.040
It's kind of a focused use, but still pretty cool.

00:23:45.040 --> 00:23:45.360
Yeah.

00:23:45.360 --> 00:23:46.640
But then it's a rabbit hole.

00:23:46.640 --> 00:23:49.820
Now I got to go and read about squash and what that is.

00:23:49.820 --> 00:23:51.460
Yes, it's true.

00:23:51.460 --> 00:24:02.280
So actually on this, it seems like it's generally, it could be a general mechanism for, I don't know, Ruby or JavaScript or Node or something like that.

00:24:02.280 --> 00:24:06.900
But they particularly call out Python on the GitHub page from the Facebook incubator.

00:24:06.900 --> 00:24:15.660
And it says some of the advantages for using it for Python are it looks like regular files on disk to Python.

00:24:15.660 --> 00:24:18.360
So that means you can just run CPython and it doesn't know any better.

00:24:18.360 --> 00:24:19.140
Same thing.

00:24:19.140 --> 00:24:20.780
It looks like regular files to use.

00:24:20.780 --> 00:24:25.780
You don't need to use like weird package import package resource tricks or anything like that to pass stuff around.

00:24:26.780 --> 00:24:36.600
Standard compression, it doesn't require unpacking SO files, apparently, which like if you try to use one of the PEX mechanisms, that was a problem.

00:24:36.600 --> 00:24:40.960
So more or less, it just like CPython just works basically there.

00:24:40.960 --> 00:24:47.080
Your idea of like, okay, so there's this dependency, but if you're using it within an organization, that totally makes sense.

00:24:47.080 --> 00:24:47.620
It's fine.

00:24:47.620 --> 00:24:52.400
And I, yeah, that sort of use case or within your own servers or whatever.

00:24:52.400 --> 00:24:55.720
Yeah, there's a lot of use cases where I think this would be very useful for people.

00:24:55.720 --> 00:25:00.360
Yeah, I mean, this, like those places you described, that's where a lot of Python is used.

00:25:00.360 --> 00:25:04.760
So it also has some interesting performance benefits, which is coming back to that.

00:25:04.760 --> 00:25:14.720
It, because the way SquashFS works, you're actually reading off a disk, a smaller set of binaries because it's compressed, a smaller set of data.

00:25:14.720 --> 00:25:17.320
So there's less disk activity and it's still really fast.

00:25:17.320 --> 00:25:25.580
So the startup time can actually be faster for your app than even native Python, which is pretty cool.

00:25:25.580 --> 00:25:28.520
And so once it started once, it's pretty cool.

00:25:28.520 --> 00:25:40.120
There's some statistics in there that show it's either as fast or sometimes like the second time you've interacted with that, that SAR, because the file system actually decompresses and caches that stuff in memory.

00:25:40.120 --> 00:25:42.300
It can actually run a little bit quicker.

00:25:42.440 --> 00:25:44.600
So there's a lot of interesting things around performance there.

00:25:44.600 --> 00:25:44.940
Yeah.

00:25:44.940 --> 00:25:49.560
And finally, these file system thing, these SARs, right?

00:25:49.560 --> 00:25:58.820
They're read-only, which means the integrity of your app is guaranteed as opposed to, say, a virtual environment or like a folder where people could mess with it or change the Python system.

00:25:58.820 --> 00:26:00.160
Like it's read-only.

00:26:00.160 --> 00:26:01.580
So what you gave them is what they're running.

00:26:01.580 --> 00:26:02.860
That's a cool thing too.

00:26:02.980 --> 00:26:03.120
Yeah.

00:26:03.120 --> 00:26:04.320
There's a lot of neat stuff here.

00:26:04.320 --> 00:26:08.120
I don't think it fits my use case, but maybe some listeners, it'll work well for them.

00:26:08.120 --> 00:26:08.540
Definitely.

00:26:08.540 --> 00:26:08.860
Yeah.

00:26:08.860 --> 00:26:10.560
A lot of people seemed excited about it on the Twitter.

00:26:10.560 --> 00:26:11.280
On the Twitters.

00:26:11.280 --> 00:26:12.260
Exactly.

00:26:12.260 --> 00:26:13.040
All right.

00:26:13.040 --> 00:26:14.240
Well, that's it for all of our items.

00:26:14.240 --> 00:26:15.720
You got some extras you want to throw out there?

00:26:15.720 --> 00:26:29.840
Oh, just somebody mentioned to me, speaking of Twitters, that NumPy 1.15.0 was just released recently, and they completely overhauled their testing to use pytest.

00:26:29.840 --> 00:26:30.600
Yay.

00:26:30.600 --> 00:26:31.200
Yay.

00:26:31.540 --> 00:26:32.580
Another win for pytest.

00:26:32.580 --> 00:26:32.980
That's awesome.

00:26:32.980 --> 00:26:33.380
Yeah.

00:26:33.380 --> 00:26:37.160
And then you've got a couple lists of some videos that are out.

00:26:37.160 --> 00:26:37.480
Yeah.

00:26:37.480 --> 00:26:38.600
I have a couple of events.

00:26:38.600 --> 00:26:40.800
I have two in the future and two in the past.

00:26:40.800 --> 00:26:46.560
Maybe, depending on when you listen to this, actually all of them in the past, but right now, two in the future and two in the past.

00:26:46.560 --> 00:26:54.600
So SciPy 2018, the Data Science Python World Conference, the videos for that are now out on YouTube.

00:26:54.600 --> 00:27:00.280
So if you couldn't make it to SciPy and you want to catch a bunch of the presentations there, here's a link to the videos.

00:27:00.740 --> 00:27:02.120
Also, PyOhio.

00:27:02.120 --> 00:27:11.880
We have a lot of these regional Python conferences, and somehow PyOhio has actually gained quite a bit of momentum, which is interesting with PyCon being there last year and next.

00:27:11.880 --> 00:27:14.180
Anyway, the videos for that are also out.

00:27:14.260 --> 00:27:17.880
So a bunch of good YouTube watching coming up here over the weekend.

00:27:17.880 --> 00:27:18.260
Yeah.

00:27:18.260 --> 00:27:20.040
I've already started a couple of these.

00:27:20.040 --> 00:27:21.080
Nice.

00:27:21.080 --> 00:27:22.880
And then in the future, we've got...

00:27:22.880 --> 00:27:25.740
In the future, PyCon Canada is coming.

00:27:26.100 --> 00:27:28.840
So the call for papers on that is open for about a month.

00:27:28.840 --> 00:27:41.020
And in the future, PyBay 2018, so the regional San Francisco Python conference, is happening in a couple of weeks.

00:27:41.020 --> 00:27:43.000
And I would love to go to that, but I can't.

00:27:43.000 --> 00:27:44.020
Yeah, I still haven't decided.

00:27:44.020 --> 00:27:44.860
Are you thinking about going?

00:27:44.920 --> 00:27:48.160
I was thinking about going to PyBay, but there's a lot of stuff going on in the fall.

00:27:48.160 --> 00:27:49.020
So we'll see.

00:27:49.020 --> 00:27:51.080
Yeah, I have daughters going to college right around then.

00:27:51.080 --> 00:27:54.720
So it would probably be better if I helped them do that instead of just go hang out in San Francisco.

00:27:57.040 --> 00:27:57.540
All right.

00:27:57.540 --> 00:28:03.980
So the last thing is I just want to let people know that I have another course out building data-driven web apps with Pyramid and SQLAlchemy.

00:28:03.980 --> 00:28:05.180
It's super fun.

00:28:05.180 --> 00:28:06.700
Nine hours of awesomeness.

00:28:06.700 --> 00:28:08.000
So there's a link in there.

00:28:08.000 --> 00:28:09.040
Check that out as well.

00:28:09.040 --> 00:28:09.820
People are interested.

00:28:09.820 --> 00:28:11.060
That looks fun.

00:28:11.060 --> 00:28:12.580
Yeah, it's definitely a good one.

00:28:12.580 --> 00:28:19.260
It covers some things that I've wanted to cover for a while, like ORM migrations and managing stuff over time with databases and stuff.

00:28:19.260 --> 00:28:19.800
Pretty cool.

00:28:19.800 --> 00:28:20.220
Cool.

00:28:20.220 --> 00:28:20.540
All right.

00:28:20.540 --> 00:28:22.120
Well, Brian, thanks as always.

00:28:22.120 --> 00:28:22.900
It's been fun.

00:28:22.900 --> 00:28:23.320
Thank you.

00:28:23.320 --> 00:28:23.560
Yep.

00:28:23.560 --> 00:28:23.920
Bye.

00:28:25.420 --> 00:28:27.240
Thank you for listening to Python Bytes.

00:28:27.240 --> 00:28:29.800
Follow the show on Twitter via at Python Bytes.

00:28:29.800 --> 00:28:32.680
That's Python Bytes as in B-Y-T-E-S.

00:28:32.680 --> 00:28:36.100
And get the full show notes at pythonbytes.fm.

00:28:36.100 --> 00:28:40.460
If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

00:28:40.460 --> 00:28:43.140
We're always on the lookout for sharing something cool.

00:28:43.140 --> 00:28:46.540
On behalf of myself and Brian Okken, this is Michael Kennedy.

00:28:46.540 --> 00:28:50.180
Thank you for listening and sharing this podcast with your friends and colleagues.

