#89: A tenacious episode that won't give up

Published Sat, Aug 4, 2018, recorded Thu, Aug 2, 2018

Python Bytes 89

Sponsored by Datadog -- pythonbytes.fm/datadog

“Tenacity is a general-purpose retrying library to simplify the task of adding retry behavior to just about anything.”

Example (Also, nice Trollhunters reference):

    import random
    from tenacity import retry

    @retry
    def do_something_unreliable():
        if random.randint(0, 10) > 1:
            raise IOError("Broken sauce, everything is hosed!!!")
        else:
            return "Awesome sauce!"  # Toby says this frequently

    print(do_something_unreliable())

Features:
- Generic Decorator API
- Specify stop condition (i.e. limit by number of attempts)
- Specify wait condition (i.e. exponential backoff sleeping between attempts)
- Customize retrying on Exceptions
- Customize retrying on expected returned result
- Retry on coroutines

Michael #2: Why is Python so slow?

Answer this question: When Python completes a comparable application 2–10x slower than another language, why is it slow and can’t we make it faster?
Here are the top theories:
- “It’s the GIL (Global Interpreter Lock)”
- “It’s because its interpreted and not compiled”
- “It’s because its a dynamically typed language”
“It’s the GIL”
- Modern computers come with CPU’s that have multiple cores
- For web apps, it might not matter (e.g. https://training.talkpython.fm/ has 16 worker processes, https://talkpython.fm/ has 8 workers)
“It’s because its an interpreted language”
- I hear this a lot and I find it a gross-simplification of the way CPython actually works.
- JIT vs. NonJIT is interesting (startup time too)
“It’s because its a dynamically typed language”
- In a “Statically-Typed” language, you have to specify the type of a variable when it is declared. Those would include C, C++, Java, C#, Go.
- In a dynamically-typed language, there are still the concept of types, but the type of a variable is dynamic.
- Not having to declare the type isn’t what makes Python slow
- It’s this design that makes it incredibly hard to optimize Python.
Conclusion
- Python is primarily slow because of its dynamic nature and versatility. It can be used as a tool for all sorts of problems, where more optimized and faster alternatives are probably available.

Brian #3: Keynoting with Mu

David Beazley gave his EuroPython talk/demo “Die Threads” using Mu.
Article also notes that simple tools are great not just for learning, but for teaching, as the extra clutter of a full power editor doesn’t distract too much.

Michael #4: A multi-core Python HTTP server (much) faster than Go (spoiler: Cython)

Exploring the question, “So, I’ve heard Python is slow… is it?”
A multi-core Python HTTP server that is about 40% to 110% faster than Go can be built by relying on the Cython language and LWAN C library.
Just a proof of concept validates the possibility of high performance system programming in the Cython language.
Primarily interesting as a highlight of Cython
- Cython is both an optimizing static compiler and a hybrid language. It mainly gives the ability to:
- write Python code that can call back and forth from and to C/C++;
- add static typing using C declarations to Python code in order to boost performance;
- release the GIL in some code sections.
Cython generates very efficient C code, which is then compiled into a module that Python can import. So it is an ideal language for wrapping external C libraries, and for developing C modules that speed up the execution of Python code.
However, all experiments we are aware that rely on Cython for system programming fail short in at least two ways:
- as soon as some Python code is invoked (as opposed to pure Cython cdef code), performance degrades by one or two orders of magnitude;
- benchmarks are most of the time provided for single core execution only, which is somehow unfair considering Golang's ability to scale up on multiple cores.

Brian #5: PyCharm 2018.2 beefs up pytest support

Honestly, I’m super excited about this release to help my team navigate to all of the fixtures I create on a regular basis.
This is the release I’ve been waiting for.
I can now fully utilize the power of pytest from PyCharm
Here’s the few things that were missing that now work great:
- Autocomplete fixtures from various sources
- Quick documentation and navigation to fixtures
- Renaming a fixture from either the definition or a usage
- Support for pytest’s parametrize
See also: PyCharm 2018.2 and pytest Fixtures
But if you really want to understand fixtures quickly, read chapters 3 and 4 of the pytest book.

Michael #6: XAR for Facebook

XAR lets you package many files into a single self-contained executable file. This makes it easy to distribute and install.
A .xar file is a read-only file system image which, when mounted, looks like a regular directory to user-space programs. This requires a one-time installation of a driver for this file system (SquashFS).
There are two primary use cases for XAR files.
- Simply collecting a number of files for automatic, atomic mounting somewhere on the filesystem.
- By making the XAR file executable and using the xarexec helper, a XAR becomes a self-contained package of executable code and its data. A popular example is Python application archives that include all Python source code files, as well as native shared libraries, configuration files, other data.
Advantages of XAR for Python usage
- SquashFS looks like regular files on disk to Python. This lets it use regular imports which are better supported by CPython.
- SquashFS looks like regular files to your application, too. You don't need to use pkg_resources or other tricks to access data files in your package.
- SquashFS with Zstandard compression saves disk space, also compared to a ZIP file.
- SquashFS doesn't require unpacking of .so files to a temporary location like ZIP files do.
- SquashFS is faster to start up than unpacking a ZIP file. You only need to mount the file system once. Subsequent calls to your application will reuse the existing mount.
- SquashFS only decompresses the pages that are used by the application, and decompressed pages are cached in the page cache.
- SquashFS is read-only so the integrity of your application is guaranteed compared to using virtualenvs or unpacking to a temporary directory.
Performance is interesting too

Extras:

Brian:

numpy 1.15.0 just released recently. Switched testing to pytest.

Michael:

SciPy 2018 videos are out
PyOhio 2018 videos are out
Call for papers at PyCon Canada in Toronto
PyBay 2018 conference in a few weeks
My latest course, Building data-driven web apps with Pyramid and SQLAlchemy, is out!

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 89, recorded August 2nd, 2018. I'm Michael Kennedy.

00:10 And I'm Brian Aitken.

00:11 Hey, Brian. How you doing?

00:12 I'm doing great. It's good to talk to you again.

00:14 Yeah, you as well. And we got some cool stuff lined up this week.

00:17 We always say it, but there's really so much happening in the Python space.

00:21 It's more exciting every week.

00:22 Yeah, I think it's because we're here and then people are excited about Python.

00:26 And then they do more and then it builds on itself.

00:29 It's this awesome feedback loop. I can totally feel it.

00:32 What also is awesome is Datadog for sponsoring the show.

00:36 So if you need infrastructure monitoring or monitoring of your apps, check them out at pythonbytes.fm/Datadog.

00:41 Tell you more about them later.

00:43 Brian, you found some code that's pretty tenacious out there, didn't you?

00:46 It just won't quit.

00:47 Tenacious is a fun word.

00:49 And there's a project called Tenacity that is pretty cool.

00:54 We'll, of course, have a link in the show notes.

00:57 But Tenacity says Tenacity is a general purpose retrying library to simplify the task of adding retry behavior to just about anything.

01:06 And there's a little code snippet.

01:08 The gist of it is you import retry.

01:12 And it can have a lot of options, but you can just – the defaults just sort of work also.

01:18 You can just put this around a function.

01:20 And if your function raises an exception, it just tries it again and eats the exception.

01:29 So for a lot of stuff, this is terrible.

01:31 It's a terrible idea in a lot of places.

01:33 But a few places, this might be reasonably good, especially like if, for instance, if you're going to – like, I guess I'm not even going to come with, for instance.

01:43 You guys know in places where retrying is a good idea, like maybe saving something to a file system or connecting to a service that sometimes has contention.

01:53 Yeah, or even a database.

01:55 Like, what if you have to, like, restart the database server or something to that effect, right?

02:00 Or you need to really quickly apply a migration, right?

02:05 You could say, just keep retrying the database until you get there.

02:09 Yeah, and then there's a whole bunch of extra conditions you can put on it.

02:12 You can say, try so many times or a wait time for – if it doesn't work after a while, then give up.

02:20 Then you can customize which exceptions it catches or doesn't catch.

02:24 It even has retry on coroutines, which is kind of fun.

02:29 I haven't tried that, but I've tried the simple case.

02:31 But I'm going to use it right away.

02:33 We've got several conditions where we're in testing devices where sometimes it takes a device – we're doing, like, little Wi-Fi devices, for example.

02:42 It might be asleep, so it might take a while for the thing to wake up and respond.

02:47 So a number of retries is good.

02:50 And we have, like, retry code all over our code base.

02:53 And so having Decorator do that for us is just pretty slick.

02:57 I like it.

02:58 Yeah, it's really cool.

02:58 And, you know, like I said, the database thing, like, you wouldn't just put retry and say continue to hammer the database.

03:03 And definitely, if you run into any problems.

03:06 But, you know, like the, like, try five times with an exponential fall off just so you could basically handle five-second down times.

03:14 Things like that.

03:15 Be nice.

03:15 Yeah.

03:15 There's definitely be parts of, like you said, a distributed system where you've got things that sometimes are not available for a little while.

03:25 Yeah, especially stuff that you don't control everything, like other services you depend upon.

03:30 Yeah.

03:30 Yeah, pretty cool.

03:31 So I'm going to bring a theme into this show here for the next couple of topics that I'm going to bring in.

03:39 And I think it's actually going to touch on every one of the things I'm bringing.

03:42 So let's start with Anthony Shaw, of course.

03:45 We've got to cover an article by Anthony Shaw, right?

03:48 Yeah, he writes a lot of good stuff.

03:50 Yeah, he does.

03:50 And one of the ones that I came across here is called, Why is Python so slow?

03:55 So I think it's interesting to even ask the question, like, is Python slow?

04:00 And explain how that is.

04:01 So Anthony looks at some benchmarks comparing, say, Python to Java, to C#, to C++, and things like that.

04:08 And talks about, well, in some cases it is slower, and why might that be, right?

04:13 So basically answering the question, like, Python completes a comparable operation, like, two to ten times slower than, say, Java or C#.

04:23 Why?

04:23 And why can't we make it faster?

04:25 So he has three main hypotheses, which are pretty interesting.

04:32 One is, it's the GIL, the global interpreter lock.

04:36 The other is, it's because it's interpreted rather than compiled.

04:39 And the final one is, it's a dynamic language.

04:43 So those three ones are pretty interesting.

04:47 What do you think?

04:48 Okay, so he's not saying that these are the reasons, but these are theories.

04:52 These are theories.

04:54 And then he goes through each one of them and, like, pretty deeply looks at how they work and then compares them.

05:00 So, for example, it's interpreted, not compiled.

05:02 Well, what does that mean in terms of tradeoff?

05:04 Let's compare that to, say, the way C# does things and the way C++ works.

05:08 What are the tradeoffs there?

05:10 So let's go, we'll go through some of them.

05:13 One is, it's the GIL, right?

05:17 Now, modern computers, modern processors have multiple cores.

05:23 And if you actually look at Moore's Law, Moore's Law is still alive and well, right?

05:27 The number of transistors in a chip.

05:30 But people sort of correlated that indirectly to, well, that means computers are getting faster and faster and faster.

05:37 But it turns out that a number of years ago, four, five, I don't know how many years ago, not too long ago,

05:43 the actual clock speed no longer kept doubling along with Moore's Law.

05:49 It sort of went flat, more or less.

05:51 And what started happening was we got two core machines and then four core machines.

05:55 And I just got a new MacBook.

05:56 It has six hyper-threaded cores.

05:58 It's crazy, right?

05:59 But on Python, if I want to take advantage of that computationally, it's super hard within one process because of the GIL.

06:07 So, he said, well, if you want to take advantage of modern hardware, maybe the GIL is the problem.

06:12 So, he talks about some of the trade-offs there, when it matters, when it doesn't matter.

06:16 So, for example, if what you're doing is IO bound, it basically doesn't much matter, right?

06:22 The GIL is released when you're waiting on network calls and stuff like that.

06:25 So, in some sense, like the GIL is not the problem.

06:28 If you created a bunch of threads and they all started, you know, reading, writing files, talking over the network, it should just automatically handle that.

06:36 But if, on the other hand, you tried to create six threads and do computational stuff, you'll still probably get 12% CPU usage on my machine because, you know, it only really gets to run one at a time.

06:47 So, that's one of the theories.

06:49 And I think this theory applies more in some places and less in others.

06:54 And I kind of touched on that a little bit.

06:56 Like, if your goal is to do computational mathematical things, the GIL can really, really matter.

07:02 It makes a big difference, right?

07:04 Because you're trying to execute your Python code, it doesn't let go of the GIL.

07:07 But if, say, you're building a web app, it probably doesn't matter, right?

07:11 There are some ways and you can do some things that would be better, but it doesn't really matter.

07:17 So, for example, if I looked at the various servers before we came on today, the training.talkpython.fm site has 16 worker processes, all running parallel versions of the website handling requests.

07:30 Talk Python itself has eight.

07:33 I think maybe Python bytes has eight as well.

07:35 So, anyway, there's these eight processes.

07:37 And sure, one of them may, like, lock up something with the GIL, but there's a whole bunch of others that can leverage those other CPU cores and just keep on rocking.

07:45 So, if you're doing web stuff, it matters less in this sense, I think.

07:49 I mean, even if you are, there's ways to get around it.

07:53 Yeah, if you can break up your algorithm and do sub-process type parallelism.

07:56 Okay, so that's the GIL.

07:58 The other could be it's an interpreted language.

08:01 And I think this one is the most interesting, probably.

08:06 So, it is an interpreted language.

08:08 But actually, it does compile code to bytecode.

08:11 But it just doesn't JIT compile it, right?

08:13 And so, one of the main considerations around JIT compile languages versus not is startup time.

08:20 So, if our Python code is going to start and run for a while, then doing a whole bunch of JIT optimization would be maybe a little slower to start, but then faster to run.

08:30 But if we want to just do some CLI stuff that starts really quick, does a tiny thing and goes away, a whole bunch of JIT stuff might be, you know, sort of counterproductive.

08:40 So, there's a pretty interesting comparison against C# and Java and CPython here.

08:45 The other thing I think that's worth throwing in here is because of C extensions and things like that, it's an interpreted language.

08:52 I think that's a simplistic view that's like, well, if you just take straight Python code and just run it on Python, you don't interact with any libraries.

09:00 But if you work with NumPy or if you work with SQLAlchemy or a whole bunch of stuff that has C extensions to make certain parts fast, well, all of a sudden, it's not interpreted, right?

09:11 So, there's these weird blends.

09:12 All right.

09:13 Last one.

09:13 It's because it's dynamically typed.

09:15 So, this is also, I think this is really interesting.

09:19 I think, actually, this is probably why.

09:21 And I'm going to throw in another one unless I get distracted and forget it.

09:24 So, I think this is really why it's probably the slowest is that it's a dynamic language.

09:30 And it's not that you can't make a dynamic language fast, but because it's so flexible, it's hard to know how to optimize it.

09:38 Right?

09:40 Yeah.

09:40 Like, you might want to inline a function, but somebody could monkey patch that function and then you wouldn't be inlining the right thing, for example.

09:47 And then you monkey patch it only sometimes.

09:49 What do you do then, right?

09:50 So, things like method inlining, which can really make things faster, is super hard because that could actually change.

09:56 Where, say, in C# or C++ or whatever, the method won't change.

10:01 Yeah.

10:01 Okay.

10:02 All right.

10:02 Final one.

10:03 This is mine.

10:03 I'm adding this to his.

10:04 And it sort of has to do with this dynamic typing thing.

10:07 Is everything is a reference type allocated on a heap in Python.

10:12 Right?

10:13 Some of the stuff that makes C++ and C# really, really fast is things like numbers and other stuff are allocated on the stack.

10:20 And when you work with them, you never do pointer dereferencing.

10:24 You never do reference counting.

10:25 You never do garbage collection or memory management.

10:27 You just work with little bits on the stack.

10:30 And because that little thing on the stack could become a full-blown list or something just by changing what it points at, I think probably that also makes a big difference.

10:39 Okay.

10:40 There's also, like, function calls are slower than they need to be.

10:45 And I think that's one of the things the Python core team is working on is to try to.

10:50 And luckily, there have been some advances there in the latest version of Python.

10:54 I think 20% or something they got.

10:56 Was that 3.7, I think, that they got that much faster?

10:58 So work is being done.

11:00 But, yeah, it could be more, I guess.

11:02 But what's really interesting is the tradeoffs or sort of comparisons of the article.

11:07 The thing that I want to, there's a forest in the trees sort of issue I have here is that I don't think Python is slow.

11:16 Yeah, I'm with you.

11:16 And my people time is way more expensive than computer time for, like, 98% of the applications in the world as far as I probably.

11:27 Well, and somebody made that comment below in the comment section of this article that said, well, if you're optimizing milliseconds versus nanoseconds, yes, maybe.

11:34 If you're optimizing weeks versus month from idea to shipped, you know, Python's not slow at all.

11:41 It's really fast.

11:42 Yeah.

11:43 And, I mean, that's what I see is the maintenance development time, the maintenance time, all of those extra people time things.

11:51 Python is way faster.

11:53 And a lot of these things that we say is, like, the GIL is a problem or multiprocessing is difficult.

11:59 Well, multiprocessing isn't easy.

12:01 Maybe it's easy in some languages.

12:03 I mean, Go is sort of designed to do that from the start.

12:07 But getting a complex algorithm in C++ to utilize multi-cores, that's not a trivial task either.

12:13 No, it's not.

12:14 It's definitely not.

12:15 And I would say on the web framework side, actually, Python is really quite fast.

12:21 I've compared it against other, like, C#-based JIT compiled things like ASP.NET and stuff for some conference talk I did comparing Python to the .NET stuff.

12:31 And, actually, Python was not just as fast, but faster than the JIT compiled C# stuff.

12:37 Yeah.

12:37 And also, just for people that do think that maybe the Python speed's the problem, really measure it.

12:43 And you can optimize.

12:44 There's ways in Python to optimize the parts that are slow.

12:47 Yeah.

12:47 And my next item will come back to this and show you some more ways to make it faster.

12:51 Okay.

12:51 All right.

12:53 So, what's this Mew thing all about?

12:55 Mew, you talked about Mew before, right?

12:57 It's like a simplified IDE type thing.

13:00 Is that right?

13:00 Right.

13:01 So, I'm going to highlight Mew again, partly because I think it's a neat project that's going on.

13:07 And there's a lot of cool people working on it.

13:08 But there was an article called Keynoting in Mew.

13:12 And Mew being Mew.

13:14 And in the EuroPython 2018, David Beasley did a talk and a demo called Die Threads.

13:21 And what amused me is my first thought was, is this a German joke?

13:25 And he actually addressed it early on.

13:27 It's not a German joke.

13:28 But it's a good demo.

13:30 But he used Mew during his Python talk.

13:33 And this article talks about, and also asked him about it, why he did that.

13:39 And it's just a simple thing.

13:41 There's not, it's the same experience for everybody.

13:44 Like, for instance, I use PyCharm.

13:46 But my environment in PyCharm, all the different colors I like or the plugins I use, it's going to be different than everybody else.

13:54 It's one of those customizable things.

13:55 And so having a very simple interface like Mew that works as a learning tool, but also shows people exactly what you're doing.

14:05 So one of the features of it is if you've got a little script that you can want to run, you can just push run at the top.

14:12 And then automatically a little window pops up at the bottom and shows you the output of the thing you're running.

14:20 And that's just really handy.

14:22 And so it's kind of fun to watch that in a talk and being used.

14:26 And I think that would be, I use it, do little demos for people at work.

14:30 And I think I might try this also just to not have to answer questions like, what plugin are you using or whatever.

14:37 Yeah, just sort of keep the distractions to a minimum, huh?

14:40 Yeah.

14:41 And it's a real clean interface and looks like you can change the font size and it looks fun.

14:46 Plus, Mew is, if you haven't played around with it yet, it's also something that it automatically has hooks into things like running MicroBit and Raspberry Pi and stuff.

15:00 The MicroPython and embedded IoT things?

15:03 Yeah.

15:03 Yeah, very cool.

15:04 Yeah, I like it.

15:05 That's a nice one.

15:06 So before we get on to my other, my next performance thing, let me tell you guys about Datadog.

15:11 So Datadog is sponsoring this episode like they have many.

15:14 Thank you to them for that, of course.

15:16 So they have infrastructure monitoring, distributed tracing, and they've added logging.

15:21 They provide end-to-end visibility for requests across different parts of your infrastructure and as well as health and performance monitoring of your Python apps.

15:29 So get started with them today for a 14-day free trial.

15:33 And, of course, they'll send you a cool Datadog t-shirt on it.

15:37 Just go to pythonbytes.fm/Datadog and check it out.

15:40 So, yeah, very cool stuff from Datadog.

15:42 Brian, I told you about the performance thing and whether Python is slow.

15:46 I agree with you.

15:47 I generally find for what I do, it's like actually blazing.

15:50 So not a problem, but sort of depends on choosing the right infrastructure.

15:54 Yeah.

15:55 Yeah.

15:55 So there's this interesting proof of concept done by this, I forgot the name, European Open Source Consortium.

16:07 And the basic theme of this article is sort of exploring a response to the question of, so I've heard Python is slow.

16:14 Is it?

16:15 And I think it depends.

16:18 So what they've done is they've created a multi-core, talked about how that can be hard to take advantage of, but here it is, multi-core Python HTTP server that is much faster than Go.

16:29 So you've heard a lot of people say, well, we're going to go to Go because it's parallelism is so much better than Python and it's fast.

16:35 So we can, you know, do things fast.

16:37 And of course, there's that big tradeoff with sort of the functionality and speed to market or speed to idea completion and so on.

16:45 But this thing is like, hey, and we could even go as fast or faster.

16:48 So they compare it against actually two Go web servers and it's faster than both of them.

16:55 So what it is, is these guys have gone and said, we've talked about this idea before.

16:59 They said, we want to go look at all the various C-based, not Python, C-based sort of co-routine libraries that will let us write like low level HTTP servers.

17:11 And it turns out that there were not that many good options and the best option that they found was still slower than Go.

17:17 So they're like, well, maybe this isn't going to work.

17:19 But then it turned out they found this thing called L-W-A-N, which is a C library.

17:25 And they used Cython to create a Cython-based web server that wraps it up and exposes it to Python.

17:33 Ooh, neat.

17:34 Cool, right?

17:34 So basically it's not really ready to ship or anything.

17:37 It just validates the concept of creating a high-performance thing with Cython.

17:41 And I think that's a big part of the article.

17:43 So it says, here's some interesting things about Cython.

17:46 It's both an optimizing static compiler and a hybrid language that gives you the ability to write Python code that can call back and forth with C and C++.

17:55 Really easy.

17:56 It has static type declarations that make Python code faster because it can do like, it doesn't have to treat everything like a reference type.

18:04 It can, you know, put integers as integers on the stack and whatnot.

18:08 And the other thing is it releases the gil by just having like a keyword in Cython.

18:14 So you can say this part actually don't need the gil.

18:16 I'm doing this in C, leave me alone.

18:18 Oh, okay.

18:18 Isn't it interesting how that actually hits so many of the things that Anthony brought up, right?

18:22 Yeah.

18:22 The typing, optimizations, and the gil stuff is all right there.

18:27 So it generates super efficient C code that is then compiled into a Python module.

18:32 So it's like really great for wrapping up C libraries and exposing them to Python.

18:37 I hope they continue on with work on this.

18:39 Yeah, it looks pretty awesome.

18:41 Neat.

18:41 Yeah.

18:42 Anyway, I think, you know, if you're interested in sort of checking out the performance, definitely have a look.

18:46 It's pretty cool.

18:47 So are you going to cover something on testing in this episode?

18:50 Oh, yeah.

18:51 It's my turn again.

18:52 It's your turn for a theme.

18:54 Where am I at?

18:55 I'm trying to get on here.

18:56 Oh, yes.

18:57 I am very excited about some news that came out last week.

19:01 So I've been playing with, I started PyCharm using PyCharm a while ago.

19:07 You've been using it off and on.

19:09 I think you use lots of stuff now.

19:10 Yeah, yeah.

19:11 I've done it for quite a while, yeah.

19:12 The pytest support has been getting better and better in the last year or so.

19:17 And the PyCharm released 2018.2, I think it was last week, and it totally beefs up support for pytest fixtures.

19:29 And that's, I think, I just wanted to mention that because anybody that's using both PyCharm and pytest definitely needs to update to 2018.2.

19:39 A few things that used to do a few things that used to do now.

19:41 If you have a fixture that you're, so a fixture, if anybody's not familiar, a fixture is just a little piece of code, a little function that you declare as a fixture that you put it as the parameter to your test.

19:54 And it gets run before your test gets run at various levels.

19:58 That's a simplification, but that works.

20:00 But it shows up as a parameter, but you don't actually have to use it in your function.

20:05 It just tells pytest that run that code.

20:08 Well, PyCharm used to flag that as a variable that isn't referenced.

20:13 So it gives you a warning.

20:15 And you also couldn't do code completion with it if it was an object that had methods on it.

20:20 And you also couldn't use it to look up where is that fixture defined because it's often defined in a different file, in a ConfTest file or something.

20:28 But now you can.

20:30 All those things now work.

20:31 So big thank you to the PyCharm team for getting that out so quickly.

20:35 And yeah, I just wanted to let people know about that.

20:39 Yeah, that's really awesome.

20:40 PyCharm just keeps getting better and better.

20:42 All right, so speaking of getting better, let's go back to Python performance in an indirect way.

20:47 You got like this bone that you won't let go of, man.

20:51 No, it's good.

20:52 So this one actually hits on two themes that I think we've touched a lot on.

20:57 One is this performance kick that I'm on today, but the other is packaging.

21:01 Like we've talked about how it's not super easy to package up Python.

21:04 So for example, Go compiles to a single executable binary with zero dependencies.

21:09 You take that, you give it to somebody, they run it.

21:12 That's not how it works for complicated apps that have package dependencies and stuff in Python.

21:17 You can't just easily go, here, double click this.

21:20 You're like, yeah, good luck.

21:21 Hold on.

21:22 First pip install requirements.

21:23 Now what else?

21:24 Oh, you got the wrong version of Python.

21:25 Hold on.

21:26 So the whole packaging thing, you know, there are some attempts to deal with it, like CX freeze and stuff like that.

21:32 And it's somewhat working.

21:34 But here's a interesting thing from Facebook called Azar.

21:41 So Azar is an executable archive.

21:44 And basically what it is, is you can package up some bunch of code into a single executable file that you can then mount as like a separate file system that's read only.

21:56 Like a CD or something, I guess.

21:59 And so you can mount this and then execute it.

22:01 But because it came as a whole like block, basically it's sort of a native file system that, you know, you can read files that are next to your Python files.

22:12 And you just package up your dependencies and run it.

22:14 Yeah.

22:15 So it's this read only file system, which when you mount it, it looks like a regular directory to user space.

22:22 The one drawback, like I was, when I saw this, I'm like, oh my gosh, is this it?

22:25 Is this the thing that is going to make it that we can just go here, double click this in Python?

22:29 It turns out there's a minor little step here.

22:33 This requires a one-time installation of a system level device driver for the file system.

22:38 So maybe not so much just double click.

22:41 But if you're willing to install this, you know, maybe like as your organization, you install it or on your servers, for whatever reason, you're willing to install this thing called squash FS.

22:50 Then stars become these things you can just pass around and run.

22:55 So that's pretty cool.

22:56 So that's a good caveat.

22:57 But, you know, think of Facebook.

22:59 Their goal is to make it super easy to deploy these applications onto servers and run them.

23:04 Right.

23:05 So they must pre-configure their servers with squash FS and then just make this part of their deploy mechanism.

23:11 So they say there's basically two primary use cases for these stars.

23:15 One is simply collecting a number of files for automatic atomic mounting somewhere in the file system.

23:20 Cool.

23:22 And so you can use this thing called czar exec helper.

23:26 It becomes a self-contained package of executable code and its data.

23:31 So an example might be a Python app that archives all of its source code and its native libraries and configuration and all that kind of stuff.

23:40 I still think that's pretty cool.

23:41 Yeah.

23:42 It's kind of a focused use, but still pretty cool.

23:45 Yeah.

23:45 But then it's a rabbit hole.

23:46 Now I got to go and read about squash and what that is.

23:49 Yes, it's true.

23:51 So actually on this, it seems like it's generally, it could be a general mechanism for, I don't know, Ruby or JavaScript or Node or something like that.

24:02 But they particularly call out Python on the GitHub page from the Facebook incubator.

24:06 And it says some of the advantages for using it for Python are it looks like regular files on disk to Python.

24:15 So that means you can just run CPython and it doesn't know any better.

24:18 Same thing.

24:19 It looks like regular files to use.

24:20 You don't need to use like weird package import package resource tricks or anything like that to pass stuff around.

24:26 Standard compression, it doesn't require unpacking SO files, apparently, which like if you try to use one of the PEX mechanisms, that was a problem.

24:36 So more or less, it just like CPython just works basically there.

24:40 Your idea of like, okay, so there's this dependency, but if you're using it within an organization, that totally makes sense.

24:47 It's fine.

24:47 And I, yeah, that sort of use case or within your own servers or whatever.

24:52 Yeah, there's a lot of use cases where I think this would be very useful for people.

24:55 Yeah, I mean, this, like those places you described, that's where a lot of Python is used.

25:00 So it also has some interesting performance benefits, which is coming back to that.

25:04 It, because the way SquashFS works, you're actually reading off a disk, a smaller set of binaries because it's compressed, a smaller set of data.

25:14 So there's less disk activity and it's still really fast.

25:17 So the startup time can actually be faster for your app than even native Python, which is pretty cool.

25:25 And so once it started once, it's pretty cool.

25:28 There's some statistics in there that show it's either as fast or sometimes like the second time you've interacted with that, that SAR, because the file system actually decompresses and caches that stuff in memory.

25:40 It can actually run a little bit quicker.

25:42 So there's a lot of interesting things around performance there.

25:44 Yeah.

25:44 And finally, these file system thing, these SARs, right?

25:49 They're read-only, which means the integrity of your app is guaranteed as opposed to, say, a virtual environment or like a folder where people could mess with it or change the Python system.

25:58 Like it's read-only.

26:00 So what you gave them is what they're running.

26:01 That's a cool thing too.

26:02 Yeah.

26:03 There's a lot of neat stuff here.

26:04 I don't think it fits my use case, but maybe some listeners, it'll work well for them.

26:08 Definitely.

26:08 Yeah.

26:08 A lot of people seemed excited about it on the Twitter.

26:10 On the Twitters.

26:11 Exactly.

26:12 All right.

26:13 Well, that's it for all of our items.

26:14 You got some extras you want to throw out there?

26:15 Oh, just somebody mentioned to me, speaking of Twitters, that NumPy 1.15.0 was just released recently, and they completely overhauled their testing to use pytest.

26:29 Yay.

26:30 Yay.

26:31 Another win for pytest.

26:32 That's awesome.

26:32 Yeah.

26:33 And then you've got a couple lists of some videos that are out.

26:37 Yeah.

26:37 I have a couple of events.

26:38 I have two in the future and two in the past.

26:40 Maybe, depending on when you listen to this, actually all of them in the past, but right now, two in the future and two in the past.

26:46 So SciPy 2018, the Data Science Python World Conference, the videos for that are now out on YouTube.

26:54 So if you couldn't make it to SciPy and you want to catch a bunch of the presentations there, here's a link to the videos.

27:00 Also, PyOhio.

27:02 We have a lot of these regional Python conferences, and somehow PyOhio has actually gained quite a bit of momentum, which is interesting with PyCon being there last year and next.

27:11 Anyway, the videos for that are also out.

27:14 So a bunch of good YouTube watching coming up here over the weekend.

27:17 Yeah.

27:18 I've already started a couple of these.

27:20 Nice.

27:21 And then in the future, we've got...

27:22 In the future, PyCon Canada is coming.

27:26 So the call for papers on that is open for about a month.

27:28 And in the future, PyBay 2018, so the regional San Francisco Python conference, is happening in a couple of weeks.

27:41 And I would love to go to that, but I can't.

27:43 Yeah, I still haven't decided.

27:44 Are you thinking about going?