#89: A tenacious episode that won't give up
Published Sat, Aug 4, 2018,
recorded Thu, Aug 2, 2018
Python Bytes 89
Sponsored by Datadog -- pythonbytes.fm/datadog
Brian #1: tenacity
- “Tenacity is a general-purpose retrying library to simplify the task of adding retry behavior to just about anything.”
Example (Also, nice Trollhunters reference):
import random from tenacity import retry @retry def do_something_unreliable(): if random.randint(0, 10) > 1: raise IOError("Broken sauce, everything is hosed!!!") else: return "Awesome sauce!" # Toby says this frequently print(do_something_unreliable())
Features:
- Generic Decorator API
- Specify stop condition (i.e. limit by number of attempts)
- Specify wait condition (i.e. exponential backoff sleeping between attempts)
- Customize retrying on Exceptions
- Customize retrying on expected returned result
- Retry on coroutines
Michael #2: Why is Python so slow?
- Answer this question: When Python completes a comparable application 2–10x slower than another language, why is it slow and can’t we make it faster?
- Here are the top theories:
- “It’s the GIL (Global Interpreter Lock)”
- “It’s because its interpreted and not compiled”
- “It’s because its a dynamically typed language”
- “It’s the GIL”
- Modern computers come with CPU’s that have multiple cores
- For web apps, it might not matter (e.g. https://training.talkpython.fm/ has 16 worker processes, https://talkpython.fm/ has 8 workers)
- “It’s because its an interpreted language”
- I hear this a lot and I find it a gross-simplification of the way CPython actually works.
- JIT vs. NonJIT is interesting (startup time too)
- “It’s because its a dynamically typed language”
- In a “Statically-Typed” language, you have to specify the type of a variable when it is declared. Those would include C, C++, Java, C#, Go.
- In a dynamically-typed language, there are still the concept of types, but the type of a variable is dynamic.
- Not having to declare the type isn’t what makes Python slow
- It’s this design that makes it incredibly hard to optimize Python.
- Conclusion
- Python is primarily slow because of its dynamic nature and versatility. It can be used as a tool for all sorts of problems, where more optimized and faster alternatives are probably available.
Brian #3: Keynoting with Mu
- David Beazley gave his EuroPython talk/demo “Die Threads” using Mu.
- Article also notes that simple tools are great not just for learning, but for teaching, as the extra clutter of a full power editor doesn’t distract too much.
Michael #4: A multi-core Python HTTP server (much) faster than Go (spoiler: Cython)
- Exploring the question, “So, I’ve heard Python is slow… is it?”
- A multi-core Python HTTP server that is about 40% to 110% faster than Go can be built by relying on the Cython language and LWAN C library.
- Just a proof of concept validates the possibility of high performance system programming in the Cython language.
- Primarily interesting as a highlight of Cython
- Cython is both an optimizing static compiler and a hybrid language. It mainly gives the ability to:
- write Python code that can call back and forth from and to C/C++;
- add static typing using C declarations to Python code in order to boost performance;
- release the GIL in some code sections.
- Cython generates very efficient C code, which is then compiled into a module that Python can import. So it is an ideal language for wrapping external C libraries, and for developing C modules that speed up the execution of Python code.
- However, all experiments we are aware that rely on Cython for system programming fail short in at least two ways:
- as soon as some Python code is invoked (as opposed to pure Cython
cdef
code), performance degrades by one or two orders of magnitude; - benchmarks are most of the time provided for single core execution only, which is somehow unfair considering Golang's ability to scale up on multiple cores.
- as soon as some Python code is invoked (as opposed to pure Cython
Brian #5: PyCharm 2018.2 beefs up pytest support
- Honestly, I’m super excited about this release to help my team navigate to all of the fixtures I create on a regular basis.
- This is the release I’ve been waiting for.
- I can now fully utilize the power of pytest from PyCharm
- Here’s the few things that were missing that now work great:
- Autocomplete fixtures from various sources
- Quick documentation and navigation to fixtures
- Renaming a fixture from either the definition or a usage
- Support for pytest’s parametrize
- See also: PyCharm 2018.2 and pytest Fixtures
- But if you really want to understand fixtures quickly, read chapters 3 and 4 of the pytest book.
Michael #6: XAR for Facebook
- XAR lets you package many files into a single self-contained executable file. This makes it easy to distribute and install.
- A
.xar
file is a read-only file system image which, when mounted, looks like a regular directory to user-space programs. This requires a one-time installation of a driver for this file system (SquashFS). - There are two primary use cases for XAR files.
- Simply collecting a number of files for automatic, atomic mounting somewhere on the filesystem.
- By making the XAR file executable and using the xarexec helper, a XAR becomes a self-contained package of executable code and its data. A popular example is Python application archives that include all Python source code files, as well as native shared libraries, configuration files, other data.
- Advantages of XAR for Python usage
- SquashFS looks like regular files on disk to Python. This lets it use regular imports which are better supported by CPython.
- SquashFS looks like regular files to your application, too. You don't need to use
pkg_resources
or other tricks to access data files in your package. - SquashFS with Zstandard compression saves disk space, also compared to a ZIP file.
- SquashFS doesn't require unpacking of
.so
files to a temporary location like ZIP files do. - SquashFS is faster to start up than unpacking a ZIP file. You only need to mount the file system once. Subsequent calls to your application will reuse the existing mount.
- SquashFS only decompresses the pages that are used by the application, and decompressed pages are cached in the page cache.
- SquashFS is read-only so the integrity of your application is guaranteed compared to using virtualenvs or unpacking to a temporary directory.
- Performance is interesting too
Extras:
Brian:
- numpy 1.15.0 just released recently. Switched testing to pytest.
Michael:
- SciPy 2018 videos are out
- PyOhio 2018 videos are out
- Call for papers at PyCon Canada in Toronto
- PyBay 2018 conference in a few weeks
- My latest course, Building data-driven web apps with Pyramid and SQLAlchemy, is out!