Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book

#243: Django unicorns and multi-region PostgreSQL

Published Wed, Jul 21, 2021, recorded Wed, Jul 21, 2021

Watch the live stream:

Watch this episode on YouTube
Play on YouTube
Watch the live stream replay

About the show

Sponsored by us:

Special guest: Simon Willison

Michael #1: MongoDB 5

  • Native Time Series: Designed for IoT and financial analytics, our new time series collections, clustered indexing, and window functions make it easier, faster, and lower cost to build and run time series applications
  • MongoDB automatically optimizes your schema for high storage efficiency, low latency queries, and real-time analytics against temporal data.
  • The Versioned API future-proofs your applications. You can fearlessly upgrade to the latest MongoDB releases without the risk of introducing backward-breaking changes that require application-side rework
  • New MongoDB Shell we have introduced syntax highlighting, intelligent auto-complete, contextual help and useful error messages creating an intuitive, interactive experience for MongoDB users (use mongosh rather than mongo on the CLI).
  • Also launched preview release of serverless instances on MongoDB Atlas
  • You can watch the MongoDB keynote here.

Brian #2: Python 3.11 : Enhanced error locations in tracebacks

  • Yes, 3.11. Even though 3.10 is still in Beta, we’re already excited about 3.11
  • tracebacks now point to the exact expression that caused the error within the line:

        Traceback (most recent call last):
          File "distance.py", line 11, in [HTML_REMOVED]
            print(manhattan_distance(p1, p2))
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
          File "distance.py", line 6, in manhattan_distance
            return abs(point_1.x - point_2.x) + abs(point_1.y - point_2.y)
                                   ^^^^^^^^^
        AttributeError: 'NoneType' object has no attribute 'x'
    
  • even deeply nested calls

        Traceback (most recent call last):
        File "query.py", line 37, in [HTML_REMOVED]
            magic_arithmetic('foo')
            ^^^^^^^^^^^^^^^^^^^^^^^
          File "query.py", line 18, in magic_arithmetic
            return add_counts(x) / 25
                   ^^^^^^^^^^^^^
          File "query.py", line 24, in add_counts
            return 25 + query_user(user1) + query_user(user2)
                        ^^^^^^^^^^^^^^^^^
          File "query.py", line 32, in query_user
            return 1 + query_count(db, response\['a'\]['b']\['c'\]['user'], retry=True)
                                       ~~~~~~~~~~~~~~~~~~^^^^^
        TypeError: 'NoneType' object is not subscriptable
    
  • and math expressions:

        Traceback (most recent call last):
          File "calculation.py", line 54, in [HTML_REMOVED]
            result = (x / y / z) * (a / b / c)
                      ~~~~~~^~~
        ZeroDivisionError: division by zero
    

Simon #3: fly.io multi-region PostgreSQL and last mile Redis

  • fly.io are a hosting provider that specialize in running your code in containers that are geographically close to your users
  • What I find interesting about them is that they are taking something that used to be INCREDIBLY hard - like geographically sharding your database - and describing patterns for doing that which make it easy-enough that I might actually do it
  • Their writing is really good. I’m learning a ton from them about designing code to run globally that applies even if I don’t end up using their service

Michael #4: django-unicorn

  • A magical full-stack framework for Django
  • Quickly add in simple interactions to regular Django templates without learning a new templating language.
  • Building a feature-rich API is complicated. Skip creating a bunch of serializers and just use Django.
  • Early days if you want to contribute

Brian #5: Blue : The somewhat less uncompromising code formatter than black

  • Suggested by Chris May
  • Code from Black, mods by Grant Jenks and Barry Warsaw
  • It’s not a fork, it’s a patched version of black. Kind of a “containment over inheritance” thing.
  • Deltas:
    • blue defaults to single-quoted strings.
      • except docstrings and triple quoted strings (TQS). Those are still double quotes.
    • blue defaults to line lengths of 79 characters. black is 88.
    • line lengths are customizable with all related tools.
    • blue preserves the whitespace before the hash mark for right hanging comments.
      • making comment blocks off to the side possible
    • blue supports multiple config files: pyproject.toml, setup.cfg, tox.ini, and .blue.
  • Interesting quote from the docs: “We’d prefer not to fork or monkeypatch. Instead, our hope is that eventually we’ll be able to work with the black maintainers to add just a little bit of configuration and merge back into the black project. “
  • My take
    • Probably stick with black most of the time.
    • For some large exiting projects with lots of strings that have standardized to single quote strings already, black is jarring.
    • Also, strings with double quotes in them are untouched by black, so if you have lots of those, strings will be inconsistent, making the code harder to read and confusing to maintain.
    • And the choice isn’t really black or blue. It’s often nothing due to the non-starter of switching to double quote strings by default. blue is better than nothing.
  • See also # fmt: off, # fmt: on for both blue and black
        # tell black/blue to not reformat this table
        # fmt: off
        some_table = [
            1,     2,   3,
            100, 200, 300
        ]
        # fmt: on
    

Simon #6: Organize and Index Your Screenshots (OCR) on macOS

  • I’ve been wanting to figure out how to use Tesseract OCR for years, and this post finally unlocked it for me
  • brew install tesseract
  • tesseract image.png output-file -l eng pdf
  • (use txt instead of pdf to get plain text)
  • I wrote a TIL about this at https://til.simonwillison.net/tesseract/tesseract-cli
  • It’s really good! Even works against photos I’ve taken. And the PDFs it produces have copy-and-paste text in them (despite looking visually identical to the image) and can be searched using Spotlight.
  • There’s a pytesseract library but it actually just works by running that Tesseract CLI tool in a subprocess
  • Extra: Using SQL to find my best photo of a pelican according to Apple Photos

Extras

Michael:

Simon

Joke

A “Query tale”?

Song from Brett Cannon (take on Pinky and the Brain theme song)

  • It's Michael and the Brain!
  • Yes, Michael and the Brain!
  • They're both into making ...
  • Python seem sane!
  • They're into Python podcasting,
  • and woke open-sourcing!
  • They're Michael,
  • they're Michael and the Brain, Brain, Brain, Brain, Brain!

Want to go deeper? Check our projects