Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book

Episode #259: That argument is a little late-bound

Published Wed, Nov 17, 2021, recorded Wed, Nov 17, 2021.



Watch the live stream:


About the show

Sponsored by Shortcut - Get started at shortcut.com/pythonbytes

Special guest: Renee Teate

Michael #1: pypi-changes

Brian #2: Late-bound argument defaults for Python

  • Default values for arguments to functions are evaluated at function definition time.
  • If a value is a short expression that uses a variable, that variable is in the scope of the function definition.
  • The expression cannot use other arguments.
  • Example of what you cannot do:
    def foo(a, b = None, c = len(a)):
        ...
  • There’s a proposal by Chris Angelico to add a =: operator for late default evaluation.
    • syntax still up in the air. => and ?= also discussed
  • However, it’s non-trivial to add syntax to an established language, and this article notes:
    • At first blush, Angelico's idea to fix this "wart" in Python seems fairly straightforward, but the discussion has shown that there are multiple facets to consider. It is not quite as simple as "let's add a way to evaluate default arguments when the function is called"—likely how it was seen at the outset. That is often the case when looking at new features for an established language like Python; there is a huge body of code that needs to stay working, but there are also, sometimes conflicting, aspirations for features that could be added. It is a tricky balancing act.

Renee #3: pandas.read_sql

  • Since I wrote my book SQL for Data Scientists, I’ve gotten several questions about how I use SQL in my python scripts. It’s really simple:
  • You can save your SQL as a text file and then import the dataset into a pandas dataframe to do the rest of my data cleaning, feature engineering, etc.
  • Pandas has a built-in way to use SQL as a data source.
  • You set up a connection to your database using another package like SQL Alchemy, then send the SQL string and the connection to the pandas.read_sql function.
  • It returns a dataframe with the results of your query.

Michael #4: pyjion

  • by Anthony Shaw
  • Pyjion is a JIT for Python based upon CoreCLR
  • Check out live.trypyjion.com *to see it in action.*
  • Requires Python 3.10, .NET Core 6
  • Enable with just a couple of lines:
    >>> import pyjion
    >>> pyjion.enable()

Brian #5: Tips for debugging with print()

  • Adam Johnson
  • 7 tips altogether, but I’ll highlight a few I loved reading about
  • Debug variables with f-strings and =
    • print(f``"``{myvar=}``")
    • Saves typing over print(f``"``myvar={myvar}") with the same result
  • Make output “pop” with emoji (Brilliant!)
    • print("👉 spam()")
    • Here’s some cool ones to use
      • 👉 ❌ ✅
  • Use rich.print or pprint for pretty printing
    • Also, cool rename example to have both print and rich.print available
      • from rich import print as rprint
    • Both rich.print and pprint.pprint are essential for printing structures nicely
  • Brian’s addition
    • In pytest, failed tests report the stdout contents by default from the test
    • I love the idea of using rich.print and emoji for print statements in tests themselves.
    • Even though you can use --showlocals to print local variables for failed tests, having control of some output to help you debug something if it ever fails is a good thing.

Renee #6: SHAP (and beeswarm plot)

  • Brought to my attention by my team member Brian Richards at HelioCampus, and now they’re becoming a standard part of some of our model evaluation/explanation outputs
  • SHapley Additive exPlanations
    • Shapley values from game theory
  • Additive: “SHAP values of all the input features will always sum up to the difference between baseline (expected) model output and the current model output for the prediction being explained”
    • Negative/positive - pushing the prediction towards one class or the other
      • There’s a SHAP value for every feature for every prediction
    • Waterfall plots
    • Scatterplots of input value vs SHAP value
    • SHAP value can be outputted and pulled into other tools (I use them in Tableau)
  • Correlation not causation
  • Beeswarm plots for feature importance with input value vs SHAP value

Extras

Brian:

  • Matthew Feickert recommended pip index and specifically pip index versions as a cool thing to try.
    • Example. pip index versions pyhf reports
      • all versions of pyhf available on pypi
      • the latest version
      • your installed version
    • It’s currently “experimental” so conceptually the pypa could yank it. But we like it. I hope it stays.

Michael:

Renee:

Joke: git messages


Want to go deeper? Check our projects