Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book

Episode #240: This is GitHub, your pilot speaking...

Published Fri, Jul 2, 2021, recorded Fri, Jul 2, 2021.



Watch the live stream:

About the show

Sponsored by us:

Special guest: Chris Moffit

Brian #1: Subclassing in Python Redux

  • Hynek Schlawack
  • Prefer composition over inheritance,
  • But if you must subclass, there are 3 types
    • subclassing for code sharing
    • bad. don’t do it.
    • read the article and included linked articles if you aren’t convinced
    • Interfaces / Abstract Data Types
    • Can be useful, but Python has tools that make this work without subclassing
    • Specialization
    • Exception hierarchies
    • There’s also an interesting discussion of structuring data classes with common elements
    • This is the only type of subclassing that Hynek deems worthy
  • This is a well written, useful, and long-ish article that I cannot summarize and do it justice.
  • My summary: If you even consider sublcassing other than for exceptions, read this article first.

Michael #2: Extra, Extra, Extra*7, Hear all about it!

Chris #3: klib

  • Perform automated cleaning and analyzing of data in a pandas DataFrame
  • Missing value plot and correlation data plots are similar to other tools but the visualizations are nicely done and useful.
  • The data cleaning functions are really nice. In some testing, the automated data type conversion can save a meaningful amount of data.
  • For large data sets, you can drop columns with lots of null values or highly correlated values.
  • The clean_column_names function also performs several cleanups on column names such as removing spaces, standardizing CamelCase, etc.
  • You have control to use as much or as little of the automated process as possible.

Brian #4: Don’t forget about functools

  • “functools — Higher-order functions and operations on callable objects”
    • in English: cool decorators and other functions that act on functions
  • A recent article by Martin Heinz reminded me to review functools
  • We’ve talked about singledispatch recently, and I’m sure we’ve talked about lru_cache before. These are in functools.
  • functools is an interesting library in that you kind of use it more and more as you increase your Python experience. As a new Python dev, I would have been rather lost looking at this, but as you work through different projects, come back to this and have a look, it’ll have stuff you probably could have used, and will use in the future.
  • What’s in there? Here’s a few:
    • @singledispatch & @singledispatchmethod - function/method overloading
    • @wraps - A must for creating your own decorators that makes the decorated function act just like the original function (attributes, docstring, and all, with just the added behavior you are adding.
    • @lru_cache - memoization made easy
    • LRU = least recently used. It’s what it throws away when it’s full
    • @cache - like @lru_cache but with no max size. New in 3.9
    • @cached_property - only run the read code once. New in 3.8
    • del(obj.property) to clear it. Yes this is weird, but also cool.
    • @total_ordering - Define __eq__() and one other ordering function and get the other ordering functions for “free”.
      • not free. cost is slower execution and confusing stack traces if things go wrong. but still, when prototyping something, or when comparisons are very rare, this is cool
    • partial / partialmethod - create a new function with some of the arguments of the old function already filled in.
    • super cool for callbacks or defining convenience functions

Michael #5: GitHub Copilot

  • Get suggestions for whole lines or entire functions right inside your editor.
  • Available today as a Visual Studio Code extension.
  • The technical preview does especially well for Python, JavaScript, TypeScript, Ruby, and Go, but it understands dozens of languages and can help you find your way around almost anything.
  • You can cycle through alternative suggestions
  • Powered by Codex, the new AI system created by OpenAI
  • Based on gpt3.

Chris #6: Kats

  • New tool from facebook for Time Series analysis
  • Can use Facebook’s Prophet as well as other algorithms such as Sarima and Holt-Winters for prediction. Here’s my old post on prophet.
  • Some controversy about how well prophet performs in real life. Very detailed article here.
  • Provides utilities for analyzing time series including outlier and seasonality detection
  • Offers advanced ensemble methods and access to deep learning algorithms

Extras

Chris

  • Unyt - library for working with units of measure. Pint is another similar one with a different API.

Jokes

Italian Aysnc (from Dean Langsam)

Q: Why aren't cryptocurrency engineers allowed to vote? A: Because they're miners!


Want to go deeper? Check our projects