Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book

#115: Dataclass CSV reader and Nina drops by

Published Sat, Feb 2, 2019, recorded Tue, Jan 29, 2019

Sponsored by pythonbytes.fm/datadog

Special guest: Nina Zakharenko

Brian #1: Great Expectations

  • A set of tools intended for batch time testing of data pipeline data.
  • Introduction to the problem doc: Down with Pipeline debt / Introducing Great Expectations
  • expect_[something]() methods that return json formatted descriptions of whether or not the passed in data matches your expectations.
  • Can be used programmatically or interactively in a notebook. (video demo).
  • For programmatic use, I’m assuming you have to put code in place to stop a pipeline stage if expectations aren’t met, and write failing json result to a log or something.
  • Examples, just a few, full list is big:
    • Table shape:
      • expect_column_to_exist, expect_table_row_count_to_equal
  • Missing values, unique values, and types: - expect_column_values_to_be_unique, expect_column_values_to_not_be_null
    • Sets and ranges
      • expect_column_values_to_be_in_set
    • String matching
      • expect_column_values_to_match_regex
    • Datetime and JSON parsing
    • Aggregate functions
      • expect_column_stdev_to_be_between
    • Column pairs
    • Distributional functions
      • expect_column_chisquare_test_p_value_to_be_greater_than

Nina #2: Using CircuitPython and MicroPython to write Python for wearable electronics and embedded platforms

  • I’ve been playing with electronics projects as a hobby for the past two years, and a few months ago turned my attention to Python on microcontrollers
  • MicroPython is a lean and efficient implementation of Python3 that can run on microcontrollers with just 256k of code space, and 16k of RAM. CircuitPython is a port of MicroPython, optimized for Adafruit devices.
  • Some of the devices that run Python are as small as a quarter.
  • My favorite Python hardware platform for beginners is Adafruit’s Circuit PlayGround Express. It has everything you need to get started with programming hardware without soldering. All you’ll need is alligator clips for the conductive pads.
    • The board features NeoPixel LEDs, buttons, switches, temperature, motion, and sound sensors, a tiny speaker, and lots more. You can even use it to control servos, tiny motor arms.
      • Best of all, it only costs $25.
  • If you want to program the Circuit PlayGround Express with a drag-n-drop style scratch-like interface, you can use Microsoft’s MakeCode. It’s perfect for kids and you’ll find lots of examples on their site.
  • Best of all, there are tons of guides for Python projects to build on their website, from making your own synthesizers, to jewelry, to silly little robots.
  • Check out the repo for my Python-powered earrings, see a photo, or a demo.
  • Sign up for the Adafruit Python for Microcontrollers mailing list here, or see the archives here.

Michael #3: Data class CSV reader

  • Map CSV to Data Classes
  • You probably know about reading CSV files
    • Maybe as tuples
    • Better with csv.DictReader
  • This library is similar but maps Python 3.7’s data classes to rows of CSV files
  • Includes type conversions (say string to int)
  • Automatic type conversion. DataclassReader supports str, int, float, complex and datetime
  • DataclassReader use the type annotation to perform validation of the data of the CSV file.
  • Helps you troubleshoot issues with the data in the CSV file. DataclassReader will show exactly in which line of the CSV file contain errors.
  • Extract only the data you need. It will only parse the properties defined in the dataclass
  • It uses dataclass features that let you define metadata properties so the data can be parsed exactly the way you want.
  • Make the code cleaner. No more extra loops to convert data to the correct type, perform validation, set default values, the DataclassReader will do all this for you
  • Default fallback values, more.

Brian #4: How to Rock Python Packaging with Poetry and Briefcase

  • Starts with a discussion of the packaging (for those readers that don’t listen to Python Bytes, I guess.) However, it also puts flit, pipenv, and poetry in context with each other, which is nice.
  • Runs through a tutorial of how to build a pyproject.toml based project using poetry and briefcase.
  • We’ve talked about Poetry before, on episode 100.
  • pyproject.toml is discussed extensively on Test & Code 52.
  • briefcase is new, though, it’s a project for creating standalone native applications for Mac, Windows, Linux, iOS, Android, and more.
  • The tutorial also discusses using poetry directly to publish to the test-pypi server. This is a nice touch. Use the test-pypi before pushing to the real pypi. Very cool.

Nina #5: awesome-python-security *🕶🐍🔐, a collection of tools, techniques, and resources to make your Python more secure*

  • All of your production and client-facing code should be written with security in mind
  • This list features a few resources I’ve heard of such as Anthony Shaw’s excellent 10 common security gotchas article which highlights problems like input injection and depending on assert statements in production, and a few that are new to me:
  • OWASP (Open Web Application Security Project) Python Resources at pythonsecurity.org
  • bandit a tool to find common security issues in Python
    • bandit features a lot of useful plugins, that test for issues like:
      • hardcoded password strings
      • leaving flask debug on in production
      • using exec() in your code
      • & more
  • detect-secrets, a tool to detect secrets left accidentally in a Python codebase
  • & lots more like resources for learning about security concepts like cryptography
  • See the full list for more

Michael #6: pydbg

  • Python implementation of the Rust dbg macro
  • Best seen with an example. Rather than printing things you want to inspect, you:
        a = 2
        b = 3
    
        dbg(a+b)
    
        def square(x: int) -> int:
            return x * x
    
        dbg(square(a))
    

outputs:

    [testfile.py:4] a+b = 5
    [testfile.py:9] square(a) = 4

Extras:

Brian:

  • pathlib + pytest tmpdir → tmp_path & tmp_path_factory

Michael:

  • The Art of Python is a miniature arts festival at PyCon North America 2019, focusing on narrative, performance, and visual art. We intend to encourage and showcase novel art that helps us share our emotionally charged experiences of programming (particularly in Python). We hope that by attending, our audience will discover new aspects of empathy and rapport, and find a different kind of delight and perspective than might otherwise be expected at a large conference.
  • StackOverflow Survey is Open! https://stackoverflow.az1.qualtrics.com/jfe/form/SV_1RGiufc1FCJcL6B
  • NumPy Is Awaiting Fix for Critical Remote Code Execution Bug
    • via Doug Sheehan
    • The issue was raised on January 16 and affects NumPy versions 1.10 (released in 2015) through 1.16, which is the latest release at the moment, released on January 14
    • The problem is with the 'pickle' module, which is used for transforming Python object structures into a format that can be stored on disk or in databases, or that allows delivery across a network.
    • The issue was reported by security researcher Sherwel Nan, who says that if a Python application loads malicious data via the numpy.load function an attacker can obtain remote code execution on the machine.
  • Get your google data

Nina:

  • I’m teaching a two day Intro and Intermediate Python course on March 19th and 20th. The class will live-stream for free here on each day of or join in-person from downtown Minneapolis. All of the course materials will be released for free as well.
  • I recently recorded a series of videos with Carlton Gibson (Django maintainer) on developing Django Web Apps with VS Code, deploying them to Azure with a few clicks, setting up a Continuous Integration / Continuous Delivery pipeline, and creating serverless apps. Watch the series here: https://aka.ms/python-videos
  • I’ll be a mentor at a brand new hatchery event at PyCon US 2019, mentored sprints for diverse beginners organized by Tania Allard. The goal is to help underrepresented folks at PyCon contribute to open source in a supportive environment. The details will be located here (currently a placeholder) when they’re finalized.
  • Catch my talk about electronics projects in Python with LEDs at PyCascades in Seattle on February 24th. Currently tickets are still for sale.
  • If you haven’t tried the Python extension for VS Code, now is a great time. The December release included some killer features, such as remote Jupyter support, and exporting Python files as Jupyter notebooks. Keep up with future releases at the Python at Microsoft blog.

Jokes:

  • Q: What do you call a snake that only eats desert? A: A pie-thon. (might not make sense read out loud)
  • Q: How do you measure a python? A: In inches. They don't have any feet!
  • Q: What is a python’s favorite subject? Hiss-tory!

Episode Transcript

Collapse transcript

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to

00:05 your earbuds. This is episode 115, recorded January 29th, 2019. I'm Michael Kennedy.

00:12 And I'm Brian Okken.

00:13 And Brian, we got a special guest, don't we?

00:15 Yeah.

00:15 Nina Zagarenko. Say hello, Nina. How you doing?

00:18 Hey, everyone. I'm very proud of you for pronouncing my last name correctly.

00:22 Thank you. Brian and I, we specialize in mangling people's last names on this show,

00:27 but we try.

00:29 This time you nailed it.

00:30 Thank you.

00:31 For those who don't know me, I'm a senior cloud developer advocate at Microsoft,

00:35 and my focus is on Python. You can find me on Twitter at NNJA. That's like ninja, but without

00:42 the I.

00:42 Yeah, that's a pretty cool Twitter handle. Awesome. And also, this episode is brought to you by

00:46 Datadog. Check them out at pythonbytes.fm/Datadog. Tell you more about that. You know,

00:51 Brian, I expect quite a bit of stuff out of Datadog. I would say I have great expectations

00:56 when I go use their stuff. What do you think? How about you?

00:59 Yeah, yeah. I do too. Yes. For our first item, we have great expectations. And it is a package

01:06 that was shared by a listener. And I don't know the listener because they shared it with you,

01:11 Michael, and you didn't tell me who it was.

01:12 I just sent you over the link and I forgot. But thank you for sending it in. Next time,

01:16 we'll do a better job.

01:17 Well, we're just also trying to make it so that sharing information with everybody else is not

01:22 an ego-boosting exercise because we won't remember your name.

01:26 Or we'll mispronounce it.

01:27 Unless it's important to you, then let us know.

01:28 Yeah.

01:29 Yeah.

01:29 No, great expectations. It's kind of cool. It's this idea that we have a lot of tools out,

01:35 for instance, pytest, to test your code. But there's in a lot of stuff, the data that you're

01:42 running your code through, like in data science or a lot of data pipelines, the data is important too.

01:50 And being able to check to make sure your data fits what you expect it to fit, what look like, is important.

01:56 So these are some really cool expect calls.

02:00 They're a bunch of functions that start with expect.

02:02 They aren't assertions, so they aren't going to throw an exception.

02:07 But what happens is you, like maybe on some of them, you pass them a data frame.

02:11 Like I'll just give one as an example. You expect a column to exist.

02:16 So you give it a data frame and you give it a specific column, and you want that column to actually be there.

02:21 And if it's not, you get, actually, regardless of whether the answer, it comes back in the form of a JSON object.

02:28 And it has like a valid, and you can say whether or not, you know, if it failed or passed the test.

02:35 But it also shows you the parts where it didn't.

02:37 So this is, there's a little video, actually, demo video that they have, where you can see it in action using a Jupyter notebook.

02:44 And it's kind of cool. It shows you exactly where it's failing.

02:48 So I imagine doing this interactively to look at your data would be helpful.

02:52 But also you could probably put this in place in some data cleaning steps to make sure things are around.

02:58 Like making sure there aren't any nulls in a column.

03:00 There's a whole bunch of different things you can assert on or expect on your data.

03:05 It's pretty fun.

03:06 Yeah, it's pretty cool.

03:07 Some of them are totally straightforward.

03:08 Like expect these values to be in the set.

03:11 Others, a little more data science, mathematical focused, like expect the chi-squared test p-value to be greater than such and such, right?

03:20 I honestly haven't done a lot of chi-squared lately, but I don't do that much data science, you know, on the web.

03:26 It's more we use addition and like stuff like that.

03:30 Yeah, but some of the fun things, like it was in the video example, he expected some data from a column to be in a particular set.

03:42 And it was either male or female.

03:45 And first off, like how binary of this.

03:47 But anyway, the males, a lot of them came back with spaces in them.

03:51 So doing some of this exploratory thing might tell you where you need to add some cleaning steps or dealing with nulls or things like that.

03:59 The follow-up, I'd like to hear how people might be able to use this, how it's used exploratory-wise.

04:06 But how, I'd like to see somebody using it in their pipeline stages and how that works.

04:12 Whether you, I guess I imagine if you had it in production, you'd have some code that would, if it failed an expectation, you'd write a log entry or something?

04:21 Or I don't know what you'd do.

04:22 Yeah, or maybe return, if you were accepting data over a web method, right?

04:26 Somebody's doing ETL and they're like, here, we're submitting this new set of data.

04:30 You could return like 400 bad requests to indicate, no, no, something's wrong with the data you sent me.

04:35 Things like that, possible.

04:36 Yeah.

04:37 You know, what do you think about this?

04:38 I think it's very cool and is probably going to be pretty helpful.

04:42 I like it as well.

04:43 What I like about it is it lets you take what would be a little algorithm you'd have to write to, say, go through and test all, say, the chi-square values and then compare them and then assert on that.

04:52 Now you just do one line and it does it on the whole data set, the whole data frame.

04:56 That's pretty sweet.

04:56 Yeah, I'm going to be watching the video demo after the show.

04:59 Yeah, right on.

04:59 So the thing that you are going to cover next, the timing is incredible, Nina.

05:04 I literally got a notification that I had ordered one of these and it shipped today and it's on its way to my house.

05:11 So I'm very excited.

05:12 Tell people what is on its way.

05:14 Would that be the Circuit Playground Express?

05:17 It would be.

05:18 Yeah.

05:19 Awesome.

05:19 Yeah.

05:20 So I wanted to chat a little bit about using CircuitPython and MicroPython to write Python for wearable electronics and embedded platforms.

05:28 I've been playing with electronics projects as a hobby for probably about the past two years now.

05:34 In the past few months, I've been focusing my attention on Python for microcontrollers.

05:39 Right on.

05:39 And what kind of little things are you making with your projects?

05:42 My last one was Python powered LED earrings.

05:45 And I dropped a link to the repo for the code as well as a photo.

05:49 So you can see that.

05:50 Okay.

05:51 So what I know I've heard of MicroPython.

05:53 And I think I've heard of CircuitPython.

05:55 Are they the same?

05:56 Are they different?

05:56 Tell everyone about it.

05:57 Yeah, there's definitely a little bit of confusion about that.

06:00 So MicroPython is the original.

06:02 It's a lean and efficient implementation of Python 3 that can run on these tiny little microcontrollers.

06:09 And all it needs is 256 kilobytes of code space and 16 kilobytes of RAM.

06:15 It's incredible.

06:16 It's truly awesome.

06:17 Yeah.

06:17 It's so super low level too.

06:19 Yeah.

06:19 And CircuitPython is a port of MicroPython.

06:23 And it's optimized for Adafruit devices.

06:26 Cool.

06:26 So I guess that's the one I'm going to be learning about.

06:29 Yeah.

06:29 Yeah.

06:30 And some of these, the devices that Adafruit sells, they're as small as a quarter.

06:34 That would be the trinket.

06:36 But of course, my favorite Python hardware platform for beginners is that the Adafruit Circuit Playground

06:42 Express.

06:42 It has everything you need to get started with programming hardware without even needing to

06:47 learn how to solder.

06:48 All you need is some alligator clips for the conductive pads.

06:52 And the board has a ring of neopixel LEDs.

06:56 It has buttons, switches, temperature sensors, motion detectors, sound sensors, a tiny little

07:02 speaker, and a lot more stuff.

07:04 Like you can even use it to control servos, which are those tiny little motor arms.

07:09 Wow.

07:10 That's really awesome.

07:11 Yeah.

07:11 It's a tiny little thing.

07:13 Yeah.

07:13 What I really like about this and the reason that I ordered it was I can go to Adafruit and

07:17 look around and I'm just like, this is too low level for me.

07:20 I don't really know what I need.

07:22 I don't even know if I have a power supply, if I get this little chip or that.

07:25 Like how to put together.

07:26 I'm like, okay, give me like one thing that has all the stuff to do little projects like

07:31 you're describing.

07:31 And I think it's awesome.

07:33 And doing it in Python is super cool.

07:35 It only costs $25.

07:36 Yeah.

07:37 It's not too expensive.

07:38 No.

07:38 And then if you don't want to use Python to program this, there's a tool that you can use

07:44 called Microsoft MakeCode.

07:46 And it lets you program these little devices with a drag and drop style kind of scratch

07:51 like interface.

07:52 So that's perfect for kids.

07:54 And you'll find a lot of examples on their site.

07:57 Oh, that's awesome that you point that out because I might get my daughter to do some kind

08:01 of little game, you know, like a Simon Says or something with the LEDs.

08:04 Who knows?

08:05 That'd be fun.

08:05 Yeah.

08:06 The Mu editor has some tutorials on using these as well.

08:09 Right.

08:09 That's awesome.

08:10 Super cool.

08:10 All right.

08:11 And I like you threw in the link.

08:12 Yeah.

08:13 To your code.

08:14 So people can do the earrings as well.

08:16 Yeah.

08:16 And like you said earlier, there are tons of guides for Python projects on the Adafruit

08:20 website.

08:21 Stuff from making your own synthesizers to jewelry to silly little robots.

08:26 So definitely make sure to check that out.

08:28 Cool.

08:28 Yeah.

08:28 Well, I can almost recommend it.

08:31 It looks really, really good.

08:32 And when I get it, I'm going to play with it and let everyone know what I think.

08:35 But yeah, it seems like a great little package that you can get.

08:38 25 bucks.

08:39 You know, it's like, you know, as software developers, you could just try it, right?

08:42 It's not that big of a risk.

08:43 It's not like you're getting a new MacBook or something.

08:45 Mm-hmm.

08:45 And Adafruit has a Python for microcontrollers mailing list.

08:51 And they always drop kind of hot news and interesting new things.

08:54 So there's a link to sign up for that in the show notes.

08:57 Awesome.

08:57 That's a great one.

08:58 So this next one, I think, is a really interesting use of Python 3.7.

09:04 So when you think of the main features of Python 3.7, certainly data classes has to be

09:10 one of them.

09:11 Have either of you found any use for data classes yet?

09:13 Yeah, I just love them.

09:14 I use them like named tuples.

09:15 Yeah.

09:16 Yeah.

09:16 They're awesome.

09:17 Trey Hunter has a great talk on data classes.

09:20 Definitely.

09:20 So this is a library that's derived from data classes.

09:24 And it's a CSV file reader.

09:26 Now, Python comes with pretty good support for CSV readers.

09:29 You import CSV and then you create a dict reader based on a file stream.

09:33 And it'll read the header and figure out the columns.

09:36 And then it gives you little dictionaries based on the column names.

09:38 And that's pretty sweet.

09:39 But what you get back is a whole bunch of strings corresponding to those values.

09:44 Right?

09:45 And with this, what you can do is you can actually define a data class that maps the schema of

09:51 your CSV file.

09:53 Right?

09:54 So I could define a data class that maybe has an ID and a name and a value or price.

09:59 Right?

09:59 Maybe it's like products.

10:00 So like the ID could be an integer defined in the data class.

10:04 The name could be a string.

10:05 And the value, the price could be, say, a float.

10:07 And it'll actually do all those conversions for you and give you meaningful errors if like

10:13 it's a non-parsable float or something like that.

10:15 Isn't that nice?

10:15 That's incredible.

10:16 Working with CSV is the bane of my existence.

10:19 Yeah.

10:20 So you can just define these things.

10:21 You get autocomplete in, you know, PyCharm or Visual Studio Code for your types, for your

10:26 rows, because they come back as these data classes.

10:28 And you get validation.

10:31 And on top of that, you can actually do cool stuff.

10:33 Like you can say, with data classes, you can specify either just the type or the type and

10:39 a value.

10:40 And that value becomes the default value.

10:42 So if like only sometimes the price is there, you could put zero or minus one.

10:45 And it'll just go through and substitute that value.

10:48 So a lot of cool little nice touches here.

10:50 Nice.

10:50 Yeah.

10:50 I'm definitely going to check this out the next time that I have to do some sort of CSV parsing

10:54 because it's way better to let this thing do the validation and the type conversion and

10:59 all that and just, you know, not worry.

11:00 I don't think I've been this excited about CSV files in a long time.

11:03 I know.

11:04 They are.

11:05 They are pretty amazing.

11:07 No, this really makes working with them nice.

11:09 And I'm excited, too.

11:11 All right.

11:12 Speaking of excited, I do want to tell you about Datadog.

11:14 They're helping make this show possible.

11:15 So before we get to our next item, let me tell you about them.

11:18 This show is brought to you by Datadog.

11:21 They're a cloud scale monitoring platform that brings like metrics and logs, distributed traces

11:25 all together.

11:26 You can do auto instrumenting of frameworks like Django and Flask and PostgreSQL, which

11:32 means you can track requests across service boundaries, across machines, things like that, which is

11:37 awesome.

11:38 It makes it really easy to troubleshoot your slow Python apps and figure out overall where

11:43 the time is being spent.

11:44 So you can get started for free at pythonbytes.fm/Datadog.

11:48 And they'll also give you a cool t-shirt with a Datadog character on it, which is nice and

11:53 cute.

11:53 So check them out.

11:54 It helps keep the show going.

11:56 Now, Brian, I want to come to a topic that we haven't really covered very much on the show,

12:01 but maybe I think a while ago we did talk about packaging ones, right?

12:05 Yeah.

12:06 You want to catch us up on it?

12:08 I think it's only second to GUIs so far.

12:11 That's right.

12:12 There's a fun article called How to Rock Python Packaging with Poetry in Briefcase.

12:17 Plus, it has the phrase how to rock something, and I'm a sucker for that.

12:22 It's actually kind of a nice tutorial on packaging.

12:25 So for those of you that just joined us and haven't learned all of our discussions on packaging,

12:30 this is a nice introduction to packaging and how it fits in Python and also kind of how

12:36 the changing of things, changing of packaging, like where Flit, Pipenv, and Poetry sort of fit in

12:43 with all of this.

12:44 It's kind of a nice run through of that.

12:46 Poetry is one of those things for packaging, and it uses the pyproject.toml file.

12:53 And of course, we talked about poetry in episode 100.

12:57 And the nitty-gritty details of pyproject.toml was in testing code 52 with Brett Cannon.

13:04 But the neat thing, one of the neat things why I picked this is it also talks about Briefcase.

13:09 And we haven't talked about Briefcase yet.

13:12 And Briefcase is one of those tools from Py, the Beware project.

13:17 And it's something that you can, it packages your, a Python application as a standalone native

13:24 application for lots of stuff.

13:27 It claims Mac, Windows, Linux, even iOS and Android, which is interesting.

13:31 I haven't tried any of this.

13:33 The tutorial talks about desktop distribution of code through Briefcase.

13:38 And it's kind of cool.

13:40 And how to get that done with poetry.

13:42 I think poetry is definitely nice.

13:43 It's, you know, sort of an alternative philosophy to PipInv, which is, we talked about that as well.

13:49 And Briefcase is part of that whole set of small independent tools from Beware, which also is

13:56 pretty cool.

13:57 And it's nice to see them working together.

13:58 Also, we got Cookie Cutter doing some magic in here as well.

14:02 Definitely.

14:03 And then one of the things that I like at the end of this, there's several tutorials on publishing,

14:08 like how to push your new package to PyPI.

14:12 But this one I really like because instead of telling you exactly how to push it to PyPI,

14:17 they tell you how to push it to the test server.

14:19 And I think that's an important step for people to do.

14:23 Before you subject the world to your code, try it out at the test server first.

14:28 So it's nice.

14:29 Of course.

14:29 Yeah, of course.

14:31 Why don't more of them do that?

14:32 I don't know.

14:34 But please, everybody, like save the world and push it here first.

14:39 Yeah.

14:39 The packages on PyPI cannot be changed when they're uploaded.

14:42 You can only add newer ones.

14:44 So although you've got to admire the rate at which you'll increase your version of your package

14:50 if you screw it up a few times trying to publish it.

14:52 Yeah.

14:56 That's productivity, right?

14:57 That's right.

14:57 So Nina, this next one that you found for us is one of these awesome lists.

15:04 And I think these awesome lists are coming along really faster and faster these days, right?

15:08 We've got awesome Python, awesome Python applications.

15:10 What's the next one that's in that category?

15:13 And this is a new one I came across called Awesome Python Security.

15:17 It's a collection of tools, techniques, and resources to make your Python more secure.

15:22 Oh, that's cool.

15:23 Yeah.

15:23 And it's got a lot of stuff for web apps like the secure.py, which is awesome.

15:27 We've covered that.

15:28 And the Flask, Flask Talesman, Django sessions, all kinds of stuff, right?

15:32 But not just the web.

15:34 There's a whole bunch of other ones.

15:35 I think, and hopefully all of you agree, that all of your production and client-facing code

15:39 should be written with security in mind.

15:42 And this list features a few resources that I'd come across before, like Anthony Shaw's excellent

15:47 10 common security gotchas article that highlights problems like input ejection and depending on

15:54 assert statements in production.

15:56 And I also came across a few that were new to me.

15:58 So the OWASP Python resources.

16:01 OWASP stands for Open Web Application Security Project.

16:04 And there's tons of OWASP resources out there.

16:08 I didn't know that there was a Python-specific one.

16:10 You can find that one at pythonsecurity.org.

16:13 I came across Bandit, which is a tool to find common security issues in Python.

16:17 And now Bandit has a lot of really useful plugins that test for some issues like hard-coded password

16:24 strings in production, leaving Flask debug on in production, using exec in your code, and

16:31 a lot more.

16:31 I linked to the full list in the show notes.

16:33 And then a few other cool ones like Detect Secrets, which is a tool to detect secrets that

16:40 were accidentally left in your Python code base.

16:42 That's cool.

16:42 Yeah.

16:42 Yeah.

16:43 Let's open source that.

16:44 Oh, wait.

16:44 Was our full access AWS or Azure key in there?

16:48 Whoopsie.

16:48 Oops.

16:50 Oopsie doops.

16:52 We'll just check it in again without that in there.

16:54 I'm sure it'll be fine.

16:55 There's no history.

16:56 And something I really like about this list in particular is it also includes resources

17:00 for learning about security concepts like cryptography.

17:03 Yeah.

17:04 You know, out of that cryptography section, they listed one of my all-time favorite packages,

17:09 which is PassLib.

17:10 So if you're going to store user secrets and you want to hash them, like passwords are

17:16 probably the most common, but there could be other things as well that you don't want

17:19 to store directly, but you want to accept from user and see if you have it.

17:23 All right.

17:23 Like you can hash it and that's a good idea.

17:26 But what you really should do is like take that result, add some salt, then hash it again,

17:30 take that, do it again, maybe a hundred thousand times, right?

17:33 PassLib, that's like one function, like dot encrypt rounds equal 150,000, 200,000, whatever.

17:39 It's really nice.

17:40 That's awesome.

17:41 Yep.

17:41 To make sure your password doesn't end up on have I been pwned.

17:44 Exactly.

17:44 Exactly.

17:45 So basically you can say, I want it to take, you know, 0.2 seconds to determine, to brute

17:51 force or to check each version of the password.

17:53 And it automatically, because of that, will slow down dictionary attacks against your site

17:57 because you can only do them point, you know, it'd take only five per second, right?

18:00 So, and then there's another one called let's be bad guys.

18:03 That's interesting.

18:04 So yeah, a lot of cool stuff here.

18:07 It's a great project name.

18:08 I know.

18:08 It's like a hacker playground.

18:10 Yeah.

18:10 So check out the full list on GitHub.

18:12 And then if there's something missing that you think should be there, maybe open a pull

18:16 request.

18:17 Yeah, absolutely.

18:17 That's a cool one.

18:18 I'm glad you found it.

18:19 All right.

18:20 The last official item I want to cover is PyDBG, which is the implementation of a, of a

18:27 Rust macro called DBG.

18:29 So just put the pie on the front.

18:31 Now I haven't done that much Rust.

18:34 I've actually been wanting to learn Rust.

18:35 It looks pretty interesting to me, but the basic idea of this DBG macro is instead of

18:41 just printing out, like I'm here, I'm here, the value is, you know, printing X as the value,

18:47 it'll actually give you a higher level statement without doing more work.

18:51 So if you're trying to debug something through print statements and that kind of thing, this

18:54 makes it a lot easier.

18:56 So you can go and say, like, I have a equals two, b equals three.

19:00 If I could say DBG of a plus b, the output is the file of the line a plus b equals five.

19:06 Things like that.

19:08 Really, really nice.

19:09 It sort of shows you in your message where you are in the file, what thing is you're actually

19:14 printing without having to like come up with elaborate print statements.

19:18 So pretty cool.

19:19 Oh man, I'm going to use this like every day.

19:20 Do you still use print statements to debug brain?

19:25 Yes, I do.

19:26 I love print statements.

19:29 I use the debugger a lot, but every now and then I'm just like, you know, I just want

19:32 to print this out and just see what is happening.

19:35 Like I, I don't primarily use print statements for debugging, but sometimes I do when I'm kind

19:41 of exploring that I want it to run, but I kind of want to see what's happening.

19:45 I'm like, Oh, what am I getting back from that API?

19:46 What is this value?

19:48 Things like that.

19:49 Have either of you used watch statements?

19:51 No, tell us about it.

19:52 You can just set up a variable or an expression to watch.

19:55 And, you know, every time you hit a break point, you're like, Oh, I see what's in there.

19:59 I don't have to type it again.

20:00 With VS Code or PyCharm?

20:02 Yeah.

20:02 I believe you can set up watches with PDB2, but I don't know.

20:06 I usually do those in a graphical debugger.

20:08 Yeah, me too.

20:09 Yeah, I definitely have used those.

20:10 I was, I was thinking something different, but so what, where, where I'm going to use this

20:14 DBG, PyDBG thing is a lot of times I've got, we've got test code that generates

20:20 huge amounts of data, like trace data.

20:23 And, these are stored and the, the test runs are really long and throwing a couple

20:30 of these extra ones for intermediate values.

20:33 So the failing tests or failing test runs, we can take a look at those postmortem, things

20:39 like that.

20:40 It'd be just save.

20:41 It's an elegant way to, to have that be done.

20:44 Yeah.

20:44 Yeah.

20:44 It's pretty cool.

20:45 What I like about it is this like so simple, right?

20:47 Like you place the word print with DBG and you, you kind of got something going on here.

20:50 It also kind of like that.

20:51 It's more explicit.

20:52 You're like, this is not really supposed to be a print statement.

20:54 This is just here till I figure out what's going on.

20:57 And then we're going to stop this.

20:59 Yeah.

20:59 But cool.

21:00 People can check it out.

21:01 Thanks for sending that in to our listeners.

21:04 All right.

21:05 So I guess that's it for all of our main topics, but looking at our show notes here, Brian,

21:09 we all like kind of had a second round in the extra.

21:12 So maybe we'll do like a lightning round one more time.

21:14 What do you got for us?

21:14 This is just a quickie that pytest has temporary directories and temperature factory fixtures for dealing with temporary files.

21:24 But they've added, as of pytest 3.9, there's path versions that return pathlib path objects.

21:33 And those are just quite fun.

21:35 And I'll drop a link in the show notes.

21:37 Okay.

21:37 That's great.

21:38 So I want to bring your attention to something, let's say, non-standard in terms of conference presentations.

21:44 So this is something at PyCon US.

21:46 So in May, Cleveland, 2019, there's a project called The Art of Python, which is a miniature arts festival focusing on narrative performance and visual art around programming and Python.

22:01 And basically showcase novel art that helps us share our emotionally charged programming experiences, particularly to do with Python.

22:08 So it's like five to 20 minute presentations in a separate little track.

22:12 And the call for papers are open.

22:13 So if you've always been a theater fan and you program, here you go.

22:17 Oh, this looks very cool.

22:19 Yeah.

22:20 Yeah.

22:20 That's pretty interesting.

22:21 So people can check that out if that connects with them.

22:23 The other one is one of my favorite surveys and sort of put your finger on the pulse of the community items is the Stack Overflow survey.

22:31 Well, the 2019 one is open.

22:33 So everyone should go out there and represent for Python and fill out the 2019 Stack Overflow survey.

22:39 So that's good.

22:40 And then finally, this gets a little bit back to your pick, Nina.

22:44 NumPy is awaiting a fix for a critical remote, remote code execution bug.

22:50 That's bad.

22:52 That doesn't sound super good.

22:55 So, yeah, I don't know if it has been entirely fixed yet.

22:59 I don't, you know, this is a couple days ago.

23:01 It was not.

23:02 So the idea is basically there's a problem with the pickle module.

23:06 Have you ever, could you imagine there'd be a problem with accepting user input straight in pickle form?

23:11 I can't imagine.

23:13 So the idea is there's some part of NumPy that you can load pickled data.

23:19 And, you know, for those who don't know, like part of the pickle statement is here's some Python code as a module and here's arbitrary code to run as part of deserializing that.

23:28 So good luck.

23:30 Oh, boy.

23:31 Yeah, yeah, that's not so, that's not so good.

23:33 So this goes up to at least version 110 through 116, which at least is January 14th.

23:39 This hadn't been fixed.

23:40 So hopefully it's been fixed.

23:42 But more importantly, if you're using NumPy and you're accepting user input through the load function, you want to upgrade and you want to be a little careful around that.

23:50 All right.

23:51 Then last one I just want to throw out there really quick.

23:53 I ran across this and I've known about it for a long time, but it turns out to be more useful than I thought.

23:58 So I use Google Docs a lot and I have like sheets in there and I've got Word, like documents and stuff.

24:07 But the problem is if you use like Google Drive, what ends up on your hard drive is like a hyperlink back to the actual sheet, right?

24:13 So how do you back that stuff up?

24:14 It turns out if you go to takeout.google.com slash settings slash takeout, that's a lot of repetition.

24:20 Anyway, you go there, you can say, give me all my document format, all my documents, and it'll give them to you in Microsoft Office format.

24:26 Like it'll convert the sheets to Excel, it'll convert the docs to Word docs, and then let you download them so you have like a sort of permanent version.

24:33 Anyway, I thought that was cool and people might find that useful too.

24:36 Cool.

24:36 Yeah.

24:36 All right, Nina, you got some as well.

24:38 You're teaching a class that looks really interesting.

24:40 That's right.

24:41 Yeah, I'm teaching a two-day introduction to an intermediate Python course on March 19th and 20th.

24:47 And that class is going to live stream for free at Front End Masters on each day.

24:52 And all the course materials I'm going to release for free as well.

24:55 That's really excellent.

24:56 And it has an in-person component if you happen to be, where is it, Minneapolis?

25:00 In Minneapolis, that's right.

25:01 You can come to the class.

25:02 Will Minneapolis be thought out by then?

25:04 The class size is about 20 people.

25:06 That I cannot promise.

25:08 Hopefully in March it's a little warmer.

25:10 Yeah, so up to 20 people could drop in in person.

25:12 That'd be really cool.

25:13 The next thing, I recently recorded a series of videos with Carlton Gibson.

25:17 He's a Django maintainer, maintains a lot of other projects on developing Django web apps with VS Code, deploying them to Azure with just a few clicks, setting up continuous integration and continuous delivery, as well as creating serverless applications.

25:33 You can watch that video series at aka.ms.com.au.

25:36 Python-virus.com.au.

25:40 Check that out, too.

25:40 Yeah, it's great.

25:41 We got to film in the Microsoft Channel 9 studio.

25:44 And it's a very well-done series.

25:47 All the bright lights and everything, huh?

25:49 Not just screencasts.

25:50 That's cool.

25:51 Yeah, we feel like newscasters.

25:53 I'm also planning on being a mentor at a brand new hatchery event at PyCon US 2019.

25:59 That's going to be mentored sprints for diverse beginners organized by Tanya Allard.

26:04 The goal is to help underrepresented folks at PyCon contribute to open source in a supportive environment.

26:10 The details aren't out yet, but I dropped a link to where they'll be when they're finalized.

26:15 Oh, that's super cool.

26:16 And there's also things like scholarships or something like that to help folks get actually physically to the event if they need some help as well, right?

26:25 They can apply for that at the PyCon site.

26:27 Yeah, PyCon US offers a lot of financial aid.

26:30 Lastly, if you're interested in Python for hardware, like we talked about earlier, you can catch my talk about electronics projects in Python with LEDs at PyCascades in Seattle on February 24th.

26:42 Currently, tickets for that are still on sale.

26:44 Excellent.

26:44 Yeah, and Brian and I are definitely going to be there.

26:46 We're all going to PyCascades, so we're going to catch it.

26:50 Hopefully, everyone else does as well.

26:51 Great.

26:51 I'm excited to see you there.

26:52 And I do have one last thing to sneak in, and that is if you haven't tried the Python extension for VS Code yet,

27:00 now is a really good time.

27:01 The December release included some really killer features like remote Jupyter support and exporting Python files as Jupyter notebooks.

27:10 And if you're interested in keeping up with future releases, I dropped a link to the Python at Microsoft blog.

27:15 Nice.

27:16 And didn't you as a group, I'm speaking to you as a Microsoft, as VS Code,

27:21 didn't you guys just release like an AI-powered autocomplete backend for Python as well?

27:28 That's been around for a few months.

27:30 I believe it's still in preview mode, but it works really well.

27:34 The dataset was trained on a bunch of open source projects.

27:37 Yeah, it looks super cool.

27:38 So I definitely want to check that.

27:40 I think I installed it just the other day, so it should be fun.

27:42 Yeah.

27:43 Try it.

27:44 Let me know what you think.

27:45 Absolutely.

27:45 Brian, we've come to our joke section, right?

27:47 Yeah.

27:47 Nina, you want to kick us off?

27:50 Yeah.

27:51 I found a bunch of really cheesy snake jokes, so here they go.

27:55 What do you call a snake that only eats dessert?

27:58 I don't know.

27:58 It's a pie-thon.

28:00 Nice.

28:02 I'll do the next one.

28:03 How do you measure a Python?

28:05 In inches.

28:06 They don't have any feet.

28:07 Brian, what's the last one?

28:10 What is a Python's favorite subject?

28:13 I don't know.

28:13 What is it?

28:14 History?

28:14 That's bad.

28:15 These are all bad.

28:16 History.

28:17 Lovely.

28:19 And I will not apologize.

28:21 No, those are great.

28:22 Thank you for finding those, Nina.

28:23 We're coming up with them.

28:24 Either way, they're great.

28:25 All right, folks.

28:26 Well, thank you for listening.

28:27 And Nina and Brian, thank you for being here today, of course.

28:30 Thank you.

28:30 Thanks for having me.

28:31 You bet.

28:32 Bye.

28:32 Bye.

28:32 Bye.

28:33 Thank you for listening to Python Bytes.

28:35 Follow the show on Twitter via at Python Bytes.

28:37 That's Python Bytes as in B-Y-T-E-S.

28:40 And get the full show notes at pythonbytes.fm.

28:43 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

28:48 We're always on the lookout for sharing something cool.

28:51 On behalf of myself and Brian Okken, this is Michael Kennedy.

28:54 Thank you for listening and sharing this podcast with your friends and colleagues.


Want to go deeper? Check our projects