Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #185: This code is snooping on you (a good thing!)

Return to episode page view on github
Recorded on Thursday, Jun 4, 2020.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 185 recorded June 4th, 2020. I'm Michael Kennedy.

00:10 And I am Brian Okken.

00:11 And this episode is brought to you by Datadog. More on that later.

00:14 Check them out at pythonbytes.fm/datadog.

00:17 Brian, I feel like we're all working from home. Everyone's life is scrambled.

00:21 Even like my sleep schedules are scrambled.

00:23 Like some crazy stuff happened and I slept from like 6 to 930 and I was up for like four hours and I slept in it, like it's just, it's weird.

00:30 Don't we need more structure in our life?

00:31 (laughing)

00:32 - Nice, nice intro.

00:34 Yes, more structure.

00:36 Yeah, I'm a fan of Markdown also.

00:38 Believe it, trust me, it's not a tangent.

00:41 Though we have just a repo that we want to point people to called Myst, it's gotta be called Myst, don't you think?

00:47 - Oh yeah, definitely.

00:49 - M-Y-S-T, which is Markedly Structured Text.

00:53 And what this is, is a fully functional Markdown parser for Sphinx, it's Markdown plus a whole bunch of stuff from Restructured Text.

01:02 Restructured to text. So, Mist allows you to write Sphinx documentation entirely in Markdown, and things that you could do in Restructured Text but could not do in Markdown have been put in a... there's a new flavor of Markdown, so you can do all of your directives and all sorts of cool things, like anything you could do in restructured text with Sphinx you can now do in Markdown. It's based on CommonMark and some other tools, so they're standing on other tools that are already doing things really well and just extending them a bit. But this is pretty powerful. One of the things I like about this is I particularly don't use a lot of Sphinx, but this also includes a standalone parser so you can see how somebody's extended Markdown for these extra directives and even use some of them in your own code if you want.

01:56 >> Yeah, this looks really, really nice.

01:58 Restructured text is good and all, but I don't know.

02:01 If I'm going to write something like restructured text, my heart just wants to write Markdown, I got to tell you.

02:07 >> Yeah, me too. I think one of the things that was holding a lot of people back is some of the extra directives like information boxes and other things like that, that you can't necessarily do in Markdown off the shelf, but some extensions are nice.

02:23 I played with it a little bit doing some just I didn't pull it down with Sphinx.

02:28 I just pulled it down so that I could run some markdown through it and some of the extra directives to see what it has.

02:35 So for instance, some of the directives like I tried like an information box, you can have structure around putting an information box somewhere.

02:42 And what you end up with is a div that has a class to it.

02:46 Oh, nice.

02:47 If you're not using Sphinx, then you'll have to use your own CSS, I guess to style it.

02:53 it puts in enough hooks for you to be able to do that.

02:56 - That's really nice.

02:57 I do wish you could sort of indicate CSS styles and mark down because, wow, that would just, that would be the end of what you need HTML for for many, many things.

03:07 That would be nice.

03:08 So last week, you brought up dir inf.

03:11 We were talking about how do you store your secrets, how do you activate and configure different environments.

03:18 I think I even said something about like specifying where Python was running.

03:22 I don't remember what the context was exactly, but you're like, "Durinf." And actually, I've been meaning to cover this.

03:27 DunderDan, I linked him on Twitter.

03:29 Don't know his last name is.

03:30 Thanks, Dan.

03:31 Sent this over to us as a recommendation.

03:34 And I'm like, "Yeah, you brought it up.

03:36 It seems definitely cool." So let me tell you about Durinf, D-I-R-E-N-V.

03:41 So it's an extension that goes into your shell.

03:43 And normally what you do is you open your shell, and it runs your Bash RC, ZHRC, whatever, and sets up some stuff.

03:52 Or if you're over on Windows, works a little bit different, but I think Dura-env is only for the POSIX type systems.

03:59 Anyway, it'll set up some values that you put in there, like environment variables and whatnot, and that's just global, right?

04:07 You can also set up when you activate a virtual environment to export other values, that's pretty cool.

04:14 But what it doesn't really do is allow you to have like a hierarchy of values.

04:18 So if I'm in the sub directory over here, I want this version of Python active or this version of where the Flask app lives.

04:27 And then if I change to another directory, I want it to automatically go, well, that means different values and dir env basically does that.

04:34 - Oh, nice.

04:35 - Yeah, so as you go into different parts of your folder system, it'll look for certain files, .envrc.

04:44 And if it finds that, it'll automatically grab all the, basically all the exports and then jam them into whatever your shell is.

04:51 And it's also cool because it's not a shell, right?

04:54 It's not like, well, here's a shell that has this cool feature.

04:56 It works with Bash, Zshell, TCshell, Fish, and others.

05:01 Right, so it's basically a hook that gets installed for, like I use OhMyZshell because, oh my gosh, it's awesome.

05:09 And then I would just plug this into it and as I do stuff with Zshell, it will just apply its magic.

05:16 - Yeah, and so one of the things you can do with this is to automatically set a virtual environment if you go into special directories.

05:24 That's not the only thing it can do, but that's one of the reasons why a lot of people use it.

05:29 - Right, you basically, well, I guess you can't do aliases.

05:31 You can't change what Python means, but you can say where the Python path is, yeah.

05:35 Yeah, and that's one of the things that's a limitation of this that people should be aware of is it doesn't, the way to think of it is not as a sub-rc, right?

05:43 It's not a sub-bash-rc where it runs aliases and all sorts of stuff.

05:48 The way it works is it runs a bash shell, like a little tiny hidden bash shell.

05:52 It imports that as the bash-rc and it captures what the exported variables are, throws away that shell, and then jams that into whatever active shell you have, like zshell or bash or phish or whatever.

06:04 Yeah.

06:05 I would probably use this all the time if I wasn't somebody that used both Windows and Mac and Linux frequently.

06:13 - You know, probably, I bet somebody could come up with this thing for Windows as well.

06:17 It's just gotta be like totally from scratch, different type of thing, right?

06:20 - People have already pointed me to Windows versions of it, but it's one of those things of like, you gotta jump through hoops to make it work.

06:28 And it's just not, for me, it's not solving a big enough problem that I have that I need to jump through the hoops.

06:35 - I agree, I agree.

06:36 It is cool, but it's not like life-changing in that regard.

06:40 I guess one more thing to point out is it's, you don't have to go to the directory where the environment RC file is.

06:48 It looks up the parent directories until it finds one.

06:51 So you have this hierarchy.

06:52 I'm down here in the views part of my website and the top level of that Git repo, I have one of these env RCs.

07:00 It would find that and activate that for you.

07:02 So that's pretty cool that it has, it's kind of like Node.js, where the Node modules live in that regard.

07:07 That's pretty cool.

07:08 - Yeah, that's a really nice feature.

07:09 - Yeah, for sure.

07:11 Also nice, Datadog.

07:12 So before we get to the next thing, let me talk about them real quick.

07:15 They're supporting the show.

07:16 So thank you, they've been sponsors for a long time.

07:18 Please check them out and see what they're offering.

07:20 It's good software and it helps support the show.

07:22 So if you're having trouble visualizing bottlenecks and latency in your app, and you're not sure where the issues are coming from or how to solve it, you can use Datadog's end-to-end monitoring platform with their customizable built-in dashboards to collect metrics and visualize app performance in real time.

07:37 they automatically correlate logs and traces at the individual level of requests, allowing you to troubleshoot your apps and track requests across tiers.

07:46 Plus their service map automatically plots the flow of these requests across your application architecture.

07:51 So you can understand dependencies and proactively monitor performance of your apps.

07:56 So be the hero that got that app at your company back on track.

08:00 Get started with a free trial at pythonbytes.fm/datadog.

08:04 You can get a cool shirt.

08:05 All right, Brian, what's next?

08:06 Yep, thanks Datadog.

08:07 I had a problem.

08:08 So my problem was a little application that had a database and it was a, I was using tinyDB just for development.

08:15 You could use Mongo, similar.

08:16 It's a document database.

08:18 Thrown some data into it, no problems.

08:20 But I, that was one of the values that I decided to change to use Python enums because I thought enums are cool.

08:28 I don't use them very often.

08:29 I'll give these a shot 'cause they seem like perfect.

08:32 And then everything blew up because I couldn't save it to the database because enums are not serializable by default.

08:41 So I'm like, there's gotta be an easy workaround for this.

08:44 And I first ran into questions about, or topics about creating your own serializer.

08:51 That just didn't seem like something I wanted to do.

08:54 - You could do it, but it's not so fun, right?

08:56 - Yeah, well, so I ran across an article, a little short article written by Alexander Hultner called "Convert a Python Enum to JSON." And I didn't need it converted to JSON, but I did need it serializable.

09:08 And the trick is to just, if you're doing your, when you use enums, you do from enum import the capital enum type, and then you have a class that derives from that, and then you have your values.

09:22 Well, if you also derive from not just enum, but another concrete type like int or string, and in my case, I was using, I used string so that my string values would be stored.

09:35 Now it is serializable and it works just the same as it always did before.

09:39 It's just, it uses the serializer from the other type.

09:43 And it just works incredible. So, for instance, I'm going to put a little example in the show notes about using a color which is red and blue.

09:51 And if you just, you derive from enum, you can't convert it to JSON because it's not serializable. You can either do an int enum, which is a built-in one, or combine a str and enum.

10:03 and enum. Now it serializes just to the string red and blue if that's the values. And then that's what's stored in your database too. So when I'm using it's really handy for debugging to be able to have these readable values as well. Yeah, this is really cool. It's a little bit like abstract based classes versus concrete classes or something like that, right? You've like the sort of general enum, but if you do the int enum, then it has this other capability, which is cool.

10:30 Yeah, multiple inheritance str, enum is the one you went for, right?

10:34 Yeah, so the multiple inheritance is the thing that Alexander recommended in his post. That's what I'm using. It works just fine. But I was interested to find out that in the Python documentation for int enum, int enum is almost just there as an example to say, we realize that it might not be integers that you want, you might want something else. But there's an example right in the in the Python documentation on using multiple inheritance to create your own It doesn't talk about serializability there, but that's one of the benefits.

11:04 Yeah, it seems like it works anyway. Awesome.

11:06 How much time did it take you to figure that out? Was it a long time?

11:09 No, I don't know. About 10 minutes of Googling?

11:11 Yeah, that's pretty cool.

11:12 Well, you could compute it with Python, of course, but the datetimes in Python and time spans, they're pretty good, actually, but they're a little bit lacking.

11:20 There are certain types of things you might want to do with them.

11:23 And so there's a couple of replacement libraries, and one that Tucker Beck sent over, it's called Pendulum.

11:30 That's pretty cool.

11:31 Have you played with Pendulum?

11:32 - I haven't, but I like the name.

11:33 - Yeah, I do too.

11:33 It's really good.

11:35 I've played with Arrow.

11:36 So this is a little bit like Arrow, but it doesn't seem like it tries to solve exactly the same problem.

11:41 It's just like, let's make Python date times and time deltas better, which is kind of the goal of both of them.

11:46 So it's more or less a drop-in replacement for standard date time.

11:50 So you can create like time deltas, which are pretty cool.

11:53 like I could say pendulum.duration, days equals 15, I have this duration.

11:58 It has more properties than the standard date time, or the time delta.

12:03 You get like total seconds or something like that, but that's not that helpful.

12:07 So this one has like duration.weeks, duration.hours, and so on, which is pretty cool.

12:13 You can ask for the duration in hours, like the total number of hours, not just the number of hour, like three hours and two days or whatever.

12:22 but you also have this cool like human friendly version.

12:25 So I can say duration in words and give it a locale and say like locale is US English.

12:31 And it'll say that's two weeks in one day.

12:33 - Nice.

12:34 - You can also, like, let's suppose I'm trying to do some work with like calendars or some kind of difference.

12:39 I say the time from here to there, I wanna do something for every weekday that appears, right?

12:45 So skip Saturday and Sunday, but if it's like from Thursday to Wednesday, I need to go Thursday, Friday, Monday, Tuesday, Wednesday.

12:51 Yeah?

12:52 So I could say pendulum.now, and then I could go from that and subtract three days, so that would be a period of three days.

12:59 And that gives you what they call a period, which is a little bit different.

13:02 And then I can go to it and say, convert yourself to in weekdays.

13:07 Okay.

13:08 Right.

13:09 Not interesting.

13:10 Then you can loop over it.

13:11 You can say for each day or each time period in this period and go, it would go, you know, over the weekdays that are involved in that time span.

13:20 That's pretty cool.

13:21 that would not be so much fun to do yourself.

13:23 There's a bunch of stuff that it does and I don't want to go like read all the capabilities and whatever.

13:27 But that gives you a sense, like if these are the kinds of problems you're trying to work through and you're like, "Man, this is a challenge to do with the built-in one." Check out Pendulum. Also check out Arrow.

13:37 I think we covered Arrow a long time ago.

13:39 If we haven't, I'll cover it at some point. It's a good one.

13:41 Yeah, and I think, actually, I don't think it's a matter of which one's the best either.

13:45 It's whatever seems to speak to you and has an API that that thinks like you do.

13:52 - Yeah.

13:53 - It's good that lots of people have solved things like this.

13:54 - Yep, absolutely.

13:56 All right, well, what's this next one?

13:58 You trying to be like a private detective or what's going on with this?

14:02 (laughing)

14:03 - Yeah, a private detective.

14:04 Looking into and spying on your code.

14:07 So this was sent off by a Twitter account called PyLang and this is PySnooper.

14:13 The claim is never use print for debugging again And I have to admit, I am one to lean on the print statement every once in a while, especially if I'm just--

14:23 sometimes I don't really want to use breakpoint because I've got some code that's getting hit a lot.

14:29 And I really do want to see what it looks like over time.

14:32 So one of the things that people often do is throw a print statement somewhere in a line just to say, hey, I'm here.

14:38 The other thing they do is print out a variable name right after an assignment so that they can see when it changes.

14:44 But that's exactly--

14:45 - It was this and now it's that.

14:47 - Yeah, so this is exactly kind of what it does.

14:49 So by default, it's just a, you can throw a decorator onto a function.

14:54 That's the easiest way to apply it for PySnooper, decorate a function.

14:59 And now every time that function gets run, you get a play by play log of your function.

15:04 And what it logs is it logs the parameters that gets past your function.

15:08 It logs all the output of your function, but also every line of the code of the function that gets run.

15:15 And every time a variable is changed, it changes its value.

15:18 And then even at the end, it tells you the elapsed time for the function.

15:22 So that's quite a bit.

15:23 If that's great for you, great.

15:25 But if it's too much information, you can also isolate it with a width block and just take a section of your function under test and just log a subset.

15:35 And then if local variables are not enough and you're changing some global variable, you can tell it to watch that as well.

15:43 Anyway, it's a pretty simple API.

15:44 And there's actually quite a few times I think I'll probably reach for this.

15:48 - When I first saw this, I'm like, ah, yeah, it's kind of cool.

15:51 There's a lot of these replacements where I think like, you know what?

15:55 You've got PyCharm or you've got VS Code, you're better off just sitting in a breakpoint.

15:59 And the tooling is so much better than like say, PDB or something like that, right?

16:05 - Yeah.

16:06 - This though, this solves a problem that always frustrates me when I'm doing debugging, which is you're going around, you've got to keep a track in your mind.

16:13 Okay, this value was that now it's this and then it became that and like sort of the flow of data, like at any frozen point, you can see really well with the visual debuggers, right, like PyCharm or whatnot, what the state is, you can see even what's changed, but like this number of when this list was empty, empty, then this was added, then this was added and here's how it evolved over time.

16:33 Yeah, people should check out the readme for this because that view of it is like, there's a loop where it shows going through the loop four times and as like all the values and variables like build up.

16:43 So you can just like review it and see how it flows.

16:45 I think it's pretty sweet actually.

16:47 - Yeah, one of the other things that I forgot to mention is if you're like debugging a process on a server, maybe you've got a small service that's running and instead of standard out, you can pipe these logs to a file and review them later.

17:03 - Yeah, definitely for a server as well, it would be nice to flip that on.

17:07 And I guess with the conditional, but you could probably even in code say, Do you feel like you're running into trouble?

17:14 Turn on the PySnooper for a minute and then turn it off.

17:16 You know, like there's probably options there, but yeah, you definitely wouldn't want to attach a real debugger to like production.

17:22 Dude, why wasn't the site work?

17:25 Oh, somebody's got to go back to their desk and hit F5 or continue or whatever.

17:31 That's not going to go well.

17:32 So I have something that's pretty similar to follow this up with that's, you know, this is about debugging and seeing how your code is running.

17:39 Like per usual, we talk about one tool and people are like, "Oh yeah, but did you know about?" So we've talked about Austin and we've talked about some of the other cool debugger profilers.

17:50 And so over on PyCoders, they talked about FIL, which is a new memory profiler for data scientists and well, general scientists.

18:02 And you might wonder like, why do data scientists, or biologists, why can't they just use our memory profile?

18:10 Like why is Austin not their thing, right?

18:13 And it may or may not be, like it may answer some great questions for them.

18:16 Like obviously, they do a lot of computational stuff, making that go much faster to let some ask more questions, right?

18:22 So maybe profilers in general are like things they should pay attention to.

18:26 But when they talk about this, they say, look, there's a really big difference between servers and like data pipeline our sort of imperative, just top to bottom code, we're just gonna run scripts, sort of, right?

18:39 And that's what scientists and data scientists do a lot.

18:42 It's like, I just need to do this computation and get the answer.

18:45 So with servers, if you're worried about memory, remember this is a memory profiler, what you're worried mostly about is, you know, this has been running for three hours.

18:54 Now the server's out of memory.

18:57 That's a problem, right?

18:58 Like it's probably an issue of a memory leak somewhere.

19:01 something is hanging on to a reference that it shouldn't, and it builds up over time like cruft, and it just eventually wears it down, and it's just bloated with too much memory.

19:14 So that's the server problem, and I think that's what a lot of the tooling is built for.

19:18 But data pipelines, they go and they just run top to bottom, and they don't, for the most part, don't really care about memory leaks, because they're only gonna run for 10 seconds.

19:27 But what they need to know is if I'm using too much memory, what line of code allocated that memory?

19:33 Like I need to know what line, where, I'm using too much memory and how can I like maybe use a generator instead of a function in a list or something like that, right?

19:43 So that's what the focus of this tool is, is it's like, it's gonna show you exactly what your peak memory usage is and what line of code is responsible for it.

19:53 - This is actually pretty cool.

19:54 - It is, right?

19:55 At first I thought, what is this?

19:56 Like why do they need their own thing?

19:57 But as I'm looking through, I'm like, yeah, this is actually pretty cool And if you go to the site, you can actually see, they give you this graph, like a nice visualization of like, here are the lines of code.

20:08 And then it's like more red or less red, depending on how much memory it's allocated.

20:12 - Oh, wow.

20:13 - Yeah, and then the total amount, and you can like dive into like, okay, well, I need to see like this loop or this sub-function that I'm calling.

20:19 How much is it?

20:20 So you can like navigate through this visual, like red, pink, gray of like memory badness, I guess.

20:26 I don't know, memory usage.

20:27 - Yeah. - Yeah, it's not bad, right?

20:28 - No, yeah, and when you're staring at code, it's not obvious where the huge array might get generated or used.

20:33 >> Yeah, and the example they have here, it's like, okay, well, they have a function called makeBigArray, okay, so probably.

20:39 You might look there.

20:40 And there's also things like using NumPy, like okay, here we're creating a bunch of stuff with NumPy.

20:45 You might say, well, here's the NumPy thing that we're doing that makes too much.

20:49 But you could be doing a whole bunch of NumPy and pandas work and one line is actually responsible.

20:56 But you're probably pretty sure it has to do with pandas, but that you're not sure where exactly, right?

21:00 So you could, you know, dig into it and see.

21:02 I think it's cool.

21:03 - Yeah, we thought we were using arrays and suddenly we have this huge matrix that accidentally.

21:08 - Exactly.

21:08 Why is all this stuff still in here?

21:10 Yeah, yeah, cool.

21:11 Well, anyway, if you're doing data science and you care about memory pressure, this thing seems super easy.

21:18 It even has like a try it on your own code on the website, which I don't know what that means, but that's crazy.

21:23 (laughing)

21:24 Not uploading my code there, but it's fine.

21:26 All right, well, Brian, that's it for our main items.

21:28 You got anything?

21:29 - I don't.

21:30 I've just been trying to get through the day lately.

21:33 - Yeah, I hear you.

21:34 Well, I have one really quick announcement and then an unannouncement in a sense.

21:38 So I sent out a message to a ton of people.

21:40 So unannouncement is for them.

21:43 So what I'm trying to do is I'm trying to create some communities for students going through the courses to go through them together.

21:50 And I'm calling these cohorts, right?

21:51 So I set up like a beginner Python cohort in a WebPython cohort and put like 20 or 30 people, I had 20 or 30 slots, let's say, for people to go through over like three or four, three months or so, where they each work a little, like they all work on the same part of the course at the same time and they're there to help each other.

22:10 There's like private Slack channels and other stuff around it.

22:14 So that's really fun, but it turns out that after one day of having that open, I got many hundreds of applicants for like 20 spots.

22:22 So I had to stop taking applications.

22:24 So if people got those messages and like, I want to apply, but it looks like the form is down, it's because there's like an insane number of applicants per spot.

22:34 So those will come back and people can sign up to get notified.

22:37 There's a link in the show notes, but I just want to say like, that's what I was doing, which is fun.

22:42 But for those of you who didn't get a chance to apply, cause it got closed right away, that's why.

22:46 - And that's for training at talkpython.fm.

22:49 - Yes, exactly.

22:50 So there's like certain courses, and if you got one of the courses and you want to go through it with a group of students all on the same schedule.

22:58 This was like a free thing that I was doing to try that out.

23:01 Yeah, I think it's a neat idea.

23:02 Yeah, thanks.

23:03 Yeah, people seem to like it.

23:04 Yeah, too many.

23:05 But yeah, we've got to give it a try, get it dialed in, then we can open up some more groups.

23:10 Yeah.

23:11 All right, well, I've got a joke I kind of like for you here.

23:13 I love this one.

23:14 Are you ready for it?

23:15 Yeah.

23:16 You want to be...

23:17 Why don't I be the junior dev?

23:18 You can be the senior dev.

23:19 Junior Dev and Senior Dev are having a chat.

23:21 And I feel like that you may be a little skeptical of what I've done here.

23:25 Let's just do this.

23:26 All right, why don't you hit me with a question.

23:27 - Okay, so where did you get the code that does this?

23:31 Where did you get the code from?

23:32 - Oh, I got it from Stack Overflow.

23:33 - Was it from the question part or the answer part?

23:36 (laughing)

23:38 - Isn't that so good?

23:40 It's like people say copy from Stack Overflow is bad.

23:43 I think this is the real--

23:45 - Copy from the question.

23:46 - You definitely don't want to copy from the question part.

23:49 - Yeah, but actually I've never heard anybody like, you know, spell that out.

23:52 You know, you can look up stuff on Stack Overflow, but at the top with the question, don't copy that.

23:58 That's the code that somebody's saying, this doesn't work.

24:01 Yeah.

24:01 (laughing)

24:02 - Exactly, exactly.

24:04 - That's funny.

24:04 - All right, yeah, this is a good one.

24:06 - Ah, it's too funny.

24:09 - It's too funny.

24:10 All right, well, thanks as always.

24:12 Great to chat with you and share these things with everyone.

24:14 - Thank you.

24:15 - Yeah, bye-bye.

24:16 Follow the show on Twitter via @PythonByte.

24:18 that's Python Bytes as in B-Y-T-E-S.

24:21 And get the full show notes at PythonBytes.fm.

24:24 If you have a news item you want featured, just visit PythonBytes.fm and send it our way.

24:28 We're always on the lookout for sharing something cool.

24:31 On behalf of myself and Brian Okken, this is Michael Kennedy.

24:34 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page