Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #186: The treebeard will guard your notebook

Return to episode page view on github
Recorded on Wednesday, Jun 10, 2020.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 186 recorded June 10th, 2020. I'm Brian Okken.

00:11 And I'm Michael Kennedy.

00:12 And this episode is actually brought to you by us. And we'll talk more about some of the ways you can support myself and Michael a little later in the show. But first, let's side table that for a little bit.

00:23 Side table. Yeah, let's put it to the side and talk about side table. Yeah, So, Sightable is something that I noticed, this new project from Chris Moffat, and long-time listeners of the show will definitely know that I'm inspired by visuals.

00:39 And this is one of those that's really nice, right?

00:42 Like, not long ago when Guido was on the show, we talked about a missing number visualizer for pandas.

00:50 So you could have a quick view of just like, I got this data, I just need to really quickly see like kind of what it looks like, what's missing, correlate missing elements and whatnot.

00:59 And so side table is in this general zen of things.

01:02 It's like, I've loaded up some data.

01:04 I just want to quickly ask some questions and get a sense of what's going on.

01:08 Like I've got a pandas data frame and I want to be able to say, can you just break this down by like, show me the top 20% of this and then group the other stuff into just like an other category.

01:18 Also, instead of just getting like a plain text output, you get a cool, like alternating row color, nice table with extra information and whatnot.

01:28 And it's usually something really, really simple.

01:31 Like I could go to the data frame and say, just give me the frequency of state and just group it by that or something.

01:38 And it does a group on those and a whole bunch of cool stuff.

01:41 So really, really neat visualization.

01:44 There's a picture in the show notes that shows you without it and with it.

01:48 And given that the nicer version requires even less typing than the not nice version, I kind of like it.

01:54 - Yeah, and just out of the box, having just like the alternating gray and white stripes is good.

02:02 - Yeah, absolutely.

02:03 So basically all you have to do is a pip install, of course, but then import side table and it adds an STB functionality to data frames, to Panda data frames.

02:12 And then you can ask it questions like frequency.

02:14 There's other stuff that you can also ask.

02:16 There's like a bunch of different functionality there.

02:19 So really nice for exploring new data sets.

02:21 And it's basically a supercharged version of Pandas value counts with a little cross tab mixed in.

02:27 So yeah, it's easy to use.

02:29 And if you're working with Pandas, especially in Jupyter context, you know, that's really where this makes sense.

02:35 Give it a shot.

02:36 I think it looks great.

02:36 - It does look good.

02:37 - Good job, Chris.

02:38 Yeah, go ahead.

02:39 - No, I totally didn't even intend to do another table one back to back, but.

02:44 - We're kicking it off with all the tables.

02:45 Yeah, which one you got here?

02:47 - So this was a suggestion from Tom McDermott And for the tabulate package, this is not for, it's not intended for Jupyter stuff, it's intended for just standard out sort of things.

03:01 So you want a pretty printy tabular data in Python for command line utility.

03:06 Actually, I've been using this for years.

03:09 I was like, I'm sure we've covered this, and I looked it up, and I don't think we have, or at least I can't find it.

03:14 - I don't remember us covering it either, and it's really sweet.

03:16 It's like it generates nice formatted tables, but in ASCII.

03:21 So like before I said, you know, side table is awesome if you're going to be doing this within Jupyter, but this is like, if you're doing it within just a terminal command line app.

03:30 - By default, you've just got like a matrix of like a list of lists or a list of tuples or something to represent the rows.

03:38 And you just want to print it with tabulate.

03:41 It just does it automatically, but you can also, I usually use it with headers.

03:44 So you have pass in the headers separate.

03:46 So header information, and by default, it just prints stuff out with, prints the headers and then dashed lines, and then your columns underneath.

03:57 But it also like spaces it correctly, 'cause that, I mean, actually that, trying to get that right by yourself by hand is just a pain to try to figure out how wide things are supposed to be and whatever.

04:07 So this just does it, and it's great.

04:09 - Not only does it do it by text, like the example that you have in the show notes really illustrates like the nuance here.

04:16 So it's got a list of planets, their radius and masses.

04:20 And for the sun, it has in scientific notation, like 1.8, 1.989 times 10 to the ninth.

04:27 And then for the other ones, it's like 5,973.6.

04:32 It aligns the decimal places, not all to the right.

04:36 I mean, it's glorious.

04:37 - Yeah, the alignment is neat.

04:38 I really appreciate that.

04:39 So you have control over some of your number formatting in your alignment, but also if you're outputting for different things, there are multiple different formats, including like a simple markdown type table, but it also does GitHub flavored markdown tables and pipes that just look nice if they just kind of make it look like boxes, and there's Jira style and MediaWiki and HTML and just plain if you don't want any sort of stuff in between, just spaces in between, it looks nice.

05:09 - That's cool.

05:10 like output this in JIRA format and like paste it into a JIRA issue as like, here's what we're doing now, or here's the problem or here's the data.

05:17 Yeah, definitely. It's a good one for keeping track of tables.

05:21 Yeah. Wow.

05:22 Another good thing is all the stuff that you and I have to offer people to learn more information about lots of stuff.

05:29 Yeah, absolutely. We have the podcast, but we also have other things as well.

05:32 Yeah. So if you want to support what we're doing, one of the things you can do is become a Patreon supporter.

05:36 So there's a link on the page where you can throw a couple bucks at us a month.

05:40 if you want, but also I've got a book.

05:44 If anybody was not aware of that, there's a pytest book.

05:47 - You've written a pytest book?

05:48 That's awesome. - Yeah, I did.

05:50 - It's good, I really like it.

05:51 - Another podcast called Test & Code.

05:54 I'd love to have more people go check that out and suggest what you want.

05:57 So I'd like to have people tell me about what other topics should be covered there.

06:03 You also offer quite a few learning opportunities for people.

06:07 - Yeah, absolutely.

06:08 The main thing for me, if you want to support me, like obviously we have the Patreon and that's great, but if you want to support us and get something back, you could take one of our courses over at Talk Python Training.

06:18 We're doing all sorts of cool stuff there.

06:20 We've got like 120 hours of Python courses and exercises beyond that.

06:24 But we recently just kicked off a cohort thing where people can go through as groups.

06:30 So that's something I'm trying to put together and you know, we'll probably be more opportunistic to do that as well.

06:35 So yeah, check it out if you want to learn Python.

06:37 that's where I recommend people go.

06:38 - Yeah, I wanna bring something up about your courses.

06:41 There's a lot of the courses that are, there's a lot of content there and it's wonderful information.

06:46 One of the things I really love, especially in this working from home environment where I don't often have a lot of time, is the way you've broken up all the courses into little tiny pieces.

06:57 So there's a table of contents so you can go through the course and see what you've seen and see what you haven't, but you can keep track of what you haven't and there's often just, If you've got like three to five minutes, you can fit in a little extra video.

07:09 - Yeah, thank you so much.

07:10 That's awesome.

07:11 - And I like that you've done that.

07:12 - Yeah, I really want to try to make the courses have meaning as a reference afterwards as well.

07:17 And like, nobody wants to go back and scan a 30 minute video for that 30 second clip you're looking for.

07:22 - Yeah, that's good.

07:23 - Awesome.

07:23 You know what else is really good?

07:24 Tree beards.

07:26 Yeah, for real, tree beards are pretty awesome.

07:28 - Is that like a neck beard?

07:29 - But for a tree.

07:32 - Okay.

07:33 - Okay, yeah.

07:33 So I actually have no idea the relationship of the neck beard to the tree beard, but tree beard is continuous integration for notebooks, which is pretty cool actually.

07:41 That is cool.

07:42 So this was recommended by Brian Skinn and it's continuous integration for a particular subset of notebooks.

07:50 Those are the notebooks that are binder ready.

07:52 So if you're not familiar with binder, I recently did a talk by Thon episode on this and.

07:56 Came to appreciate binder way more than I originally did.

08:00 So, Binder is a place where you can basically point a GitHub repo or some repo at, go to Binder, point it at your repo, say, "Here's the notebook, here's the dependencies files and everything." And then you just click a button and say, "Let me run this on Binder." Because if you go to GitHub, you see the possibly the output from the notebook, but that's like cached what was run the last time.

08:23 If you want to actually run it and play with it, you can click launch a Binder, it'll fire up a little docker image somewhere magically in the cloud and it'll just run it.

08:32 So you basically configure the repo to describe to binder what it needs to run successfully, right?

08:39 So that's how this works is Treebeard basically says if there's something that can be run on binder, then it will use that same functionality to automatically install the dependencies, which could be like conda or pip or whatever.

08:53 And then it'll run the notebook using that cool library called Paper Mill, which sort of converts notebooks into kind of function type things.

09:01 It'll upload the output and do an NB convert on the notebook to save it and create like a version stamped last run of your notebook that you can go back through your continuous integration and see the history of the outputs saved as HTML, which is pretty awesome.

09:18 And it integrates with a GitHub app that'll like push notifications back to your repo.

09:22 it integrates with Slack, it has all kinds of interesting things like this.

09:26 So really a neat mechanism to make sure that your code just keeps running if it's a notebook.

09:32 - Yeah, it's even got like secret management so you can, if you have to connect with different things with passwords and stuff.

09:38 - Don't you just put those in the notebook?

09:40 - No.

09:41 - No, darn it.

09:42 Yeah, no, that's really cool.

09:44 It has secret management and all kinds of stuff.

09:46 And basically when I first saw this, I thought, okay, well, what's the criteria of success, right?

09:52 Like how do I write a test to indicate a successful notebook experience?

09:56 The way it works is basically it runs all the cells and if all the cells run without exceptions, then it's successful.

10:03 So it's not like it's making assertions, but it's kind of like a smoke test.

10:07 Like it didn't entirely explode, so we think it's probably okay.

10:11 That's not bad for a starter.

10:12 - I mean, conceptually, you could put asserts in there and that would throw an exception.

10:16 - Exactly, right.

10:16 You could build in the test at like some layer in there.

10:19 Like have even a Python file that you import that like does the tests, I don't know, whatever.

10:23 There's a lot of options.

10:24 So you're right, you could make your notebook report out.

10:27 - Make some cells in there.

10:28 - Yeah, make some cells that'll blow up if things go wrong, for sure.

10:30 - Yeah, somebody should get ahold of us and tell us why beard.

10:33 - Yeah, 'cause trees generally don't have beards.

10:36 (laughing)

10:38 - Well, okay, we live in Oregon, so they're often very mossy, so.

10:41 - That's true, they've got that little moss thing if it's just right, actually, it could totally do that.

10:46 You're sure?

10:46 - Yeah, okay.

10:47 - So one thing that surprised me, Ryan, that seems to keep coming up and up, and both of us are talking about it next.

10:52 Like, I feel like we've aligned perfectly so far.

10:56 - Oh my gosh, we are.

10:57 - Dude, we're both talking about virtual environments.

10:59 You go first.

11:00 - Okay, so there's a couple of things that we, in episode 184, we discussed virtualenv, N-V-E-N-V, and actually I learned quite a bit to find out that virtualenv is still pretty cool and fast.

11:13 But that was in 184, but we had people get ahold of us and say, "Hey, there's more information that you guys don't know.

11:20 And I love that.

11:21 Please keep it coming.

11:22 If we do half of the story, give us the rest of it.

11:25 In Python 3.9, so VENV, the built-in one, it has a cool new flag called "upgradeDepth" for upgrading your dependencies.

11:34 It's like not all of your dependencies, but it's for virtual environments.

11:38 Every time you create one, we commented that you have to upgrade PIP.

11:42 And this new flag allows it so when you install, create a new virtual environment, It automatically upgrades set of tools and pip for you.

11:52 Yeah, that's just nice.

11:53 That's in Python 3.9.

11:55 I tried it out already.

11:57 I tried it on Beta 1.

11:58 Beta 3 is already out.

12:00 So you can try it out if you want.

12:02 The other news is the Virtual Env is getting something new.

12:07 And it's not there yet.

12:08 I'm not sure when it's coming.

12:10 But I think it's soon.

12:11 It's getting a feature called periodic update, which is super cool.

12:15 So one of the things, so Virtualenv, since it's separate from your Python, or you can have it install, make virtual environments for multiple Pythons, for instance, but it also keeps its own cache of new pip, new setup tools, and new wheel, that package you need if you're creating wheels.

12:33 And so it has those upgraded already, but the periodic update, it will just have this extra thing that in the background goes off and checks to see if there's new ones around.

12:44 So whenever you actually need to create a new virtual environment, it'll automatically have an updated one that it can install right away, which is neat.

12:53 Yeah, that's pretty cool.

12:54 Nice.

12:55 And if you don't want it to go off and do the background, you can manually say, okay, right now I want you to go off and upgrade it right now.

13:02 So, okay, that's a cool idea.

13:03 I like it.

13:04 So you've got a better chance of having updated stuff.

13:07 If like you're working without an internet connection at the moment or something.

13:10 It already had a kept it its own version of it that would upgrade it.

13:15 So you already can.

13:17 It's newer than if you're using VNV.

13:19 But I'm excited about it.

13:21 And one of the other things I wanted to mention is I kind of complained about that the the prompt is different.

13:28 And I got a little bit of the skinny about why the prompt is different in virtual M versus VNV.

13:34 and it had to do with the prompt formatting on different operating systems was different, which is weird, but they coalesced it and made it a single prompt.

13:45 And the need for, like sometimes you actually want to not have a space, you might not want to have those parentheses.

13:52 So there may be reasons to not have the parentheses in space.

13:56 So there's reasoning behind it.

13:58 It just still annoys me, but that's okay.

14:01 (laughing)

14:02 - It's cool to actually know why though.

14:03 That's really nice.

14:04 So all these things that make working with virtual environments better are great.

14:09 But how about we just don't have virtual environments, but we still do?

14:12 Wouldn't that be better?

14:13 I don't know.

14:14 So let me tell you what I'm thinking.

14:17 So a while ago, for the 3.8 timeframe, there was a proposal called PEP 582.

14:25 And PEP 582 is put together by a bunch of folks, Steve Dower, and four or five other people, I'm forgetting, Donald Stuff.

14:33 stuff, then I know there's two other folks that I'm forgetting.

14:36 Sorry about that.

14:37 But anyway, it was put together.

14:38 And the idea is that it proposes to add a mechanism to automatically recognize a Dunder Pi packages and prefer importing packages installed in there over global packages.

14:50 So the idea is you just go to your project and say at the top of your project, go, here's the top of my project.

14:57 And then when you pip install stuff, it will put things there.

15:01 You won't have to activate a virtual environment because you're not changing anything outside the global system.

15:08 It's just going to drop it in right there.

15:10 Basically this is how Node.js works.

15:12 So if I'm by npm install a thing, it just traverses up the directory until it finds a node modules.

15:20 And it's kind of like that.

15:21 So it says, if you have this folder here, we're going to automatically install stuff there and then Python will automatically know to look there.

15:28 So if you're anywhere in a subfolder without even activating the virtual environment and you type Python something to run a command, as long as you're in the folder structure, it's going to use that environment.

15:39 Oh, that's pretty cool.

15:40 Yeah, that's pretty cool, right?

15:41 So the motivation at least is it's like every time someone's new to Python, they're like, well, I can't install this thing.

15:48 It says access denied.

15:49 You're like, you know, permission denied.

15:50 Like, well, okay, let me talk to you about virtual environments and why you need them.

15:54 And also to activate the environment on the different shells and the different platforms like Windows versus POSIX, you know, source versus not source and bin versus scripts is different.

16:06 And so that's kind of a pain.

16:07 So the idea also, every time you open up a new terminal or command prompt, you've got to reactivate it.

16:14 Like I've all, all for all of these things, I have aliases that make this happen.

16:18 Right.

16:19 So the idea here is that you don't have to worry about any of that stuff.

16:22 You just have to like init your Python project somehow.

16:25 It doesn't, I don't remember seeing how that was supposed to happen.

16:28 But once that PyPackages folder is there, it's like, well, that's the top of the project and we're going to install there.

16:34 And you presumably could have like a fallback one at the top of your user profile or something along those lines.

16:39 - Yeah, you have that.

16:41 So that's for the packages.

16:42 But what about in virtual environments, you also have local scripts that come along, entry points.

16:48 - Yeah.

16:48 - Do you know if it deals with that?

16:49 - I don't know.

16:51 I don't know about it.

16:51 Well, I didn't read like every word of it.

16:53 So it's in draft mode, but I was a little confused because it says its version is Python three, eight.

16:59 I'm like, well, three, it's shipped.

17:00 It should either be closed or or published.

17:04 That seems weird.

17:05 So I sent a message to Steve Dower just a moment ago on Twitter.

17:08 And he said that, Michelle Doss, one of the folks proposing it, I think the primary guy, still working on it, the text itself hasn't been updated before 3.8's release, which is why the header is still a little bit out of date.

17:22 So it's probably more like a 3.10 thing or something, but it's still pretty cool.

17:28 If you want to try to live in this world and see what it's like, David O'Connor has this thing called PyFlow, and PyFlow basically does this.

17:36 It integrates with PyProject.com.

17:38 Oh man, we lined it up good this week.

17:41 And you go through, instead of saying pip install, you say PyFlow install.

17:45 instead of saying Python run, you say, or Python script, you say PyFlow script, because it has to like, re initialize that every time because it's not actually changing something.

17:55 Anyway, it's interesting, I would like to see something kind of like this. I think it's pretty neat. There's also some interesting possibilities around direnv that I'm looking into just talking to someone, Chris, who has got some cool ways to have direnv automatically activate virtual which would be kind of cool as well.

18:16 So there's a lot of stuff happening here.

18:18 It still kind of blows my mind.

18:19 There's so much action around something that feels like it's just a, I don't know, so plumbing and foundational.

18:26 - Yeah, but like you said, it is plumbing and foundational, but it's also one of those things that's, it's one of those tripping things.

18:33 It's like the loose stone on the sidewalk that trips up all the new people all the time.

18:40 - So far what we've managed to do is we've managed to spray paint a yellow line on both sides of it.

18:45 You know?

18:46 Somebody needs to shave that bad boy down.

18:48 But right now, at least it's got a little marker on it.

18:51 And I just want to say thanks to Louise Erbier on here.

18:56 Sent that over and let me know about this whole project.

18:58 So thank you for that.

18:59 - Yeah, that's nice.

19:00 So speaking of pyprojects.toml, I actually really love, I kind of like this.

19:05 I like awesome lists.

19:06 So awesome lists are a thing.

19:08 We've covered many of them in the past.

19:10 - There's even a Python bytes awesome list.

19:12 - Yeah, this one is awesome, pyproject.toml projects.

19:17 So this is one of the great things about different sorts of source code lists is to go and look at examples.

19:24 So this is a list of other projects that are out there that already use pyproject.toml so you can look to see how other projects are doing it.

19:34 So if you want to figure out for your own project, this is helpful.

19:38 For instance, a lot of the testing and formatting stuff came along early.

19:41 So covers.py is in there, pytest, tox, black, isort.

19:46 I know, knew all of those.

19:48 Ward was a new one to me.

19:49 So ward is apparently a way to test things without like string named test functions instead of function names.

19:57 I haven't really played with it much other than looking at the documentation, but it looks neat.

20:01 But there's a code analysis like pylint and unimport and the really long titled, we make Python style guide which is a linter and other stuff, but it's pretty cool.

20:12 And then it has a couple of links to articles about pyproject.toml.

20:16 And then what I think is also neat is a, a list of projects that are discussing switching to pyproject.toml.

20:25 So you can, that's probably pretty interesting if you're deciding, if you're trying to decide yourself, right?

20:30 Yeah.

20:30 To figure out what sort of discussions are going on in other projects as to why to switch and why not.

20:35 So yeah, for sure.

20:37 Pretty cool.

20:38 Very cool.

20:38 Yeah.

20:39 I think people switch, I'm using it everywhere because it's just, it's sort of easier.

20:44 What confused me for a little while was that it isn't, I thought it was something you needed Flit or Poetry to be using, but you can use byproject.toml with setup tools projects also.

20:56 - Okay, interesting.

20:57 Yeah, I didn't know about that.

20:58 Yeah, I kind of thought it was tied to some of these higher order management, things like Poetry and Flit and so on.

21:04 - Yeah.

21:05 - Cool, cool, cool.

21:06 - And like you said, there's a Python bytes awesome list.

21:08 if people like awesome lists.

21:10 - Sorry, I put that at the end there.

21:12 People can check that out.

21:12 Thanks, Jack, for doing that.

21:14 - Yeah, so that's our six items.

21:16 Michael, anything extra to share with us?

21:18 - I got something for everyone.

21:19 I got two things, actually.

21:20 One follow-up and one new thing.

21:23 First of all, we had Calvin on a while ago, a couple shows ago, was that last show?

21:29 Show before, I think a couple shows ago, and we were talking about secrets, and he also, he's in your camp.

21:35 He doesn't put them in the notebook or in the right there in the source code, he's doing something else.

21:39 But what he talked about is actually using one password as like a vault, right?

21:45 So one password has awesome encryption and security.

21:49 And so a lot of the challenges revolve around, well, if I'm gonna put them somewhere else, if I just put them straight in the virtual into an environment variables, well, people can grab them there.

21:58 So maybe I wanna put them some other place where it's like encrypted or something, right?

22:03 So he talked about his mechanism of finding all those environment variables at launch and then like just as you run your virtual environment, injecting them there, but storing them in one password instead of just on the file system or something like that.

22:16 So he did a blog post about how he's doing that.

22:19 And so I'm gonna just link back to that.

22:21 - Yeah, nice.

22:21 - That looks pretty cool.

22:22 And also I wanna give a shout out to Talk Python, specifically the last episode, at least the time of recording, it'll probably not be by the time we publish this, but nonetheless, just recently, You were a guest on Talk Python, where we talked about 15 awesome pytest plugins, mostly a few extensions like using with or alongside, but mostly pytest plugins, and went through things like pytest Sugar and Freeze Gun and all sorts of fun stuff.

22:48 So people can't get enough of us.

22:50 They can hear you being a guest over there talking about pytest the entire time.

22:54 - Yeah, it's nice.

22:55 - Yeah, that was fun.

22:56 Thanks for coming on there.

22:57 - I like to hear myself talk so much that I also, we cross-posted that on a testing code as well.

23:03 - Yep, sounds good.

23:04 - And one of the things, so as an extra bit, did you know that I wrote a book?

23:08 - Yes, yeah, I've heard of that.

23:10 No, it's a great book, I have it.

23:11 - I published through Pragmatic Publishers and I just wanted to bring up that Pragmatic has a shiny new website.

23:18 So the Pragmatic site is a little different and there's an FAQ up there if people wanna know why or what's different about it.

23:25 And for the most part, it looks a lot the same to me but the entire backend is different and it's a little faster. - Yeah, yeah, cool.

23:32 Faster is always nice.

23:33 Makes it happy.

23:34 Nice to work with.

23:35 All right, I have a joke.

23:36 Let's pretend we're roommates.

23:37 You can be the first person and I'll be the second person, okay?

23:40 - Okay.

23:42 Okay.

23:43 Stop by the store on the way home from work.

23:48 Please stop at the market and buy one bottle of milk.

23:52 If they have eggs, bring six.

23:53 - I came back with six bottles of milk.

23:56 - Why the hell did you buy six bottles of milk?

23:59 I just said, it's just the two of us.

24:01 - What do you think, man?

24:02 because they had eggs.

24:03 (laughing)

24:05 Obviously, taking this programming logic a little strong, right?

24:08 Stop by the store, if they have eggs, and get a bottle of milk, if they have eggs, get six.

24:13 Cool.

24:14 - That's funny. - Yeah, pretty good one.

24:16 - Takes a little bit of thinking, so glad we have it written down for people.

24:20 - Yeah, yeah, we can go back and study it, right?

24:21 - Yeah. - All right, well, thanks a bunch, huh? - Cool.

24:24 All right, thank you. - Yep, bye.

24:26 - Thank you for listening to Python Bytes.

24:28 Follow the show on Twitter @pythonbytes.

24:30 That's Python Bytes, as in B-Y-T-E-S.

24:33 And get the full show notes at pythonbytes.fm.

24:36 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

24:41 We're always on the lookout for sharing something cool.

24:43 This is Brian Okken, and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page