WEBVTT

00:00:00.001 --> 00:00:04.000
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to

00:00:04.000 --> 00:00:09.940
your earbuds. This is episode 238, recorded June 15th, 2021. I'm Michael Kennedy.

00:00:09.940 --> 00:00:11.060
And I'm Brian Okken.

00:00:11.060 --> 00:00:12.160
And I'm Julia Sidnell.

00:00:12.160 --> 00:00:14.460
Hey, Julia. Thanks for coming on the show.

00:00:14.460 --> 00:00:15.920
Yeah, thanks for having me.

00:00:15.920 --> 00:00:19.420
Yeah, it's great. Why don't you tell folks a bit about yourself?

00:00:19.420 --> 00:00:25.680
Yeah, so I'm the head of open source at SaturnCrowd and a maintainer of Dask. So I split my time

00:00:25.680 --> 00:00:31.300
half and half. I spend half my time just doing regular like maintenance stuff on Dask. And then

00:00:31.300 --> 00:00:37.240
half my time doing like engineering and product management on SaturnCrowd. SaturnCrowd is a

00:00:37.240 --> 00:00:42.620
data science platform that really specializes in distributed Dask clusters in Jupyter and making

00:00:42.620 --> 00:00:46.800
it really easy for people to get up and going with those things on AWS.

00:00:46.800 --> 00:00:52.280
Yeah, Dask is really interesting. You know, when I first heard about it, I thought, okay,

00:00:52.280 --> 00:00:57.080
this is a like a grid computing scale out thing, which I probably don't have a lot of use for.

00:00:57.080 --> 00:01:01.520
But then I was speaking with Matthew Rocklin about it. And it has a lot of applicability,

00:01:01.520 --> 00:01:07.700
even if you have not huge data, huge clusters, right? Like you can say even on your local machine

00:01:07.700 --> 00:01:12.520
scale this out across my cores or, you know, allow me to work with more data than will fit in RAM

00:01:12.520 --> 00:01:15.040
on my laptop and stuff like that. Right. It's a cool.

00:01:15.040 --> 00:01:19.540
Yeah. Yeah. It has like a whole different, a whole number of different ways of interacting with it.

00:01:19.540 --> 00:01:25.040
Right. Like there's that there's like, just make this thing go faster by paralyzing it. There's all the data

00:01:25.040 --> 00:01:30.540
framey stuff. There's all the array stuff for more dimensional data. So it's got a, it's got a large API.

00:01:30.540 --> 00:01:36.700
Yeah. Cool. And we're going to touch on a couple of topics that are not all that unrelated to, to those things

00:01:36.700 --> 00:01:41.460
here. And so, yeah. Speaking of data science, Brian, you want to kick us off?

00:01:41.460 --> 00:01:48.320
Sure. Yeah. The first thing I want to cover is an article called the practical SQL for data analysis.

00:01:48.320 --> 00:01:57.580
This is by Aki Benita. So I, one of the things I like, liked about this is it was, it's kind of talking

00:01:57.580 --> 00:02:03.960
about the first bit of the article was talking about basically that with, with data science,

00:02:03.960 --> 00:02:08.960
you've got pandas and, and numpy and stuff. And you also have often you're dealing with a database.

00:02:08.960 --> 00:02:17.520
So the, and SQL on the backend. So there's the first part of the article talks about how some

00:02:17.520 --> 00:02:26.860
things you can do both in pandas and in, in, in SQL, like SQL queries it's faster in SQL. So there's a big

00:02:26.860 --> 00:02:33.520
chunk that's just talking about how that's faster. But then, but then, you know, he also talks about just

00:02:33.520 --> 00:02:38.700
basically there's, there's a lot of benefits to the flexibility and the comfortableness you can have with

00:02:38.700 --> 00:02:44.440
pandas though. So trade offs as to where, you know, you can, where you're going to push the, push it too far

00:02:44.440 --> 00:02:50.340
into SQL or, or have a nice split is good. But then he goes through and talks about a whole bunch of

00:02:50.340 --> 00:02:57.060
great examples of different things like pivot tables and roll-ups and, and choices and different things

00:02:57.060 --> 00:03:02.500
you can do with either pandas or SQL. And really what his recommendations are for whether it should be

00:03:02.500 --> 00:03:08.400
in pandas or in, in SQL query, and then how to do those queries, because I mean,

00:03:08.400 --> 00:03:15.120
really the gist of the articles and, and this problem space is people are comfortable with pandas,

00:03:15.120 --> 00:03:21.560
but they don't really understand SQL queries. So this sort of a good cheat sheet for, for, for how to do

00:03:21.560 --> 00:03:24.880
the queries is a, is a, I think really kind of a cool thing. So.

00:03:24.880 --> 00:03:30.380
Yeah. I think it's really neat. And you, you have these problems, you know how to solve them in one

00:03:30.380 --> 00:03:35.900
or the other. And I think this compare and contrast is really valuable, right? Like I know how to take

00:03:35.900 --> 00:03:40.680
the mean of some column and SQL, but I haven't done it in pandas yet. Let's go see how to do that. Or

00:03:40.680 --> 00:03:45.620
I'm really good at doing pivot tables and pandas, but boy, always kind of avoided joins in SQL. They

00:03:45.620 --> 00:03:49.640
scared me. And then how does that even translate? Right. I think that back and forth is really valuable.

00:03:49.780 --> 00:03:53.740
Yeah. Yep. And I, then it covers, covers things that I don't even know what they are,

00:03:53.740 --> 00:03:58.680
like aggregate expressions. I don't even know what that is, but currently that's a, that's a thing

00:03:58.680 --> 00:04:03.660
that people do. So I can help you out at aggregate stuff. No, just kidding. Julia, what do you think

00:04:03.660 --> 00:04:08.740
of this? Yeah, no, it seems, it's really cool. Like, I agree that like the, having the, the,

00:04:08.740 --> 00:04:13.860
having independence and then in SQL, that like comparison is super helpful. Like SQL is always super

00:04:13.860 --> 00:04:18.640
scary to me. And I always end up like Googling a bunch of stuff whenever I have to mangle my SQL.

00:04:18.640 --> 00:04:23.100
but I know it's so fast, so it's cool to see a way to access that.

00:04:23.100 --> 00:04:26.400
Yeah, absolutely. This is a good one, Brian. I think a lot of people will find it useful.

00:04:26.400 --> 00:04:31.580
I also want to just give a quick shout out to the past a little bit, not too long ago. We caught,

00:04:31.580 --> 00:04:37.320
we talked about efficient SQL on pandas with duck DB, where you actually do the SQL queries against

00:04:37.320 --> 00:04:43.280
pandas data frames. So if you're finding that you're trying to do something and maybe it would be

00:04:43.280 --> 00:04:47.920
better in SQL, but you don't want to say completely switch all your, your data over to a relational

00:04:47.920 --> 00:04:52.240
database, you just kind of want to stay in the panda side, but is that one or two things like,

00:04:52.240 --> 00:05:00.640
this is really cool. This sort of upgrade your data frame to, execute SQL with the duck DB query

00:05:00.640 --> 00:05:06.680
optimizer is also a kind of a nice intermediary there. Yeah. Dask also does some, I'm going to try not

00:05:06.680 --> 00:05:11.440
not to make everything about Dask, but, Dask does some things that are kind of, that kind of

00:05:11.440 --> 00:05:16.760
take some of the ideas in the, from this article of like doing predicate push down of like, of

00:05:16.760 --> 00:05:22.660
pushing down some of the like filters into the read because it, because it evaluates lazily. It doesn't

00:05:22.660 --> 00:05:27.280
have to like grab all the data greedily upfront. It can like do that later. so you can get some

00:05:27.280 --> 00:05:31.420
of the benefits. That's cool. And it can also distribute the filter bit, I guess at that point.

00:05:31.420 --> 00:05:37.840
Yeah. Nice. All right. I want to talk about the usual suspects. So, okay. That was, that was a

00:05:37.840 --> 00:05:41.860
pretty good show. Was that Quentin Tarantino or something like that? It was not actually about

00:05:41.860 --> 00:05:47.300
this. This comes to us from wrestling port port. Noy. And thank you for sending this in mentioned

00:05:47.300 --> 00:05:55.060
an article that has this really interesting idea. How do you apply get blame when you encounter a

00:05:55.060 --> 00:06:01.080
Python traceback? So here's the scenario, your code crashes and you either print out the traceback or

00:06:01.080 --> 00:06:06.720
Python does it for you because it's just crashed. And normally it says, here's the value. Here's the

00:06:06.720 --> 00:06:12.100
line of code. Here's the file. It's in here's the next line in the call stack. Here's a line of

00:06:12.100 --> 00:06:18.980
code. It's in the idea is you can take get blame, which is a command that says, show me who changed

00:06:18.980 --> 00:06:24.180
this line of code or who wrote this line of code, at least touched it last on every single line of

00:06:24.180 --> 00:06:28.360
code. And I love this whole idea of like, all right, who did this? And sometimes I'll come across code.

00:06:28.360 --> 00:06:33.820
I'm like, this is so crappy. Like who did this? Oh wait, that's me. Okay. Well, at least I know

00:06:33.820 --> 00:06:38.820
what I would feel about it. But the idea is what if your traceback on each line where it had an

00:06:38.820 --> 00:06:45.260
exception could also show who wrote that line of code. Cool. Huh? Yeah. So let's check it out.

00:06:45.260 --> 00:06:51.800
It's pretty straightforward. This is an article by offer core or in, and it basically uses two libraries

00:06:51.800 --> 00:06:57.380
that are themselves both pretty straightforward. So like here's a straightforward example of a traceback,

00:06:57.380 --> 00:07:02.960
like trying to pop something off of an empty list says on this line in the function pop some,

00:07:02.960 --> 00:07:06.940
you know, there's this line here in the call stack. And then the next line, this line,

00:07:06.980 --> 00:07:12.240
the call stack and eventually raise a value error, you know, empty range, can't pop nothing off,

00:07:12.240 --> 00:07:16.400
you know, something off of nothing basically. But this doesn't show you any information about like

00:07:16.400 --> 00:07:22.040
maybe who wrote that line and who wrote this other line up here. Right. So, what they did is they

00:07:22.040 --> 00:07:28.660
took a couple of modules traceback and then line cache. And it turns out when traceback shows you this

00:07:28.660 --> 00:07:34.540
traceback that line, it uses line cache to figure out, okay, from this actual, I'm guessing,

00:07:34.940 --> 00:07:41.480
bytecode that it's going to run this, CPython interpreter code. Where did it like,

00:07:41.480 --> 00:07:48.300
what line of file did this actually come from? Right. So here's the insight or the thing

00:07:48.300 --> 00:07:54.580
you can actually change what's in the cache. And because it's a cache, once it's figured out what

00:07:54.580 --> 00:08:00.260
the lines are, it's not going to read it again. So it's like, like a list for each line that you

00:08:00.260 --> 00:08:05.320
get back and you can just change the value. So it said, okay, well, here's like return random.

00:08:05.320 --> 00:08:08.740
That's what the line of text was. They're like, no, no, no, there's nothing to see here. Move along.

00:08:08.740 --> 00:08:13.480
If you make that and then you cause it to crash again, what comes out is, if you go a little

00:08:13.480 --> 00:08:17.800
bit further down, normal code, normal code, or normal traceback, normal traceback, then it just,

00:08:17.800 --> 00:08:20.960
instead of the line of code, it says nothing to see here. Please move along.

00:08:21.880 --> 00:08:25.740
All right. So what are you going to do with that? Now that you realize like you can actually

00:08:25.740 --> 00:08:33.200
change what appears in the traceback. So you write a little regular expression to go and execute,

00:08:33.200 --> 00:08:39.580
get blame on the various files, and then to re-inject that back into line cache. And so what they do is

00:08:39.580 --> 00:08:45.060
they just put, if they know the blame, they just put, you know, like 80 lines, 80 characters up to 80

00:08:45.060 --> 00:08:50.180
characters of the line and then edited on such and date, such and such date by such and such person.

00:08:50.180 --> 00:08:56.600
And here's the, commit message. Right. And so just basically shelling out to get blame when it

00:08:56.600 --> 00:09:02.000
crashes. Now you get some really cool stuff. Like on this line, it says this is edited by, you know,

00:09:02.000 --> 00:09:07.940
many, many days ago by so-and-so in this, get commit and so on. And what's interesting,

00:09:07.940 --> 00:09:13.220
like this is already in itself useful, I think, but what's more interesting is other tools use this as

00:09:13.220 --> 00:09:19.180
well. So for example, if you use PUDB, which is a sort of visual debugger, kind of, it's like a

00:09:19.180 --> 00:09:24.320
command line one that I know visual in the sense of like Emacs is visual, not like PyCharm is visual,

00:09:24.320 --> 00:09:29.960
but it will actually pull up that data. So you can see they, they jumped into the PUDB bugger and it's

00:09:29.960 --> 00:09:35.220
actually showing all of this get blame attribution as well that they've added. So yeah, pretty interesting.

00:09:35.220 --> 00:09:40.580
What do you all think? Yeah, I think that looks really cool. I mean, I always do get blame whenever I run

00:09:40.580 --> 00:09:43.980
into something that's weird with the hope that someone else will be able to explain it to me.

00:09:43.980 --> 00:09:47.700
Exactly. Who knows about this or who do I talk to about breaking this?

00:09:47.700 --> 00:09:51.820
Right. Yeah. You could even put like PR numbers and stuff in here. Right. And that'd be pretty cool.

00:09:51.820 --> 00:09:54.620
Yeah. Yeah. That'd be super cool.

00:09:54.620 --> 00:09:59.660
Yeah. One of the things I like, I don't really like that the name get blame, but it's there.

00:10:00.280 --> 00:10:05.980
But I agree with Julia that the main thing I use it for isn't to try to figure out who broke it,

00:10:05.980 --> 00:10:09.040
but who to ask about this, this chunk of the code.

00:10:09.040 --> 00:10:13.960
Yeah. I agree. Cause usually when you see something that's really confusing and weird, you're like,

00:10:13.960 --> 00:10:19.020
I know they didn't just pick the hard way of doing this because they didn't want to do the easy way.

00:10:19.020 --> 00:10:24.520
There's something that I don't fully understand. Some edge case that's crazy here. I'm going to go talk to

00:10:24.520 --> 00:10:29.360
that person. So yeah. Also the, how long ago it was edited. So if there was something edited

00:10:29.360 --> 00:10:33.800
yesterday, that's probably the problem. Yeah, exactly. Like in this little screenshot here,

00:10:33.800 --> 00:10:41.120
some of these are edited like 1,427 days ago. That's probably not the problem. Maybe, but probably not.

00:10:41.120 --> 00:10:44.760
I feel like I have the opposite assumption. Like if something is from six years ago and it's weird,

00:10:44.760 --> 00:10:48.760
I'm like, well, probably things were different back then. And like, you know.

00:10:48.760 --> 00:10:53.480
Yeah. Yeah. Yeah. It's no, no longer applicable to the new data, new situation. Yeah.

00:10:53.940 --> 00:10:58.240
Oh, that'd be an interesting thing also is to have like a tool that would tell you if something's like

00:10:58.240 --> 00:11:02.820
over a thousand days old or something like that, you probably should go refactor it to make sure

00:11:02.820 --> 00:11:08.560
somebody understands that code. Yeah. Yeah. For sure. All right. Jumping back to the first item

00:11:08.560 --> 00:11:13.160
really quick and the live stream, Alexander out there. Hey, Alexander says, I wonder if graph databases

00:11:13.160 --> 00:11:18.500
with Gremlin queries could be more suitable for data science. You know, SQL joins are way harder.

00:11:18.500 --> 00:11:22.420
Yeah. Graph databases are pretty interesting. If you're trying to understand the relationships,

00:11:22.420 --> 00:11:25.460
that may well be better. I don't know. Hila, do you got any thoughts on this?

00:11:25.460 --> 00:11:30.000
I don't know anything about graph databases. So out of my league.

00:11:30.000 --> 00:11:36.400
I didn't have a desire to understand graph databases until I found out that there were Gremlin queries.

00:11:36.400 --> 00:11:38.160
Now I think I want them.

00:11:38.160 --> 00:11:44.280
Well, Brian, they don't start out as a Gremlin queries. They're Mogwai inserts. And then if you

00:11:44.280 --> 00:11:48.800
insert them after midnight, then they become a Gremlin query. I mean, come on, we all know how it goes.

00:11:49.080 --> 00:11:53.920
You definitely don't want to get them wet. Oh, that's an old show. I'm not sure if everyone's

00:11:53.920 --> 00:11:57.860
going to get that reference, but yeah, that was, I love that show. Okay. Anyway, let's,

00:11:57.860 --> 00:12:01.080
let's move on to the next one. The next one is you, Julia.

00:12:01.080 --> 00:12:08.640
Yeah. So I wanted to highlight FS spec. So file system spec for people who can't hear letters very well.

00:12:08.640 --> 00:12:19.100
So this is the basis for S3 FS. FS, I'm not getting the letters right, but there's, there's one for GCP.

00:12:19.240 --> 00:12:27.560
There's one for S3 and basically it's a file system storage interface or like the basis for a file system.

00:12:28.020 --> 00:12:41.020
And so you can do things like you can open just files as you can just take a path and open it as a, as a, as a file object in Python and read it with all the normal, like read, write operations.

00:12:41.020 --> 00:12:42.680
Oh, interesting.

00:12:43.020 --> 00:13:02.580
But from anywhere. So like there's all these different ones for S3 for GCFs and for, for like, even for like HTTP and just basically anything you've, you can imagine anywhere you can imagine a file being either there's already been one of these written.

00:13:04.020 --> 00:13:11.340
It's kind of like a, it's an interface and then you write different packages on top of it that are like drivers or some, they have some name for it.

00:13:11.340 --> 00:13:18.120
And it allows you to treat the file system as like this interchangeable building block.

00:13:18.120 --> 00:13:26.380
So you don't get, you don't end up writing like photo three code or something that's like very specific to a specific cloud storage.

00:13:26.580 --> 00:13:38.240
You write like this more general code and then it's really useful for like a lot of free datasets that are hosted on different clouds, but like they'll sometimes be on one cloud and sometimes be on another, but like basically it's the same data.

00:13:38.240 --> 00:13:45.000
Or if you're at a company and you want to like switch clouds, it just makes that whole thing so much easier.

00:13:45.000 --> 00:13:50.300
It looks really, really useful, especially for avoiding cloud lock-in.

00:13:50.300 --> 00:13:51.620
Yeah. Yeah.

00:13:51.760 --> 00:13:54.680
And you can always write, like you can always write your own one.

00:13:54.680 --> 00:13:57.900
If something else pops up, you can write your own implementation of that.

00:13:57.900 --> 00:13:58.360
Right.

00:13:58.360 --> 00:14:08.080
So there's an example here talking about using a file system in the docs that says something to the effect of, well, you want to open up a CSV and feed it off to pandas, read CSV.

00:14:08.080 --> 00:14:14.500
So normally you would say open CSV file, and then you just say pandas, read CSV and give it the file stream.

00:14:14.500 --> 00:14:16.000
But what if that's on the internet?

00:14:16.000 --> 00:14:18.260
What if that's on S3 with authentication?

00:14:18.260 --> 00:14:18.900
What's that?

00:14:18.900 --> 00:14:20.700
What if that's somewhere else?

00:14:20.700 --> 00:14:20.960
Right.

00:14:21.020 --> 00:14:25.080
And so with this one, you can just say FS, file system spec, open.

00:14:25.080 --> 00:14:26.260
Here's a URL.

00:14:26.260 --> 00:14:28.120
And now that's a stream, right?

00:14:28.120 --> 00:14:32.080
Or that could be, here's an S3 location, S3 bucket.

00:14:32.080 --> 00:14:32.840
Go get that, right?

00:14:32.840 --> 00:14:33.760
Yeah.

00:14:33.760 --> 00:14:34.120
Yeah.

00:14:34.120 --> 00:14:39.880
So instead of passing the path directly into the read function, you pass in the file object.

00:14:39.880 --> 00:14:43.220
And it's really powerful.

00:14:43.220 --> 00:14:45.860
It seems like a thing that we shouldn't need.

00:14:45.860 --> 00:14:50.680
But files get like the file locations can get so crazy so quickly.

00:14:50.680 --> 00:14:57.980
And this just really helps simplify and like, make it so you don't have to think about this stuff, which I think is what most people want.

00:14:57.980 --> 00:14:58.720
It's what I want.

00:14:58.720 --> 00:15:00.080
Yeah, for sure.

00:15:00.200 --> 00:15:02.660
So like there's a local file system option.

00:15:02.660 --> 00:15:05.400
But then you could also have an FTP file system.

00:15:05.400 --> 00:15:07.420
Or you could have something else, right?

00:15:07.420 --> 00:15:08.460
All sorts of different options.

00:15:08.460 --> 00:15:09.020
Yeah.

00:15:09.020 --> 00:15:09.360
Yeah.

00:15:09.360 --> 00:15:10.400
All sorts of stuff.

00:15:10.400 --> 00:15:10.940
Yeah.

00:15:11.360 --> 00:15:11.660
Okay.

00:15:11.660 --> 00:15:12.140
That's cool.

00:15:12.140 --> 00:15:13.360
Brian, what do you think?

00:15:13.360 --> 00:15:15.220
Does it have any applicability for you?

00:15:15.220 --> 00:15:15.980
Oh, yeah.

00:15:15.980 --> 00:15:16.440
Definitely.

00:15:16.440 --> 00:15:24.760
And that's a great abstraction layer to put in place to just have reading as if it was a file and have it moved.

00:15:24.760 --> 00:15:30.580
It also helps you develop tools locally and then be able to deploy them into a larger space.

00:15:30.580 --> 00:15:31.540
So it's cool.

00:15:31.540 --> 00:15:32.380
Yeah, for sure.

00:15:32.380 --> 00:15:38.080
One of the things that always makes me a little hesitant when I hear people say things like, we're cloud native.

00:15:38.080 --> 00:15:40.000
Like my app is cloud native.

00:15:40.340 --> 00:15:41.800
That's always code word for me.

00:15:41.800 --> 00:15:44.860
Like I will never be able to run my app unless I'm connected to the internet.

00:15:44.860 --> 00:15:47.540
You know, it's like it depends on all these services together.

00:15:47.540 --> 00:15:49.880
And there's no way I can recreate that locally.

00:15:49.880 --> 00:15:54.480
But something like this could allow you to say, well, we're going to have a local file system version.

00:15:54.480 --> 00:15:57.880
But then when we go to production, we'll switch to S3 or, you know, pick it.

00:15:57.880 --> 00:15:58.520
Pick something.

00:15:58.520 --> 00:16:03.900
I've always wanted to make it either a t-shirt or a sticker or both that says not a cloud native, just visiting.

00:16:03.900 --> 00:16:06.660
Nice.

00:16:06.660 --> 00:16:09.600
I also think, Brian, there might be testing opportunities here.

00:16:09.860 --> 00:16:10.380
Yeah, definitely.

00:16:10.380 --> 00:16:12.400
Give it a test file system.

00:16:12.400 --> 00:16:13.080
That'd be cool.

00:16:13.080 --> 00:16:13.560
Yeah.

00:16:13.560 --> 00:16:19.960
And like Julia said, swapping things out to just have your logic not have to care where it's coming from.

00:16:20.960 --> 00:16:29.260
But I guess you'd have to make sure all of the interfaces, the different storage systems really are equal.

00:16:29.260 --> 00:16:32.040
But I guess you should try that out yourself.

00:16:32.420 --> 00:16:34.440
Yeah, there's like kind of a bucket, right?

00:16:35.220 --> 00:16:38.600
There's kind of like a dict that you can pass, which is like storage options.

00:16:38.600 --> 00:16:44.540
So I think that can, that might get a little wonky depending on what the different backends need.

00:16:44.540 --> 00:16:47.320
But the general principles are the same.

00:16:47.320 --> 00:16:58.160
And it also, I should have said this originally, but it also allows, like FS spec itself can contain logic to do things that are general to all the different libraries, like caching and things like that.

00:16:58.400 --> 00:16:59.320
It's all the different.

00:16:59.320 --> 00:16:59.780
Oh, interesting.

00:16:59.780 --> 00:17:03.600
Like you could put a caching layer on top of arbitrary things like S3.

00:17:03.600 --> 00:17:04.340
Yeah.

00:17:04.520 --> 00:17:07.520
Google storage and Azure buckets or blob storage.

00:17:07.520 --> 00:17:07.900
Yeah.

00:17:07.900 --> 00:17:08.400
Yeah.

00:17:08.400 --> 00:17:12.320
Maybe even save money on bandwidth there if you can do some caching.

00:17:12.320 --> 00:17:12.860
Yeah.

00:17:12.860 --> 00:17:13.540
If you can do it right.

00:17:13.540 --> 00:17:14.420
Yeah.

00:17:14.420 --> 00:17:15.560
Super, super neat.

00:17:15.560 --> 00:17:18.380
Brian, you're going to tell us about how to slim down our Docker containers.

00:17:18.380 --> 00:17:23.320
But before you do, I want to tell people about our sponsor for this episode brought to you by Sentry.

00:17:23.320 --> 00:17:28.460
So how would you like to remove a little stress from your life in addition to just abstracting your file system?

00:17:28.460 --> 00:17:30.140
Maybe tracking down some errors.

00:17:30.220 --> 00:17:35.760
So do you worry that your users may be having difficulties or encountering errors with your app right now?

00:17:35.760 --> 00:17:38.340
And would you even know it until they send that support email?

00:17:38.340 --> 00:17:44.740
How much better would it be if you got the error or performance details sent right away with all the call stack?

00:17:44.740 --> 00:17:46.160
Maybe you would get blame in there.

00:17:46.160 --> 00:17:51.700
The local variables, the active user who was logged in while this happened, all that kind of stuff.

00:17:51.700 --> 00:17:55.200
So with Sentry, it's not only possible, it's actually really simple.

00:17:55.380 --> 00:17:57.040
I've used this on Sentry.

00:17:57.040 --> 00:17:59.360
I've used Sentry on our websites before.

00:17:59.360 --> 00:18:02.360
So it's on Python Bytes, Talk Python Training, all those different sites.

00:18:02.360 --> 00:18:07.200
And I've actually had someone encounter an error trying to buy a course over on Talk Python Training.

00:18:07.200 --> 00:18:08.960
I got the Sentry notification.

00:18:08.960 --> 00:18:12.140
I said, oh, geez, I can't believe this problem crept in here.

00:18:12.140 --> 00:18:16.180
And I fixed it really quick and started to roll out the fix and actually got an email.

00:18:16.180 --> 00:18:18.340
They said, hey, we're having this problem buying a course.

00:18:18.340 --> 00:18:18.900
I know.

00:18:18.900 --> 00:18:19.980
I've almost got it fixed.

00:18:19.980 --> 00:18:21.840
Just give me a moment and try again.

00:18:21.840 --> 00:18:23.360
And they were just like, what?

00:18:23.360 --> 00:18:24.420
That doesn't make sense.

00:18:24.500 --> 00:18:25.840
So they were very surprised.

00:18:25.840 --> 00:18:30.220
And so it's surprising to let your users create your Sentry account at pythonbytes.fm slash

00:18:30.220 --> 00:18:30.620
Sentry.

00:18:30.620 --> 00:18:32.900
And when you sign up, there's a little got a promo code.

00:18:32.900 --> 00:18:37.780
Make sure that you put Python Bytes, all one word, all caps with a Y in there.

00:18:37.780 --> 00:18:41.200
And you'll get two free months plus a bunch of extra features and so on.

00:18:41.200 --> 00:18:45.780
So also, it really lets them know that you came from us rather than just somewhere else.

00:18:45.780 --> 00:18:46.960
And that helps support the show a lot.

00:18:46.960 --> 00:18:50.340
So pythonbytes.fm/sentry and promo code Python Bytes.

00:18:50.340 --> 00:18:50.900
Awesome.

00:18:50.900 --> 00:18:53.020
Thanks for supporting the show, Sentry.

00:18:53.020 --> 00:18:55.320
And Brian, let's talk Docker.

00:18:55.320 --> 00:18:57.240
Yeah, let's talk Docker.

00:18:57.240 --> 00:19:00.080
I mean, I'm starting to use Docker more and more.

00:19:00.080 --> 00:19:02.120
And I like the experience.

00:19:02.120 --> 00:19:05.780
But I was interested when this article came up.

00:19:05.780 --> 00:19:07.040
So it was in June.

00:19:07.500 --> 00:19:11.400
I saw this article called The Need for Slimmer Containers.

00:19:11.900 --> 00:19:15.120
And this is from somebody, Ivan.

00:19:15.120 --> 00:19:17.220
Ivan, I'm not going to try his last name.

00:19:17.220 --> 00:19:17.860
Ivan something.

00:19:17.860 --> 00:19:20.400
But anyway, it's an interesting discussion.

00:19:20.400 --> 00:19:29.300
And the idea around the original post was that there's now a Docker scan that you can use.

00:19:29.300 --> 00:19:34.680
So you can use Docker scan to scan for vulnerabilities in your Docker containers.

00:19:35.200 --> 00:19:40.220
And this, Ivan thought, well, I'll look at some of the standard Python containers that are available.

00:19:40.220 --> 00:19:40.820
Right.

00:19:40.820 --> 00:19:49.180
Theoretically, some of the things that are nice is I can just go and say Docker or in my Docker container, I can say from Python colon three nine.

00:19:49.340 --> 00:19:51.460
And I don't have to think about how do I install Python?

00:19:51.460 --> 00:19:52.700
How do I keep it up to date?

00:19:52.700 --> 00:19:58.260
You know, make sure that pip is there and that I'll be able to, you know, pip install stuff that needs to do build things.

00:19:58.260 --> 00:19:59.680
All that stuff will be there, right?

00:19:59.680 --> 00:20:02.120
So it seems like, of course, this is what you want.

00:20:02.120 --> 00:20:02.680
Yeah.

00:20:02.680 --> 00:20:06.640
Well, and also, that's kind of one of the neat things about Docker.

00:20:06.640 --> 00:20:08.940
I can just say I have these standard parts.

00:20:08.940 --> 00:20:11.800
Now I just want to put my custom stuff on top of it.

00:20:11.800 --> 00:20:13.900
And it's great.

00:20:13.900 --> 00:20:15.880
So, well, what did he find?

00:20:16.080 --> 00:20:23.980
So he used, so Docker scan apparently uses a third party tool called Snake, S-N-Y-K container.

00:20:23.980 --> 00:20:30.600
And we've covered Snake before, not the container version, but we covered Snake in episode 227.

00:20:30.600 --> 00:20:35.560
But so it's looking for vulnerabilities, and that's a good thing.

00:20:35.560 --> 00:20:37.420
But he found them in everything.

00:20:37.420 --> 00:20:43.780
And he found them in all of the standard Python ones, except for Alpine, I guess.

00:20:44.860 --> 00:20:47.340
And so he didn't really know what to make of it, really.

00:20:47.340 --> 00:20:53.280
He was just sort of reporting his results that maybe Alpine is the only one with few vulnerabilities.

00:20:53.280 --> 00:20:59.740
But then this went out on Hacker News, and there was a big discussion around it.

00:20:59.740 --> 00:21:05.380
So he updated the article, which I appreciate, with some of the feedback that he got.

00:21:06.180 --> 00:21:11.660
And so some of the feedback was that these vulnerability checkers sometimes give you false positives.

00:21:11.660 --> 00:21:23.920
And I don't really have enough experience to know what that means, but I don't have enough experience to know if these really are false positives or if they're actual vulnerabilities or not.

00:21:24.580 --> 00:21:33.460
The other thing that maybe some people suggested that these standard ones really aren't updated very much.

00:21:33.460 --> 00:21:35.560
So I don't really know much about that either.

00:21:35.560 --> 00:21:39.840
And if they're not, that's kind of a bummer because I think people are relying on them.

00:21:40.260 --> 00:21:45.680
So I actually just kind of am left with a little bit of a confusion as to what to do.

00:21:45.680 --> 00:21:54.280
I want to also mention that the Alpine, in his current one, or his original article, he says Alpine's pretty good for vulnerabilities.

00:21:54.280 --> 00:21:57.660
But then his follow-up says it doesn't really...

00:21:57.660 --> 00:22:01.680
Well, there's a lot of applications that can't run on Alpine because of some issues or another.

00:22:01.680 --> 00:22:03.960
So anyway, I'm not sure what to make of it.

00:22:03.960 --> 00:22:06.220
So I was hoping Michael might give us some insight.

00:22:06.220 --> 00:22:08.580
I did some thinking about this morning.

00:22:08.580 --> 00:22:14.660
And in fact, I recently spoke a lot about this over on Talk Python.

00:22:14.660 --> 00:22:21.040
So I had Itamar on the show, and we talked about best practices for Docker packaging.

00:22:21.040 --> 00:22:24.660
And we talked a lot about both security and package size.

00:22:25.020 --> 00:22:28.180
So I can try to relay a couple of things from that.

00:22:28.180 --> 00:22:32.160
So we've got our official image over here, our Python official image.

00:22:32.160 --> 00:22:34.020
There's actually a bunch of options.

00:22:34.020 --> 00:22:42.040
As you can see, there's a few, like 310 beta 2 Buster or the 310 RC Buster.

00:22:42.040 --> 00:22:44.460
That sounds bad, but I think it's actually good.

00:22:44.460 --> 00:22:44.940
No, I'm just kidding.

00:22:44.940 --> 00:22:45.460
I know what it is.

00:22:45.460 --> 00:22:49.000
So these are by default based on Debian.

00:22:49.000 --> 00:22:51.880
And Buster is the latest version of Debian.

00:22:51.880 --> 00:23:02.340
And so you can do a Buster, which is like full Debian with 310, or you can do a 310 Slim Buster, which is like a slimmed down version of Debian Buster that supports Python 310.

00:23:02.340 --> 00:23:02.900
Okay.

00:23:02.980 --> 00:23:07.300
So there's a lot going on here in terms of the options.

00:23:07.300 --> 00:23:14.380
One of, so the article talks about how Alpine had the fewest security vulnerabilities.

00:23:14.380 --> 00:23:23.640
And I actually, so the Python latest, if you run the sneak package scanner thingy on it, it says there's 364 vulnerabilities.

00:23:23.640 --> 00:23:32.660
If you just do Python latest, three, nine, and 353 after you run apt update, apt upgrade.

00:23:32.660 --> 00:23:38.040
So if you try to get the container to update itself, there's still 353 in the, that one.

00:23:38.040 --> 00:23:39.160
I don't use that.

00:23:39.160 --> 00:23:39.880
I use Ubuntu.

00:23:39.880 --> 00:23:45.680
So I use the Ubuntu latest and the bare version of that one had 31 vulnerabilities.

00:23:46.100 --> 00:23:58.100
But then if I either install Python through apt or, or build it through source and put it in the necessary foundational bits, like build essentials and stuff to build Python, it goes up to 35 total problems.

00:23:58.100 --> 00:23:59.900
We're at 28 of Merlot.

00:23:59.900 --> 00:24:01.600
So seven or medium, nothing major.

00:24:01.820 --> 00:24:13.500
One thing I thought was weird was I actually ran another step where I said, okay, let's uninstall those intermediate tools like GCC and Wget and stuff like that, that I needed to get stuff on the machine, but I'm not going to use again.

00:24:13.500 --> 00:24:15.040
And I took them away.

00:24:15.040 --> 00:24:19.300
And almost all those warnings were about those tools that I had apt uninstalled.

00:24:19.300 --> 00:24:21.660
So I don't know why sneak is still showing them.

00:24:21.660 --> 00:24:26.440
Cause if I go into the container, I type Wget, it says, Nope, this thing is not installed.

00:24:26.440 --> 00:24:26.800
Sorry.

00:24:27.220 --> 00:24:31.400
But it still says the warning is that Wget has a vulnerability in it, for example.

00:24:31.400 --> 00:24:31.740
Right.

00:24:31.740 --> 00:24:34.620
So there's like, there's like this over-reporting for sure.

00:24:34.620 --> 00:24:38.740
But I mean, the difference between 28 and 350 is not trivial.

00:24:38.740 --> 00:24:39.320
Right.

00:24:39.320 --> 00:24:39.920
Right.

00:24:39.920 --> 00:24:44.760
So like running an apt install Python three type of thing is not, you know, it's probably worth it.

00:24:44.760 --> 00:24:45.200
For example.

00:24:45.200 --> 00:24:54.660
When I switched from Python three, nine to Python three, nine slim buster, it went from 350 to 69.

00:24:54.660 --> 00:24:56.060
So that's a lot better.

00:24:56.060 --> 00:24:56.380
Right.

00:24:56.380 --> 00:24:56.860
Yeah.

00:24:57.160 --> 00:25:00.240
it's still not as good as a moon two, but it's a lot better.

00:25:00.240 --> 00:25:02.100
The it's still twice as many.

00:25:02.100 --> 00:25:09.220
I mean, you can't, it sounds better, but it could be like 359 low problems and then 69 critical ones.

00:25:09.220 --> 00:25:10.780
it totally could.

00:25:10.780 --> 00:25:11.480
It totally could.

00:25:11.480 --> 00:25:11.740
Yeah.

00:25:11.740 --> 00:25:19.300
Also if the reporting, like if the, if, if we can't trust snake necessarily, then like maybe,

00:25:19.300 --> 00:25:25.120
you know, if you can't trust your reporting system, then like maybe that, maybe none of this is means anything.

00:25:25.120 --> 00:25:25.580
Right.

00:25:25.580 --> 00:25:25.840
Yeah.

00:25:25.840 --> 00:25:26.640
Yeah.

00:25:26.640 --> 00:25:32.800
I think one of the things the article originally started out to address was if you have fewer subsystems,

00:25:32.800 --> 00:25:36.780
there's no chance the missing subsystem could get hacked because it's not there.

00:25:36.780 --> 00:25:37.260
Right.

00:25:37.260 --> 00:25:42.000
So if there's a vulnerability in SSH, but you literally don't install SSH, who cares?

00:25:42.260 --> 00:25:46.420
Whereas if you, you know, you're going to have a lot of things that you're going to have to do with your own.

00:25:46.420 --> 00:25:47.540
You're going to have to do with your own.

00:25:47.540 --> 00:25:48.540
You're going to have to do with your own.

00:25:48.540 --> 00:25:49.740
You're going to have to do with your own.

00:25:49.740 --> 00:25:49.900
Yeah.

00:25:49.900 --> 00:25:52.400
And then it went down this rattle of like, well, let me scan it.

00:25:52.400 --> 00:25:52.880
And so on.

00:25:52.880 --> 00:25:54.540
So I want to add one more thing.

00:25:54.540 --> 00:26:01.460
Like Alpine did result in the best outcome from the scanner, but there's a lot of issues with Alpine and Python.

00:26:01.460 --> 00:26:10.680
So for example, there's this PEP here, 656, that right now, if I try to pip install something on Alpine.

00:26:10.680 --> 00:26:15.840
So especially in the data science world where things are large and the compiling takes a lot of steps and so on.

00:26:15.840 --> 00:26:21.820
The wheels that are built for Linux are built for, is it GLib?

00:26:21.820 --> 00:26:22.720
GClib?

00:26:22.720 --> 00:26:23.420
I mean, hold on.

00:26:23.420 --> 00:26:24.600
I'll look over here.

00:26:24.600 --> 00:26:25.420
I wrote it down.

00:26:25.420 --> 00:26:25.960
So I know.

00:26:25.960 --> 00:26:27.000
No, I didn't write it down.

00:26:27.000 --> 00:26:27.140
Sorry.

00:26:27.140 --> 00:26:32.620
There's like, I think it's GLib or GClib, which is the C runtime on like Ubuntu and Debian.

00:26:32.620 --> 00:26:36.220
But there's one MUSL, Muscle, on Alpine.

00:26:36.220 --> 00:26:39.720
And the wheels are not built for Muscle.

00:26:39.820 --> 00:26:41.220
They're built for GClib.

00:26:41.220 --> 00:26:44.020
And so you can't pip install that.

00:26:44.020 --> 00:26:46.600
You've got to download everything and then compile it.

00:26:46.600 --> 00:26:53.460
And it's like compiling matplotlib and Jupyter from scratch can take a really long time versus just downloading the wheel.

00:26:53.460 --> 00:26:54.600
And it takes up a lot of space.

00:26:54.600 --> 00:26:59.680
And there's a bunch of issues and things around that that make it slightly not Python friendly.

00:26:59.680 --> 00:27:08.420
That's why there's this pep, pep656, to allow wheels to be tagged as supporting Muscle, not GClib.

00:27:08.960 --> 00:27:10.400
Is that more than you wanted, Brian?

00:27:10.400 --> 00:27:10.840
Are you good?

00:27:10.840 --> 00:27:11.980
Okay.

00:27:11.980 --> 00:27:18.780
So the takeaway that I'm getting is probably not panic on some of these, but maybe at least pay attention to them.

00:27:18.780 --> 00:27:27.260
And it is good, like you said, to remove tools out of your Docker images that you're not using.

00:27:27.260 --> 00:27:30.140
If you're not using Wget in your application, take it off.

00:27:30.140 --> 00:27:30.900
Things like that.

00:27:31.180 --> 00:27:31.640
Yeah, exactly.

00:27:31.640 --> 00:27:33.280
I think Julia's point was great, right?

00:27:33.280 --> 00:27:35.840
It's if you it might be a false positive.

00:27:35.840 --> 00:27:43.780
But at the same time, if you're not going to use it again, because Docker, a lot of times you pip install all your stuff and then it's kind of ready to run.

00:27:43.780 --> 00:27:46.020
But you're not going to go and pip install something again.

00:27:46.020 --> 00:27:48.480
You're going to do a new Docker build from scratch.

00:27:48.480 --> 00:27:49.000
Right.

00:27:49.000 --> 00:27:55.500
Like one of the final lines could be remove, remove all those intermediate things that could have problems and make it larger and whatnot.

00:27:56.000 --> 00:28:02.000
Yeah, I thought so I've only thought about this from like package from like image size.

00:28:02.000 --> 00:28:02.400
Right.

00:28:02.400 --> 00:28:02.920
Like that.

00:28:02.920 --> 00:28:03.240
Yeah.

00:28:03.240 --> 00:28:07.160
That you want some more images just because it takes forever to get them around.

00:28:07.160 --> 00:28:10.700
But it's interesting to think about from the vulnerability perspective.

00:28:10.700 --> 00:28:17.220
And I've always seen it done as you do whatever installation you need and then you do all these like cleaning steps.

00:28:17.220 --> 00:28:23.240
But what you said, Michael, about like not ever putting certain things on your image was is interesting.

00:28:23.240 --> 00:28:24.960
I haven't heard of that before.

00:28:24.960 --> 00:28:25.860
Yeah, thanks.

00:28:25.860 --> 00:28:31.200
I also had Peter McKee from who works at Docker on Talk Python a little while, like six months ago or something.

00:28:31.200 --> 00:28:36.980
And he talks about having these multi-step builds, something to the effect of doesn't make as much sense with Python.

00:28:36.980 --> 00:28:38.100
I'll try to put it together.

00:28:38.100 --> 00:28:39.740
But like imagine you're building a Go library.

00:28:39.740 --> 00:28:44.560
You could put the Go runtime and build tools on a container, build your thing.

00:28:44.560 --> 00:28:48.320
But the thing you get from Go is an actual binary that's all self-contained.

00:28:48.320 --> 00:28:55.720
You could throw that container away and just copy the output of that into your actual container and never even put all those tools on the

00:28:55.720 --> 00:28:57.160
actual system that goes to production.

00:28:57.160 --> 00:29:04.260
With Python, that might look something like maybe using PEX to package up all the stuff inside of a virtual environment.

00:29:04.260 --> 00:29:08.420
And long as Python, the runtime is there, then you can like PEX run on your other machine.

00:29:08.420 --> 00:29:11.860
But you could potentially not even ever install those, which might be good.

00:29:11.860 --> 00:29:12.820
Yeah, that makes sense.

00:29:12.900 --> 00:29:17.020
There's a lot there that is sort of beyond my comfort level.

00:29:17.020 --> 00:29:20.280
But that's what I thought as I looked at this, Brian.

00:29:20.280 --> 00:29:21.700
Well, thanks for taking a look.

00:29:21.700 --> 00:29:22.320
There you bet.

00:29:22.320 --> 00:29:23.020
All right.

00:29:23.020 --> 00:29:26.680
We'd like to talk about GUIs on the show every now and then.

00:29:26.680 --> 00:29:32.000
And we want to talk about pandas and data frames and data science and all that.

00:29:32.000 --> 00:29:33.180
So let's put those together.

00:29:33.180 --> 00:29:36.980
There's this project over here called Pandas GUI.

00:29:37.280 --> 00:29:41.260
And the documentation is sparse, let's say.

00:29:41.260 --> 00:29:41.800
It's pretty easy.

00:29:41.800 --> 00:29:43.300
There's a couple of examples or two.

00:29:43.300 --> 00:29:46.880
So I could come down here and I could like do my panda stuff and create a data frame.

00:29:46.880 --> 00:29:49.920
And I could just import show from the pandas GUI.

00:29:49.920 --> 00:29:56.660
And within my notebook, it will pop open a separate window that it then allows me to cruise around and check it out.

00:29:56.660 --> 00:30:04.580
So it does, you know, you can print out the data frame in a notebook and you get kind of a static Excel grid looking thing.

00:30:04.580 --> 00:30:05.880
And that's nice.

00:30:05.880 --> 00:30:10.460
But with this, you get a interactive one that lets you sort and select.

00:30:10.460 --> 00:30:15.500
You can actually copy and paste chunks out of there as if it was Excel and then paste it in other places.

00:30:15.500 --> 00:30:18.400
It also has a plotting library with like pictures.

00:30:18.400 --> 00:30:20.840
So I'm going to go click on the bar graph picture.

00:30:20.840 --> 00:30:24.600
And then there's a list of all the columns and the things that the bar graph needs.

00:30:24.600 --> 00:30:29.060
And you can drag and drop this column is the X axis and this column is the Y axis.

00:30:29.060 --> 00:30:35.780
And I want to group by color and have, you know, group by color it by some other aspect of the data.

00:30:35.780 --> 00:30:40.500
And, you know, like group into multiple charts or multiple lines or plots on a chart.

00:30:40.500 --> 00:30:42.080
All sorts of cool stuff like that.

00:30:42.080 --> 00:30:43.520
There's a statistics section.

00:30:43.520 --> 00:30:49.020
There's you can export important export, I guess, import CSV files with drag and drop.

00:30:49.020 --> 00:30:50.860
And there's also search that you can do.

00:30:51.200 --> 00:30:54.860
So it's a pretty neat, quick way to explore pandas.

00:30:54.860 --> 00:30:55.520
Yeah.

00:30:55.520 --> 00:30:57.060
It's a neat idea.

00:30:57.060 --> 00:31:05.600
Like when you, when you first encounter a data frame, like you really want to, you really want to just be able to like look at it without any assumptions.

00:31:05.880 --> 00:31:15.100
And there's a lot of stuff that like kind of goes towards that with like the dot plot API and pandas and making that, making it really accessible to make plots really quickly.

00:31:15.100 --> 00:31:18.020
But this is like kind of like the step beyond that, right?

00:31:18.020 --> 00:31:20.560
Of just visualizing it immediately.

00:31:21.060 --> 00:31:21.160
Yeah.

00:31:21.160 --> 00:31:29.820
Like one thing you get when you view the data frame as, you know, like I said, it looks kind of just like printing DF in or just typing DF in the notebook.

00:31:29.820 --> 00:31:32.360
But then on the right, you can say, oh, I want to see the filters.

00:31:32.360 --> 00:31:38.320
And you could type in these filter expressions, these query expressions, and then turn them all, like pile them on.

00:31:38.320 --> 00:31:41.980
You can have little checkboxes to like optionally turn them off, but not delete them.

00:31:42.120 --> 00:31:45.360
And then of course you can sort within there like that.

00:31:45.360 --> 00:31:49.100
And the graphing, I think the support for the graphing part is really, really helpful.

00:31:49.100 --> 00:31:53.920
So the fact that you can just go and click and say, oh, I want a box plot.

00:31:53.920 --> 00:31:56.280
And then the box plot needs these things.

00:31:56.280 --> 00:32:02.140
You can just drag and drop from the column, from your data frame definition over, and it just live updates.

00:32:02.140 --> 00:32:02.740
Yeah.

00:32:02.740 --> 00:32:11.340
I think that really like lets people visualize the data in the way that they want to sometimes, rather than like the way they already know how in that plot loop.

00:32:11.660 --> 00:32:15.200
Which I think is what people end up doing, at least for exploratory stuff.

00:32:15.200 --> 00:32:16.100
Yeah, exactly.

00:32:16.100 --> 00:32:23.560
You could real quickly switch between a bar, a box, a scatter plot, back and forth without having to actually be familiar with how those works.

00:32:23.560 --> 00:32:29.540
Can you tell if there's a way to export the filters or is there any mechanism for that?

00:32:29.540 --> 00:32:31.440
There is, I don't think so.

00:32:31.440 --> 00:32:36.440
At least in the YouTube explainer video, there were some comments like, you know what would be awesome?

00:32:36.440 --> 00:32:40.920
Export this as code from here so that I can just turn it back into Python.

00:32:41.200 --> 00:32:43.540
I didn't see anything like that.

00:32:43.540 --> 00:32:44.240
Yeah.

00:32:44.240 --> 00:32:47.120
Sometimes GUIs are a little weird for me because of that.

00:32:47.120 --> 00:32:51.680
You know, like you end up in this GUI world and it's not, you can't reproduce anything.

00:32:51.680 --> 00:32:56.840
I clicked on a whole bunch of stuff and then it looked great, but don't touch it.

00:32:56.840 --> 00:32:57.700
Yeah, exactly.

00:32:57.700 --> 00:32:58.660
I can't do it again.

00:33:00.120 --> 00:33:13.240
Okay, but to be fair, it is a fairly quick way to look at the data and know what you, maybe you can't produce that exact plot again, but you know what the data looks like and you can use a different plotting mechanism to do that.

00:33:13.560 --> 00:33:14.040
Yeah.

00:33:14.040 --> 00:33:15.040
And the visually it's pretty clear.

00:33:15.040 --> 00:33:18.660
Like, okay, well, X is assigned to speed and we know it's a histogram.

00:33:18.660 --> 00:33:26.520
And so you could pretty quickly, you know, with some Googling stack overflowing, go, all right, how do I map plot live a histogram and get that going?

00:33:26.520 --> 00:33:26.840
You know?

00:33:26.840 --> 00:33:27.400
Right.

00:33:27.400 --> 00:33:28.560
That's a huge time saver.

00:33:29.240 --> 00:33:29.460
Yeah.

00:33:29.460 --> 00:33:34.320
But some, some, some sort of export of like, okay, give me the code to make this plot in my own code.

00:33:34.320 --> 00:33:34.960
That would be great.

00:33:34.960 --> 00:33:36.520
Yeah, absolutely.

00:33:36.520 --> 00:33:37.840
Absolutely.

00:33:37.840 --> 00:33:38.300
All right.

00:33:38.300 --> 00:33:39.200
On to the next.

00:33:39.200 --> 00:33:45.160
But before we get there, I do want to call out just a shout out by Pylang that FS spec is sweet.

00:33:45.160 --> 00:33:45.660
Good mention.

00:33:45.660 --> 00:33:46.020
Yeah.

00:33:46.020 --> 00:33:47.660
I like it as well.

00:33:47.660 --> 00:33:48.380
Cool.

00:33:48.380 --> 00:33:49.160
All right.

00:33:49.160 --> 00:33:49.840
X-Ray.

00:33:49.840 --> 00:33:50.760
X-Ray.

00:33:50.760 --> 00:33:51.660
Okay.

00:33:51.660 --> 00:33:57.220
So X-Ray is, it's my favorite library.

00:33:57.980 --> 00:33:59.540
It's a, it's like a pandas.

00:33:59.540 --> 00:34:04.520
So it's a pandas like API, but it's for N dimensional data.

00:34:04.520 --> 00:34:21.860
So if you have like a lot of times people talk about in like geospatial data where there's that long time and others, but also for image data where there's maybe a bunch of different bands from like satellite imagery or other disciplines where you just have labeled data.

00:34:21.860 --> 00:34:23.040
That's not tabular.

00:34:23.040 --> 00:34:28.600
So the axes like mean something, but there's not just one or two of them.

00:34:28.980 --> 00:34:39.720
Then X-Ray is like great for that because it lets you do things like you can select a certain subset of time or a certain subset of whatever your dimension is.

00:34:39.720 --> 00:34:42.880
And you can also aggregate across different dimensions.

00:34:43.620 --> 00:34:45.780
And you can use the labels directly.

00:34:45.780 --> 00:35:04.860
So if you don't have a tool like this, I see people doing this a lot with like machine learning workflows where they'll be, they'll have like separate, like a list of all their, they'll have like a list of all their labels and then they'll have their data and they'll do some manipulation and they'll try to like reattach them at the end.

00:35:04.860 --> 00:35:08.540
And it's just, it just turns into a mess.

00:35:08.540 --> 00:35:12.300
And it's actually just like takes care of that all for you.

00:35:12.300 --> 00:35:13.940
It's pretty great.

00:35:13.940 --> 00:35:18.740
And I think that it has applications that have not been fully realized yet.

00:35:18.740 --> 00:35:23.020
And it's starting to like take off in other spaces, but it really comes from this geospatial world.

00:35:23.020 --> 00:35:25.560
But I think it could be useful for all sorts of people.

00:35:25.560 --> 00:35:26.120
Right.

00:35:26.120 --> 00:35:29.660
Because in geospatial, sometimes you have three dimensions, not just two.

00:35:29.660 --> 00:35:30.440
Yeah.

00:35:30.460 --> 00:35:31.980
You almost always have three.

00:35:31.980 --> 00:35:32.520
Right.

00:35:32.520 --> 00:35:33.440
Sorry, Brian.

00:35:33.440 --> 00:35:34.020
Go ahead.

00:35:34.020 --> 00:35:35.560
No, the documentation looks great too.

00:35:35.560 --> 00:35:42.680
The documentation has like getting started guides and tutorials and videos and galleries and stuff.

00:35:42.680 --> 00:35:44.340
So definitely check out the documentation.

00:35:44.340 --> 00:35:45.160
Yeah.

00:35:45.160 --> 00:35:49.920
I think it got a major, it seems like I looked at it for this too, and it seems like it got a major facelift.

00:35:49.920 --> 00:35:51.640
So it looks really nice.

00:35:52.060 --> 00:35:55.000
It also has like plotting.

00:35:55.000 --> 00:36:01.300
It supports the dot plot API or some different version of it that's like the pandas version.

00:36:01.300 --> 00:36:06.320
But you can plot in different, you know, three dimensions or aggregate and then plot.

00:36:06.320 --> 00:36:10.080
And so that's like a really nice way to get the visuals quickly.

00:36:11.080 --> 00:36:23.100
And then the last thing that I wanted to say about it is that it's normally backed by NumPy arrays, but it can also be backed by Dask arrays or Sparse arrays or all sorts of different arrays natively.

00:36:23.100 --> 00:36:32.520
So it's a, it's a really cool, it's another one of these like building block things where you can have X arrays like you're labeling and you're indexing and all the like nice stuff.

00:36:32.520 --> 00:36:38.100
And then down inside it can be NumPy or QPy or Dask.

00:36:38.100 --> 00:36:39.060
How interesting.

00:36:39.060 --> 00:36:45.680
So it's, it can do that juggling and piecing back together that other people are manually doing and you just have this simple API.

00:36:45.680 --> 00:36:47.680
And if it has to do that, it'll figure it out.

00:36:47.680 --> 00:36:48.440
Yeah.

00:36:48.440 --> 00:36:48.820
Yeah.

00:36:48.820 --> 00:36:49.740
That's pretty cool.

00:36:49.740 --> 00:36:50.020
Nice.

00:36:50.020 --> 00:36:52.860
And you talked about QPy and Dask.

00:36:52.860 --> 00:36:55.600
Like those are some pretty interesting backends for this.

00:36:55.600 --> 00:36:56.260
Yeah.

00:36:56.260 --> 00:36:56.820
Yeah.

00:36:56.820 --> 00:37:00.840
The Dask one is, I said QPy.

00:37:00.840 --> 00:37:04.060
And now I'm wondering if maybe it's just like Dask and then QPy.

00:37:04.060 --> 00:37:05.560
So don't quote me on that.

00:37:05.560 --> 00:37:11.380
But, but yeah, the Dask one is, is like really integrated with X-Ray code.

00:37:11.380 --> 00:37:16.580
So you do like, they do just do some special things to make it so that it works with paralyzing and things.

00:37:16.580 --> 00:37:19.100
But, but from the user experience, it's the same.

00:37:19.100 --> 00:37:19.920
Yeah.

00:37:19.920 --> 00:37:20.380
Fantastic.

00:37:20.380 --> 00:37:23.220
And then also noticed it requires Python 3.7.

00:37:23.220 --> 00:37:28.480
Really nice to see tools sort of keeping up with the latest, not, not, not really old stuff.

00:37:28.480 --> 00:37:30.420
Well, hopefully it's 3.7 and above.

00:37:30.860 --> 00:37:31.160
Well, yeah.

00:37:31.160 --> 00:37:31.620
Yeah.

00:37:31.620 --> 00:37:32.420
Greater than or equal to.

00:37:32.420 --> 00:37:35.100
Well, I mean, I ran into a library.

00:37:35.100 --> 00:37:38.960
It was an internal thing that, that was only 3.7.

00:37:38.960 --> 00:37:44.480
So I tried it on, I'm like, I assumed or above and I tried it on 3.9 and it like fell over.

00:37:44.480 --> 00:37:45.440
Like what's going on?

00:37:45.440 --> 00:37:46.960
It was only 3.7.

00:37:46.960 --> 00:37:47.720
It's weird.

00:37:47.720 --> 00:37:49.540
Okay.

00:37:49.540 --> 00:37:50.100
That is weird.

00:37:50.100 --> 00:37:55.760
That'd be interesting to think about what special features of 3.7 there, depending on that broken 3.8.

00:37:56.160 --> 00:37:56.440
Yeah.

00:37:56.440 --> 00:37:57.040
That's what I was thinking.

00:37:57.040 --> 00:38:01.360
Like, how do you do that without just checking for equal, equal 3.7 on version?

00:38:01.360 --> 00:38:02.060
Yeah.

00:38:02.060 --> 00:38:03.260
So anyway.

00:38:03.260 --> 00:38:03.840
Yeah.

00:38:03.840 --> 00:38:04.440
All right.

00:38:04.440 --> 00:38:06.300
Well, that's it for our six main topics.

00:38:06.300 --> 00:38:08.340
Brian, you got anything else you want to throw out there quickly?

00:38:09.220 --> 00:38:09.980
Yeah, actually.

00:38:09.980 --> 00:38:22.480
So I didn't have this up, but there was a, on Twitter, somebody like reacted to me with an emoji and I didn't, didn't know what they meant.

00:38:22.480 --> 00:38:26.440
So I looked up, let me, let me pop this up.

00:38:28.380 --> 00:38:37.800
And it was helpful and you can just, you can just copy and paste the emoji that somebody uses in there and it tells you what it means.

00:38:37.800 --> 00:38:44.320
And the, you know, kind of not just what it's supposed to mean, but also what people are using it for.

00:38:44.320 --> 00:38:49.340
Anyway, for somebody that's sort of an old, old guy that is out of touch sometimes, this was helpful.

00:38:49.340 --> 00:38:50.860
Anyway.

00:38:51.120 --> 00:38:51.340
Yeah.

00:38:51.340 --> 00:38:54.900
I mean, sometimes it's obvious, like a heart, we know what a heart means.

00:38:54.900 --> 00:38:55.280
Right.

00:38:55.280 --> 00:39:00.900
But, you know, like hands together, it's not necessarily that that's like a thank you sort of bow type of thing.

00:39:00.900 --> 00:39:03.040
I mean, there's certain ones where you're like, ah, what does that mean?

00:39:03.040 --> 00:39:06.520
It was like a hands together with like arrows coming out of the top.

00:39:06.520 --> 00:39:10.680
And I'm like, I don't know what this is, but apparently it's just raising hands.

00:39:10.680 --> 00:39:12.740
Like, like you're saying hooray for somebody.

00:39:12.740 --> 00:39:13.540
Oh, okay.

00:39:13.540 --> 00:39:14.400
That's nice.

00:39:14.400 --> 00:39:14.880
So.

00:39:14.880 --> 00:39:15.200
Okay.

00:39:15.200 --> 00:39:15.760
It's good.

00:39:15.760 --> 00:39:18.820
I use Emojipedia all the time, but I think I use it in the opposite way.

00:39:18.820 --> 00:39:24.100
Like I use it to get an emoji to like put somewhere because I don't have like an emoji keyboard or whatever.

00:39:24.100 --> 00:39:25.000
Oh yeah.

00:39:25.000 --> 00:39:25.700
That would be good too.

00:39:25.700 --> 00:39:38.040
The other thing I wanted to bring up is I hopefully have some cool news to share tomorrow about the pytestBook and the news will show up on a revamped pytestBook site.

00:39:38.040 --> 00:39:48.300
So if you go to pytestBook.com, you get redirected to this Pythontest.com page where I'll talk about the second edition.

00:39:48.680 --> 00:39:51.740
Hopefully there'll be news about the second edition coming out tomorrow.

00:39:51.740 --> 00:39:54.600
Is your new static site magic?

00:39:54.600 --> 00:39:55.700
Yeah.

00:39:55.700 --> 00:39:56.160
Yeah.

00:39:56.160 --> 00:39:56.800
Static site.

00:39:56.800 --> 00:39:58.640
And I totally, and it goes dark and light.

00:39:58.640 --> 00:40:01.720
But I totally stole from Pragyun.

00:40:01.720 --> 00:40:06.280
So Pragyun has the same, he's got a really nice site.

00:40:06.280 --> 00:40:08.540
So it's a bunch of great, great.

00:40:08.540 --> 00:40:09.220
It looked great.

00:40:09.220 --> 00:40:10.260
And I'm like, that'll work.

00:40:10.260 --> 00:40:11.320
I'll just do what he's doing.

00:40:11.320 --> 00:40:12.680
So that's what I did.

00:40:12.680 --> 00:40:13.140
Yeah.

00:40:13.140 --> 00:40:13.660
Yeah.

00:40:13.660 --> 00:40:14.000
Very cool.

00:40:14.000 --> 00:40:17.800
I think we have exactly the same stack for our Saturn Cloud site now.

00:40:17.800 --> 00:40:18.900
Oh, how neat.

00:40:18.900 --> 00:40:19.360
So it's cool.

00:40:19.360 --> 00:40:20.380
Awesome.

00:40:20.380 --> 00:40:21.140
How about you, Julie?

00:40:21.140 --> 00:40:22.560
Anything else you want to give a shout out to?

00:40:22.560 --> 00:40:26.480
Well, I've been really into entry points recently.

00:40:26.480 --> 00:40:29.600
Just like the concept of them is very cool.

00:40:29.680 --> 00:40:34.220
As in like Python packages, you can give them almost like CLI command type entry points?

00:40:34.220 --> 00:40:34.900
Yeah.

00:40:34.900 --> 00:40:39.480
But the thing that I think is really cool is like, like, like Matplotlib.

00:40:39.480 --> 00:40:43.520
This is an example that, that made me first realize about entry points is Matplotlib has

00:40:43.520 --> 00:40:44.200
this .plot.

00:40:44.200 --> 00:40:45.940
I think I mentioned this three times now.

00:40:45.940 --> 00:40:47.820
But you can swap out the backend.

00:40:47.820 --> 00:40:49.080
So you don't have to have Matplotlib.

00:40:49.080 --> 00:40:50.660
You can use other backends.

00:40:50.660 --> 00:40:57.180
And all the logic for that is in the other visualization libraries themselves, not in

00:40:57.180 --> 00:40:57.780
Pandas.

00:40:57.780 --> 00:41:02.320
So it's, it's just like, you can swap out other things.

00:41:02.320 --> 00:41:03.720
It's not just for CLIs.

00:41:03.720 --> 00:41:04.820
Okay.

00:41:04.820 --> 00:41:05.400
Yeah.

00:41:05.400 --> 00:41:05.860
How neat.

00:41:05.860 --> 00:41:06.260
All right.

00:41:06.260 --> 00:41:06.440
Yeah.

00:41:06.440 --> 00:41:09.340
I learned about entry points a year, year and a half ago.

00:41:09.340 --> 00:41:11.080
And ever since I'm like, oh yeah, this is awesome.

00:41:11.080 --> 00:41:14.100
I can now create these little commands that'll be part of just my shell.

00:41:14.100 --> 00:41:14.740
I love it.

00:41:14.740 --> 00:41:15.320
Yeah.

00:41:15.320 --> 00:41:18.280
The other thing I wanted to say was GitHub CLI is really cool.

00:41:18.280 --> 00:41:22.360
I think that's standalone, but it's, I've been using it a lot.

00:41:22.360 --> 00:41:27.020
I'm sure people know the Git CLI, but what's the story of the GitHub CLI?

00:41:27.020 --> 00:41:34.160
Oh, well, the GitHub CLI is, makes it, so if you have ever tried to check out a branch on

00:41:34.160 --> 00:41:38.960
someone else's fork, like if you want to like evaluate a PR that someone has put on a fork.

00:41:38.960 --> 00:41:39.400
Yeah, exactly.

00:41:39.400 --> 00:41:39.760
Yeah.

00:41:39.760 --> 00:41:44.380
That is the situation where the GitHub CLI is really great because you can just do like

00:41:44.380 --> 00:41:51.140
GH checkout PR or a GH PR checkout, whatever the number is, and that you're just on their

00:41:51.140 --> 00:41:51.780
branch then.

00:41:51.780 --> 00:41:55.840
And if you can push, if you have push access to their branch, if you're a maintainer and

00:41:55.840 --> 00:41:58.080
they've allowed it, you can just push directly.

00:41:58.080 --> 00:42:03.840
And you don't, I mean, I was always looking at that sequence of commands before, like I

00:42:03.840 --> 00:42:08.120
know people have like Git aliases and stuff, but yeah, I'd really recommend checking it

00:42:08.120 --> 00:42:09.420
out if you do a lot of GitHub stuff.

00:42:09.420 --> 00:42:09.940
Okay.

00:42:09.940 --> 00:42:10.260
Awesome.

00:42:10.460 --> 00:42:10.580
Yeah.

00:42:10.580 --> 00:42:11.340
That's great advice.

00:42:11.340 --> 00:42:11.900
Yeah.

00:42:11.900 --> 00:42:15.640
I often want to like check out some, so a pull request, I want to be able to like play with

00:42:15.640 --> 00:42:16.520
it and run their code.

00:42:16.520 --> 00:42:17.100
And yeah.

00:42:17.100 --> 00:42:18.420
And so, yeah.

00:42:18.420 --> 00:42:19.320
It's the best.

00:42:19.320 --> 00:42:20.360
Yeah.

00:42:20.360 --> 00:42:20.800
Awesome.

00:42:20.800 --> 00:42:21.340
All right.

00:42:21.340 --> 00:42:24.800
I got a couple of things to add, by the way, first of all, just that first practical SQL

00:42:24.800 --> 00:42:26.260
analysis that you talked about.

00:42:26.260 --> 00:42:29.780
It also is a similar theme that you were talking about, Brian.

00:42:29.780 --> 00:42:33.140
One of the things I thought was cool though, as you scroll through it, it has a progress bar

00:42:33.140 --> 00:42:34.280
for reading at the top.

00:42:34.280 --> 00:42:35.440
And that just made me so happy.

00:42:35.440 --> 00:42:38.000
I don't know why that was, that was really neat.

00:42:38.300 --> 00:42:38.500
All right.

00:42:38.500 --> 00:42:40.600
But I have a bunch of hear all about it sort of things.

00:42:40.600 --> 00:42:44.540
So really quick, Python, B2, I just got the center.

00:42:44.540 --> 00:42:44.740
Yeah.

00:42:44.740 --> 00:42:45.100
Okay.

00:42:45.100 --> 00:42:46.360
Live update.

00:42:46.360 --> 00:42:50.520
Python 310 beta 2 is out if people want to check that out.

00:42:50.520 --> 00:42:52.560
And you can go download that.

00:42:52.560 --> 00:42:59.060
It also highlights all the major features like the pipe operator for writing unions and type

00:42:59.060 --> 00:43:03.240
specifications and a bunch of other stuff that people might care about.

00:43:03.240 --> 00:43:04.620
Structure pattern matching.

00:43:04.620 --> 00:43:05.760
It's probably a big one.

00:43:06.120 --> 00:43:06.260
Yeah.

00:43:06.260 --> 00:43:07.940
Go to the completely different down.

00:43:07.940 --> 00:43:09.200
Is that on here?

00:43:09.200 --> 00:43:10.840
And now for something completely different.

00:43:10.840 --> 00:43:11.540
I love that part.

00:43:11.540 --> 00:43:12.700
So right above the files.

00:43:12.700 --> 00:43:14.200
Yeah.

00:43:14.200 --> 00:43:16.040
Oh, interesting.

00:43:16.040 --> 00:43:21.960
The Aaron Fest paradox concerns the rotation of a rigid disk in the theory of relativity.

00:43:21.960 --> 00:43:24.840
It's original 1909 formulation presented by.

00:43:24.840 --> 00:43:25.280
Yeah.

00:43:25.280 --> 00:43:25.500
Okay.

00:43:25.500 --> 00:43:27.760
That is unexpected, but very cool.

00:43:27.760 --> 00:43:29.660
And completely different and irrelevant.

00:43:29.660 --> 00:43:30.160
Yeah.

00:43:30.160 --> 00:43:30.720
Yeah.

00:43:30.720 --> 00:43:31.060
Awesome.

00:43:31.180 --> 00:43:31.380
Okay.

00:43:31.380 --> 00:43:34.340
So takeaway 310 beta 2 is out.

00:43:34.340 --> 00:43:35.160
People can check that out.

00:43:35.160 --> 00:43:37.840
There's also some security patches for Django.

00:43:37.840 --> 00:43:38.980
So be sure to check that out.

00:43:38.980 --> 00:43:45.120
One thing that surprised me is the Microsoft install Python from the Windows store is already

00:43:45.120 --> 00:43:49.140
like has a 310 beta store install.

00:43:49.140 --> 00:43:50.280
So, okay.

00:43:50.280 --> 00:43:52.660
That's pretty cool that they're keeping that up to date.

00:43:52.660 --> 00:43:54.160
And it's rated E for everyone.

00:43:54.540 --> 00:43:54.760
Yeah.

00:43:54.760 --> 00:43:56.700
Even kids can pip install.

00:43:56.700 --> 00:43:57.120
Awesome.

00:43:57.120 --> 00:44:03.900
So Frederick Bankston sent a message in response to our last show where we talked about the

00:44:03.900 --> 00:44:05.760
method overloading by type.

00:44:05.760 --> 00:44:08.520
Like if it takes an int or a string, it calls different functions.

00:44:08.520 --> 00:44:13.020
It's also pointed us towards this multi-method other library that is similar.

00:44:13.020 --> 00:44:14.120
So people can check that out.

00:44:14.120 --> 00:44:14.520
That's cool.

00:44:14.520 --> 00:44:14.720
Yeah.

00:44:14.720 --> 00:44:15.200
Neat.

00:44:15.400 --> 00:44:22.120
Speaking of the GitHub stuff, I've been starting to use PyCharm 2021 to early access

00:44:22.120 --> 00:44:24.180
version, early access program version one.

00:44:24.180 --> 00:44:25.260
And it's been working fine.

00:44:25.260 --> 00:44:28.180
So if people want to try out the new features, there's a bunch of cool stuff.

00:44:28.180 --> 00:44:32.100
You have support for Python 310 and new stuff for pytest.

00:44:32.100 --> 00:44:38.080
I don't remember if this came in here, but one thing that I did learn about that recently

00:44:38.080 --> 00:44:44.500
that's in there that's super cool is they have in PyCharm, if you log in PyCharm into

00:44:44.500 --> 00:44:48.280
your GitHub account, there's a pull request section and you can just click it and it'll

00:44:48.280 --> 00:44:50.480
do those same steps that Julia was talking about.

00:44:50.480 --> 00:44:55.200
Like right there in PyCharm, just go, I want to try that PR before I accept it and just click

00:44:55.200 --> 00:44:55.700
that and go.

00:44:55.700 --> 00:44:57.260
You can even have comments.

00:44:57.260 --> 00:44:59.300
You see the conversation inside there and everything.

00:44:59.300 --> 00:44:59.680
It's cool.

00:44:59.680 --> 00:45:01.760
Never go to GitHub again.

00:45:01.760 --> 00:45:02.860
Exactly.

00:45:02.860 --> 00:45:05.040
And don't just forget how to use it basically.

00:45:05.040 --> 00:45:05.780
All right.

00:45:05.780 --> 00:45:07.200
That's it.

00:45:07.320 --> 00:45:08.340
That's all the items I got.

00:45:08.340 --> 00:45:11.280
So yeah, I've got other stuff that's just hanging around from before.

00:45:11.280 --> 00:45:11.780
Cool.

00:45:11.780 --> 00:45:12.680
All right.

00:45:12.680 --> 00:45:14.200
Well, you want to close it out with a joke?

00:45:14.200 --> 00:45:14.760
Yeah.

00:45:14.760 --> 00:45:15.700
A couple of jokes.

00:45:15.700 --> 00:45:16.340
Always.

00:45:16.340 --> 00:45:16.800
All right.

00:45:16.800 --> 00:45:22.080
So over at upjoke.com slash programmer to ask jokes, you'll find many bad jokes.

00:45:22.080 --> 00:45:26.340
Some even that are not very appropriate or whatever, but there's a few that are funny.

00:45:26.340 --> 00:45:27.940
So I pulled out three here.

00:45:27.940 --> 00:45:29.580
I'll do the first one.

00:45:29.580 --> 00:45:31.300
Brian, you can do the second.

00:45:31.300 --> 00:45:33.120
Julie, you can do the third, I guess, if you're up for it.

00:45:33.120 --> 00:45:33.580
Okay.

00:45:33.580 --> 00:45:36.520
So this one we should have saved for six months from now.

00:45:36.560 --> 00:45:39.580
But I asked a programmer what her new year's resolution would be.

00:45:39.580 --> 00:45:41.480
She answered 1920 by 1080.

00:45:41.480 --> 00:45:42.820
That's so bad.

00:45:42.820 --> 00:45:44.460
No, that's awesome.

00:45:44.460 --> 00:45:45.560
It's really bad.

00:45:45.560 --> 00:45:45.780
All right.

00:45:45.780 --> 00:45:46.920
Well, you got to do the next one.

00:45:46.920 --> 00:45:51.940
How does a programmer confuse a mathematician?

00:45:51.940 --> 00:45:53.220
I don't know how.

00:45:53.220 --> 00:45:55.580
Just saying that X equals X plus one.

00:45:55.580 --> 00:45:59.700
All right, Julia.

00:46:00.180 --> 00:46:00.620
Okay.

00:46:00.620 --> 00:46:03.960
Why do Python programmers have low self-esteem?

00:46:03.960 --> 00:46:06.880
They're constantly comparing their self to other.

00:46:06.880 --> 00:46:10.780
Also bad.

00:46:10.780 --> 00:46:11.720
Probably the worst.

00:46:11.720 --> 00:46:12.660
Sorry we gave you that one.

00:46:12.660 --> 00:46:13.860
That's okay.

00:46:13.860 --> 00:46:15.500
I saw this.

00:46:15.500 --> 00:46:19.240
I saw the one that Brian did and I was like, oh, it should be X plus equals one.

00:46:19.300 --> 00:46:20.680
And I was like, no, that ruins the joke.

00:46:20.680 --> 00:46:22.820
Exactly.

00:46:22.820 --> 00:46:24.160
Yeah.

00:46:24.160 --> 00:46:25.880
Yeah.

00:46:25.880 --> 00:46:30.800
I actually often do the slow way or the non-obvious way.

00:46:30.800 --> 00:46:30.900
The proposed way.

00:46:30.900 --> 00:46:31.160
Yeah.

00:46:31.160 --> 00:46:35.920
X equals X plus one just to make it more obvious to people reading it sometimes.

00:46:35.920 --> 00:46:36.260
Yeah.

00:46:36.260 --> 00:46:36.840
Yeah.

00:46:36.840 --> 00:46:37.640
No, I agree.

00:46:37.640 --> 00:46:38.320
Yeah.

00:46:38.320 --> 00:46:41.740
At least it's not C++ with X, plus plus X.

00:46:41.740 --> 00:46:43.540
I love that.

00:46:43.540 --> 00:46:44.640
No, no.

00:46:44.900 --> 00:46:46.280
We should have that.

00:46:46.280 --> 00:46:50.800
I'm okay with X plus plus, but not that also plus plus X.

00:46:50.800 --> 00:46:51.680
Oh, the pre-increment.

00:46:51.680 --> 00:46:52.240
Yeah.

00:46:52.240 --> 00:46:53.180
The pre-increment.

00:46:53.180 --> 00:46:53.800
The slight.

00:46:53.800 --> 00:46:54.380
That's weird.

00:46:54.380 --> 00:46:55.140
Yes.

00:46:55.140 --> 00:46:55.840
Exactly.

00:46:55.840 --> 00:46:56.540
Exactly.

00:46:56.540 --> 00:46:57.280
But I could go for it.

00:46:57.280 --> 00:46:57.880
X plus plus.

00:46:57.880 --> 00:46:58.220
Come on.

00:46:58.220 --> 00:46:58.820
All right.

00:46:58.820 --> 00:47:01.200
Well, Julia, thanks for joining us this week.

00:47:01.200 --> 00:47:02.420
And Brian, thanks as always.

00:47:02.420 --> 00:47:03.280
Oh, it was a pleasure.

00:47:03.280 --> 00:47:03.920
Thanks, Julia.

00:47:03.920 --> 00:47:04.500
Yeah.

00:47:04.500 --> 00:47:04.600
Bye.

00:47:04.600 --> 00:47:04.740
Bye.

