WEBVTT

00:00:00.001 --> 00:00:05.380
Hello and welcome to Python Bytes, where we deliver news and headlines directly to your earbuds.

00:00:05.380 --> 00:00:11.120
This is episode 233, recorded May 12, 2021, and I'm Brian Okken.

00:00:11.120 --> 00:00:12.160
I'm Michael Kennedy.

00:00:12.160 --> 00:00:13.980
And I'm Marlene Mangami.

00:00:13.980 --> 00:00:18.240
Well, welcome, Marlene. For people who don't know you, can you introduce who you are?

00:00:18.240 --> 00:00:26.100
So I am a Pythonista, of course, and I am based in Hawaii, Zimbabwe.

00:00:26.100 --> 00:00:29.800
I am also really involved with the Python community.

00:00:30.080 --> 00:00:34.160
So I'm currently the vice chair of the PSA Board of Directors.

00:00:34.160 --> 00:00:41.340
The board, I think, for about coming up on four years now, which is really exciting.

00:00:41.340 --> 00:00:44.620
And it's been really a very cool experience for me.

00:00:44.620 --> 00:00:47.760
I'm also a software engineer.

00:00:47.760 --> 00:00:53.780
I work right now with the Rapids team at NVIDIA and have just been doing software engineering with them.

00:00:53.780 --> 00:00:55.600
I will talk a bit about that later.

00:00:55.600 --> 00:00:59.420
But yeah, I'm trying to think what else.

00:00:59.480 --> 00:01:04.940
I'm also a very avid reader and just like doing other things besides software.

00:01:04.940 --> 00:01:07.740
So yeah, that's pretty much me.

00:01:07.740 --> 00:01:08.180
Cool.

00:01:08.180 --> 00:01:08.980
That's awesome.

00:01:08.980 --> 00:01:10.240
You're doing a bunch of cool stuff.

00:01:10.240 --> 00:01:13.700
So I think Rapids seems like a really neat project to work on as well.

00:01:13.700 --> 00:01:17.120
And of course, the Python community side is great.

00:01:17.120 --> 00:01:18.840
So super happy to have you here.

00:01:18.840 --> 00:01:23.340
Brian, you know, having a good readme is really important to a project, wouldn't you say?

00:01:23.760 --> 00:01:24.960
Yeah, definitely.

00:01:24.960 --> 00:01:31.560
And I, for some reason, I don't know, readmes are not difficult to write, but I freeze up.

00:01:31.560 --> 00:01:32.980
It's a blank page syndrome.

00:01:32.980 --> 00:01:37.260
I think often I've gone through and just like copied from some other project.

00:01:37.260 --> 00:01:38.780
What's in there?

00:01:38.780 --> 00:01:39.100
Readme.

00:01:39.940 --> 00:01:45.700
But I don't think that that's like the best way to go about it, really, because sometimes you forget stuff.

00:01:45.980 --> 00:01:49.400
So this was a, we have a recommendation from Johnny Metz.

00:01:49.400 --> 00:01:52.260
It's a tool called readme.so.

00:01:53.180 --> 00:01:55.820
And this is like totally fun.

00:01:55.820 --> 00:02:00.920
It's just this interactive thing where you get to add stuff.

00:02:00.920 --> 00:02:01.980
So we've got the title.

00:02:01.980 --> 00:02:08.680
You can, there's on the left-hand side, there's a bunch of sections where you can select what you want to go into the readme.

00:02:08.680 --> 00:02:15.160
And then it shows a preview on the right, but you can also see the raw markdown.

00:02:15.160 --> 00:02:17.060
And then in the middle, there's an editor.

00:02:17.060 --> 00:02:19.100
So you can actually just edit the whole thing here.

00:02:19.100 --> 00:02:24.020
But really, I don't know if I really would project title.

00:02:24.020 --> 00:02:30.120
What I'd probably do is go through and, you know, pick out to look at what sort of things I'd want.

00:02:30.120 --> 00:02:36.840
So I'd probably maybe some acknowledgements if I took the, if I got some help from somebody, maybe an API reference.

00:02:36.840 --> 00:02:40.800
If it's got a, if it's a library, how to contribute.

00:02:40.800 --> 00:02:42.020
Oh, badges.

00:02:42.020 --> 00:02:43.200
Definitely want badges.

00:02:43.200 --> 00:02:44.260
Oh, yeah.

00:02:44.260 --> 00:02:48.500
And then, you know, maybe like how to run tests if you want to contribute.

00:02:48.500 --> 00:02:54.360
If there's other cool projects using it, I'd want to use by all these sort of things.

00:02:54.360 --> 00:03:00.880
And then you can, the editor only selects, it only shows you the ones, the one at a time, which is nice.

00:03:00.880 --> 00:03:09.000
But then you've got this, this whole generated, really nice looking readme with tables and everything like built in.

00:03:09.060 --> 00:03:12.380
And you can either just copy it or download it and just run with it.

00:03:12.380 --> 00:03:14.860
I think this is really great.

00:03:14.860 --> 00:03:16.200
I'll probably use this in the future.

00:03:16.720 --> 00:03:17.500
I really love this.

00:03:17.500 --> 00:03:24.700
And I'm surprised about the psychological benefit of just showing the little, the section with the one heading.

00:03:24.700 --> 00:03:28.520
So for like example, acknowledgements, you just have hash, hash acknowledgements.

00:03:28.520 --> 00:03:34.180
And then the few things, even though you're editing the whole readme, it seems so much more like, oh, I'm going to just work on that section.

00:03:34.180 --> 00:03:35.260
It's really cool.

00:03:35.260 --> 00:03:36.020
Marlene, what do you think?

00:03:36.020 --> 00:03:37.560
It's really, really cool.

00:03:37.560 --> 00:03:38.300
I like it.

00:03:38.360 --> 00:03:40.100
I think I'm going to try it out.

00:03:40.100 --> 00:03:45.880
I have put no if it at all in this.

00:03:45.880 --> 00:03:49.320
So I think it's, it's something I need to put more if it into.

00:03:49.320 --> 00:03:51.720
And this looks like a really good way to do that.

00:03:51.720 --> 00:03:52.600
So, yeah.

00:03:52.600 --> 00:03:59.480
I think that it would also be great to just like, if you have an existing readme and you want to add some new sections, you're not quite sure how it should look.

00:03:59.480 --> 00:04:04.840
Using this as a jumping point of just to grab sections of a readme to add to an existing one too.

00:04:04.840 --> 00:04:05.540
This would be great.

00:04:05.540 --> 00:04:06.560
Yeah.

00:04:06.560 --> 00:04:08.120
This is really, really cool.

00:04:08.120 --> 00:04:12.080
How do you, how do you, can you start with a new one?

00:04:12.080 --> 00:04:13.940
Like, can I, well, sorry, let me take it back.

00:04:13.940 --> 00:04:15.420
Can I start with an existing one?

00:04:15.420 --> 00:04:17.720
Can I somehow upload an existing one?

00:04:17.720 --> 00:04:18.800
I don't see.

00:04:18.800 --> 00:04:19.180
I don't think so.

00:04:19.180 --> 00:04:21.380
Wait, I can go to raw.

00:04:21.380 --> 00:04:22.080
Hold on.

00:04:22.080 --> 00:04:24.060
Oh, you can probably just drop it into raw.

00:04:24.060 --> 00:04:24.360
Maybe.

00:04:24.360 --> 00:04:26.160
Yes, you can drop it into the raw.

00:04:26.160 --> 00:04:26.540
That's it.

00:04:26.540 --> 00:04:27.100
Okay, perfect.

00:04:27.100 --> 00:04:29.560
You go to raw, which it doesn't hide the sections.

00:04:29.560 --> 00:04:30.860
It's just pure markdown.

00:04:30.860 --> 00:04:31.880
And then you just throw it in there.

00:04:31.880 --> 00:04:32.080
Okay.

00:04:32.080 --> 00:04:34.020
But you can't edit there.

00:04:34.020 --> 00:04:35.640
No, no, but you can flip it back.

00:04:35.640 --> 00:04:37.200
I think probably once you edit it there.

00:04:37.620 --> 00:04:38.980
I don't think you can edit.

00:04:38.980 --> 00:04:41.600
So you can only edit in the, like the editor part.

00:04:41.600 --> 00:04:42.140
So.

00:04:42.140 --> 00:04:42.720
Yeah.

00:04:42.720 --> 00:04:43.860
It still looks really, really cool.

00:04:43.860 --> 00:04:46.020
I've, I've heard of platform as a service.

00:04:46.020 --> 00:04:48.860
I've heard of infrastructure as a service.

00:04:49.080 --> 00:04:52.400
I've heard of database as a service, but I guess now we have read me as a service.

00:04:52.400 --> 00:04:52.720
I don't know.

00:04:52.720 --> 00:04:53.520
You just go to the website.

00:04:53.520 --> 00:04:55.700
Exactly.

00:04:55.700 --> 00:04:56.160
That's pretty cool.

00:04:56.160 --> 00:04:57.420
Yeah.

00:04:57.420 --> 00:04:59.040
I'm, I'm, I'm pretty excited about this.

00:04:59.040 --> 00:05:02.660
Actually, I might, I might play around with this, for my next project.

00:05:02.780 --> 00:05:07.120
I've got some stuff that may end up on PyPI soon and it'd be cool to do it.

00:05:07.120 --> 00:05:07.640
All right.

00:05:07.640 --> 00:05:13.000
So I've got the next item and it's a bit of, a skateboarding dog type of thing.

00:05:13.000 --> 00:05:16.660
It's not something I think a lot of us will take advantage of, but it's something that is

00:05:16.660 --> 00:05:22.580
pretty interesting as we kind of look at how Python is finding its way into the larger

00:05:22.580 --> 00:05:23.500
computing space.

00:05:23.640 --> 00:05:24.080
Yeah.

00:05:24.080 --> 00:05:28.040
And Oh, Sam Morley out there and the live stream before we move on says, it'd be

00:05:28.040 --> 00:05:32.060
really cool if you could point this at a GitHub repo and edit your repo directly.

00:05:32.060 --> 00:05:34.000
the read me directly on your repo.

00:05:34.000 --> 00:05:35.640
Yes, absolutely.

00:05:35.640 --> 00:05:36.380
That's fantastic.

00:05:36.380 --> 00:05:37.860
Yeah.

00:05:37.860 --> 00:05:38.860
That's a really good idea.

00:05:38.860 --> 00:05:39.520
Really good idea.

00:05:39.520 --> 00:05:40.160
All right.

00:05:40.160 --> 00:05:41.480
Back to my skateboarding dog.

00:05:41.480 --> 00:05:48.780
So there's a company called Sarah brass, and this was sent over to, to us by Galen

00:05:48.780 --> 00:05:53.040
Swint, who is a PhD researcher who does high performance computing.

00:05:53.040 --> 00:05:53.880
And stuff.

00:05:53.880 --> 00:05:57.460
So in that world, I think this, this may be a real thing.

00:05:57.460 --> 00:06:01.260
You look through the article here that talks about this announcement and it's like, well,

00:06:01.260 --> 00:06:05.900
there's like these 12 customers or 15 customers of this, this chip.

00:06:05.900 --> 00:06:10.100
But for those of you watching, there's, or you check out the article, there's a woman holding

00:06:10.100 --> 00:06:11.000
a chip.

00:06:11.000 --> 00:06:13.740
And normally we think of computer chips as little tiny things.

00:06:13.740 --> 00:06:20.360
This is a 12 inch by 12 inch computer chip, or you want to go metric 30 centimeters by 30

00:06:20.360 --> 00:06:20.720
centimeters.

00:06:20.920 --> 00:06:23.700
It is a big, big computer chip.

00:06:23.700 --> 00:06:29.340
And the idea is we've had small little chips come along to do special types of processing.

00:06:29.340 --> 00:06:35.960
We've had GPUs come along and do, be adapted, I guess, for things like machine learning, training

00:06:35.960 --> 00:06:36.960
machine learning models.

00:06:36.960 --> 00:06:41.460
This thing just takes that idea to an entire new level.

00:06:41.460 --> 00:06:48.200
So for example, I'm always going on and on and raving about my Mac mini, my M1, where,

00:06:48.200 --> 00:06:51.240
it's a cheap little computer relative to Apple stuff, I guess.

00:06:51.240 --> 00:06:57.120
but it's, it's super fast, but it has four performance cores and four efficiency cores.

00:06:57.300 --> 00:06:57.920
That's it.

00:06:57.920 --> 00:07:01.600
your GPU, if you've got a really high end one might have 4,000 cores.

00:07:01.600 --> 00:07:08.140
This insane little chip here has 850,000 AI cores on one chip.

00:07:08.140 --> 00:07:09.100
Is that insane?

00:07:09.100 --> 00:07:10.380
What do you, what do you do think?

00:07:10.380 --> 00:07:16.000
I, I'm, I'm curious how they, I mean, this is, this is some major advances in, in wafer technology,

00:07:16.000 --> 00:07:19.560
because how do you get that big of a chip with no defects in it?

00:07:19.560 --> 00:07:20.040
Yeah.

00:07:20.040 --> 00:07:23.180
And they have apparently 100%, efficiency.

00:07:23.180 --> 00:07:28.640
Well, first of all, one of the ways you do it is you use the TSMC foundry, who seems to

00:07:28.640 --> 00:07:32.160
be taking over all these small high efficiency, type of things.

00:07:32.160 --> 00:07:36.640
And so they had a previous one that they've more than doubled the core count for.

00:07:36.640 --> 00:07:41.960
And another way to kind of appreciate like how much is going on in this chip, I, you know,

00:07:41.980 --> 00:07:47.880
go back to my, my M1, it has a zero, zero, one, six brilliant transistors.

00:07:47.880 --> 00:07:49.420
This has 2.6 trillion.

00:07:49.420 --> 00:07:50.720
Or is that another way?

00:07:50.720 --> 00:07:55.440
2,600 billion, a billion transistors versus 1.6 billion.

00:07:55.440 --> 00:07:58.660
Like it's a 2,000 times more on this chip.

00:07:58.660 --> 00:08:00.040
So super, super cool.

00:08:00.040 --> 00:08:03.000
And now you may be wondering, all right, all this is interesting and tips are neat.

00:08:03.000 --> 00:08:05.300
What is the Python angle?

00:08:05.300 --> 00:08:08.040
Like, why would I bother putting this on here?

00:08:08.040 --> 00:08:11.280
Cause you know, we don't really talk about chips that much, except for when I go on and

00:08:11.280 --> 00:08:12.000
on about my M1.

00:08:12.000 --> 00:08:12.740
Here's the deal.

00:08:12.740 --> 00:08:17.840
If you scroll down in this article a little bit, you'll see user program, this insane machine

00:08:17.840 --> 00:08:23.860
transparently in machine learning frameworks, such as specifically TensorFlow and PyTorch.

00:08:23.860 --> 00:08:25.940
Isn't that crazy?

00:08:25.940 --> 00:08:27.260
That's really interesting.

00:08:27.260 --> 00:08:28.600
Isn't it?

00:08:28.600 --> 00:08:28.760
Yeah.

00:08:28.760 --> 00:08:31.820
I said, I was just thinking about you as I'm going through this, cause you're working on the

00:08:31.820 --> 00:08:36.040
Rapids project, which is, it's not the same thing obviously, but it's kind of in that space,

00:08:36.040 --> 00:08:36.300
right?

00:08:36.300 --> 00:08:38.120
Yeah, it is.

00:08:38.120 --> 00:08:39.440
Have you heard of this before?

00:08:39.440 --> 00:08:41.260
No, I haven't heard of it.

00:08:41.260 --> 00:08:45.260
This is, yeah, this is really big and I have not heard of it.

00:08:45.260 --> 00:08:46.220
Yeah.

00:08:46.220 --> 00:08:49.260
I will get into reading a bit more about it after.

00:08:49.260 --> 00:08:49.940
Yeah.

00:08:49.940 --> 00:08:51.240
Yeah, for sure.

00:08:51.240 --> 00:08:52.680
So there's a lot of interesting things.

00:08:52.680 --> 00:08:57.180
And one of the, I can't remember where exactly they spoke about it, but they basically say,

00:08:57.280 --> 00:09:00.960
what you do is you program and TensorFlow and PyTorch as normal.

00:09:00.960 --> 00:09:07.740
And then they have this custom compiler that, that rewrites, that extracts this execution

00:09:07.740 --> 00:09:14.660
graph to actually scale out to the 850,000 cores to the developers don't have to think about

00:09:14.660 --> 00:09:16.820
how they program against something like this.

00:09:17.280 --> 00:09:20.460
I don't want to spend too much time on this because there's something, my next item is

00:09:20.460 --> 00:09:20.920
super amazing.

00:09:20.920 --> 00:09:22.440
I want to take the time to dive into it.

00:09:22.440 --> 00:09:25.580
But there's another thing that's really interesting.

00:09:25.580 --> 00:09:30.040
Just as you look at it, like this thing takes an insane amount of power.

00:09:30.040 --> 00:09:36.240
Like, oh, for this one chip, you're going to need a four kilowatt power supply with up to

00:09:36.240 --> 00:09:39.040
a peak power of 23 kilowatts.

00:09:39.040 --> 00:09:40.380
Oh, wow.

00:09:41.180 --> 00:09:45.780
When you plug in an electric car at one of the high speed home chargers, that's seven

00:09:45.780 --> 00:09:47.300
kilowatts, just to give you a sense.

00:09:47.300 --> 00:09:51.100
This is like insane amounts for one chip, right?

00:09:51.100 --> 00:09:52.120
You could think of it as a supercomputer.

00:09:52.120 --> 00:09:53.300
Like it's one chip.

00:09:53.300 --> 00:09:54.020
So anyway.

00:09:54.020 --> 00:09:58.020
Our entire lab doesn't draw that much.

00:09:58.020 --> 00:10:03.140
So the reason I said it's a skateboarding dog thing is I don't think most of us will be

00:10:03.140 --> 00:10:06.140
able to ever even interact with one of these, much less buy one.

00:10:06.140 --> 00:10:09.780
They're going to be shipping in the later part of this year.

00:10:09.780 --> 00:10:13.320
And the price is something like $3 million plus.

00:10:13.320 --> 00:10:15.700
So this is certainly super computer level.

00:10:15.700 --> 00:10:21.500
But I do think it opens the door for really interesting stuff going on in the high performance

00:10:21.500 --> 00:10:22.500
Python space.

00:10:22.500 --> 00:10:23.240
So yeah.

00:10:23.240 --> 00:10:25.040
Glad that Galen sent it over.

00:10:25.040 --> 00:10:29.940
Well, I'm totally going to put $25 into Dogecoin so that I can afford this later this year.

00:10:29.940 --> 00:10:31.420
Oh, speaking of which.

00:10:31.420 --> 00:10:33.120
Exactly.

00:10:33.120 --> 00:10:38.300
Well, what about, I think maybe you get this and you create an AI that can more intelligently

00:10:38.300 --> 00:10:40.560
mine Dogecoin and then you take over the world.

00:10:40.560 --> 00:10:42.500
Just an investment.

00:10:42.500 --> 00:10:43.300
Yeah.

00:10:43.300 --> 00:10:43.960
Just an investment.

00:10:43.960 --> 00:10:45.080
All right.

00:10:45.080 --> 00:10:49.420
So speaking of large scale, high performance computing, Marlene, take it away.

00:10:49.420 --> 00:10:50.160
Sure.

00:10:50.160 --> 00:10:52.720
I have the next item, which is Rapid.

00:10:53.340 --> 00:10:58.020
And I wanted to speak about this because I'm working on it.

00:10:58.020 --> 00:11:00.700
And it's what I've been working on for, I think.

00:11:00.700 --> 00:11:01.140
Yeah.

00:11:01.140 --> 00:11:01.460
Wow.

00:11:01.460 --> 00:11:06.120
It's been about a year since I have been with NVIDIA.

00:11:07.300 --> 00:11:12.180
working as a software engineer there and working specifically on the Rapids project.

00:11:12.180 --> 00:11:19.820
And so Rapids, I think, is really interesting because the goal of Rapids, similarly, like the

00:11:19.820 --> 00:11:23.600
last thing Michael just showed us, is to speed up data science.

00:11:23.600 --> 00:11:25.080
But this is what GPUs.

00:11:25.080 --> 00:11:32.860
So I think it's really, it's been really cool to work on the Rapids project.

00:11:32.860 --> 00:11:37.120
And I think it's really interesting as well because it's open source.

00:11:37.120 --> 00:11:40.180
It's also, there's a lot of Python involved.

00:11:40.180 --> 00:11:44.820
So it's, well, it's not entirely, it's not mostly Python.

00:11:44.820 --> 00:11:49.020
Actually, there's a lot of C++ and Cuda code in there as well.

00:11:49.380 --> 00:11:55.300
But I am not, you know, personally, I'm not, I'm not, my aim is not to learn Cuda.

00:11:55.300 --> 00:11:57.840
It's to try and avoid that as much as possible.

00:11:57.840 --> 00:12:01.440
And also avoid as much C++ as possible.

00:12:01.440 --> 00:12:03.640
So that's a bit more reasonable.

00:12:03.640 --> 00:12:11.740
But one of the goals of the Rapids project is to allow people who are Pythonistas to work

00:12:11.740 --> 00:12:19.140
primarily with GPUs and to get those speed ups without having to know any Cuda code or

00:12:19.140 --> 00:12:21.760
to know any C++.

00:12:21.760 --> 00:12:29.600
And so I have been working primarily on the Python side of things and have really been enjoying it.

00:12:29.600 --> 00:12:34.880
I work specifically on, with the QDF data frame library.

00:12:34.880 --> 00:12:42.920
And QDF is basically a, it mirrors, it's a, it's a GPU data frame library that mirrors pandas.

00:12:43.060 --> 00:12:49.520
So if you have a data set and you'd like to do computations on your data set or do different

00:12:49.520 --> 00:12:55.720
operations on your data set, if you can do that with pandas, you should be able, hopefully,

00:12:55.720 --> 00:12:57.680
to do the same thing with QDF.

00:12:57.680 --> 00:13:03.080
But the good thing is that it will be, it will, it will probably be faster.

00:13:03.080 --> 00:13:10.600
I actually can't definitively say that it will be faster because I remember when I first joined

00:13:10.600 --> 00:13:17.300
the project as well, I was, I really, I'm very enthusiastic and I really enjoy sort of sharing

00:13:17.300 --> 00:13:18.980
when I'm learning something new.

00:13:19.140 --> 00:13:25.300
And I remember I was like going around and speaking and saying that, you know, QDF is

00:13:25.300 --> 00:13:29.000
so much better than pandas because it's just so much faster.

00:13:29.000 --> 00:13:35.740
And then my manager was just like, you need to stop saying that because it's not true all

00:13:35.740 --> 00:13:36.480
of the time.

00:13:36.560 --> 00:13:38.560
it's true most like some of the time.

00:13:38.560 --> 00:13:46.640
So for smaller data sets, it's probably better to stick with pandas because.

00:13:46.640 --> 00:13:47.620
Yeah.

00:13:47.620 --> 00:13:48.980
There's always this overhead, right?

00:13:48.980 --> 00:13:53.960
Like as you scale things out and stuff, there's probably like, well, how do we convert this over

00:13:53.960 --> 00:13:55.240
and get it onto the GPU?

00:13:55.240 --> 00:14:00.440
And if, if that process takes half the time of what just doing the computation, you might

00:14:00.440 --> 00:14:01.660
as well just do the computation, right?

00:14:01.900 --> 00:14:02.260
Exactly.

00:14:02.260 --> 00:14:05.980
Like if you're already, if you're mainly doing, right, I agree.

00:14:05.980 --> 00:14:11.060
Like if you're working with smaller data sets and you are fine with that and that works for

00:14:11.060 --> 00:14:16.640
you and your time is not like being wasted a lot, then I would say, please go ahead and

00:14:16.640 --> 00:14:17.560
stick with pandas.

00:14:17.560 --> 00:14:24.100
But if you have, if you are working on like larger data sets and the larger your data sets

00:14:24.100 --> 00:14:27.600
get, the more the difference is going to be in terms of your speed up.

00:14:27.600 --> 00:14:33.820
So with very large data sets, QDF is going to take a much shorter time to do computations

00:14:33.820 --> 00:14:34.600
and things like that.

00:14:34.600 --> 00:14:35.120
Yeah.

00:14:35.120 --> 00:14:38.820
You've actually put a really interesting example in the show notes here, right?

00:14:38.820 --> 00:14:40.960
Showing how many zeros is that?

00:14:40.960 --> 00:14:44.340
A hundred million items or something like that?

00:14:44.340 --> 00:14:44.820
Yeah.

00:14:44.820 --> 00:14:45.640
It's a hundred million.

00:14:45.640 --> 00:14:51.460
I just kind of like randomly chose a number to try and make it like, I didn't also want to

00:14:51.460 --> 00:14:58.120
take a number that was too big because I didn't want to spend like a long time doing it.

00:14:58.120 --> 00:15:02.600
And I know like for a lot of data scientists, like I think increasingly people are working

00:15:02.600 --> 00:15:06.560
with larger and larger data sets, just depending which field you're in.

00:15:06.560 --> 00:15:10.160
For the example, I put it on the show notes and it's on the screen right now.

00:15:10.680 --> 00:15:16.300
But if you take a Pandas sort of data frame and try and calculate the mean and you take

00:15:16.300 --> 00:15:22.480
the same QDF data set and try and calculate the mean, it will take, I think I'm trying to

00:15:22.480 --> 00:15:22.900
look at the notes.

00:15:22.900 --> 00:15:32.400
It's 105 milliseconds for Pandas and it's like 1.83 milliseconds for QDF, which is just,

00:15:32.400 --> 00:15:33.480
That's awesome.

00:15:33.480 --> 00:15:37.900
And that's like a smaller scale, I would say, data sets compared to some people.

00:15:39.060 --> 00:15:40.660
Yeah, just a hundred million.

00:15:40.660 --> 00:15:42.760
It's just a hundred million.

00:15:42.760 --> 00:15:44.260
So it's not a lot.

00:15:44.260 --> 00:15:45.560
I mean, it depends.

00:15:45.560 --> 00:15:51.400
But yeah, I think it's definitely significant once you get to a certain threshold, which is

00:15:51.400 --> 00:15:51.820
pretty cool.

00:15:51.820 --> 00:15:52.360
Yeah.

00:15:52.360 --> 00:15:52.640
Yeah.

00:15:52.640 --> 00:15:55.360
Over on the rapid site, it's a rapid site AI.

00:15:55.360 --> 00:15:58.340
It says it scales out on multiple GPUs.

00:15:58.340 --> 00:16:04.880
So seamlessly scale from a GPU workstation to multi GPU servers and multi node clusters working

00:16:04.880 --> 00:16:05.840
with Dask as well.

00:16:05.840 --> 00:16:09.760
So Dask is also kind of about scaling Pandas and combining those.

00:16:09.760 --> 00:16:10.700
That's pretty awesome.

00:16:10.700 --> 00:16:14.720
So I actually saw that you have a Dask course out.

00:16:14.720 --> 00:16:19.480
Like I recently thought, yes, definitely going to take that because I mean, yeah, check it

00:16:19.480 --> 00:16:19.580
out.

00:16:19.580 --> 00:16:21.440
That one's awesome.

00:16:21.440 --> 00:16:22.560
Yeah.

00:16:22.560 --> 00:16:22.860
Yeah.

00:16:22.860 --> 00:16:27.660
We put that together with Matthew Rocklin and team over at Coiled.

00:16:27.660 --> 00:16:27.960
Yeah.

00:16:27.960 --> 00:16:28.820
And that's actually free.

00:16:28.820 --> 00:16:30.660
So people can just drop in and take that course.

00:16:30.660 --> 00:16:32.820
I think maybe I can put it in the show notes at the end.

00:16:32.820 --> 00:16:34.620
I think it just just announced.

00:16:34.620 --> 00:16:35.180
Let's see.

00:16:35.180 --> 00:16:36.000
Very cool.

00:16:36.000 --> 00:16:37.060
No, that was a lot.

00:16:37.060 --> 00:16:37.660
That was last week.

00:16:37.660 --> 00:16:38.920
But yeah, this is super cool.

00:16:38.980 --> 00:16:43.420
And this one is certainly within normal person's reach.

00:16:43.420 --> 00:16:45.300
You get a GPU and you're good to go.

00:16:45.300 --> 00:16:45.540
Right?

00:16:45.540 --> 00:16:46.440
Yeah.

00:16:46.440 --> 00:16:46.960
I think.

00:16:46.960 --> 00:16:47.420
Yeah.

00:16:47.420 --> 00:16:50.220
I mean, I'm just using it on my laptop with the GPU.

00:16:50.220 --> 00:16:52.040
You can also use it like online.

00:16:52.180 --> 00:16:58.580
So there's also a CoLab notebook on the rapid side.

00:16:58.580 --> 00:17:02.580
I think that you can click and then you can like kind of experiment if you just wanted to

00:17:02.580 --> 00:17:03.240
do it online.

00:17:03.240 --> 00:17:08.460
Or I think you can use any sort of online GPU that you have access to.

00:17:08.460 --> 00:17:14.280
So it's very, I think it's trying to make it more accessible, which is great.

00:17:14.280 --> 00:17:14.800
Yeah.

00:17:14.800 --> 00:17:15.740
That's super cool.

00:17:15.740 --> 00:17:16.080
Yeah.

00:17:16.080 --> 00:17:16.580
Very neat.

00:17:16.580 --> 00:17:19.400
Well, like I said, I think this is a cool project to be working on.

00:17:19.400 --> 00:17:20.860
So thanks for sharing it with us.

00:17:20.860 --> 00:17:21.400
No problem.

00:17:21.400 --> 00:17:23.980
Ryan, is it time for the next one?

00:17:23.980 --> 00:17:26.000
Is it time for the next one?

00:17:26.000 --> 00:17:27.480
yes.

00:17:27.480 --> 00:17:32.060
This was a recommended by a listener, Ira Horeca, I think.

00:17:32.060 --> 00:17:35.040
So mentioned this in, it's kind of a rabbit hole.

00:17:35.040 --> 00:17:37.580
I spent a whole bunch of time playing with all this stuff last night.

00:17:37.580 --> 00:17:39.320
He recommended Date Finder.

00:17:39.320 --> 00:17:43.680
So this is a Python utility and it kind of is amazing.

00:17:43.680 --> 00:17:45.620
So it's a combination of a couple of things.

00:17:45.620 --> 00:17:53.780
But so he pointed us to a comcode video, which, you know, I'm totally a fan of comcode stuff

00:17:53.780 --> 00:17:57.900
because they kind of go through some of the Python libraries and some of the other, a lot

00:17:57.900 --> 00:18:01.660
of other things, but just have kind of a quick demo of what it does.

00:18:01.780 --> 00:18:03.120
And I really appreciate that.

00:18:03.120 --> 00:18:08.580
It actually, the demo here is better than the read me in the, date finder, read me.

00:18:08.580 --> 00:18:12.180
So, so maybe I guess a pull request is necessary, but anyway.

00:18:12.360 --> 00:18:17.840
So what is, what date finder does is it takes, I'm going to scroll down a little bit.

00:18:17.840 --> 00:18:23.240
So, date finder takes, it parses dates or finds them.

00:18:23.240 --> 00:18:29.520
So it, you give it a string or a bunch of, list of strings or something, and it can find,

00:18:29.520 --> 00:18:31.140
find where the dates are in there.

00:18:31.140 --> 00:18:36.620
So if you've got, if you've got a sentence or a paragraph or, an entire page that has

00:18:36.620 --> 00:18:42.360
a whole bunch of dates in it, it'll find all of them and then return you a list of,

00:18:42.360 --> 00:18:43.680
of dates that it found.

00:18:43.680 --> 00:18:48.080
It actually does a whole bunch of things, but that's the default or the one that

00:18:48.080 --> 00:18:49.320
we're talking about find dates.

00:18:49.320 --> 00:18:54.940
There's a bunch of other, Leslie, less documented, features of, date finder,

00:18:54.940 --> 00:18:58.120
but this, this is the one that is demonstrated here and it's pretty cool.

00:18:58.120 --> 00:19:02.860
So what it does, it finds those dates and then you can, and then it converts them to

00:19:02.860 --> 00:19:03.400
date times.

00:19:03.400 --> 00:19:06.020
So find dates will find them and convert them to date times.

00:19:06.140 --> 00:19:09.380
And it does that by passing them off to the date util library.

00:19:09.380 --> 00:19:14.660
So, this is just kind of a really cool demo that the list, the little video is

00:19:14.660 --> 00:19:19.280
a good demo, of showing, showing how to, how to do this.

00:19:19.280 --> 00:19:21.800
I also really kind of liked this way to play.

00:19:21.800 --> 00:19:26.940
So the video shows this way to play with things of, of it just had a list of strings

00:19:26.940 --> 00:19:32.960
and then, used a comprehension to convert that, to, to call a function on a whole bunch

00:19:32.960 --> 00:19:33.380
of strings.

00:19:33.380 --> 00:19:37.920
And I thought this was just kind of a clever way to just play with a function that translates

00:19:37.920 --> 00:19:38.400
things.

00:19:38.400 --> 00:19:39.880
This is a neat thing to do.

00:19:39.880 --> 00:19:41.520
I would have probably so hard.

00:19:41.520 --> 00:19:41.820
Yeah.

00:19:41.820 --> 00:19:45.100
It's super hard, but, normally because it's so picky, right?

00:19:45.100 --> 00:19:50.240
You've got to go to the date time parsing language, almost look up.

00:19:50.320 --> 00:19:55.020
So if I put percent D, D, D, D, D, that might mean the year, but if it's capital D, it might

00:19:55.020 --> 00:19:55.880
mean something different.

00:19:55.880 --> 00:20:01.280
So you might say month, day, comma year, but like, there's an example here with month, day,

00:20:01.280 --> 00:20:04.320
year without the comma is like March 12, 2010.

00:20:04.320 --> 00:20:06.460
But if they forget the comma, it won't parse.

00:20:06.460 --> 00:20:10.300
And like all those things are really annoying about working with reading, converting strings

00:20:10.300 --> 00:20:10.640
to dates.

00:20:10.920 --> 00:20:13.080
And this looks like it just, it doesn't care.

00:20:13.080 --> 00:20:13.580
It's nice.

00:20:13.580 --> 00:20:14.180
Yeah.

00:20:14.180 --> 00:20:18.140
And then it also, it's kind of a clean, nice, clean air face to it as well.

00:20:18.140 --> 00:20:22.660
the, and a limited documentation is just a focus tool, which is nice.

00:20:23.020 --> 00:20:27.540
And it's interesting that this is just a focus tool that apparently a lot of people need

00:20:27.540 --> 00:20:31.580
because according to GitHub, there's 662 projects using this.

00:20:31.580 --> 00:20:34.340
So, it's used kind of all over the place.

00:20:34.340 --> 00:20:40.000
The behind the scenes though, it's taking the, what the dates that it found to the strings

00:20:40.000 --> 00:20:41.880
and passing those to date util.

00:20:41.880 --> 00:20:47.940
So if you want to avoid the finding part, this actually is also a good library to look at for

00:20:47.940 --> 00:20:52.860
the usage of how to use date util to easily convert, convert dates.

00:20:52.860 --> 00:20:56.200
and date util is kind of an amazing tool as well.

00:20:56.200 --> 00:21:00.800
and it gives you a, I told you this was a rabbit hole.

00:21:00.800 --> 00:21:05.520
One of the cool things about it is you, it isn't just parse dates, but you can do relative

00:21:05.520 --> 00:21:05.900
dates.

00:21:05.900 --> 00:21:10.280
You can say like today plus three weeks or something, and it'll figure that out.

00:21:10.280 --> 00:21:16.180
and then you can, or you can take two days, two dates and do, do date math with

00:21:16.180 --> 00:21:16.840
it really well.

00:21:16.840 --> 00:21:21.460
And also date util has an amazing time zone support, probably the best in Python.

00:21:21.720 --> 00:21:24.260
So, this is pretty, pretty kind of cool.

00:21:24.260 --> 00:21:27.020
also I think I was looking through the test code.

00:21:27.020 --> 00:21:33.860
The test code for date util, has, is an, is kind of a neat mix of a unit test and pie

00:21:33.860 --> 00:21:34.120
test.

00:21:34.120 --> 00:21:36.560
Both of them are good examples of how to do both.

00:21:36.560 --> 00:21:41.940
and I like some of the newer stuff is using pie test with parameterization, but it's

00:21:41.940 --> 00:21:42.140
good.

00:21:42.140 --> 00:21:42.640
Yeah.

00:21:42.640 --> 00:21:43.580
I like this a lot.

00:21:43.580 --> 00:21:44.520
Marlene, what do you think?

00:21:44.520 --> 00:21:45.680
Yeah, I like it.

00:21:45.720 --> 00:21:48.800
I think it, I'm not actually working with dates quite often.

00:21:48.800 --> 00:21:54.840
So I'm, I'm trying to think of use cases for myself other than like maybe converting time

00:21:54.840 --> 00:21:56.540
zones, which is like a nightmare.

00:21:56.540 --> 00:22:00.520
so maybe, Oh, you can say that again.

00:22:00.520 --> 00:22:01.260
Oh my gosh.

00:22:01.260 --> 00:22:07.280
Maybe you said that, but it looks like it would be really useful for people that are, yeah.

00:22:07.280 --> 00:22:07.320
Yeah.

00:22:07.320 --> 00:22:08.420
Yeah.

00:22:10.020 --> 00:22:14.160
I'm showing up some of the, some of the examples from date util of how to use it.

00:22:14.160 --> 00:22:20.320
And it's based, I imagine this is one of the reasons why date finder is so used because,

00:22:20.320 --> 00:22:23.180
um, this is non-trivial even to use date util.

00:22:23.180 --> 00:22:25.560
So yeah, that's cool.

00:22:25.560 --> 00:22:26.060
Cool.

00:22:26.060 --> 00:22:26.220
Cool.

00:22:26.220 --> 00:22:26.500
All right.

00:22:26.500 --> 00:22:31.720
Well, I got the next one and this one doesn't exactly come to us from Anthony Shaw, but I was

00:22:31.720 --> 00:22:35.600
talking to Anthony about something else and he's like, Oh, have you heard of this?

00:22:35.600 --> 00:22:37.180
Have you heard of cinder?

00:22:37.640 --> 00:22:39.940
And, cinder is pretty awesome.

00:22:39.940 --> 00:22:45.340
So Anthony's doing interesting work around Python and performance at the CPython level,

00:22:45.340 --> 00:22:49.660
especially now, I think he's given a talk on pigeon or piston, piston.

00:22:49.660 --> 00:22:50.440
I believe it is.

00:22:50.440 --> 00:22:51.980
I'm not a hundred percent sure.

00:22:51.980 --> 00:22:55.700
I might be remembering which one's wrong at PyCon, which is, you know, we're going to talk

00:22:55.700 --> 00:22:57.240
more about that in just a second as well.

00:22:57.240 --> 00:23:03.280
But cinder is a really interesting fork of CPython from Instagram.

00:23:03.280 --> 00:23:06.000
So it's under the Facebook incubator project.

00:23:06.500 --> 00:23:08.540
And I think we've mentioned it before.

00:23:08.540 --> 00:23:12.520
I definitely have talked about it before other presentations that Instagram has done really

00:23:12.520 --> 00:23:16.960
interesting things like disable the garbage collector, just turn it off a hundred percent.

00:23:16.960 --> 00:23:23.260
And they got less memory usage, not more memory usage by just allowing the cycles to leak, which

00:23:23.260 --> 00:23:23.820
is insane.

00:23:23.820 --> 00:23:27.520
But this is like, speaking of insane, this takes it to a whole nother level.

00:23:27.520 --> 00:23:31.680
So this is that they've been doing all these low level things inside of CPython.

00:23:31.800 --> 00:23:33.100
This is based on three, eight.

00:23:33.100 --> 00:23:37.640
Hopefully some of these ideas can be brought forward and shared with everyone because there's

00:23:37.640 --> 00:23:38.340
a lot going on.

00:23:38.340 --> 00:23:39.920
So let me just cruise down here.

00:23:39.920 --> 00:23:44.500
I'll just read the little intro part because it's jam packed.

00:23:44.500 --> 00:23:45.600
And then I'll go into some of the details.

00:23:45.840 --> 00:23:50.820
So this is the internal performance oriented production version of CPython 3.8.

00:23:50.820 --> 00:23:53.540
And it contains a number of performance optimizations.

00:23:53.540 --> 00:23:56.680
I feel like performance is some sort of theme of this episode.

00:23:56.680 --> 00:24:04.400
It includes bytecode inline caching, eager evaluations of coroutines, a JIT, just in time compiler,

00:24:04.920 --> 00:24:10.100
an experimental bytecode compiler that uses type annotations in some incredibly interesting

00:24:10.100 --> 00:24:14.760
ways to enter, to emit type specialized bytecode that performs better.

00:24:14.760 --> 00:24:22.220
So just to give you an example, one of the reasons that math in the pure Python layer is slower

00:24:22.220 --> 00:24:27.860
than say C++ or C# is C++ and C# work with just the value.

00:24:27.860 --> 00:24:32.000
So if you have the value seven, you might have two or four bytes that represent the value seven

00:24:32.000 --> 00:24:38.340
in Python, you have a py object pointer, which is like 28 bytes pointing out to a thing on

00:24:38.340 --> 00:24:40.440
the heap that represents the number seven.

00:24:40.440 --> 00:24:45.280
And it's a whole lot more work to interact with that and set the reference count on that

00:24:45.280 --> 00:24:47.940
and so on instead of just working with the value seven.

00:24:47.940 --> 00:24:48.440
Right.

00:24:48.440 --> 00:24:54.980
So one of the things they do is they actually have typed the, they use Python type annotations

00:24:54.980 --> 00:24:56.720
to understand, oh, this is an integer.

00:24:56.720 --> 00:25:01.620
This is a long and so on type of thing and actually convert those to the,

00:25:01.780 --> 00:25:03.640
the machine oriented numbers.

00:25:03.640 --> 00:25:03.960
Right.

00:25:03.960 --> 00:25:05.760
So just the value four instead of a pointer.

00:25:05.760 --> 00:25:07.500
And then it will use what's called boxing.

00:25:07.500 --> 00:25:12.880
If something else that's outside of this world needs it, it'll up level that to like a py long

00:25:12.880 --> 00:25:15.740
object pointer type thing and hand it off.

00:25:15.740 --> 00:25:17.920
So there's, there's all sorts of stuff like that going on.

00:25:17.920 --> 00:25:20.740
Interestingly, the first question is, is this supported?

00:25:20.740 --> 00:25:22.460
No, not supported.

00:25:26.460 --> 00:25:28.360
But there's some interesting things going on here.

00:25:28.360 --> 00:25:33.820
And all of this has to be taken within, with an understanding that it's in a very specific

00:25:33.820 --> 00:25:36.420
context and that may or may not be useful for you.

00:25:36.420 --> 00:25:40.640
You know, Brian had pointed out some articles and ideas around that.

00:25:40.700 --> 00:25:41.320
You're not Instagram.

00:25:41.320 --> 00:25:42.680
You're not Facebook.

00:25:42.680 --> 00:25:44.320
You're not Netflix.

00:25:44.320 --> 00:25:44.940
And so on.

00:25:44.940 --> 00:25:49.380
Most of the time people are building much smaller software with different constraints.

00:25:49.560 --> 00:25:54.720
So they start out by saying, look, Instagram uses a multi-process web server architecture

00:25:54.720 --> 00:26:00.620
where the parent process starts, performs initialization, and then forks 10 worker processes

00:26:00.620 --> 00:26:01.540
to handle requests.

00:26:01.540 --> 00:26:03.060
This is super common.

00:26:03.060 --> 00:26:06.080
Like for example, Talk Python training literally does exactly this.

00:26:06.080 --> 00:26:07.700
It, it uses micro-wisgi.

00:26:07.700 --> 00:26:11.420
It starts up and it creates 10 worker processes to handle like people wanting to take courses.

00:26:11.620 --> 00:26:15.980
So it's not uncommon in the web, but it's, it's not how all Python code runs.

00:26:15.980 --> 00:26:21.460
And so the first optimization they did is they created what are called immortal instances.

00:26:21.460 --> 00:26:26.420
The reason they were so focused on the garbage collector and all those sorts of things was

00:26:26.420 --> 00:26:32.280
when you fork these processes, initially there's a bunch of memory that can be shared and that

00:26:32.280 --> 00:26:37.140
helps with cache locality that helps with overall memory usage, all sorts of things.

00:26:37.540 --> 00:26:42.380
But as soon as something has changed about one of those items, it has to copy a whole page

00:26:42.380 --> 00:26:42.740
of memory.

00:26:42.740 --> 00:26:49.880
And they realized that when an object's reference count is modified in one of the processes, it

00:26:49.880 --> 00:26:55.340
has to copy a replicate and sort of fork off a bunch of the memory that used to be shared

00:26:55.340 --> 00:26:56.400
across all those processes.

00:26:56.400 --> 00:27:02.240
So they created what they call immortal instances that cannot be, that don't participate in reference

00:27:02.240 --> 00:27:03.720
counting or garbage collection.

00:27:03.720 --> 00:27:07.320
And that prohibits their reference count number to change so they can be shared.

00:27:07.480 --> 00:27:12.020
So they can mark like a whole bunch of the startup stuff as like, just don't even look at this

00:27:12.020 --> 00:27:13.880
or change it and don't do reference counting on it.

00:27:13.880 --> 00:27:17.820
So in their world, they may, it got things faster, but it doesn't always, they said it's

00:27:17.820 --> 00:27:22.200
something a little bit slower in straight line code, but in this sort of fork world, it's

00:27:22.200 --> 00:27:22.380
better.

00:27:22.380 --> 00:27:27.340
The next one is shadow byte code, which is an inline caching implementation.

00:27:28.000 --> 00:27:34.800
And it goes through applies in certain optimization cases for generic Python op codes.

00:27:34.800 --> 00:27:41.000
And it'll observe those for functions that take a lot of time and dynamically replace those

00:27:41.000 --> 00:27:44.940
with specialized op codes that it thinks are going to be better.

00:27:44.940 --> 00:27:49.660
Another thing it does that's pretty interesting is it will eagerly evaluate coroutine.

00:27:49.660 --> 00:27:54.520
So if I say this is an async method, and then in that method, I call a weight, some function

00:27:54.520 --> 00:27:58.000
call normal Python is going to create a coroutine.

00:27:58.000 --> 00:28:01.540
It's going to schedule it on the asyncio event loop, and it's going to get to it.

00:28:01.540 --> 00:28:02.760
And that's a lot of overhead.

00:28:02.760 --> 00:28:05.380
But maybe that function says inside.

00:28:05.380 --> 00:28:11.020
The first thing is if this case just return the cached answer, otherwise go to the database,

00:28:11.020 --> 00:28:13.100
await the response and so on.

00:28:13.100 --> 00:28:17.820
And what they realized is if it's going to go through that first case, it's not actually

00:28:17.820 --> 00:28:18.620
awaiting something.

00:28:18.620 --> 00:28:25.240
So they'll actually execute the awaited thing up until it actually needs to become async.

00:28:25.240 --> 00:28:29.760
So it'll like look or effectively look inside the function and say, is the path we're going

00:28:29.760 --> 00:28:31.900
on this time going to be async or not?

00:28:31.900 --> 00:28:37.300
And if the answer is no, it will run it without async, which means it skips all that context switching

00:28:37.300 --> 00:28:38.640
and all that stuff, which is pretty crazy.

00:28:38.640 --> 00:28:44.940
It also has the cinder JIT, which is a method in time JIT compiler.

00:28:44.940 --> 00:28:48.980
I think C#, Java, maybe even JavaScript V8.

00:28:48.980 --> 00:28:52.720
So it's enabled for every function that is called.

00:28:52.720 --> 00:28:54.560
Actually, it's not.

00:28:54.560 --> 00:28:54.740
Sorry.

00:28:54.740 --> 00:28:55.960
If it is, it'll make it slow.

00:28:55.960 --> 00:28:59.540
So you can basically say which functions should be optimized.

00:28:59.540 --> 00:29:03.840
But they say it supports almost everything that Python can do.

00:29:03.840 --> 00:29:09.140
And it has a 1.5 to 4 times speed up of the Python benchmarks, which is pretty interesting.

00:29:09.140 --> 00:29:15.700
They also have this thing called strict modules, which is actually a static analyzer capable of

00:29:15.700 --> 00:29:20.820
validating top level code to see if a module has side effects and can treat it differently.

00:29:20.820 --> 00:29:27.740
If it doesn't, you can have an immutable strict module type that is sort of a replacement for

00:29:27.740 --> 00:29:31.880
Python's regular module that behaves and loads differently and so on.

00:29:32.340 --> 00:29:36.820
And then the thing I talked about, the numbers more broadly is under this category of static

00:29:36.820 --> 00:29:37.200
Python.

00:29:37.200 --> 00:29:43.200
It's an experimental byte co-compiler that makes use of type annotations to emit better

00:29:43.200 --> 00:29:43.540
things.

00:29:43.540 --> 00:29:44.860
And check this out.

00:29:44.860 --> 00:29:48.180
It can deliver performance similar to mypyC or Cython.

00:29:48.180 --> 00:29:55.540
And this thing will go up to seven times faster than regular Python for the Richards benchmarks.

00:29:55.540 --> 00:29:59.100
And I don't know if the 4x improvement before is like in addition to this.

00:29:59.100 --> 00:30:01.280
So you get 28 or you just get seven.

00:30:01.360 --> 00:30:02.040
I don't really know.

00:30:02.040 --> 00:30:07.620
But there's a lot of things going on here and a lot of different ideas about how this

00:30:07.620 --> 00:30:07.920
works.

00:30:07.920 --> 00:30:12.980
So I'm just scratching the surface on the details, but I feel like I've gone on and on about it.

00:30:12.980 --> 00:30:14.340
That's really interesting.

00:30:14.340 --> 00:30:17.800
I saw, I think, is there a talk about it at Python?

00:30:17.800 --> 00:30:18.480
At PyCon?

00:30:18.480 --> 00:30:19.580
I think that's...

00:30:19.580 --> 00:30:20.560
It is coming up.

00:30:20.560 --> 00:30:20.900
Yes.

00:30:20.900 --> 00:30:22.680
They're going to give a talk on this at PyCon.

00:30:22.680 --> 00:30:22.940
Yeah.

00:30:22.940 --> 00:30:26.820
It was one of the talks I was looking forward to listening to.

00:30:26.820 --> 00:30:28.340
Yeah.

00:30:28.340 --> 00:30:34.700
Just because I think it's super interesting to be able to kind of play around with that they

00:30:34.700 --> 00:30:37.820
were able to kind of make their own version of Python.

00:30:37.820 --> 00:30:39.120
And it might...

00:30:39.120 --> 00:30:39.760
I don't know.

00:30:39.760 --> 00:30:41.080
Like, I think that there's...

00:30:41.080 --> 00:30:48.120
Like you mentioned, Anthony, and I also know Victor, I think, and someone else who are also

00:30:48.120 --> 00:30:51.880
working on, like, sub-interpreters and different things to make Python faster.

00:30:52.280 --> 00:30:58.140
So I'm really curious to see if, like, the core devs or people will also be, like, listening

00:30:58.140 --> 00:31:02.040
to this talk and maybe take some ideas from it.

00:31:02.040 --> 00:31:05.160
It would be really cool to kind of see.

00:31:05.160 --> 00:31:09.400
And I mean, it's always good to get speed-ups, even if they...

00:31:09.400 --> 00:31:09.800
I don't know.

00:31:09.800 --> 00:31:16.940
I don't know if it will help, like, general, like, normal Python users, but I think it's

00:31:16.940 --> 00:31:17.820
always good to look into.

00:31:17.820 --> 00:31:19.120
Yeah.

00:31:19.120 --> 00:31:19.520
Yeah.

00:31:19.520 --> 00:31:19.600
Yeah.

00:31:19.600 --> 00:31:20.220
I agree.

00:31:20.300 --> 00:31:26.800
I think some things here are absolutely transferable to regular general-purpose CPython,

00:31:26.800 --> 00:31:27.940
and some of them might not be.

00:31:27.940 --> 00:31:32.980
For example, the immortal instances, that might be a thing that just...

00:31:32.980 --> 00:31:36.820
They do that, and it makes sense for their large-scale farm of servers.

00:31:36.820 --> 00:31:43.740
But the JIT that takes the type information and does math many, many times faster, that...

00:31:43.740 --> 00:31:44.620
Everybody would want that.

00:31:44.620 --> 00:31:46.880
Like, we all work with numbers at some level or another.

00:31:46.880 --> 00:31:47.900
Right?

00:31:47.900 --> 00:31:50.620
Well, one of the things I love about the...

00:31:50.620 --> 00:31:54.000
I mean, this kind of applies to all of these sort of speed-up projects.

00:31:54.000 --> 00:31:57.680
One of the things I love about Python is just the generalness of it.

00:31:57.680 --> 00:31:59.020
You can throw it.

00:31:59.020 --> 00:32:00.620
Data structures can hold anything.

00:32:00.620 --> 00:32:02.580
So it's...

00:32:02.580 --> 00:32:09.280
But there are times where you really are using a huge array of floats or a huge array of integers

00:32:09.280 --> 00:32:12.940
or a huge array of, like, a fixed data size.

00:32:12.940 --> 00:32:15.860
Those are times where you...

00:32:15.860 --> 00:32:17.320
I don't need it to be generic.

00:32:17.320 --> 00:32:18.280
I just need it...

00:32:18.280 --> 00:32:19.520
I need it to be fast.

00:32:19.520 --> 00:32:21.800
So having something...

00:32:21.800 --> 00:32:26.280
That's the part where I think it would be interesting to pull into regular Python.

00:32:26.280 --> 00:32:30.860
But don't we get that with, like, some of the data science stuff anyway?

00:32:30.860 --> 00:32:32.980
Some of the number...

00:32:32.980 --> 00:32:34.320
With, like, NumPy and stuff?

00:32:34.320 --> 00:32:35.080
Yeah, you do.

00:32:35.080 --> 00:32:38.320
But you can't do generic programming with it, right?

00:32:38.320 --> 00:32:41.500
You do, like, sort of matrix math type of things.

00:32:41.500 --> 00:32:42.520
And this one...

00:32:42.520 --> 00:32:44.980
Like, the answer used to be, okay, well, this function is slow.

00:32:44.980 --> 00:32:48.320
This serialization deserialization section might be slow.

00:32:48.320 --> 00:32:50.440
So rewrite that in Cython, for example.

00:32:50.440 --> 00:32:57.340
And what's really cool about this is you can write regular Python and just put type annotations on it.

00:32:57.440 --> 00:32:59.240
And then it goes as fast as Cython.

00:32:59.240 --> 00:33:04.280
And you don't even have to do, like, a separate compiler, I believe, in this world, right?

00:33:04.280 --> 00:33:05.120
Because they have...

00:33:05.120 --> 00:33:06.360
The JIT just knows that.

00:33:06.360 --> 00:33:07.360
And then we'll...

00:33:07.360 --> 00:33:09.780
Like, as you run it, it'll just compile and run it.

00:33:09.780 --> 00:33:11.180
So, which is...

00:33:11.180 --> 00:33:16.640
I think it just sort of makes some of those ideas closer and more automatic for most people.

00:33:16.640 --> 00:33:23.920
I kind of think I foresee a future where we have sort of some types that affect runtime.

00:33:23.920 --> 00:33:35.120
There's, like, this tension that I sense in the Python core people of whether or not types should be just an afterthought or whether they should be really part of the runtime.

00:33:35.120 --> 00:33:41.660
And I think there are some cases where having them be part of the runtime might be a good thing.

00:33:42.020 --> 00:33:48.520
Yeah, and this is interesting because what they do is they define these static modules.

00:33:48.520 --> 00:33:51.020
And then in there, they can treat them differently.

00:33:51.020 --> 00:33:59.580
I feel like I always see on Twitter some people kind of, like, ranting about how they don't like that direction that Python is going in.

00:33:59.580 --> 00:34:03.200
Like, the idea of putting in, like, annotations and things like that.

00:34:03.280 --> 00:34:06.820
I've seen some people that are not super big fans of that.

00:34:06.820 --> 00:34:08.360
I'm not really sure why.

00:34:08.360 --> 00:34:12.280
I generally would like to understand, like, I think most people...

00:34:12.280 --> 00:34:14.180
Or not most people.

00:34:14.180 --> 00:34:19.500
But I think some people would prefer Python to maybe remain as it is.

00:34:19.500 --> 00:34:24.340
But I do think that there's, like, just having it be a bit lesser in a couple of cases would be helpful.

00:34:24.340 --> 00:34:25.560
So, I don't know.

00:34:25.560 --> 00:34:27.120
I don't know if it's in that direction.

00:34:28.140 --> 00:34:28.780
I'm with you.

00:34:28.780 --> 00:34:34.920
And one of the things they point out in this readme announcing the project is that you can still do gradual typing.

00:34:34.920 --> 00:34:37.680
So, you can, in some places, have no types.

00:34:37.680 --> 00:34:39.460
In some places, have some types.

00:34:39.460 --> 00:34:42.460
And the thing can convert and just deal with that automatically.

00:34:42.460 --> 00:34:48.640
And I think that's the reason that the types are really welcome in Python is because you can use them if you want, but you don't have to.

00:34:48.800 --> 00:34:55.100
As opposed to places like TypeScript, which said, well, JavaScript doesn't have types, so we're going to add this very strict type system.

00:34:55.100 --> 00:34:59.860
And if you don't fit it exactly, we're going to not compile and complain, and it's going to be really not good.

00:34:59.860 --> 00:35:05.820
This feels like it continues that forgiving nature of Python to let you opt into it.

00:35:05.820 --> 00:35:07.360
But if you do, it can go faster.

00:35:07.360 --> 00:35:09.840
That's the direction I'd like to see.

00:35:09.840 --> 00:35:16.400
I personally would like to see types be really a full-fledged feature of Python.

00:35:16.400 --> 00:35:17.520
I agree.

00:35:17.520 --> 00:35:23.860
I love that they're optional, but if they're there, let's see how much we can do and improve things with them, right?

00:35:23.860 --> 00:35:24.620
A hundred percent.

00:35:24.620 --> 00:35:25.540
Yeah.

00:35:25.540 --> 00:35:26.180
All right.

00:35:26.180 --> 00:35:27.200
Marlene, you got the last one.

00:35:27.200 --> 00:35:28.240
I got it on screen for you.

00:35:28.240 --> 00:35:28.680
Okay.

00:35:28.680 --> 00:35:29.120
Yes.

00:35:29.120 --> 00:35:34.860
The last one for today is PyCon US, which I'm very excited about.

00:35:34.860 --> 00:35:39.060
It started today, which is really great.

00:35:39.060 --> 00:35:41.440
Are both of you attending?

00:35:41.440 --> 00:35:42.380
I don't know if you're attending.

00:35:42.380 --> 00:35:45.760
Yes, absolutely.

00:35:45.760 --> 00:35:46.000
Okay.

00:35:46.000 --> 00:35:48.220
Brian, are you attending?

00:35:48.220 --> 00:35:48.740
Yay.

00:35:48.740 --> 00:35:50.260
Yeah.

00:35:50.260 --> 00:35:58.660
I think it's such a great event in terms of the fact that I know it's PyCon US, but it is, at the moment, it's the largest Python.

00:35:58.660 --> 00:36:08.440
Python gathering, or largest Python on Earth, I think, which is very cool because it means that you can meet people from all around the world.

00:36:08.440 --> 00:36:19.840
I remember, I'm really sad that it's not in person because last year, not last year, but the year before that, that's where I actually met you, Michael, for the first time.

00:36:19.840 --> 00:36:27.220
I think we were literally, I think we were at a table with you and Anthony Shaw and Łukasz Langa.

00:36:27.220 --> 00:36:29.800
And I was just randomly there.

00:36:29.800 --> 00:36:32.580
But it was such a cool discussion.

00:36:32.580 --> 00:36:40.780
And I really love the idea of being able to be in a room with people that are contributing to Python.

00:36:40.780 --> 00:36:43.740
That's my favorite part of PyCon.

00:36:43.740 --> 00:36:44.080
Yeah.

00:36:44.080 --> 00:36:45.580
It was so nice to meet you as well.

00:36:45.800 --> 00:36:48.280
That is actually my favorite part of PyCon.

00:36:48.280 --> 00:36:48.580
Yeah.

00:36:48.580 --> 00:36:54.800
It's the just, you happen to end up at a table or out for a beer or coffee with this group of people.

00:36:54.800 --> 00:36:59.240
And you're like, wow, I got these connections and this experience that just, I wouldn't.

00:36:59.240 --> 00:37:01.460
So I'm very much looking forward to coming back in person.

00:37:01.580 --> 00:37:04.080
But there's a bunch of great talks coming up.

00:37:04.080 --> 00:37:04.480
Exactly.

00:37:04.880 --> 00:37:11.000
So this year, although it's online, the online platform is very cool.

00:37:11.000 --> 00:37:13.200
And there's still lots of great talks to watch.

00:37:13.200 --> 00:37:17.460
In the show notes, I put down a list of the talks that I'm excited to watch.

00:37:17.600 --> 00:37:23.300
But I also want to just put in a word for the things that I will be doing at PyCon US this year.

00:37:23.300 --> 00:37:29.340
And the first thing I'm going to be doing is I'm going to be hosting the diversity and inclusion work group discussion,

00:37:29.340 --> 00:37:36.560
along with four other really amazing women that are part of the diversity and inclusion work group.

00:37:36.560 --> 00:37:43.220
I do want to comment here because I got like, we got some comments about it, some feedback.

00:37:43.860 --> 00:37:49.940
I posted a picture of like a group that's going to be having this discussion or hosting this panel and it's all women.

00:37:49.940 --> 00:37:52.340
And someone was just like, why is it all women?

00:37:52.340 --> 00:37:53.440
How is this diversity?

00:37:53.440 --> 00:37:57.040
So I do want to throw it out there.

00:37:57.040 --> 00:38:03.020
I just want to throw it out there that we did try, like the work group itself has a lot of,

00:38:03.020 --> 00:38:05.780
it has a good balance of men and women in it.

00:38:05.780 --> 00:38:11.240
But then when I asked people if they want to come on the panel, it was only like women that volunteered.

00:38:11.240 --> 00:38:12.280
So it's not my fault.

00:38:12.280 --> 00:38:14.300
And I am aware of that.

00:38:14.300 --> 00:38:18.440
That's just a general or that's just general feedback there.

00:38:18.440 --> 00:38:24.260
But I think the panel will be really exciting.

00:38:24.260 --> 00:38:28.760
It's going to be on Saturday on the main stage at 12 p.m.

00:38:28.760 --> 00:38:30.900
EST, I think.

00:38:30.900 --> 00:38:34.120
If you're going to be there, I really would encourage you to attend.

00:38:34.120 --> 00:38:35.660
There's going to be questions and answers.

00:38:36.420 --> 00:38:38.600
And I just think it's such an important thing.

00:38:38.600 --> 00:38:45.960
I know that sometimes diversity can seem like a really tiring thing to talk about, especially like recently.

00:38:45.960 --> 00:38:54.140
I feel like sometimes people use it as like this buzzword and it can, and people can be like, oh my gosh, and just turn off when they hear the word diversity.

00:38:54.800 --> 00:38:56.840
But I really do think it's important.

00:38:56.840 --> 00:39:10.260
And particularly now as Python is growing in popularity, I think a few years ago, it was okay for the nucleus of Python to be based in the United States or based in Europe.

00:39:10.260 --> 00:39:12.880
But it's growing so quickly.

00:39:12.880 --> 00:39:18.200
Python for I don't know how many years now has been the most popular language in the world.

00:39:18.200 --> 00:39:25.920
And I know even for me, I'm in Zimbabwe right now, and it's one of the most popular languages here where I live.

00:39:25.920 --> 00:39:42.880
And so just providing the group, like our main purpose is to figure out how we can support the PSF to try and serve Pythonistas from around the world better and to connect the community better and have better representation and different things like that.

00:39:42.880 --> 00:39:44.760
So very excited about that one.

00:39:44.760 --> 00:39:46.060
That's awesome.

00:39:46.060 --> 00:39:47.740
And thanks for your work here.

00:39:47.740 --> 00:39:51.340
I definitely agree that we're stronger together, right?

00:39:51.340 --> 00:40:00.800
And one thing I would really like to see, and I think we're getting there, is when people look at Python and programming in general, but generally the Python space, we have influence over that.

00:40:00.800 --> 00:40:06.040
When people look at that world, I would like them to say, I can see myself being part of that.

00:40:06.040 --> 00:40:08.120
I can see that I could belong there, right?

00:40:08.120 --> 00:40:11.300
And if that's not the case, then how do we make that the case?

00:40:11.300 --> 00:40:12.040
Exactly.

00:40:12.040 --> 00:40:13.020
Absolutely.

00:40:13.020 --> 00:40:15.060
I think exactly that.

00:40:15.780 --> 00:40:19.180
And I would love to see that happening in the next two years.

00:40:19.180 --> 00:40:27.220
I would love to see, you know, one of my things is I'd love to see more like women core developers and more like global core developers as well.

00:40:27.220 --> 00:40:29.800
And also people on the board and different things.

00:40:29.800 --> 00:40:31.820
And those are all goals that we are working towards.

00:40:31.960 --> 00:40:37.460
Obviously, we don't know, like, the perfect way to achieve something or the perfect way to do things.

00:40:37.460 --> 00:40:42.220
But it's something that I think is really great and exciting to work on.

00:40:42.220 --> 00:40:45.580
So please attend if you are listening to this.

00:40:46.340 --> 00:40:49.420
And let me know if you, like, came from this podcast.

00:40:49.420 --> 00:40:51.460
It would be fantastic to see you there.

00:40:51.460 --> 00:40:52.320
Maybe just comment.

00:40:52.320 --> 00:40:53.380
Yeah, fantastic.

00:40:53.380 --> 00:41:04.960
And then, oh, another thing that I am doing for, like, on this year as well is I will be, one, I will be in the, so there's, like, a lounge area.

00:41:05.160 --> 00:41:07.200
Well, there's, like, a PSF booth.

00:41:07.200 --> 00:41:15.120
And if you would like to just, if you're going to be there in the morning on Saturday or on Friday, I will be hanging out in the PSF booth.

00:41:15.640 --> 00:41:20.100
And so, yeah, if you just want to talk about Python or the PSF or anything, I will be there.

00:41:20.100 --> 00:41:24.460
And I will also be hosting the EMEA meeting.

00:41:24.460 --> 00:41:30.280
So if you're in Europe, the Middle East, or Africa, there's a members meeting on Saturday.

00:41:30.280 --> 00:41:34.000
I think it's at 10 a.m. Central African time.

00:41:34.000 --> 00:41:36.260
I'm not sure what time that is in other places.

00:41:36.260 --> 00:41:38.620
But I know it's at 10 a.m.

00:41:38.620 --> 00:41:40.100
It's on the schedule, right?

00:41:40.100 --> 00:41:40.520
It's on the schedule.

00:41:40.520 --> 00:41:40.680
Yeah.

00:41:40.680 --> 00:41:42.680
We can use the daytime thing.

00:41:42.680 --> 00:41:43.500
I don't know.

00:41:43.500 --> 00:41:45.360
Exactly.

00:41:45.880 --> 00:41:46.680
Pull up the ripple.

00:41:46.680 --> 00:41:47.640
Throw it into daytime.

00:41:47.640 --> 00:41:48.220
Exactly.

00:41:48.220 --> 00:41:49.080
Please do that.

00:41:49.080 --> 00:41:52.700
So I will be hosting that.

00:41:52.700 --> 00:41:54.600
And that's going to be in the morning.

00:41:54.600 --> 00:42:00.060
And if you would like, even if you're not a member, you can watch it on the PSF YouTube channel.

00:42:00.060 --> 00:42:01.340
It's going to be streaming there.

00:42:01.340 --> 00:42:02.540
Or you could join.

00:42:02.540 --> 00:42:04.640
There's a meetup link that I put in the show notes.

00:42:04.640 --> 00:42:06.760
So people could join that way as well.

00:42:06.760 --> 00:42:09.760
So, yeah, Python is going to be really exciting.

00:42:09.760 --> 00:42:11.420
And I'm really looking forward to it.

00:42:11.420 --> 00:42:14.400
So just encouraging people to come along for sure.

00:42:14.680 --> 00:42:15.480
Yeah, it should be fun.

00:42:15.480 --> 00:42:21.440
And even though it is super sad that it's not in person, it's not in Pittsburgh this year.

00:42:21.440 --> 00:42:25.100
I think in some ways it's more accessible to people around the world, right?

00:42:25.100 --> 00:42:26.340
They don't have to travel there.

00:42:26.340 --> 00:42:29.140
They can just log in and attend it.

00:42:29.140 --> 00:42:32.680
And that's so much less expensive than I flew to the U.S.

00:42:32.680 --> 00:42:34.880
And I paid $1,000 for a hotel.

00:42:34.880 --> 00:42:38.980
So there's a little silver lining, you know, out there in the live stream.

00:42:38.980 --> 00:42:42.520
Sam Morley really says, I really wish I could go to PyCon in person.

00:42:42.520 --> 00:42:45.940
Adam Parkin there says, me too, maybe in 2020.

00:42:45.940 --> 00:42:46.920
I think so.

00:42:46.920 --> 00:42:53.220
Finally, Sam also thinks it's great that we're having this diversity conversation and paying attention to it.

00:42:53.340 --> 00:43:00.000
One of the things I've noticed in 2020 is all the regional, actually last year also, though.

00:43:00.100 --> 00:43:07.400
But the 2020 and 2021, we've got all these PyCons going on all over the world.

00:43:07.400 --> 00:43:12.760
I used to think of like PyCon U.S. as the PyCon and everything else is regional.

00:43:13.200 --> 00:43:17.680
Now I think of PyCon U.S. as a regional conference also.

00:43:17.680 --> 00:43:21.320
It's the regional one that's close to the people that are in the U.S.

00:43:21.320 --> 00:43:23.660
It isn't necessarily better.

00:43:23.660 --> 00:43:26.020
It's, I love it.

00:43:26.020 --> 00:43:26.520
It's great.

00:43:26.520 --> 00:43:29.320
Anybody from that's hosting it, yes, it's better.

00:43:29.320 --> 00:43:34.080
But no, I like all of them.

00:43:34.080 --> 00:43:39.680
And I was excited to get to participate and watch videos from all over the world this year.

00:43:39.680 --> 00:43:40.840
That was pretty neat.

00:43:41.360 --> 00:43:45.980
But yeah, I'm on board with, I want to get back to regional stuff.

00:43:45.980 --> 00:43:47.940
I'd like to see people in person.

00:43:47.940 --> 00:43:48.980
I can't wait.

00:43:48.980 --> 00:43:56.320
Yeah, I will say for sure, like, even if people are feeling adventurous, there is a regional conference.

00:43:56.320 --> 00:44:01.740
I didn't mention it before that I am also part of the organizing team for, which is PyCon Africa.

00:44:01.740 --> 00:44:10.820
So if you would like to travel to another PyCon in a different part of the world, when we are able to travel and the world gets back to,

00:44:11.020 --> 00:44:19.060
some form of free travel, definitely recommend also hopping over to PyCon Africa.

00:44:19.060 --> 00:44:23.400
I think, like you said, I think PyCon US is fantastic.

00:44:23.400 --> 00:44:29.960
And one of the neat things about that is that it's a conference that has been there for so long.

00:44:29.960 --> 00:44:31.760
So a lot of people are going to be there.

00:44:32.640 --> 00:44:40.600
But there are 100% are a lot of great conferences like PyCon Africa, which you should attend if you can.

00:44:40.800 --> 00:44:42.540
I think they're really just as exciting.

00:44:42.540 --> 00:44:45.800
And there's so many cool things that you get to experience.

00:44:45.800 --> 00:44:51.840
Like, I think for me, it's like whenever I go to the US, like last year, I'd never been to Ohio before.

00:44:51.840 --> 00:45:02.560
And like, I had like, I would never like in my, I would never have a reason that I would think to myself, let me go to America to go to Ohio.

00:45:03.680 --> 00:45:07.460
But I feel like it was such a good experience for me.

00:45:07.460 --> 00:45:08.880
And I really liked it.

00:45:08.880 --> 00:45:10.680
And I was really surprised by that.

00:45:10.680 --> 00:45:12.580
And so I think it's the same way.

00:45:12.580 --> 00:45:15.820
Like, PyCon is a great way as well to like experience new places.

00:45:16.140 --> 00:45:18.320
So yeah, definitely sticking in that.

00:45:18.320 --> 00:45:20.580
Well, that wraps up our six.

00:45:20.580 --> 00:45:24.440
Anybody got any extra information to share?

00:45:24.440 --> 00:45:31.540
Nothing else for me, other than the fact that if you do want to reach out to me, you can reach out to me on Twitter.

00:45:31.540 --> 00:45:34.480
I'm Marlene underscore ZW there.

00:45:34.480 --> 00:45:38.300
I'm also Marlene underscore ZW on GitHub, I think.

00:45:38.300 --> 00:45:40.500
And on my website.

00:45:40.500 --> 00:45:42.880
My website is MarleneMangami.com.

00:45:43.120 --> 00:45:46.700
So if you would like to reach out to me here, feel free to.

00:45:46.700 --> 00:45:49.120
I'm always happy to like chat about PyCon.

00:45:49.120 --> 00:45:50.320
Nice.

00:45:50.320 --> 00:45:50.840
Cool.

00:45:50.840 --> 00:45:51.780
So I got a couple.

00:45:51.780 --> 00:45:53.200
One made me really excited.

00:45:53.200 --> 00:45:55.040
This tweet from GitHub.

00:45:55.040 --> 00:45:57.040
Is your fork behind?

00:45:57.040 --> 00:46:01.640
You can now sync your parent repo with just a single click.

00:46:01.640 --> 00:46:02.620
So check this out.

00:46:02.620 --> 00:46:09.960
If you go to your fork now, next to contribute for your PRs and stuff, there is now a fetch upstream button.

00:46:10.100 --> 00:46:16.420
And all you have to do is click it and then automatically your fork will become in sync with whatever you forked it from.

00:46:16.420 --> 00:46:19.020
You just have to go and go check it out.

00:46:19.020 --> 00:46:24.160
Add an upstream origin and then pull from that and then merge that wherever you want it to go to.

00:46:24.160 --> 00:46:27.040
Over here, you just click this button and boom.

00:46:27.040 --> 00:46:27.840
It's good to go.

00:46:28.180 --> 00:46:31.320
So I think this will just lower the bar for people forking something.

00:46:31.320 --> 00:46:35.300
They want to get the current one and then make a change to see if they could contribute back.

00:46:35.300 --> 00:46:37.880
Here's one fewer steps in that process.

00:46:37.880 --> 00:46:41.520
Do you have any idea if it stays in sync or if you have to?

00:46:41.520 --> 00:46:43.920
No, it's a one-time type of thing, I believe.

00:46:43.920 --> 00:46:49.380
It says there's this many changes we'll pull over and it basically just automatically does the process at that time.

00:46:49.380 --> 00:46:49.880
Nice.

00:46:49.880 --> 00:46:51.240
But still pretty nice.

00:46:51.240 --> 00:46:53.980
You know, Flask 2.0 is out.

00:46:53.980 --> 00:47:01.640
And that one was sent in to us from Adam Parkin that, hey, heads up, this is now actually live.

00:47:01.640 --> 00:47:03.100
So very, very cool.

00:47:03.540 --> 00:47:06.180
Actually, everything from Palettes has been updated.

00:47:06.180 --> 00:47:07.440
So, yeah.

00:47:07.440 --> 00:47:21.900
I happen to have spoken, done a podcast recording with David Lord, who runs Palettes, and Phil Jones, who does Core and contributes back to Palettes as well, about all the stuff coming in Flask 2.0, all the exciting stuff and their future plans as well.

00:47:21.900 --> 00:47:23.040
So, yeah.

00:47:23.040 --> 00:47:29.080
You can watch the live stream of that or wait a day or two until the episode is out and just listen at Talk Python as well.

00:47:29.080 --> 00:47:30.240
But, yeah, very, very cool.

00:47:30.240 --> 00:47:30.580
Yeah.

00:47:30.960 --> 00:47:33.760
And then Adam also at the live stream again says, this is super sweet.

00:47:33.760 --> 00:47:35.880
Always find it a headache to sync with upstream.

00:47:35.880 --> 00:47:37.080
Yeah, about the GitHub thing.

00:47:37.080 --> 00:47:37.860
That's cool.

00:47:37.860 --> 00:47:38.400
Cool.

00:47:38.400 --> 00:47:39.460
Close it out with a joke.

00:47:39.460 --> 00:47:41.960
Well, I got a couple of things I wanted to mention.

00:47:41.960 --> 00:47:42.880
Go for it.

00:47:42.880 --> 00:47:43.120
Sorry.

00:47:43.120 --> 00:47:51.080
I had Brett Cannon on last week on testing code and huge feedback from everybody that it was a great episode.

00:47:51.080 --> 00:47:52.120
We talked about packaging.

00:47:52.120 --> 00:47:56.260
I'll have Ryan Howard on this week talking about Playwright.

00:47:56.260 --> 00:47:57.540
So that'll be fun.

00:47:57.960 --> 00:48:03.300
And I wanted to mention a thank you to the 71 patrons that we have on Patreon.

00:48:03.300 --> 00:48:05.300
So thank you for supporting the show.

00:48:05.300 --> 00:48:05.900
Thanks.

00:48:05.900 --> 00:48:06.480
Yeah.

00:48:06.480 --> 00:48:07.000
Thank you, everyone.

00:48:07.000 --> 00:48:08.060
How about a joke?

00:48:08.060 --> 00:48:09.040
Yes.

00:48:09.040 --> 00:48:11.820
Sorry for almost skipping over your extras.

00:48:11.820 --> 00:48:13.300
Here.

00:48:13.300 --> 00:48:14.140
No worries.

00:48:14.140 --> 00:48:15.040
You ready?

00:48:15.040 --> 00:48:15.740
Yeah.

00:48:15.800 --> 00:48:20.060
So this one, I talked about that crazy giant ship thing.

00:48:20.060 --> 00:48:22.060
And we've got Marlene doing rapid.

00:48:22.060 --> 00:48:24.840
So I thought maybe some kind of machine learning joke.

00:48:24.840 --> 00:48:29.340
Here's a bunch of robots in school.

00:48:29.340 --> 00:48:32.260
And they go like little Android looking things.

00:48:32.260 --> 00:48:34.460
Small ones because they're students.

00:48:34.460 --> 00:48:34.900
They're kids.

00:48:34.900 --> 00:48:36.640
And they're in machine learning class.

00:48:37.040 --> 00:48:38.980
And there's a big box of dirty data.

00:48:38.980 --> 00:48:42.120
Like a bunch of bits that are like kind of gray.

00:48:42.120 --> 00:48:42.980
And I don't know.

00:48:42.980 --> 00:48:43.840
They just have dirt on them.

00:48:43.840 --> 00:48:47.480
And the teacher says, Robbie, stop misbehaving.

00:48:47.480 --> 00:48:49.440
Or I will send you back to data cleaning.

00:48:49.440 --> 00:48:53.800
Yeah.

00:48:53.800 --> 00:48:56.180
That's where they're spending half the day anyway.

00:48:56.180 --> 00:48:56.600
Yeah.

00:48:56.600 --> 00:48:58.240
They actually spend most of their time there.

00:48:58.240 --> 00:48:58.660
That's right.

00:48:59.220 --> 00:49:00.740
I don't know who is drawing them.

00:49:00.740 --> 00:49:03.060
Like one of the robots is looking the wrong way.

00:49:03.060 --> 00:49:04.780
I was like, why is it drawn like that?

00:49:04.780 --> 00:49:05.500
I don't understand.

00:49:05.500 --> 00:49:12.800
Hey, a more concrete, really quick closeout question I see in the live stream here is

00:49:12.800 --> 00:49:13.440
Akmos.

00:49:13.440 --> 00:49:18.420
Is there a difference between QDF and pandas in terms of utilization?

00:49:18.420 --> 00:49:22.960
Like in terms of how you actually use them?

00:49:22.960 --> 00:49:26.740
Well, I don't think so.

00:49:26.860 --> 00:49:32.700
For the most part, if when you're using QDF, the way it's built is to mirror pandas.

00:49:32.700 --> 00:49:35.160
So the APIs are really similar.

00:49:35.160 --> 00:49:40.840
So ideally, the methods that you would use when you're using pandas are exactly the same

00:49:40.840 --> 00:49:43.420
methods that you would use when you're using QDF.

00:49:43.420 --> 00:49:47.620
The only difference is like when you're creating a pandas data frame, for example,

00:49:47.620 --> 00:49:50.860
you would use PD dot data frame, for example.

00:49:50.860 --> 00:49:54.080
But then with QDF, you would say QDF dot data frame.

00:49:54.400 --> 00:49:59.580
If you make it like into a variable or something like that, then the methods that you're going

00:49:59.580 --> 00:50:01.400
to pull are going to be totally identical.

00:50:01.400 --> 00:50:03.200
It's really easy to try.

00:50:03.200 --> 00:50:04.240
Yeah, that's it.

00:50:04.240 --> 00:50:05.000
Yeah, that's awesome.

00:50:05.000 --> 00:50:05.640
Yeah.

00:50:05.640 --> 00:50:07.500
And Dask has similar stuff as well, right?

00:50:07.500 --> 00:50:11.220
You create a Dask data frame instead of a pandas data frame, but the API looks quite similar.

00:50:11.220 --> 00:50:16.060
They're not always 100% compatible, but most of the mainstream things, right?

00:50:16.240 --> 00:50:17.120
Definitely.

00:50:17.120 --> 00:50:23.780
So yeah, it's built definitely to make it as easy as possible to switch between the two.

00:50:23.780 --> 00:50:24.560
So it's very similar.

00:50:24.560 --> 00:50:24.800
Yeah.

00:50:24.800 --> 00:50:26.640
Thanks a lot, everybody, for showing up.

00:50:26.640 --> 00:50:27.080
Yeah.

00:50:27.080 --> 00:50:27.720
Thanks.

00:50:27.720 --> 00:50:28.600
Thanks, Maya.

00:50:28.600 --> 00:50:29.400
Thank you, Marlene.

00:50:29.400 --> 00:50:30.320
It's really great to have you here.

00:50:30.320 --> 00:50:31.240
No problem.

00:50:31.240 --> 00:50:32.500
Thanks for having me.

00:50:33.020 --> 00:50:34.640
Thank you for listening to Python Bytes.

00:50:34.640 --> 00:50:37.200
Follow the show on Twitter via at Python Bytes.

00:50:37.200 --> 00:50:40.080
That's Python Bytes as in B-Y-T-E-S.

00:50:40.080 --> 00:50:43.500
And get the full show notes at pythonbytes.fm.

00:50:43.500 --> 00:50:47.840
If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

00:50:47.840 --> 00:50:50.540
We're always on the lookout for sharing something cool.

00:50:50.540 --> 00:50:53.800
On behalf of myself and Brian Okken, this is Michael Kennedy.

00:50:53.800 --> 00:50:57.360
Thank you for listening and sharing this podcast with your friends and colleagues.

