#250: skorch your scikit-learn together with PyTorch

Published Wed, Sep 15, 2021, recorded Wed, Sep 15, 2021

Watch the live stream:

Play on YouTube

Watch the live stream replay

About the show

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Hey there, thanks for listening. Before we jump into this episode, I just want to remind you

00:03 that this episode is brought to you by us over at Talk Python Training and Brian through his pytest

00:09 book. So if you want to get hands-on and learn something with Python, be sure to consider our

00:14 courses over at Talk Python Training. Visit them via pythonbytes.fm/courses. And if you're

00:21 looking to do testing and get better with pytest, check out Brian's book at pythonbytes.fm slash

00:27 pytest. Enjoy the episode. Hello and welcome to Python Bytes, where we deliver Python news and

00:32 headlines directly to your earbuds. This is episode 250, recorded September 15th, 2021. I'm

00:39 Michael Kennedy. And I'm Brian Okken. And I am Grayson. Grayson, welcome to Python Bytes. Yeah, it's a

00:46 pleasure. I've been looking so much forward to joining you guys. Yeah, you've been somebody out

00:51 there who's been giving us a lot of good ideas and topics and helping us learn about new things. So

00:56 you've been a big supporter of the show and now you are part of the show.

00:59 Yeah, I'm hurrah.

01:00 Hurrah.

01:01 Yeah, yeah, hurrah. It's, yeah, it's, I've been looking so much for it. Like, for the first

01:08 time I saw, oh, we can take part in this. I go like, oh, I should try to just get myself

01:13 in there. And here I am.

01:14 Yeah, here you are. Thanks for, thanks for doing that. That's really nice. Tell people a bit

01:18 about yourself before we dive into Brian's first topic.

01:22 Yes. Well, my name is Grayson Daniel and I'm originally from Tanzania, but living in Denmark,

01:30 married with three awesome kids. Currently, I'm a principal data scientist at NTT Data Business

01:40 Business Solution here in Copenhagen. And yeah, so accidentally became a data scientist and somehow

01:50 discovering that I was really, really good at it. Then I just started climbing my way out

01:55 thanks to the Python community and everything that is out there.

02:00 Yeah, awesome. Congratulations. Nice to see you finding your way in the data science world.

02:04 Very cool. Accidentally becoming a data scientist. That's interesting.

02:08 Exactly. All right, Brian, have people been doing things wrong?

02:12 I think so.

02:13 Including race conditions with screen sharing.

02:17 Yeah. So I just couldn't resist this article. There's an article out called Exciting New Ways

02:24 to be Told That Your Python Code is Bad, which is just a great title. And the, the,

02:30 the gist is there's two new pilot errors. So it's pretty simple. There's, but it made me think

02:38 about my code a little bit. And the first one is, is an error to tell you to consider ternary

02:45 expressions. So if you, if you have, if you've got like if condition, and then you assign a variable

02:53 in both the, the, if clause and the else clause, and it's a short thing, maybe, maybe use a conditional

03:00 expression instead and do all in one line, like say, in one, one of the examples in the blog post says,

03:07 X equals four, if condition else five. So ternary operators are pretty cool. and they're pretty

03:14 easy to read in Python, but I was just curious what you thought, is this, is a ternary expression

03:20 easier to read or more difficult? Well, for me, I think this is pretty nice. I'm always on the edge

03:26 about the ternary condition, the value, if condition else, other value. A lot of times it starts to

03:32 stretch out to be a little bit verbose. And then it's kind of, you know, it's not entirely obvious.

03:38 One thing I recently learned about, I don't know how it took me so long is the simpler version of that,

03:43 like variable or other option at, without the, the, if else, just the thing or that thing. Right. So for

03:51 example, if you try to get a user back, and you just want to return the user or you want to return,

03:57 maybe you want to check if they're admin. If they are, you return them. Otherwise you might turn

04:01 them back. You could say something to the effect of like, you know, if I say user or a result equals

04:07 user or false or, or something like that. Some, it's not a totally good example here, but this like

04:13 super short version of value where you kind of have the return value and the test and then the fallback,

04:19 the else case, it wouldn't work in the example that I have here, but that one, I actually,

04:23 I started to really like because it's so concise. Yeah. I don't know. I think I'm very traditional.

04:29 I like reading my code up, going down. So whenever it started stretching sideways to me, I'm like,

04:35 Oh, okay. I think I just love the flow of if, then I know I have to look down for the else. Right.

04:41 But now I have to look the else from the other side. Then they, yeah, but I, well, line is,

04:47 good in some places, but in most of the cases, out of readability, I usually just try to avoid them.

04:53 Yeah, I do as well. The one thing I was thinking is interesting on the data science side,

04:57 Jason is a lot of times you're trying to take instead of statements, multiple lines,

05:02 you're trying to create little expressions that you can put together in like little list

05:06 comprehensions and other types of things. And these, these one liners become really valuable there.

05:11 Yeah. Yeah. Definitely. Definitely. Mostly when we're using lambdas everywhere, right?

05:16 Yes, exactly. Yeah, exactly.

05:17 So the, then the next error condition is funny, I think, and it's just the while is used. So it's just a

05:25 warning to say you have a while in your code. And this, the comment really is there's,

05:30 it's just not really usually good to have a while because it can, like never terminate. You can,

05:38 there's no, it's not guaranteed to terminate if you've got a while loop. So, I thought that

05:43 was interesting. I actually was just thinking about this the other day is that I don't,

05:48 I can't even remember the last time I've used a while loop in some code. So I think this,

05:53 I think this is actually pretty good just to warn people they've got a while loop.

05:56 It's pretty strong. It's a pretty strong warning to say you have used this language construct. That's

06:01 a problem. I certainly think it's, I'm on board with the Zen of the idea that most of the time,

06:09 a while means you're doing it wrong. Most of the time you could probably iterate over a collection

06:14 or you could enumerate and then iterate over the index and the value. But there are times where you

06:20 actually need to test for something and then break out and, and to put it as a full on warning just for

06:27 its existence. To me, it seems a bit too far, but it's, it's interesting to say the first one.

06:34 yeah, I think these are both sort of in the eye of the beholder a bit.

06:37 Yeah. Yeah. I actually been in like in our team or in my whole existence, I think we're using while

06:43 only once, and this is on the computer vision. So you are trying to capture videos, from the

06:50 camera and then do analysis with them. So it says, while there's a frame, keep on doing this. And of course,

06:56 you always have to catch, some way to go out of this while loop. But I think that's the

07:03 only time we use while. And we usually want one people says never use while, except when you are

07:10 doing computer vision.

07:11 Interesting. Yeah. And especially if you got things like pandas and stuff, or maybe you shouldn't even

07:15 be looping at all.

07:16 No, no, not at all.

07:17 Not at all.

07:17 Yeah. Interesting.

07:18 Interesting.

07:19 A couple of thoughts from the live stream. So Sam Morley out there says X equals Y or Z is really

07:25 handy for setting instance variables in a class, where they're using nuns. I totally agree.

07:29 Chris May. Hey, Chris says turn air is a great idea if it's simple else. Not so much.

07:35 Yeah. Brandon Brainer out there agrees with you, that the traditional if else is probably easier to read. Henry Schreider says turn air is much

07:46 better for type type checking as well. Okay. Yeah. Probably because the type of print is more

07:51 obvious there. So yeah, pretty neat, pretty neat. Also speaking of neat stuff, what if you could have

07:58 all sorts of little placards and things about your read me? So here is a project I want to tell people

08:04 about called GitHub read me stats and get up read me stats is pretty interesting. It's, comes to us

08:12 from Poma. So thank you, Poma, for sending that in. And the description says it's

08:16 dynamically, dynamically generated stats for your GitHub read me's, but I feel like that scope is

08:23 actually way too short. It's dynamically generated little placards for wherever you want to put them

08:29 on the internet. You might want to put them on a project read me so the project can describe itself

08:34 more dynamically, but you might also want to put it on your about page on your blog or something like

08:40 that. So give you all a sense of what's going on here. If you come down here, you can have these

08:45 different, there's a whole bunch of different options. You can get like a GitHub stats card.

08:49 You can get extra pins. You can get the languages. Like for example, we could say what the languages

08:56 you are most likely to use across all of your repositories, the walk of time week stats.

09:02 there's a bunch of themes and visualizations and stuff. So I think the best way to get a sense of

09:06 this is to see an example. So I put a couple of projects in my own self in here to kind of pick

09:11 on me. So here's a image that I could add. I'll zoom that in. So I have this Python switch package

09:18 that I created a while ago when Python didn't have anything like a switch statement. So I wanted to add

09:23 a switch statement to the Python language. So I did. And apparently here are the stats of it. These are live,

09:28 right. If I refresh it, it'll regenerate it. And it gives you a little bit of info about the project,

09:33 like the name and it's little description. That's mostly Python. As it says, it has 238 stars and 18

09:40 forks, which is pretty awesome. So all I got to do to get that is to go up here and say, I want to get

09:45 the pin and I want to have the username be Mike C. Kennedy and the repo be Python dash switch.

09:50 And then this returns an image that I can put, like I said, anywhere, right? If you put this as the image

09:54 source, it'll go, it's not just like it'll only render on GitHub. It'll go wherever you put it.

09:59 So I think that that's pretty cool. Another example would be your stats. I'll refresh this

10:04 because your little animation, I can get my Michael Kennedy's GitHub stats. Apparently I have an A plus

10:09 plus, but a two thirds closed red ring. I'm not totally sure what the ring means, but kind of a

10:14 cool little graphic here. Apparently I've got 3.5,000 stars, which surprised me. A lot of commits,

10:20 73 PRs, 103 issues, 23 repositories that contributed to. I don't know if that's this year or maybe this

10:27 year, who knows? Or total. Anyway, that's kind of cool, right? You could put that on your blog or

10:31 somewhere where you're trying to talk about yourself, like you're trying to get hired or you do consulting

10:36 or something. And then the third one here is you can say your most used languages. So apparently I've

10:40 most used JavaScript, which is very much not true, but I've probably committed a ton of like

10:46 node modules to some projects that I don't actually want to have to, you know, re-NPM install.

10:52 I want to just make sure they're there for like a course or something like that. Right. But it'll show

10:55 you through the breakdown of your various languages and whatnot. So that gives you kind of a sense of

11:01 what these are all about, what the idea of this thing is. You generate these little cards and you can

11:06 put them, like I said, wherever you want. What do you think? Like on our resume page.

11:09 Yeah. Yeah. I really love it. But it's kind of sad because some most of our time is spent in GitLab and all

11:17 this other and all our commits are done there. And then when I come to my GitHub, it looks so empty and it makes my heart sick.

11:24 What has President been doing? He hasn't committed anything for a week.

11:27 Yeah. Yeah. So it's really, really awesome.

11:30 Yeah. Cool. Yeah. I guess it really only works for GitHub and that's where it's really handy, but still pretty nice.

11:36 Do you know if the stats are only on public repos or are they public and private?

11:40 It's a good question. So you can choose as a user, if you go down here and like the stuff that shows in your contributions in your GitHub profile, you can check whether you want public and private contributions to appear in that little green.

11:55 Okay.

11:56 How much contributions have you made this year by day?

12:00 Okay.

12:00 So maybe it depends on whether you've checked that or not. You know what I mean?

12:04 Probably.

12:05 But it might not.

12:06 Cool.

12:07 Anyway. Yeah. Pretty cool. A little project.

12:10 Brayson, you're up next. What you got?

12:11 Yes. Yes. Yes. So I got this one here. Actually, this is, it's something that has been covered, not covered, covered, but been mentioned.

12:20 So I could see it in footnotes as when I searched through. Actually, Brian, you covered it in episode 182 with hypermodern Python.

12:31 I think it's just a name that was there. Yeah. But it was not mentioned. I think it's just been, oh, this could be used in this hypermodern Python way of doing awesome.

12:40 stuff. And then in episode 248, it was mentioned again with hypermodern Python cookie cutter, but it's just like a footnote of, oh, it use Nox instead of Tox.

12:52 So this is really, really an awesome tool that we've been using recently because when we do machine learning, we are encountering a lot of problems where we have to test how our models are performing and how are they ethical.

13:11 So the test, when we do tests of our pipelines, we're not just testing that the models are accurate or they are doing the things that they're doing, like the API.

13:24 So we actually, you cannot just testing our API, you need to have keys and all those. We actually also have to test about the ethicalness of our models. So like if we say our models does not segregate between, let's say, gender.

13:39 So we test, we have counterfactual tests where we send different genders and see what are the models responding. Are they responding with the similar results?

13:48 So when we say it doesn't segregate between sexual orientation, then we send different inputs where it pretends to be either straight or homosexual and just try to say, do we receive the same results?

14:05 So we've been trying to run this very automatic way. And before that, we use a lot of TOCs.

14:13 But the problem is, TOCs, the way of defining your TOCs is just not Pythonic. Like you don't write this Pythonic way of doing things.

14:24 It's similar to, we had this issue with make. I really could not debug make. So whenever I made a make file, I copied from someone else and then changed some things.

14:34 Because anything I touched in, I have a syntax error. Oh, this thing is not in the right place.

14:40 And then I came across evoc, which it was almost like Pythonic. I can write everything in a Python way.

14:48 So this Nox is actually similar to what evoc did to make, but it's doing exactly to TOCs.

14:58 So in this case, you can create simple pipelines like this one here, where it creates a session, installs the package that needs to be installed, and then run whatever experiments you're trying to run.

15:12 And this is really, really handy. At least we found it really handy because you can select that it actually use the KONDA environment. Like the KONDA world has been used a lot in data science. So you can say first create a KONDA virtual environment, install these packages, and then test them.

15:29 So what I like about this tool, it's almost similar to pytest. Like if you know how pytest works, then you know how this guy works because there's a parameterization. And whenever you run tests, you can select which part of station needs to be run.

15:47 Like in pytest, we use the -k run this kind of test. And here you use the same thing, -k run only these kind of builds, right? So it is dark. We really, really enjoy that. Like you can pass in environment variable. But I actually wanted to show you the coolest part here.

16:08 -yeah. -yeah, this does look nice.

16:09 -it's just amazing. I cannot, I cannot, I mean, the guy who created this, I just give him all the thumbs up with everything that they have come up with. So it's really, really handy if you're not using it, or if you're using TOX, you should probably consider changing to NOX.

16:31 -that's cool. If you can, for example, write that you have a test and then say, I want this, you know, as a decorator, sort of parameterized. I want this to run on 2736, 3738, and it'll do that, right?

16:42 -yeah. So, so, so, so you can see, it's like this example here, right? So you can see we are parameterizing different Django. So we want it to first install this version and then run the tests, right? And then later it will come and take this version and run the test. But then in the command line, you can

17:01 actually just select it to run only the test with this guy and skip this guy here. So it's really, I mean, it's the ability that it, it gives you, it's incredible. So if I could see, so you can see like here, right here, right? This is exactly what like it goes into the py test-ish world.

17:23 -I see. So you can run it and say, don't, don't run the linter or just lint it, don't run the test or test. You can even put Python expressions. It looks like tests and not lint, for example.

17:33 -Is it, I mean, it's just, it's just, insanely great.

17:38 -Nice. Brian, what do you think of this?

17:41 -Oh, I really like NOX. It's neat. the parameter is, the use of parameters is really cool. and the example of like using a couple of different Django's is good, but you can also,

17:52 build up, matrices of testing easily with like a couple, you can stack these so you can have two parameters together. it's, it's a pretty cool project. I just really love talks. So I haven't, I haven't switched. there's, but, but I know that the, you know, there's, there's like invoke also people are using invoke for automation, but people are using NOX for, more than just automating testing.

18:21 -You can automate, really whatever you want to, you can run. It's just running a command, right? So.

18:26 -Nice. Yeah.

18:28 -Prayson, you've got a lot of comments from the live stream on this one.

18:31 Henry Schreider says, "I love NOX. TOX is mired in backwards, compatibility defaults. It is hard to tell what's actually doing. Whereas NOX is simple. It doesn't hide or guess stuff. It's just program like pytest." Sounds great.

18:43 Sam Morley says, "This is the only way to write a make file with invoke."

18:49 -I had, I mean, I had, I had that one.

18:54 -Yeah. Henry also says, "The PyPA projects have some very powerful NOX files, CI build, wheel, pip, and so on." which is good. And then Sam Morley also has a question for you. "Can it also, NOX, run external tools? For example, build a C extension or run a C test suite?"

19:00 -Oh, I don't know, Brian. -I don't know that either. I assume so.

19:05 -It definitely can, because Python has sub-process, but can it do it without you forcing that into it? You know, but you could put technically, you know, Python call this other command.

19:10 -Oh, I don't know.

19:12 -Oh, I don't know.

19:13 -Oh, I don't know.

19:14 -Oh, I don't know.

19:15 -Oh, I don't know.

19:16 -Oh, I don't know.

19:17 -Oh, I don't know.

19:18 -Oh, I don't know.

19:19 -Oh, I don't know.

19:20 -Oh, I don't know.

19:21 -Oh, I don't know.

19:22 -Oh, I don't know.

19:23 -Oh, I don't know.

19:24 -Oh, I don't know.

19:25 -Oh, I don't know.

19:26 -You could put technically, you know, Python call this other command, right?

19:31 -Well, there's an example in one of the, in the tutorial of, of calling CMake.

19:36 -Yeah, I saw the CMake as well. So that probably counts, right?

19:39 -Yeah.

19:40 -Yeah, I think that would count.

19:41 -So it's just running a command.

19:43 -Yeah.

19:44 -Yeah.

19:45 -Of course.

19:46 -And then, Brian, Brandon out there has a comment for you. New lights look great.

19:48 - Thanks.

19:49 -I agree with him.

19:50 I actually need to adjust my, my camera a little bit, which is a little bit off on the lights.

19:54 Very cool.

19:55 -All right.

19:56 -let's see. I think Brian, you got the next one.

19:58 -Oh, okay.

19:59 I forgot what I was talking about.

20:01 yeah.

20:02 So I've got the old document there.

20:05 So I've got a couple of things I wanted to talk about.

20:07 So this is, one of those extra, extra, extra things, but there's just two.

20:11 a couple of things around dealing with text and I've been playing with my blog a little bit lately,

20:18 not really writing much, which is a problem, but, but actually dealing with some of the old things.

20:24 -The whole thing you wrote looks really good now.

20:26 -

20:27 Well, I'm, I'm, I'm doing some, automated, trying to automate some of the parsing of some of the old stuff.

20:33 So I grabbed a whole bunch of blog posts from, from, WordPress and which, yeah, you can, nobody needs to throw eggs at me.

20:42 I'm already switching, and using Hugo now, but, but I've got a whole bunch of files that I automatically generated markdown files, but there there's problems with them.

20:51 So I have to, I have to keep track.

20:53 So I've got some scripts.

20:54 So they, a couple of tools are helping me.

20:56 Python front matter is, is a really pretty, it's a package that's, it's just a really small package, but it, all it does is really takes, like YAML style, front matter stuff.

21:10 And, and parses those, you could just load it.

21:14 So you load, I'm using a markdown files of the example shows a text file and, and you can get at all the pieces of the file, like the content and stuff.

21:23 But it, for instance, I can grab, I can grab the title.

21:27 You can look at what the keys are, but so for blog posts, I've got, you know, tags and, the date and it's all, it's all converted to, Python objects.

21:38 So, if I have a date, listed in, a blog post, it'll show up as a date time object.

21:46 So you can do math on it and all sorts of stuff.

21:49 So this is pretty cool.

21:50 it's really small, but super handy for it.

21:53 For what I need.

21:54 So it's good.

21:55 Yeah.

21:56 This looks nice.

21:57 The other tool I wanted to talk about, which is even a tinier use case, I think is, called FTFY fixes text for you.

22:04 and really it just takes bad Unicode conversions and makes them good.

22:10 So it takes like common problems with Unicode conversions and, fixes them in like where it looks like you have Greek or Russian letters or something instead of a space or apostrophe or something like that.

22:22 Yeah.

22:23 Like the, one of the first example, a quick example, there's like, yeah, like this weird AE character and really it was intended to be a check mark.

22:31 So it just converted it to the proper, what it was.

22:34 What it was.

22:35 I'm not sure how it's doing this, but it's pretty neat.

22:37 That is very cool.

22:38 the, the, this gets me all the time with stuff like goes from word.

22:43 If I'm converting from word or something, or copying, copy and pasting, or other things.

22:49 There's a lot of different quote marks that word processors put in and like, it just ends up being gross in a lot of places.

22:57 Yeah.

22:58 And, having that converted to just the, the, one example is, the Mona Lisa doesn't have eyebrows, but instead of the just apostrophe T it's this weird, ugly, big Unicode thing.

23:11 yeah.

23:12 So just, just replacing that with an apostrophe is a good idea.

23:15 Yeah.

23:16 Nice.

23:17 Does it change single quotes to double quotes and stuff like that as well?

23:20 I don't know.

23:22 I don't know if it should either.

23:26 I'm not sure.

23:27 Yeah, this is cool.

23:29 So you just run this across like your Markdown files or something like that.

23:32 Yeah.

23:33 So, I'm not using it really for the blog stuff, but there's, there was some other text parsing I was doing where I was scraping some information from somewhere.

23:40 And it just was just gross.

23:43 there's a, had a bunch of gross Unicode stuff in it.

23:46 And I just wanted to, you know, have something easy to just convert it quickly.

23:51 And this does the trick.

23:53 Yeah.

23:54 Yeah.

23:55 Very cool.

23:56 Nice one.

23:57 Nice finds.

23:58 So I'd follow up on that.

23:59 I was playing with my, Oh, my posh shell and the new windows terminal and the new windows

24:03 power shell on windows 11 earlier this week, trying to set up some testing over there.

24:09 And I found they have all these cool themes that show you all kinds of neat stuff.

24:13 So you can see like, the get branch you're on and they've got these little cool arrows and all these colors.

24:19 And they'll even do certain things for like showing the version of the Python virtual environment.

24:24 And they'll be able to do certain things for you to do.

24:25 And they'll be able to do certain things for you to do.

24:26 And they'll be able to do certain things for you to do.

24:27 And they'll be able to do certain things for you to do.

24:28 And they'll be able to do certain things for you to do.

24:29 And they'll be able to do certain things for you to do.

24:30 And they'll be able to do certain things for you to do.

24:32 And they'll be able to do certain things for you to do.

24:33 And they'll be able to do certain things for you to do.

24:34 And they'll be able to do certain things for you to do.

24:35 And they'll be able to do certain things for you to do.

24:36 And they'll be able to do certain things for you to do.

24:37 And they'll be able to do certain things for you to do.

24:38 And they'll be able to do certain things for you to do.

24:39 And they'll be able to do certain things for you to do.

24:40 Posh Shell is tested on Nerd Fonts.

24:42 But Nerd Fonts is full of all these amazing developer fonts that have font ligatures and all sorts of cool stuff.

24:49 And they're all free.

24:50 There's like 50 developer fonts and terminal fonts and stuff.

24:53 So, yeah.

24:54 One more thing along those lines to check out.

24:56 Very neat.

24:57 But what I wanted to talk about is stealing this idea from Preycin that he was going to cover.

25:02 But I got to it.

25:03 Got to it before.

25:05 So, there's this new project that recently is making traction.

25:10 It's been around for a couple of months.

25:13 Even, I guess it's about two years old, honestly.

25:15 But somehow it got discovered and is now getting some traction called Empire.

25:20 M-P-I-R-E.

25:22 And the idea is it's a Python package for easy multiprocessing.

25:26 It's like the multiprocessing module, but faster, better, stronger.

25:30 It's like the Bionic one.

25:32 So, the acronym stands for multiprocessing is really easy.

25:37 I love that thought.

25:38 And it primarily works around taking multiprocessing pools, but then adding on some features that make it more efficient.

25:47 For example, instead of creating a clone, a copy of every object that gets shared across all the multiprocessing,

25:53 it'll actually do copy on write.

25:55 So, it won't make a copy of the objects you're just reading.

25:58 It'll only make a copy of the ones you're changing.

26:00 So, if you start like 10 sub-processes, you might not have to make copies, 10 copies of that, which can make it faster.

26:06 It comes with cool like progress bar functionality and insight to how much progress it's made.

26:11 It's also supposed to be faster.

26:13 I'll talk about it in a second.

26:14 But it has map, map unordered, and things like that.

26:18 Interim maps.

26:19 The copy on write I talked about, which is cool.

26:22 Each worker has its own state and some like startup shutdown type of behaviors you can add to it.

26:28 It has integration with TQDM, the progress bar.

26:32 What else does it have?

26:34 Like I said, some insights.

26:35 It has user-friendly exception handling, which is pretty awesome.

26:39 You can also do automatic chunking to break up blocks of queues across sub-processes and multiprocessing,

26:47 including NumPy arrays.

26:49 You can adjust the maximum number of tasks or restart them after a certain number.

26:54 Restart the worker processes after a certain amount of work.

26:57 So, in case there's like a memory leak or it just hasn't cleaned it up, you can sort of work on that.

27:02 And create pools of these workers with like a daemon option.

27:05 So, they're just up and running and they grab the work.

27:08 Let's see.

27:09 It can be pinned to a specific or a range, specific CPU or a range of CPUs,

27:15 which can be useful for cache invalidation.

27:19 So, if you're getting a lot of like thrashing and moving across different CPUs,

27:23 then the caches have to read different data, which is of course way, way, way slower.

27:27 So, a bunch of neat things.

27:28 I'll show you a quick example.

27:30 So, in the docs, if you pull their page up, there's a multiprocessing example.

27:35 So, you write a function and then you say with pool processes equals five as pool, pool.map,

27:40 and give the function and the data iterable and it runs each one through there.

27:44 With the empire one, it's quite simple, similar.

27:47 So, you just create a empire worker pool and you specify the number of jobs.

27:51 And it says the difference of the code or small, you don't have to relearn anything,

27:54 but you get things like all the stuff I talked about, the more efficient shared objects,

27:59 the progress bar, if you want.

28:01 You can just say progress bar equals true and you automatically get a cool little TQDM progress bar.

28:07 You get startup and shutdown methods for the workers so you can initialize them and what else you need to do.

28:15 So, yeah, pretty cool little project.

28:17 And benchmarks show it down here at the bottom in the fast area.

28:20 So, you all can check that out.

28:22 Grayson, what did you like about this?

28:24 Well, I think it's also going to transition really well to the other topic that I have.

28:31 I like when one creates an API that you can just easily plug to your existing code.

28:38 Yeah.

28:39 So, you can just import this as this and do not change the entire code and then you take care of that.

28:44 You know, like writing your code in a way that one can just plug and play.

28:48 That's the amazing thing.

28:50 So, it's easy that you don't have to relearn a lot of stuff, but it just gives you the power that you need.

28:55 So, this is why we move toward this one.

28:58 So, we gain the power without changing much of our code.

29:01 Yeah.

29:02 Yeah, definitely.

29:03 I love that as well.

29:04 You know, I think of like HTTPX and requests for a while.

29:08 I think they diverged at some point, but yeah.

29:10 Let's see some feedback from audience real quick.

29:13 I'll jump back to the nerd fonts.

29:15 Chris says they're amazing.

29:17 Henry Schreiner says, "Fish shell plus Fisher plus oh my fish." And then the theme, "Bob the fish plus sauce code pro nerd font is fantastic."

29:26 Oh my gosh.

29:27 I have no idea.

29:28 I've explored this yet.

29:29 These are great names.

29:30 Yeah.

29:31 You're going to see me on a serious rat hole.

29:32 I'm going to be losing like the rest of the day.

29:34 No, I'm afraid.

29:35 To just fiddle with that, I'm afraid.

29:36 Well, I keep on messing my terminal every time I start fiddling around, right?

29:41 That's right.

29:42 Because I'm using WSL, Windows Subsystem, Linux, right?

29:47 Right.

29:48 So, whenever I fix something, then I get it right.

29:51 And before I know it, I broke it again.

29:54 And so, but yeah, it looks really awesome.

29:56 Yeah.

29:57 Fantastic.

29:58 And then on topic, what I was most recently talking about, Chris Mesa's "Whoa, Empire looks

30:03 nice." Alvaro asks, "Will it help to get logging working in multiprocessing?"

30:08 I don't know that it'll make any change.

30:11 I mean, it really is mostly still multiprocessing.

30:13 So, probably not.

30:14 Yeah.

30:15 Yeah.

30:16 Very cool.

30:17 All right.

30:18 Grayson, I think you got the last one here.

30:19 Yes.

30:20 Yes.

30:21 Yes.

30:22 So, I have this awesome tool here.

30:24 It's called Scotch.

30:25 It's really like a mixture of scikit-learn and torch.

30:31 This is really, really cool bit, whereas we were talking about building an API that it's

30:36 easy to integrate.

30:37 So, if someone already knows scikit-learn and a bit of torch, then you don't really need

30:44 to learn anything in this tool because everything just fits in together.

30:48 So, basically, when you're using scikit-learn, so if you are not familiar with scikit-learn,

30:56 it's just this, what we call it, the must-have toolkit for data scientists, because here they

31:03 have created a really good tool with a really good API where you can build an entire pipeline

31:09 from cleaning your data to building interesting models and everything like that.

31:17 But the biggest problem which we keep on experiencing when working with scikit-learn is when it comes

31:23 to neural networks, that you really don't have a lot of power to customize your networks in

31:29 the way that you will, like, it's very limited with this input that you already have here.

31:38 And in most cases, someone says, "Well, just create your own neural network classifier

31:44 or a regressor and then wrap it in the scikit-learn wrapper." But then, oh, sometimes one does not

31:50 want to do that. But nice thing is, another guy just came up with this project, which is really, really neat. So, basically, it's just, I think mostly I will just go about, maybe I should shamelessly show you an example.

32:06 I should shamelessly show you an example in one of my gifts, which is, I know this is a shameless way to do it, but it's easier like doing a demo on how it works, right?

32:20 So, like, if you're using scikit-learn, you are very familiar with all these other tools that someone needs to have, like the way to split your data, et cetera, et cetera.

32:28 But then it's the pipeline and all that kind of stuff. But the coolest thing is, instead of using one of the scikit-learn models, you can create your own custom neural net, right? So this will be like a neural network where we decided what, how many, what will, how many nodes we want in the first layer, how many nodes we want in the first layer, how many nodes we want in the first layer, how many nodes we want in the first layer.

32:49 How many nodes we want in the second layer. And here we can build as many interesting net as we see fit, right? And then basically here, we just do the calling of it. So this is very standard, high touch way of creating your net.

33:04 The awesome part is that now this net, forgetting about all this process, we can see. So we just create this net, wrap it up like this, and now we are using it as part of our pipeline. So you can see, I will just go down right here.

33:19 So I am having my preprocessor scikit-learn-ish and I'm having my net. And the coolest thing is, now I just call this thing as I will do with any scikit-learn model with my classifier.fit this. And later I will do my classifier.predict these things, right?

33:37 So this example is, we are trying to predict the species of penguin given the data that we have. So this whole thing is really, really cool because it obscures the whole fuzz of, when you do it in PyTorch, pure PyTorch, you will have to write this folder with optimizer, stepping up, stepping down, all these things.

34:01 But here, just transforming to the scikit-learn-world, where you just do fit, which just train your model. And now you can just do predict as if you're predicting any other scikit-learn tool.

34:15 So, so Scorch is a really, really tool that just does that. So it allows you to connect your touch net with the scikit-learn pipeline. So this is really, really awesome.

34:29 So I would just, I encourage people to take a look at it.

34:32 I love the idea of it, that basically you can create these PyTorch models and do what you need to do to set them up and then just hand them off to the rest of the scikit-learn world.

34:42 And I can see some really interesting uses for this. Like I've got some library and it can either integrate with PyTorch or it can integrate with scikit-learn and it just uses this little wrapper to pass it around. I like it.

34:54 Yeah. Yeah. So just for me, it's like, it just gave me this ability to create this, more extended, algorithms, and then just continue using my scikit-learn, my scikit, my, scikit pipelines.

35:12 So that's the coolest thing that I don't have to change my code because I just want to replace one line and that is the model. So I get the model from, scotch and then pass it in my ordinary, something like logistical regression.

35:24 Instead. Now I'm using a net.

35:26 Love it. Nice. Brian, what do you think? You like this pattern?

35:29 Yeah, I do. I like the, I like the pattern of, being able to use, not have to change your entire tool chain, just, to change one piece. Nice and clean.

35:38 Yeah. I like it as well. So, that's it for our main items. Brian, I've got one. I feel like, I feel like I should have let you have this one, but I grabbed this little extra thing I wanted to throw out there. Cause I thought it would make you happy.

35:50 Cause I thought it would make you happy. Neat. Can't wait. Yeah. So, Marco Garelli sent over this thing and said, if you want to work in JupyterLab, right. I know that one of your requirements for working with tools and shells and stuff is that they're Vim ish. You can do Vim keyboard things to it. I'm excited. Yeah. So he sent in this thing called JupyterLab dash Vim, which is Vim notebook cell bindings for a JupyterLab. So if you're editing a notebook cell, you can do all of your, your magic Vim.

36:19 he's to make all the various changes and whatnot, that you want. So yeah. Cool. What do you think?

36:25 I'm definitely going to try this. Yes. Yeah. Awesome. All right. Let's see. What else do I have? I got, oh yeah, this, I nevermind my picture. I didn't really intend to put that up there, but I just want to point out that I'm going to be speaking. And the reason the picture is there is the conference, the pie bay conference that is running next month. they featured my talk that I'm doing. So that's why there's a picture of me, but the pie bay 2021 food truck.

36:48 And food truck edition, they have rented out an entire, like food cart topia type place with a bunch of these pods and having a conference outdoors and putting up multimedia, like camera TVs and stuff for each pod. So even if you're not at the, like a great line of sight, you can still see the live talks, but sit outside and drink and eat food, cart food in California. Sounds fun.

37:10 Sounds fun. So I'm going to be talking about, what, what did I say? My title, my talk was, is going to be HTMX plus flask modern Python web apps, hold the JavaScript. So I'm looking forward to giving that talk in there. So people, if they're generally in that area, they might want to check that out.

37:25 I might, that just sounds fun.

37:27 Yeah. Yes, indeed. All right. That's it for my extra items. You got any extras, Brian?

37:33 No. How about you, Preston?

37:34 Yes, I got one. I had to actually search if this one has been covered and I was surprised that has not been covered.

37:43 I don't think it has. What is this?

37:45 It's so it's, you know, there is something called pi dot inf. So we've been using pi dot inf to, of course, ones can say, why don't you just use always dot inf then get whatever that is. Why do we need to, to install another package just to get the environment variable or something. But this is pretty, pretty neat. it's quite recent project, I think. and it's rising slowly.

38:11 And, there's a lot of contributor and it's, yeah, it, it's very promising. so what it does, I think I can just bring it, somewhere here. it allows you to do all this, type convention, casting, et cetera, et cetera. Right. Like you can say, you say, I'm going to get my debug here and then I will set the defaults and also I will do the casting here. Right. So this is really, really neat.

38:40 So often when you're reading config files, everything is a string and then you're like, Oh, this one is a date time. So I got to parse it. This one. Yeah. Yeah. Yeah. Yeah. So I got to parse it. Yeah. Okay. Yeah. But it's really even, it's so much that. So there's another way where you can say from decouple import auto config.

38:55 So it goes in search, dot in file is. So, otherwise you can just tell where the environment variable is, but it's just, it's just neat. It's very simple. It does what you want it to do.

39:09 So I will really encourage people to look at it. It's, I just, we, we, I've just changed every places where I've been using, dot in or always dot in with, with this one. And it's just, helped me clean some unnecessary steps in my code.

39:26 That's pretty cool. Yeah. Yeah. Great, great idea. Definitely check that one out. All right. Well, I think that's it for all of our items. Well, what do you think? Should we do a joke?

39:35 Definitely. I love it. Cause I've almost forgotten what the joke is. So it's going to be new to me as well. All right. So the joke is called adoption. This comes from monkey user.com. And you've heard about the Python idea of you came for the language, but you stayed for the community.

39:51 Well, what if it is a little bit different? What if actually people get brought in unwillingly and then they kind of realize they like it. So here's a picture of, like kind of an open field, you know, think, gazelle or something. And there's, there's a couple of developers just running and there's one who is fixated on a butterfly who doesn't actually see what there's a bunch of like a pack of Python developers coming to adopt them. It says a pack of Python developers spotting a junior dev away from its pack, initiate their conversion assault.

40:21 Yeah. Silly, silly, silly, silly, silly, silly. Man, I'm that way even for non-programmers. So, and my family just sort of like rolls their eyes every time this happens, but every time I like get a, some, a young, somebody going, coming over either in college or high school or just out of college, I'll, I'll say, so if you haven't done it already, I, no matter what your field is, you really should learn how to code. and while you're at it, why not?

40:51 Just choose Python. So I'm trying to make Python developers out of every person I meet.

40:55 I think that's, you do him a favor. It's, it's, it'll, it'll be their superpower amongst all their non-developer friends.

41:02 Yeah.

41:03 Awesome.

41:04 That's funny.

41:05 Brian, thanks as always. And Pracin, really great to have you on the show this week and thanks for being here.

41:10 Yeah. Thank you, Michael. Thank you, Brian.

41:12 Thank you.

41:13 You bet. Bye.

41:13 Bye.

41:14 Thanks for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in

41:21 B-Y-T-E-S. Get the full show notes over at Python Bytes.fm.

41:25 If you have a news item we should cover, just visit Python Bytes.fm and click submit in the nav bar.

41:30 We're always on the lookout for sharing something cool.

41:32 If you want to join us for the live recording, just visit the website and click live stream to get

41:37 notified of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays

41:43 over at YouTube.

41:44 On behalf of myself and Brian Okken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Want to go deeper? Check our projects

Course: Python for the Absolute Beginner course

Beginners

HTMX + Flask

FastAPI

pytest book

Full transcript