Transcript #250: skorch your scikit-learn together with PyTorch
Return to episode page view on github00:00 Hey there, thanks for listening.
00:01 Before we jump into this episode, I just want to remind you that this episode is brought to you by us over at TalkBython Training, and Brian through his pytest book.
00:10 So if you want to get hands on and learn something with Python, be sure to consider our courses over at TalkBython Training.
00:17 Visit them via pythonbytes.fm/courses.
00:21 And if you're looking to do testing and get better with pytest, check out Brian's book at pythonbytes.fm/pytest.
00:28 Enjoy the episode.
00:29 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
00:34 This is episode 250, recorded September 15th, 2021.
00:38 I'm Michael Kennedy.
00:39 And I'm Brian Okken.
00:41 And I am Preysen.
00:42 Preysen, welcome to Python Bytes.
00:45 Yeah, it's a pleasure.
00:46 I've been looking so much forward to joining you guys.
00:49 Yeah, you've been somebody out there who's been giving us a lot of good ideas and topics and helping us learn about new things.
00:56 So you've been a big supporter of the show and now you are part of the show.
00:59 Yeah.
00:59 I'm around.
01:00 Yeah.
01:02 Yeah.
01:02 For row.
01:02 It's a, it's a, I'm been looking so much for it.
01:05 like, for the first time I saw, Oh, we can take part in this.
01:10 I go like, Oh, I should try to just get myself in there.
01:13 And here I am.
01:14 Yeah.
01:15 Here you are.
01:15 Thanks for, thanks for doing that.
01:16 That's really nice.
01:17 Tell people a bit about yourself before we dive into Brian's first topic.
01:22 Yes.
01:22 Well, well, my name is, price and Daniel.
01:25 and I'm originally from Tanzania, but living in Denmark, married with three awesome kids.
01:33 Currently, I'm a principal data scientist at NTT Data Business Solution here in Copenhagen.
01:43 Yeah, so I accidentally became a data scientist and somehow discovering that I was really, really good at it.
01:53 Then I just started climbing my way up, thanks to the Python community and everything that is out there.
02:00 >> Yeah, awesome. Congratulations.
02:01 Nice to see you finding your way in the data science world.
02:04 >> Very cool.
02:05 >> Yeah, Brian.
02:05 >> Accidentally becoming a data scientist, that's interesting.
02:08 >> Exactly.
02:09 >> Yeah.
02:10 >> All right, Brian. Have people been doing things wrong?
02:12 >> I think so.
02:14 >> Including race conditions with screen sharing.
02:18 >> Yeah. I just couldn't resist this article.
02:21 There's an article out called "Exciting New Ways to be Told that Your Python Code is Bad," which is just a great title. And the gist is there's two new Pylint errors. So it's pretty simple. But it made me think about my code a little bit. And the first one is an error to to tell you to consider ternary expressions.
02:46 If you've got like if condition, and then you assign a variable in both the if clause and the else clause, and it's a short thing, maybe use a conditional expression instead and do all in one line, like say, in one of the examples in the blog post says x equals four if condition else five.
03:11 Ternary operators are pretty cool, and they're pretty easy to read in Python.
03:16 But I was just curious what you thought.
03:18 Is a ternary expression easier to read or more difficult?
03:22 >> Well, for me, I think this is pretty nice.
03:25 I'm always on the edge about the ternary condition, the value if condition else other value.
03:31 A lot of times it starts to stretch out to be a little bit verbose, and then it's not entirely obvious.
03:38 One thing I recently learned about, I don't know how it took me so long is the simpler version of that, like variable or other option at without the, if else, just the thing or that thing.
03:50 Right.
03:51 So for example, if you try to get a user back, and you just want to return the user or you want to return, maybe you want to check if their admin, if they are, you return them, otherwise you might turn them back, you could say something to the effect of like, you know, if I say user or a result equals user or false or something like that.
04:10 Some, it's not a totally good example here, but this like super short version of value where you kind of have the return value and the test and then fall back to else case.
04:19 it wouldn't work in the example that I have here, but that one I actually started to really like because it's so concise.
04:26 I don't know.
04:27 I think I'm very traditional.
04:29 I like reading my code up going down.
04:32 So whenever it started stretching sideways to me, I go like, Oh, okay.
04:36 I think I just love the flow of if, then I know I have to look down for the else, right?
04:41 But now I have to look the else from the other side.
04:43 Then, yeah, but one-liners are good in some places, but in most of the cases, out of readability, I usually just try to avoid them.
04:53 - Yeah, I do as well.
04:54 The one thing I was thinking is interesting on the data science side, Preysen, is a lot of times you're trying to take, instead of statements, multiple lines, you're trying to create little expressions that you can put together in like little list comprehensions and other types of things.
05:08 And these one-liners become really valuable there.
05:11 - Yeah, yeah, definitely, definitely.
05:13 Mostly when we're using lambdas everywhere, right?
05:16 - Yes, exactly, exactly.
05:18 - So the next error condition is funny, I think, and it's just the while is used.
05:23 So it's just a warning to say you have a while in your code.
05:28 And this, the comment really is there's, it's just not really usually good to have a while because it can never terminate.
05:38 It's not guaranteed to terminate if you've got a while loop.
05:42 I thought that was interesting.
05:44 I actually was just thinking about this the other day.
05:47 I can't even remember the last time I've used a while loop in some code.
05:52 I think this is actually pretty good, just to warn people they've got a while loop.
05:56 >> It's pretty strong. It's a pretty strong warning to say, you have used this language construct. That's a problem.
06:01 I certainly think it's, I'm on board with the Zen of the idea that most of the time, a while means you're doing it wrong.
06:11 Most of the time, you could probably iterate over a collection or you could enumerate and then iterate over the index and the value.
06:19 But there are times where you actually need to test for something and then break out.
06:25 To put it as a full-on warning just for its existence.
06:29 To me, it seems a bit too far, but it's interesting to say.
06:32 The first one, I think these are both sort of in the eye of the beholder a bit.
06:37 Yeah, I actually, like in our team, or in my whole existence, I think we're using WILD only once, and this is on the computer vision.
06:46 So you are trying to capture videos from the camera and then do analysis with them.
06:52 So it says, "WILD is a frame, keep on doing this." And of course you always have to catch some way to go out of this while loop.
07:02 But I think that's the only time we use while.
07:05 And we usually warn people, say, never use while except when you are doing computer vision.
07:11 Interesting. Yeah. Especially if you got things like pandas and stuff, where maybe you shouldn't even be looping at all.
07:16 No, no. Not at all. Not at all.
07:18 Yeah. Interesting.
07:18 Interesting.
07:19 A couple of thoughts from the live stream.
07:22 So Sam Morley out there says, "X equals Y or Z is really handy for setting instance variables in a class where they're using nones." I totally agree.
07:29 Chris May, hey Chris, says, "Ternary is a great idea if it's simple, else not so much." >> Nice, clever.
07:38 >> Brandon Brainer out there agrees with you, Preston, that the traditional if-else is probably easier to read.
07:43 Henry Schreider says, "Ternary is much better for type checking as well." >> Okay. Yeah, probably because the type of reference is more obvious there.
07:52 So yeah, pretty neat.
07:55 Also speaking of neat stuff, what if you could have all sorts of little placards and things about your README?
08:02 So here is a project I want to tell people about called GitHub README stats.
08:07 GitHub README stats is pretty interesting.
08:10 It comes to us from Palma.
08:12 So thank you Palma for sending that in.
08:14 And the description says it dynamically generated stats for your GitHub readmes.
08:21 But I feel like that scope is actually way too short.
08:25 It's dynamically generated little package for wherever you wanna put them on the internet.
08:30 You might wanna put them on a project's readme so the project can describe itself more dynamically.
08:35 But you might also wanna put it on your about page on your blog or something like that.
08:40 So give you all a sense of what's going on here.
08:43 You come down here, you can have these different, there's a whole bunch of different options.
08:47 You can get like a GitHub stats card, you can get extra pins, you can get the languages, like for example, we could say what the languages you are most likely to use across all of your repositories, the walk of time, week stats, there's a bunch of themes and visualizations and stuff.
09:05 So I think the best way to get a sense of this is to see an example.
09:08 So I put a couple of projects in my own self in here to kind of pick on me.
09:11 So here's a image that I could add.
09:14 I'll zoom that in.
09:15 So I have this Python switch package that I created a while ago when Python didn't have anything like a switch statement.
09:22 So I wanted to add a switch statement to the Python language, so I did.
09:26 And apparently here are the stats of it.
09:27 These are live, right?
09:28 If I refresh it, it'll regenerate it.
09:30 And it gives you a little bit of info about the project, like the name and its little description.
09:35 That's mostly Python.
09:36 As it says, it has 238 stars and 18 forks, which is pretty awesome.
09:42 So all I gotta do to get that is go up here and say I wanna get the pin and I wanna have the username be Mike C. Kennedy and the repo be Python-Switch.
09:50 And then this returns an image that I can put, like I said, anywhere, right?
09:53 If you put this as the image source, it'll go, it's not just like it'll only render on GitHub, it'll go wherever you put it.
10:00 So I think that that's pretty cool.
10:01 Another example would be, your stats, I'll refresh this 'cause there's a little animation.
10:05 I can get my Michael Kennedy's GitHub stats.
10:08 Apparently I have an A++ but a two thirds closed red ring.
10:12 I'm not totally sure what the ring means, but kind of a cool little graphic here.
10:15 Apparently I've got 3.5 thousand stars, which surprised me.
10:19 A lot of commits, 73 PRs, 103 issues, 23 repositories I contributed to.
10:25 I don't know if that's this year or maybe this year, who knows, or total.
10:28 Anyway, that's kind of cool, right?
10:29 You could put that on your blog or somewhere where you're trying to talk about yourself, like you're trying to get hired or you do consulting or something.
10:37 And then the third one here is you can say your most used languages.
10:39 So apparently I have most used JavaScript, which is very much not true.
10:43 But I've probably committed a ton of like node modules to some projects that I don't actually want to have to, you know, re-NPM install.
10:52 I want to just make sure they're there for like a course or something like that, right?
10:55 But it'll show you through the breakdown of your various languages and whatnot.
10:59 So that gives you kind of a sense of what these are all about, what the idea of this thing is.
11:04 So it generates these little cards and you can put them, like I said, wherever you want.
11:08 - What do you think? - Like on a resume page.
11:10 - Yeah. - Yeah.
11:10 - I really love it, but it's kind of sad because most of our time is spent in GitLab and all this other, and all our commits are done there.
11:20 And then when I come to my GitHub, it looks so empty and it makes my heart sing.
11:25 - What has Fraser been doing?
11:26 He hasn't committed anything for a week.
11:28 - Yeah, yeah.
11:28 So this is really, really awesome.
11:31 - Yeah, cool.
11:32 Yeah, I guess it really only works for GitHub and that's where it's really handy, but still pretty nice.
11:36 - Tina, if the stats are only on public repos or are they public and private?
11:41 - It's a good question.
11:42 So you can choose as a user, if you go down here and like the stuff that shows in your contributions, in your GitHub profile, you can check whether you want public and private contributions to appear in that little green of how much contributions have you made this year by day.
12:00 So maybe it depends on whether you've checked that or not.
12:04 You know what I mean?
12:04 - Probably.
12:05 But it might not.
12:06 But anyway, yeah, pretty cool little project.
12:09 Brayston, you're up next.
12:11 What you got?
12:12 Yes, yes, yes.
12:13 So I got this one here.
12:15 Actually this is it's something that has been covered, not covered, covered, but been mentioned.
12:20 So I could see it in the footnotes as when I search through.
12:26 Actually Brian, you covered it in episode 182 with HyperMod in Python.
12:31 I think it's just a name that was there.
12:33 Yeah, but it was not mentioned.
12:34 I think it's just been, oh, this could be used in this hyper modern Python way of doing awesome stuff.
12:41 And then in episode 248, it was mentioned again with hyper modern Python cookie cutter, but it's just like a footnote of, oh, it use knocks instead of talks.
12:53 So this is really, really an awesome tool that we've been using recently because when When we do machine learning, we are encountering a lot of problems where we have to test how our models are performing and how are they ethical.
13:11 So the test, when we do tests of our pipelines, we're not just testing that the models are accurate, or they are doing the things that they're doing, like the API, it's actually, you cannot just ping our API, you need to have keys and all those.
13:28 we actually also have to test about the ethicalness of our models.
13:32 So if we say our models does not segregate between, let's say, gender, so we have counterfactual tests where we send different genders and see what are the models responding.
13:46 Are they responding with a similar result?
13:48 So when we say it doesn't segregate between sexual orientation, And then we send different inputs where it pretends to be either straight or homosexual and just try to see, do we receive the same results?
14:05 So we've been trying to run this in an automatic way.
14:12 And before that, we used a lot of talks.
14:14 But the problem is, the way of defining your talks is just not Pythonic.
14:20 don't write this Pythonic way of doing things.
14:24 It's similar to, we had this issue with make.
14:27 I really could not debug make.
14:29 So whenever I made a make file, I copied from someone else and then changed some things because anything I touched, then I have a syntax error.
14:38 Oh, this thing is not in the right place.
14:40 And then I came across evoke, which it was almost like Pythonic.
14:46 I can write everything in a Python way.
14:48 So this Knox is actually similar to what Evoke did to Make, but it's doing exactly to Tux.
14:58 So in this case, you can create simple pipelines like this one here, where it creates a session, installs the package that needs to be installed, and then run whatever experiment you're trying to run.
15:12 And this is really, really handy, at least.
15:15 We found it really handy because you can select that it actually used the Conda environment, like the Conda world, it's been used a lot in data science.
15:24 So you can say first create a Conda virtual environment, install these packages and then test them.
15:30 So what I like about this tool, it's almost similar to pytest.
15:35 Like if you know how pytest works, then you know how this guy works because there's a parametrization and whenever you run tests, you can select which part of session needs to be run, Like in pytest, we use the -k, run this kind of test.
15:52 And here you use the same thing, -k, run only this kind of builds, right?
15:58 So it is dope.
16:00 We really, really enjoy that.
16:02 Like you can pass in a environment variable, but I actually wanted to show you the coolest part here.
16:08 - Yeah, this does look nice.
16:10 - It's just amazing.
16:12 I cannot, I mean, the guy who created this, I just give him all the thumbs up with everything that they have, they have come up with.
16:22 So it's really, really handy if you're not using it, or if you're using Tux, you should probably consider changing to Nuxt.
16:31 - That's cool.
16:32 You can, for example, write that you have a test and then say, I want this, you know, as a decorator, sort of parameterize, I want this to run on 2736, 3738, and it'll do that, right?
16:43 - Yeah.
16:44 So you can see it's like this example here, right?
16:47 So you can see we are parameterizing a different Django.
16:51 So we want it to first install this version and then run the tests, right?
16:56 And then later it will come and take this version and run the test.
16:59 But then in the command line, you can actually just select it to run only the test with this guy and skip this guy here.
17:06 So it's really, I mean, it's...
17:10 the ability that it gives you, it's incredible.
17:15 So if I could see, so you can see like here, right here, right?
17:19 This is exactly what like it goes into the pytest-ish world.
17:23 - I see, so you can run it and say, don't run the linter, or just lint it, don't run the test, or test.
17:29 You can even put Python expressions, it looks like, test and not lint, for example.
17:34 - Is it, I mean, it's just insanely great.
17:39 >> Nice. Brian, what do you think of this?
17:42 >> I really like Knox. It's neat.
17:44 The use of parameters is really cool.
17:48 The example of using a couple of different Django's is good, but you can also build up matrices of testing easily with a couple.
17:58 You can stack these, so you can have two parameters together.
18:02 It's a pretty cool project.
18:04 I just really love talks, So I haven't switched.
18:09 But I know that there's like invoke also, people are using invoke for automation, but people are using Knox for more than just automating testing, you can automate really whatever you want to.
18:24 You can run, it's just running a command, right?
18:27 - Nice, yeah.
18:29 Bracen, you've got a lot of comments from the live stream on this one.
18:31 Henry Schreider says, "I love Knox.
18:33 Tox is mired in backwards compatibility defaults.
18:37 It is hard to tell what's actually doing, whereas Knox is simple.
18:40 It doesn't hide or guess stuff.
18:43 It's just programmed like pytest, which sounds great.
18:45 Sam Morley says, "This is the only way "to write a makefile, which can invoke." (laughing)
18:52 - I mean, I had that one.
18:55 - Yeah.
18:56 Henry also says, "The PyPA projects "have some very powerful Knox files, "CI build, wheel, pip, and so on," - Which is good.
19:04 And then Sam Morley also has a question for you.
19:07 Can it also Knox run external tools, for example, build a C extension or run a C test suite?
19:13 - Oh, I don't know, Brian.
19:15 - I don't know that either.
19:17 - I assume so.
19:18 - It definitely can because Python has sub process, but can it do it without you forcing that into it?
19:26 But you could put technically, you know, Python call this other command, right?
19:31 - Well, there's an example in the tutorial of calling CMake.
19:36 - Yeah, I saw the CMake as well.
19:37 So that probably counts, right?
19:39 - Yeah.
19:40 - Yeah, I think that would count.
19:41 - So it's just running a command.
19:43 - Yeah. - Yeah.
19:44 - Of course.
19:44 - And then Brian, Brandon out there has a comment for you.
19:47 New lights look great.
19:48 (laughing)
19:49 I agree with him.
19:50 I actually need to adjust my camera a little bit, which is a little bit off on the lights.
19:54 Very cool.
19:55 All right, let's see.
19:57 I think, Brian, you got the next one.
19:59 - Oh, okay.
20:00 I forgot what I was talking about.
20:02 I've got the old document there.
20:06 I've got a couple of things I wanted to talk about.
20:08 One of those extra, extra, extra things, but there's just two.
20:12 A couple of things around dealing with text.
20:15 I've been playing with my blog a little bit lately, not really writing much, which is a problem, but actually dealing with some of the old text.
20:24 >> What you wrote looks really good now.
20:25 >> Well, I'm trying to automate some of the parsing of some of the old stuff.
20:33 I grabbed a whole bunch of blog posts from WordPress.
20:39 Nobody needs to throw eggs at me, I'm already switching and using Hugo now.
20:45 But I've got a whole bunch of files that I automatically generated Markdown files, but there's problems with them, so I have to keep track of them.
20:54 I've got some scripts, a couple of tools are helping me.
20:57 Python FrontMatter is a package.
21:02 It's just a really small package, but all it does is really takes YAML style, FrontMatter stuff, and parses those.
21:13 You could just load it.
21:15 I'm using a markdown files, the example shows a text file.
21:19 You can get at all the pieces of the file, like the content and stuff, but for instance, I can grab the title, you can look at what the keys are.
21:29 For blog posts, I've got tags and the date, and it's all converted to Python objects.
21:39 If I have a date listed in a blog post, it'll show up as a date-time object.
21:46 You can do math on it and all sorts of stuff.
21:49 This is pretty cool.
21:50 It's really small, but super handy for what I need.
21:54 >> Yeah, this looks nice.
21:56 >> The other tool I wanted to talk about, which is even a tinier use case, I think is called FTFY, fixes text for you.
22:05 Really, it just takes bad Unicode conversions and makes them good.
22:11 It takes common problems with Unicode conversions and fixes them.
22:16 >> It looks like you have Greek or Russian letters or something instead of a space or apostrophe or something like that.
22:22 >> Yeah. The first example, a quick example, there's this weird AE character, and really it was intended to be a check mark.
22:31 So it just converted it to the proper what it was.
22:35 I'm not sure how it's doing this, but it's pretty neat.
22:37 >> That is very cool.
22:39 >> This gets me all the time.
22:41 My stuff goes from Word, if I'm converting from Word or something, or copy and pasting, or other things.
22:49 There's a lot of different quote marks that word processors put in, and it just ends up being gross in a lot of places.
22:58 Having that converted to just, one example is the Mona Lisa doesn't have eyebrows, but instead of just apostrophe T, it's this weird, ugly, big Unicode thing.
23:11 Yeah, so just replacing that with an apostrophe is a good idea.
23:16 >> Yeah, nice. Does it change single quotes to double quotes and stuff like that as well?
23:21 >> I don't know.
23:23 >> It's nice. I don't know if it should either. I'm not sure.
23:28 Yeah, this is cool. So you just run this across your markdown files or something like that?
23:33 >> Yeah. So I'm not using it really for the blog stuff, but there was some other text parsing I was doing where I was scraping some information from somewhere.
23:41 and it just was just gross.
23:43 It had a bunch of gross Unicode stuff in it, and I just wanted to have something easy to just convert it quickly.
23:51 And this does the trick.
23:53 - Yeah, very cool.
23:55 Nice one, nice finds.
23:56 So I'd follow up on that.
23:58 I was playing with my OhMyPosh shell and the new Windows Terminal and the new Windows PowerShell on Windows 11.
24:06 Ooh, earlier this week, trying to set up some testing over there, And I found they have all these cool themes that show you all kinds of neat stuff.
24:13 So you can see like the Git branch you're on, and they've got these little cool arrows and all these colors, and they'll even do certain things for like showing the version of the Python virtual environment that's active in the prompt and stuff like that if you activate the virtual environment.
24:29 And all that had a bunch of weird blocks and like squiggly junk like that.
24:33 And so it's not exactly the same problem.
24:35 I'm gonna talk more about this later, but I found that there's this place called Nerd Fonts, And apparently Hotshell is tested on nerd fonts, but nerd fonts is full of all these amazing developer fonts that have font ligatures and all sorts of cool stuff.
24:49 And they're all free.
24:50 There's like 50 developer fonts and terminal fonts and stuff.
24:54 So yeah, one more thing along those lines to check out.
24:57 Very neat.
24:58 But what I wanted to talk about is stealing this idea from Preston that he was gonna cover, but I got to it.
25:04 Got to it before.
25:07 So there's this new project that recently is making traction.
25:11 It's been around for a couple of months, even I guess it's about two years old, honestly, but somehow it got discovered and is now getting some traction called MPIRE, M-P-I-R-E.
25:22 And the idea is it's a Python package for easy multiprocessing.
25:27 It's like the multiprocessing module, but faster, better, stronger.
25:31 It's like the Bionic one.
25:32 So the acronym stands for multiprocessing is really easy.
25:37 I love that thought.
25:39 And it primarily works around taking multiprocessing pools, but then adding on some features that make it more efficient.
25:47 For example, instead of creating a clone, a copy of every object that gets shared across all the multiprocessing, it'll actually do copy on write.
25:55 So it won't make a copy of the objects you're just reading, it'll only make a copy of the ones you're changing.
26:00 So if you start like 10 sub-processes, you might not have to make copies, 10 copies of that, which can make it faster.
26:06 It comes with cool like progress bar functionality and insight to how much progress it's made.
26:11 It's also supposed to be faster, I'll talk about in a second, but it has map, map unordered, and things like that, iterative maps.
26:19 The copy on right I talked about, which is cool.
26:22 Each worker has its own state and some like startup shutdown type of behaviors you can add to it.
26:28 It has integration with TQDM, the progress bar.
26:33 What else does it have?
26:34 Like I said, some insights.
26:36 It has user-friendly exception handling, which is pretty awesome.
26:39 You can also do automatic chunking to break up blocks of queues across sub-processes and multiprocessing, including NumPy arrays.
26:49 You can adjust the maximum number of tasks or restart them after a certain number.
26:54 Restart the worker processes after a certain amount of work.
26:57 So in case there's like a memory leak or it's just hasn't cleaned it up, you can sort of work on that and create pools of these workers with like a daemon option.
27:05 So they're just up and running and they grab the work.
27:08 Let's see, it can be pinned to a specific or a range, specific CPU or a range of CPUs, which can be useful for cache invalidation.
27:19 So if you're getting a lot of like thrashing and moving across different CPUs, then the caches have to read different data, which is of course way, way, way slower.
27:27 So a bunch of neat things, I'll show you a quick example.
27:30 So in the docs, if you pull their page up, there's a multi-processing example.
27:35 So you write a function and then you say, with pool processes equals five as pool, pool.map and give the function and the data interval and it runs each one through there.
27:44 With the Empire one, it's quite simple, similar.
27:47 You just create a Empire worker pool and you specify the number of jobs.
27:51 It says the difference of the code are small, you don't have to relearn anything, but you get things like all the stuff I talked about, the more efficient shared objects, the progress bar, if you want.
28:01 You can just say progress bar equals true and you automatically get a cool little TQDM progress bar.
28:07 You get startup and shutdown methods for the workers so you can initialize them and what else you need to do.
28:15 So yeah, pretty cool little project.
28:17 And the benchmarks show it down here at the bottom in the fast area so you all can check that out.
28:22 Grayson, what did you like about this?
28:24 Well, I think it's also going to transition really well to the other topic that I have is I like when one create an API that you can just easily plug to your existing code.
28:39 Yeah.
28:39 So you can just import this as this and do not change the entire code and then you take care of that.
28:44 You know, like writing your code in a way that one can just plug and play.
28:49 That's the amazing thing.
28:50 So it's easy that you don't have to relearn a lot of stuff, but it just gives you the power that you need.
28:56 So this is why we moved toward this one.
28:59 So we gain the power without changing much of our code.
29:02 - Yeah, yeah, definitely.
29:03 I love that as well.
29:05 You know, I think of like HTTPX and requests for a while, and I think they diverged at some point, but yeah.
29:11 Let's see some feedback from audience real quick.
29:14 I'll jump back to the nerd fonts.
29:15 Chris says they're amazing.
29:17 Henry Schreiner says, "Fish shell plus Fisher plus oh my fish, then the theme Bob the fish plus Sauce Code Pro nerd font is fantastic." Oh my gosh. I have no idea.
29:29 >> These are great names.
29:30 >> You're going to send me on a serious rattle.
29:32 I'm going to be losing the rest of the day.
29:35 >> No way.
29:35 >> I'm afraid.
29:36 >> Well, I keep on missing my terminal every time I start fiddling around, right?
29:42 >> That's right.
29:43 because I'm using a, yeah, WSL Windows subsystem Linux, right?
29:48 So whenever I fix something, then I get it right.
29:51 And before I know it, I broke it again.
29:54 And so, but yeah, it looks really awesome.
29:57 - Yeah, fantastic.
29:58 And then on topic was most recently talking about Chris Mace's, whoa, Empire looks nice.
30:04 Alvaro asked, will it help to get logging working in multiprocessing?
30:09 I don't know that it'll make any change.
30:11 I mean, it really is mostly still multiprocessing.
30:13 So probably not.
30:14 Yeah.
30:14 Yeah.
30:15 Very cool.
30:15 All right.
30:16 Grayson, I think you got the last one here.
30:18 Yes, yes, yes.
30:19 So I have this awesome tool here.
30:22 Like, it's a, it's called Scotch.
30:25 It's really like a mixture of scikit-learn and touch.
30:30 this is really, really cool bit where as we were talking about having a building an API that it's easy to integrate.
30:38 So if someone already knows Scikit-Learn and a bit of Torch, then you don't really need to learn anything in this tool, because everything just fits in together.
30:48 So basically, when you're using Scikit-Learn, so if you are not familiar with Scikit-Learn, it's just this, what we call, the must-have toolkit for data scientists, because here they have created a really good tool with a really good API, where you can build an entire pipeline from cleaning your data to building interesting models and everything like that.
31:16 But the biggest problem which we've been experiencing when working with Scikit-Learn is when it comes to neural networks, that you really don't have a lot of power to customize your networks in the way that you will… Like, it's very limited with this input that you already have here.
31:37 And in most cases, someone says, "Well, just create your own neural network classifier or a regressor, and then wrap it in the scikit-learn wrapper." But then, sometimes one does not want to do that.
31:51 But the nice thing is, another guy just came up with this project, which is really, really neat.
32:00 So basically, it's just, I think mostly, I will just go about, maybe I should shamelessly show you an example in one of my gist, which is, I know this is a shameless way to do, but it's easier like giving a demo on how it works, right?
32:21 So like, if you're using scikit-learn, you are very familiar with all these other tools that someone needs to have, like the way to split your data, et cetera, et cetera.
32:29 But then it's the pipeline and all that kind of stuff.
32:33 But the coolest thing is, instead of using one of the scikit-learn models, you can create your own custom neural net.
32:42 This will be like a neural network where we decided how many nodes we want in the first layer, how many nodes do we want in the second layer, and here we can build as many interesting net as we see fit.
32:56 and then basically here we just do the calling of it.
33:00 So this is a very standard PyTorch way of creating your net.
33:05 The awesome part is that now this net, forgetting about all this process, we can see, so we just create this net, wrap it up like this, and now we are using it as part of our pipeline.
33:17 So you can see, I will just go down right here.
33:20 So I'm having my preprocessor, scikit-learn-ish, and I'm having my net.
33:25 And the coolest thing is, now I just call this thing as I will do with any scikit-learn model, with my classifier.fitThis, and later I will do my classifier.predict these things. So this example is we're trying to predict the species of penguin given the data that we have. So this whole thing is really, really cool because it obscures the whole fuzz of when you do it in PyTorch, pure PyTorch, you will have to write this for loop with optimizer, stepping up, stepping down, all these things.
34:02 But here, just transforming to the scikit-learn world, where you just do fit, which just train your model, and now you can just do predict as if you're predicting any other scikit-learn tool.
34:15 So, Scorch is a really, really tool that just does that.
34:21 So it allows you to connect your Torch net with the Scikit-learn pipeline.
34:27 So this is really, really awesome.
34:29 So I would just encourage people to take a look at it.
34:32 I love the idea of it that basically you can create these PyTorch models and do what you need to do to set them up and then just hand them off to the rest of the Scikit-learn world.
34:42 And I can see some really interesting uses for this.
34:45 Like I've got some library and it can either integrate with PyTorch or it can integrate with scikit-learn and it just uses this little wrapper to pass it around. I like it.
34:54 Yeah, yeah. So just for me, it just gives me this ability to create these more extended algorithms and then just continue using my scikit-learn, my scikit-pipelines. So that's the coolest thing that I don't have to change my code because I just want to replace one line and and that is the model.
35:18 So I get the model from Scorch and then pass it in my ordinary something like logistical regression instead.
35:25 Now I'm using Annette.
35:27 - Love it.
35:28 - Nice. - Brian, what do you think?
35:29 You like this pattern?
35:30 - Yeah, I do.
35:31 I like the pattern of being able to use, not have to change your entire tool chain, just to change one piece.
35:38 Nice and clean. - Yeah.
35:39 I like it as well.
35:40 So that's it for our main items.
35:44 Brian, I've got one I feel like, I feel like I should have let you have this one, but I grabbed this little extra thing I wanted to throw out there 'cause I thought it would make you happy.
35:51 - Neat, can't wait.
35:53 - Yeah, so Marco Gorelli sent over this thing and said, "If you want to work in JupyterLab," right?
35:59 I know that one of your requirements for working with tools and shells and stuff is that they're Vim-ish.
36:05 You can do Vim keyboard things to it.
36:07 - I'm excited.
36:08 - Yeah, so he sent in this thing called JupyterLab-Vim, which is Vim notebook cell bindings for JupyterLab.
36:14 So if you're editing a notebook cell, you can do all of your magic Vim keys to make all the various changes and whatnot that you want.
36:23 So yeah, cool.
36:25 What do you think?
36:25 - I'm definitely gonna try this, yes.
36:28 - Yeah, awesome.
36:29 All right, let's see, what else do I have?
36:30 I got, oh yeah, this, nevermind my picture.
36:33 I didn't really intend to put that up there, but I just wanna point out that I'm gonna be speaking and the reason the picture's there is the conference, the PyBay conference that's running next month.
36:43 They featured my talk that I'm doing.
36:45 So that's why there's a picture of me, but the Pie Bay 2021 food truck edition, they have rented out an entire like food cartopia type place with a bunch of these pods and having a conference outdoors and putting up multimedia like TVs and stuff for each pods.
37:02 Even if you're not at the, like a great line of sight, you can still see the live talks, but sit outside and drink and eat food cart food in California, sounds fun.
37:11 So I'm gonna be talking about, what did I say my title of my talk was?
37:15 It's gonna be HTMX plus Flask, Modern Python Web Apps Hold the JavaScript.
37:19 So I'm looking forward to giving that talk in there.
37:21 So people, if they're generally in that area, they might wanna check that out.
37:25 - I might, that just sounds fun.
37:28 - Yeah, yes, indeed.
37:29 All right, that's it for my extra items.
37:32 You got any extras, Brian?
37:33 - No, how about you, Preston?
37:35 - Yes, I got one.
37:37 I had to actually search if this one has been covered, and I was surprised that it has not been covered.
37:43 - I don't think it has, what is this?
37:45 - There's something called py.inv.
37:49 So we've been using py.inv to, of course, one can say, why don't you just use always.inv, then get whatever that is?
37:58 Why do we need to install another package just to get the environment variable or something?
38:03 But this is pretty neat.
38:06 It's quite a recent project, I think, and it's rising slowly.
38:12 And there's a lot of contributors and it's very promising.
38:17 So what it does, I think I can just bring it somewhere here.
38:22 It allows you to do all this type convention, casting, etc.
38:30 Like you can say, I'm going to get my debug here and then I will set the defaults and also I will do the casting here.
38:38 So this is really, really neat.
38:40 So often when you're reading config files, everything is a string and then you're like, oh, this one is a date time, so I got to parse it.
38:45 This one is a float, so I got to parse it.
38:49 But it's really even, it's so much that.
38:51 So there's another way where you can say from decouple import auto config.
38:56 So it goes and searches where that .inv file is.
39:01 So otherwise you can just tell where the environment variable is.
39:04 But it's just neat.
39:06 It's very simple.
39:07 It does what you want it to do.
39:10 So I would really encourage people to look at it.
39:15 We have just changed every place where I've been using .inv or always.inv with this one.
39:21 and it's just helped me clean some unnecessary steps in my code.
39:26 - That's pretty cool.
39:27 - Yeah, yeah, great, great idea.
39:29 Definitely check that one out.
39:30 All right, well, I think that's it for all of our items.
39:34 Well, what do you think?
39:35 Should we do a joke?
39:36 - Definitely.
39:37 - I love it 'cause I've almost forgotten what the joke is, so it's gonna be new to me as well.
39:40 All right, so the joke is called adoption.
39:43 This comes from monkeyuser.com.
39:45 And you've heard about the Python idea of you came for the language, but you stayed for the community.
39:51 Well, what if it is a little bit different?
39:53 What if actually people get brought in unwillingly and then they kind of realize they like it.
39:59 So here's a picture of like kind of an open field, you know, think Gazelle or something.
40:05 And there's a couple of developers just running and there's one who is fixated on a butterfly who doesn't actually see what's, there's a bunch of like a pack of Python developers coming to adopt them.
40:15 It says a pack of Python developers spotting a junior dev away from its pack, initiate their conversion assault.
40:21 (laughing)
40:24 - Ah, yeah.
40:25 This is good. - Yeah, silly.
40:26 Silly, silly.
40:27 - Man, I'm that way even for non-programmers.
40:29 So, and my family just sort of like rolls their eyes every time this happens.
40:34 But every time I like get a young, somebody young coming over either in college or high school or just out of college, I'll say, "So if you haven't done it already, "no matter what your field is, you really should learn how to code.
40:50 And while you're at it, why not just choose Python?
40:52 So I'm trying to make Python developers out of every person I meet.
40:55 - I think that's, do them a favor.
40:58 It'll be their superpower amongst all their non-developer friends.
41:02 - Yeah, definitely.
41:05 - That's funny.
41:06 - Brian, thanks as always, and Preysen, really great to have you on the show this week, and thanks for being here.
41:10 - Yeah, thank you, Michael.
41:11 Thank you, Brian.
41:12 - Thank you.
41:13 - You bet, bye.
41:14 - Bye.
41:15 - Thanks for listening to Python Bytes.
41:16 Follow the show on Twitter via @pythonbytes.
41:19 That's Python bytes as in B-Y-T-E-S.
41:22 Get the full show notes over at pythonbytes.fm.
41:25 If you have a news item we should cover, just visit pythonbytes.fm and click submit in the nav bar.
41:31 We're always on the lookout for sharing something cool.
41:33 If you want to join us for the live recording, just visit the website and click live stream to get notified of when our next episode goes live.
41:40 That's usually happening at noon Pacific on Wednesdays over at YouTube.
41:44 On behalf of myself and Brian Okken, this is Michael Kennedy.
41:48 Thank you for listening and sharing this podcast with your friends and colleagues.