Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #281: ohmyzsh + ohmyposh + mcfly + pls + nerdfonts = wow

Return to episode page view on github
Recorded on Wednesday, Apr 27, 2022.

00:00 Hello, and welcome to python bytes where we deliver Python news and headlines directly to your earbuds. This is Episode 281, recorded April 27 2022. I'm Brian Aachen.

00:11 I'm Michael Kennedy.

00:12 And I'm hanging out story.

00:14 Welcome, Anna. Before we jump in, tell us a little bit about yourself.

00:18 Yeah, definitely how some data engineer, or at least at the moment I'm by training, I'm a linguist. So they've been both theoretical linguistics and competition linguistics. So I'm really about how the information is encoded in our brains and how we share this information. And that's why we're going to tuck Nice. Now since I got my master's in competition with the sight words, that's kind of on its Alexa AI. Org, for a while the first one was a language engineer, actually. So it was more on the side of linguistics side of things. And so dealing with extracting the semantic, meaning really how to do them for electrical and then gradually switch to to just data processing and going into the role of data engineer for about three, four years now. And currently with ticker futures, worldwide sports retailer, so working lots of workbook data

01:18 there. Okay. Wow.

01:20 Interest is fascinating. Yeah, it's been right. Yeah, it's really neat how we can speak to our devices these days. And they kind of actually work to do amazing things, right. Like, I know, when Alexa first came out, and Siri especially it was like, I don't really want to do that thing is so not good. And now I talk to my devices all the time. It's amazing.

01:42 Yeah, there are some things that are really sophisticated that they haven't do now sometimes, like, can't even believe they were actually getting there. So it's pretty exciting. Yeah. And sometimes I admit that, you know, subtle and things that like, really, you can't do it. Yeah. But I realized that having worked on that, actually, I realize that sometimes it's just the kinds of things that, you know, like from a professional said, Finally, it might seem like kind of true to me. But I realized that, you know, there's so much work with things. And then they say, I have the actual device, that sometimes just like you don't get, you know, those little like corners, right? So one of the things that I got to work on is actually helping Alex to kind of know, when she needs to stop. What she needs to stop talking about things and telling you about things like whatever she thought and repeat your work now. So yeah, it's funny.

02:42 So well, for our first item, Michael, do you wanna kick it off,

02:46 I will definitely kick it off. Let's take it to the next level with this one. So this is an article by luda called Take your GitHub repository to the next level. And there's kind of 13 levels. But you know, I guess it's a spectrum, you decide which level you want to take it to. So here are basically 13 ideas on how your GitHub repository can be better. So there was a topic I was going to cover after I explored more, I decided that not not too much. But as part of it was there was there was a conversation about some web assembly stuff in Python, and I checked it out, it's really cool. They're like, we're gonna use this library. This is the fundamental thing that makes it work. And I go to the GitHub repo for that, and it says, here's how you build it. And I said, Wait, okay, great, but why do I want it? What can I do with it? How do I use it? I don't care about how do I build it? Like, that's the last I'll just download the wasum file. But what do I do with it? Once I get it? Right? It was just none of that. And so that's kind of, you know, this article helps you think through those ideas. Oh, nice. So number one, and you know, it's Python friendly, because it starts with zero step zero, rather than one, make your project more discoverable. Now, every one of these comes with a recommendation, a bit of a description, and then examples, which is cool. Nice. So for example, this one says, what you can do is to help people find your project, if the name of your project is not carefully describe what it is, you can put tags basically. So like refactoring, or science or things like that might be something you put on there. That's not immediately obvious from it, right? So you can tag subject areas and whatnot. And they have some examples. So for example, there's this thing called well app, which is like a mindfulness app for the Mac. Of course, it's for the Mac, isn't it? So it has tags such as MacOS, productivity, happiness, mental health, but also flutter and web app if people wanted to check out a flutter web app, right? Okay. So that's, you know, there's other examples as well. That's step zero. Step one is choose a name that sticks. Something that's available on pi pi, something that people can Google something that people want to say it doesn't sound silly, or, or unprofessional if they were to use it. You wouldn't call your web app fancy HanSe server, right? You wouldn't say, well, our fancy pas servers really scaling today. Like, you wouldn't want to speak that way necessarily. So don't name it that way. Right.

05:07 Yeah. So and So choosing a mistakes, and that we can say on air. So, yes, exactly. And

05:13 is, you know, somewhat predictable in the pronunciation maybe, because that's also a challenge. But so there's some examples of Yeah, no, anything. Yeah,

05:24 absolutely. Just thinking about the names, something that I run into today, particularly with with Python, some services or applications, and libraries as well that help them and then p y. And sometimes you don't hire P in that case, it's confusing. And then you're talking to somebody else who they're gonna build the same thing. They're like, constantly confused. So yeah,

05:49 yeah, I agree. It matters a lot. Let's see. So some of the things are cut up, conduct a thorough internet, search for the name, avoid hard to spell names, get the Dev or.io domain, if you really, really care about it, you know, is it some random small little package? Or are you trying to create the next FastAPI? Write a name that conveys some meaning? I was thinking about Jupiter, for example, like Jupiter is pretty interesting, because it's kind of hard to spell. But once you know it, you just know it. And it very clearly works. Well, in a search. There's probably no domain name. That's like a misspelled planet type of thing. You know, I mean, it was probably a really good choice, even though it kind of breaks the maybe hard to spell

06:27 at first. Yeah, but it's easier to search. Right? So

06:31 yeah, so the example they get for this one is size limit is the name. And what does it do? It calculates the real cost to run your Javascript app or live, keep good performance, it'll show an error in a PR if the cost basically file size exceeds the limit. That's cool. The next one, I'm all about this display a beautiful cover image. So if you go to a repo, and it's just the text, that's not amazing. You want some color, and you don't necessarily have to have like an amazing logo. So they come back to this well app, and it's just a W with like a little connection smile or something under it. One thing I did learn about this, though, that I thought was interesting, like how do they center this image, but not have it go all the way across the readme? If you go to the readme, and you actually look at it, apparently GitHub will let you put full HTML inside of your readme for the segments that need lots of formatting. I thought that they wouldn't I know some markdown does fall back that way, but I didn't think get up to it. Anyway, apparently, yes, you can. Also this one's quick badges like a CI passing, what's the license and so on? Is there a YouTube link to like a YouTube channel that shows people how to use it? Some more of those as examples? Write a convincing description in a paragraph or two, and things like, what is this repo or project? How does it work? Who will use it? What is the goal and so on? Right, real simple one. And again, they come up with a size limit. It's a performance tool that will crash your CI if it gets too big. Here we go. Get into the ones that Brian and I love record visuals to attract users. Yes. So you might think there's no UI aspect. But here's a full on CLI example. That is create Go app, CLI. And all it does, imagine this, it creates go apps on the CLI it's a good name that came by vase, what it does, but if you go to see is like how do I create one it has the option, but then under it, it has an animated GIF, doing the things that creates the app and showing you the tree structure that results, you know, the file structure that results and so on, then a full video and a documentation to that thing, and so on. So that's pretty awesome. And how about you, Brian? I always trying to quickly jump into a project and figure out, what is it about? Is it polished and so on? But you know, that's because we run this podcast, how do you? How do you see the sort of pictures and animations for repose?

08:47 Yeah, that's super helpful. I really like the idea with the animation, just basically taking you through through the kinds of things that particular app for instance, can do. That's that's super helped us. More and more people are doing it. They don't think it's super popular yet. Like, I don't know about how about you guys, but I haven't seen a pull up, you know, times. Yeah, yeah, he just looks nice.

09:14 Yeah, I really like it as well. Alright, let's see. Another one is create a practical usage guide, like how to use it with some examples. Some templates answer common questions, like an FAQ. I use it on Windows, or does it require admin support? I don't know. Something like that builds a community. So maybe you have this is probably further down the line. But like, Do you have a discord community for your project? Or you can even just enable discussions on the GitHub repository all end up with people opening issues on my various repository saying I have a question like, questions on an issue. Issue is the thing that is wrong or thing to be improved, but but they don't have another way to communicate. Traditionally, about GitHub now has in addition to issues, they also have a discussion section that's more open ended. So I Think that's off by default, if I remember correctly, at least on the older ones it is. So I go and turn that on. But a conda, that's all good contributor guidelines, choose a license the right license. Remember, if you don't use a license at all, that means it's unlicensed, and people can't really use it. So add a roadmap for GitHub releases. One thing that I didn't pull up, that's pretty cool, is release drafter. I'm not sure if you all are familiar this, but this is a pretty cool thing as well released drafter, drafts your next release notes as PRs are merged into master or main on how you set up your email. That's pretty cool. Customize your social media preview. So if somebody shares your project, you can control what is shown in that little Twitter card or other cards. So like that, that can be customized inside of your GitHub repository, and launch a website. If it goes, you can use GitHub pages, or Netlify is really easy for easy and free for static sites, and so on. So there's a bunch of things people can do to take their repo to the next level. What do you think?

11:04 I think it's great. Yeah. I don't do any of these things. And I probably should. So I

11:13 have a picture, I have a usage guide. Oh, there's also one that talks about how to install it that I somehow skipped, but most things don't need. So one of the things that pip install, one of the things

11:21 that I see a lot is, I don't know if this covers it, but I see documentation that's on read the docs, which is great. But I still think a quickstart or a little like, this is how you install it. And this is how you can do a little bit of something with it. That should be in the readme, even if you have other documentation, because I don't want to have to just go to the documentation to see if this is the right project for me. So yeah, this is great. So we have a question of does, how does one create a CLI animated GIF? And I don't know if the.if. This article covers that, but I don't think so. Okay. Love to love to research that and get back to you. Yeah.

12:04 Well, Alvaro, what I do is to use Camtasia. And you can record a campaign via video of just the window. And then there's different output output options, like just audio, or just the video or an animated GIF. So that's one of them. Jeremy page points out there are a few tools that record that in a cinema I don't know, like, ask us our ASCII cinema. Basically, I don't know how to say that it's often used pretty cool.

12:33 And D difficult names.

12:36 Exactly. I'm at a loss of that one. kloudio, who I just had on talk, Python has a blog post about many of those things. And he has a cookie cutter for release drafter and badges. Yeah, cover that on top of Python. Nice. aspirationally about hypermodern. Python. Awesome. Well, that's probably way more than people want to know about their GitHub repository. But so often, GitHub repositories, these days serve as your CV or your resume when you go to apply for developer jobs. And if you end up at somewhere that looks like what they described here, rather than a bunch of things with like, weird commit messages, and nothing like that's gonna make a different impression, or if you want people to adopt it and start using it.

13:14 Yeah, if you don't, then don't put the stuff in. Yeah,

13:17 exactly. So Brian, let's go faster.

13:22 Well, let's go faster. Speaking of CLI. So this is this is a fun tool. We're talking about faster, faster, row faster, faster row, I'm gonna go with that. So this is a time it's like time it on the command line. So but it's pretty neat. So this is by Aryan wasI. And we've had we've covered something that has before. So

13:48 it was the type explainer thing. All right. I don't remember its exact name, but type explainer where you put a typed thing in there and humanize what those met.

13:58 So I this is a simple little tool, but I'm loving it already. So this one of the it does either it times stuff, but it also compares times. So like in this, we're showing the website here, but it I can't I can't tell what their timing. So let's just pull over in the documentation, it does have a bunch of examples. So if you ran faster with with two code snippets, and in this example we're showing is just either just showing either a string or an F string. Good timing those. So that's pretty neat. And those are those two code snippets. If you're on those, it'll run both of those a whole bunch of times and do some statistics like in this example, it's running it 20,050 1000 times but no 20 million and 50 million Wow. And then shows you a little progress bar and and then who wins. But if you don't, if you're not comparing two things, it'll just show one with the same graphics but you can do more than too, I did like three or four, just trying this out to time different things and compare him in this often, that's why I'm typing something, I'm comparing two things. And I want to see which one's faster. So this is a really cool feature. You can either pass in code snippets, or you can give it to Python file names and run both both those things. One of the it's kind of a whole bunch of really cool features, actually. And one of the things I like, is you can if you've got some code snippet that you are, it needs some setup, but that the setup part isn't the part your timing, you can give it some setup code to do before it does the time part. So that's pretty neat. Anyway, just a really nice looking command line interface timing tool.

15:45 Yeah, that's very cool. So you

15:47 can sort of isolate the things that you really want to set up. Don't care about? Yeah, I

15:54 haven't tried the setup part. But it's cool that it has it in there. There's, there's very nice documentation is pretty thorough, actually, as well. Quite a bit of cost customization available.

16:06 That's cool. Yeah, I agree. That is that is nice, that setup stuff. Because so often, if I want to profile like some web app or something, it's the thing I want to profile is dwarfed by just loading up the framework and scanning all the files, you're like, Alright, now I gotta hunt down for that little fragment that actually represents what I'm really after. So pretty cool.

16:24 Yeah, maybe I'll try. Sometimes. Yeah. And you can pass

16:27 in strings of Python, or you can pass in files. Yeah. And when I saw the strings a bit, I'm like, Alright, there's a good use case for semicolons in Python. Right. You've been using good? Yeah. So exactly. If it makes you feel better. Awesome. Well, it's good one. All right. And on to you. What's your first one here?

16:46 Yeah. So I wanted to talk a little bit about law, data, my line of business, it was just thinking that something you could be really interested in them, especially for that part of our audience that worked with them to like kind of data science project? Well, in general, you're collecting data. You definitely, in most cases, you get some kind of noisy data that you need to clean up and filter out in some way. And particularly so I imagine you've heard from your large international audience as well. And also, on the other hand, if you're working with data from social media, which is very popular right now, one of the questions that you have to go there identify the human language of the data that you're working with on content, do you want to filter out the pieces of data there are maybe, for example, are not in English, if you're going through going through social media posts or something?

17:51 All right, you could translate this to your language, little button at the end if for some reason, the popular post is in Spanish or something. Right? Exactly.

17:59 Yeah. And some of the platform's their API's rather, do provide this kind of filtering on their backhand. I know Twitter does that. But also, I know, sometimes it's not as reliable, really, I guess maybe, again, like I could imagine that maybe it's not really some ultimate goal in the fight, or maybe not putting as much love and caring to this question. So that's something that I had to do a few times, I also have a couple of libraries that have worked with our blank ID, and landing to text. There are a few more out there. And there. These ones have been out there for a while, actually, my ID has been hasn't been actually sort of worked on, actually for a few years now. But it's still kind of, you know, one of those, like, benchmark libraries for this kind of questions. And both of those are super nice, actually. So when your team is really popular, and the one of the things that I really liked about it is that it's actually covered a lot of languages. So I've actually had different pieces of information, depending on the documentation that I was using our Wi Fi or the kit hub page. So at some point that sorry, was covering 97 I think there are GitHub Pages saying 971

19:28 A lot of languages, I could do 97 languages.

19:34 I'm a linguist, I would have trouble naming, you know, maybe seven languages off the top of my head. I definitely don't speak 97 languages. And the some of the nice things about it is that you can use it as a sort of like a standalone, you know, module, like a command line tool, for instance. But you can also use it the watch, there's a web server. So the that's really neat about it, and some more like meeting rethinks that They were really helpful when I was trying it out for some of the mark project was that when you try to identify their victim of language, using like ID, it actually outputs the weight on the calculations on it, which is very typical on like a workspace, we have this funky numbers in the end do you know Truly speaking, but the good thing is that he actually can convert them to a bit more confidence for that, especially the data scientists are used to. And that actually comes in super handy. Because sometimes you can you're trying to filter out the data, and you know that this kind of tools are like, obviously not, you know, 100% reliable, you can also use this confidence scores to maybe use it as again, okay, I'm taking this answer, and I'm relying on that for okay, maybe I'll just like dropped it. This piece of data all together, because it looks like the doing each other in fire is not super actually. Sure what kind of language you know, if you're targeting a specific language. And

21:09 this is why Oh, yeah. So you get it. You basically might say, we're 80% Sure it's English, but it might also be Spanish or something.

21:19 Exactly. Yeah. Yeah. Like English can be easily confused with maybe German or with contemporary French with vocabulary now circling around those two languages. so yeah, so the guy who is not going to be like, 100% sure that, you know, this is one which, and the funny thing is that? I'm not so sure about Ling night Z? yeah. When did you go statistical actually, no remembering. And so as well. And sort of the flip side of that is that it actually works very well. Doesn't it? The bigger piece of data that you're fitting into it, the more competence going to be like? Yeah, how does how machine learning work? Sort of, generally speaking. And if you're working specifically with this kind of short, tweet, social media posts, if it's like really short phrase Thompson's, interspersed, like emojis and stuff, it's probably not going to get super confident. So the feeder data, the more competent, the better the performance of the language, I think there will be some something to keep in mind when you're working to get a new language.

22:36 That makes sense if you have one word, or something. It's Yeah, exactly. Let's look at this being one file. Sorry about this being one file is insane. Like it acts as a web server and does all sorts of stuff. It's crazy.

22:49 Yeah, and it's something that I really like about it. pretty lightweight. Well, I played it low dependency care to package, which is fascinating, based on that, kind of not a super sophisticated now you feed algorithm. If I remember, you've actually correct them. So yeah, that's really, really fun to get through. And as they work so nicely. And the other one that I wanted to sort of kind of juxtapose to it was Lang detect, which is an applicant out there. Which I happen to find a little bit more robust, when I got to work with the language, human language data in my project. And the deeds, it's also really neat and easy to use. And the great thing about the basic e which is very straightforward and quick on all those particles, they discover, like near no one needed to look through and how it's doing. And you know, if you really can understand in five minutes, you're going to do something that you know, is going to soothe well in my female project. So the main methods are to text and protect blank. So you can either just call it on a piece of data and try and get the most probable language package things. Or you can have returned a list of possible languages, actually, to import them to order them. Maybe English, some vendors, the tiny fraction of probability that has been committed to German or something like that, and then you can decide for yourself and so yeah, so overall, from my experience when detective work and they sometimes language is a little bit better, don't like it, but that sort of empirical

24:49 thing. Super useful for anyone that needs to parse text and can't be sure it's all in one language.

24:57 Yeah. So if anyone out there like working on some kind data science projects are working with gentleman which data I would highly recommend. And probably one of the things while I think is a little bit more confident and through both, I know that it covers fewer languages. So I think it's 55 languages total, compared to 97. Yeah, for a winner. But, yeah,

25:25 yeah, interesting. Not

25:26 that. Nice. Oh my God.

25:29 Let me tell you about our sponsor for this episode. And before we move on, it's a podcast. Amazing. So this episode on bytes is sponsored by the compiler podcast from Red Hat. So everyone out there just like you, Brian and I were both fans of podcasts, listen to podcasts all the time and stuff. That's why we started some we like them. So I'm happy to share a new one from a highly respected open source company compiler and original podcast from Red Hat. With more and more of us working from home or being more disconnected, it's important to keep our human connection with Knology compiler unravels industry topics, trends and things you've always wanted to know about tech through interviews with the people who know best so on compiler, you'll hear a chorus of perspectives from diverse communities behind the code. These conversations include questions like What is technical debt? What are tech hiring managers actually looking for? Hint, see, Item one is some degree. And do you know how to code to get started with open? How do you know how to code to get started open source, right? Was I was a guest on Red Hat's previous podcast called command line heroes. And that was a super produced and polished podcast was really cool experience. And so compiler follows along in that excellent tradition, and that Polish style. So I checked out Episode 12, how we should handle failure, which I found really interesting, I really value their conversation about making space for developers to fail to they can learn without fear of making mistakes, like taking down the production website, and so on, right? People grow through experimentation, but they also fail, they try new things. So you got to make sure that they get a chance to grow. So learn about the compiler podcast at Python bytes.fm/compiler. The link is at your podcast player shownotes. Right at the top. You listen to it on all the places that you would think so thanks to compiler podcast for keeping this podcast going strong. And right. Also, just real quickly want to point out, I know people can just go to their podcast app, wherever that's Pocket Casts or overcast or whatever and typing compiler and search. But please visit Python bytes FM slash compiler and there's a place to subscribe with all of your various podcasts, destinations, that way they know it came from us rather than just out of the either. So if you're going to subscribe, or check them out, please do through through that link just so people know.

27:48 Nice. Yeah. Yeah.

27:50 So how would I? How about we talked about watching some things like files.

27:53 Yeah, we were listening. So now we're watching, we were listening. Now we're gonna watch,

27:57 but watch them for changes. Now watch what they are. So this one comes to us from Samuel Colvin of Pydantic fame. So you know, it's a pretty cool, pretty cool experience behind developing this API. And the idea is, it's a simple modern and high performance ways to watch files for changes. So there's a lot of reasons you might want to do that. You might want to say, if somebody drops a file into this directory, I'm going to kick off a job to like, load it up and process it in some kind of batch processing, or I wanted to have my web framework automatically restart, if this any of the files in here get changed, right, and then a python file or whatever, you could use it for things like that. But the modern part's pretty interesting. It hooks into the underlying file system, the underlying OS notification systems, and is done through that's done through the notify rust library. So basically, it's a low latency, high performance, native non polling way of watching the files, it just goes to the operating system and says, Hey, I, in this directory tree, if anything changes, call the callback. Nice. That's pretty awesome. Yeah. So there's real simple uses here, like I can say, from watch files, import, watch, and then just for changes in watch some path, then you can process those changes. So here's an example of an app that just starts and its job is to, as things change, here, take them up, that might be an example of what I said about kicking off something over to like, load it and parse it and decide what to do and then maybe pass it to celery for background work, right. On the other hand, you might want to do other things while you're watching for changes as well in your app, in which case, there's also an a watch an asynchronous watch. So if you're doing other work, and it's all async IO based here, you can just say, kick off the watching bit, and wait for the changes to happen. And then do other async processing like bass API, or web or database calls, you know, web with HTTP acts or database calls with beanie or whatever other async IO things and it's sort of less you run them in parallel, which is cool, right? Yeah. And then if you want to go even further, you can kick off a separate process and say, start a process that will watch for changes here, and then call back this function if those things change. So that's pretty cool to do. There's all these different ways in which you can use it. But yeah, it's it's pretty neat. It's based on this rest library. And it seems pretty powerful. There's also a CLI, which I did want to point out one other thing over here like this, I thought this might impress you, Brian, definitely, I can do a command line, watch files command that will say watch this directory. And if anything changes, rerun the failing tests. That's very cool. That's cool, right? So you just do watch files, and you run the string pi test, dash dash LF, which is pi test, rerun the failing tests, if anything changes. I think that's neat. The, the,

30:51 the command line stuff is actually cool. I check it out just for the command line usage. But the the ability to use it programmatically to with an API. That's impressive, and I'm very happy they included that.

31:02 Yeah, absolutely. This is if you're going to use it through the CLI. This is the perfect pip X Install type thing, right? It backs, install watch files, and then it's not really tied in your projects that just always there. And are what do you think?

31:14 Yeah, but that this looks super nice. Just made me immediately think about file triggers that are whenever things are built in BOCES. That is why they use nest in the cloud storage as well. Yeah. I can't imagine like all the possible ways that it can be used. So yeah, that's really nervous documentation. They actually provide any proper use cases or anything that they might not. Curious if they actually do.

31:45 Yeah, I didn't see any, any in particular, just a couple examples on how you might use it not but yeah, yeah. There's an older project called watch God. I don't know anything about that one. But I'm glad I didn't learn about it, because now there's a new one called watch files. But if you're using the old one, this is the successor to that as well.

32:01 It's a funny name, but I could see why some people might not want to use it. So

32:06 yeah, well, I can see item one, right, pick a name that people aren't willing to talk about. Exactly.

32:11 Yeah. Well, I want to talk about a new tool as well, coverage, not. So hopefully all of us are familiar with coverage.pi. So it's maintained by Ned Batchelder really cool tool. But there's a new guy on the scene and the new person on the scene is slipcover. So slipcover. And actually, I heard about slipcover through the coverage, that pi Twitter account, which was interesting. And so not surprising, though, Ned's are pretty open minded guy. But so slipcover is, is coverage, but it's it's pretty new. So some of these commits, that's just within the last week or so that things this came in. So there's a it's still at like, I think the version is 0.1 point one or something like that. You even just got a new one out this morning. So why would you want to use something different? Well, the the the big selling point of this is it's really fast, it uses a different a different process for for getting the coverage information, and it supposedly is only a 3% overhead, which, depending on your code coverage that pi can be, can sometimes slow down your code significantly. And if you've got a really long running test suite, making it even 20% faster, but sometimes coverage can make it like twice as slow. So if you've got a five minute test suite, that makes it 10 minutes, and that's a little painful. So this might be worth checking out. It's quite a bit faster. I tried it against flask as an example, and the the flax numbers. So flask has just got a pretty tight test suite anyway. But so just straight pi test on my machine, it was like 2.7 seconds with coverage was a about four 4.3 seconds, and then was slipcover it was just a little slower than just pi test. So Bitez 2.7 With slipcovers 2.88. So just a little tiny bit more, and you get coverage information. That's pretty cool. It is in the early stages, though. There's some there's some kinks to work out still. So I would try it out and watch this space. I think they're doing some really cool things. Definitely worth watching. But like for instance, I ran into issues on projects that use PI test plugins. I don't know why but the plugins don't get loaded. So the Like for instance, I'm I tried to run this this flask example, but with X Dist. So that I could run all the tests in parallel to see if it sped up parallel runs it also it didn't recognize the parallelism. So I'm not sure what's going on there. But I am coming in communication with one one of the main tenders of this or let them know what what I found out. I'm not just griping and not drained to make it better. I'd love to have this be a really cool tool.

35:09 So it looks nice. Yeah, go ahead. Yeah. Yeah. And so

35:13 did they near their work? Mostly due to how do you manage to to provide that? Documentation? Really? Yeah.

35:25 Yeah, it's such low overhead, I'm tempted to think of a more diabolical use of it. Like I've got, I'm handed some crummy old app that doesn't really have tests. And I got to figure out what what part of this is dead? Because I don't know if you've ever picked up some old app that's evolved and evolved. And there's just stuff people going to take out because they're afraid to just run this in production for a while. Oh, yeah. And just go, okay, these things don't look like they're doing. There might be some case I need to track down but this gray area over here that's not touched. Let me look for things to delete over here. That'd be kind of fun.

35:58 That's my favorite use of coverage is looking for dead code.

36:01 Yeah, exactly. For remove off this, Brian, of our ask, does it have a PI test plugin? I know you said it doesn't work to run plugins. But this is the reverse question.

36:11 I don't I don't think so. So you're running, you're running slipcover and pi tests at the same time, don't think you really need it by test plug in for it. It, it does work with PI tests. So you can run pi test operations on with it. But

36:28 nice, just not the bells and whistles yet. Right.

36:31 But I'm sure they'll get there. Yep. I would love to circle back to the data. Breach sounds like a broken record. But that's my theory topic.

36:41 Oh, it's great to have you on to talk about it. Because Brian and I don't live in the the data science world. Right. So it's great. Really cool. Yeah. Well, you're

36:49 welcome. In our in our world. There's a lot of fun stuff happening here. And foolishly if you think about it from the actual very beginning, right? Even before trying to wrangle the data and trying to infer and interesting information of the data, you have to get it somehow. And sometimes, if you're really particular working on some passive side projects on your own, you want to need your browsing for meaning you can if you're doing like a machine learning project, modeling approach, you usually need some very specific data to work on. And how do you get through data while you have to actually go and need to find some examples of the data on your own. So something I wanted to talk about today was actually web crawling, not scraping, and a couple of tools for that. So one that are quite popular, and it's actually like, industrial grade kind of tool is well, actually either screening fee or free by booth variant. Yeah, and it's pretty great tool. So one of the great things right from the get go about is that it actually has a built in shell. So you can just go ahead and try out things in the CLI, get response from a URL, for instance, and then try to poke around it and test out a behavior, which is real nice. And then see what kind of things you might want on there. And if you're actually so go ahead and use it for your module to to fire to get some data writers. We provide all sorts of reignites functionality to begin with. For instance, it's a choice between using other business selectors for the consequences of pages or XPath which is obviously a little bit more flexible.

38:56 Is more fragile, though because if they make any change to the page, that also

39:00 Yeah. But Phil Yeah, well, it's part of the game. Yeah. Yeah. And then some other really nice things about it that actually they do a lot of like heavy lifting for you in terms of shrinking templating so you can there's a built in method for a start project and you can you know, run that and right away you have the whole structure and bicoastal boilerplate code you're just doing so certain pieces before I can processing which is in their pipeline module taking some setting etc. And there you go, you know, it kind of like a huge amounts of work already. preset for you bring Creed's on Canvas for you. And then some other nice things about it that also provide you with like, numerous truth is actually for exporting the data and for storing the data as well. in a few places and the format that he would love to use for it, all the typical standard things like give patient encounter some more, some less frequent option billing.

40:14 Yeah, another thing that's pretty interesting about this whole project is that there's a web scraping as a service company. Yeah, behind it, right. It used to be called scraping hub. Now, it's Zeit, zy, te, basic, you can basically go in there and just, you know, sign up and handed one of these, these spiders, and it'll just run it on different servers try to avoid getting blocked all that crazy stuff.

40:39 Exactly, yeah. So therefore, it's so elaborate. And they really put a lot of just like it was looking for a lot of love, and character work, all those sorts of functionality, like covering all those corners, like what you might want some web crawling tool. And some other examples that I found particularly useful for instances. Link extractor classes, like really getting to like the ingredients, now parts of the tool, where you can extract further links from the page. But only those ones that you know, to hear if your particular pattern for instance, and those that you get is already confused. So once again, if I could alleviate so much of you know, the theory work on your part, that's really great. And they do provide, actually, waste interact with the features of well, there's a former boss that you can use as it provides some functionality where you can interact with the page, but I haven't used as much myself, so I can barely show how fascinating it is. But it is probably well done, as well. And another library that I wanted to touch on briefly today as well was robux. That's actually something new for myself. That's something I'm in the process of exploring. So I haven't had a chance to work while I was here. But it's been really, really interesting. And I would love, I'd be happy to you know, I'd love to hear from somebody else to write it out with something. Because it was in the first place. It's built on top of actually the facts and beautiful soup for other and they're super popular in sort of the data processing line of work and particularly for web scraping. And, but they'd add some, you know, really useful functionalities. And it looks like it allows you in more of this interaction with the pages and like very neat and clean way. You can probably find examples right here. Yeah. In the documentation. It looks so so you know, nice and clean and straightforward. Looks wobbly. So yeah, I'm really excited about this package. I'm hoping to have an opportunity to test it out.

43:13 Soon. Yeah, robux looks very interesting. It looks very selenium like where you could actually control the page, like fill in this, fill in the comments with this, fill in the first name with that and then submit. The other thing that's cool about it is it has async support for doing exactly. Yeah, that's fantastic. Awesome. Yeah. Thanks.

43:34 Nice. Well, where are we at? Now? We have extra, extra extra extra extra here all

43:39 about it. I only got one. How many got? I got 00. All right. Anything else you want to give a quick shout out to while we're here. Now. Okay, cool. Well, I wanted to tell you all about my terminal adventures, I suppose we'll call them. So I've been using Oh, my Z shell, which is amazing. I love all my the shell. But I also started playing with Oh my posh and please and some of these other things. And I thought, Oh, well, how am I going to decide by between say, Oh, my posh and all my Z shell? Well, it turns out Brian, you don't have to decide you get both. So here's a little animated video I'll throw up for people who are watching. I'll put it in the links as well. So here's you can see this cool prompt which is all driven by Oh my posh but you can see, like autocomplete into get local Git branches through Oh my Z shell for either branch or check out. And then on top of that we can do like ELS which is amazing to do. Oh and McFly. We talked about McFly before, which gives you autocomplete into your history and sort of Emacs Style Editor type of AI complete MPLS for LS replacement that is developer friendly with like little icons for the file types and it uses Git ignore to hide stuff that you don't want to see. And it's like Python friendly like understands V envies and D emphasizes them meant all that kind of stuff. So anyway, people have been trying to decide between these things. It turns out they all go well together. You don't have to decide.

45:08 That's pretty cool. Yeah, yeah. Yeah. Really come with the assumption that. Yeah,

45:14 yeah, all the stuff that works. You don't have to give up any of it. The only thing that isn't there is the prompt. And the prompt is not all that great. Honestly. I mean, I know you can customize it, but I think it's better Oh, my posh, which is pretty amazing. So people who are listening, they can check out the little video I'll link to somehow find a way to do that in the show notes, so y'all can check it out. Okay, that's my extra. Yep. Yeah. How about a joke, I guess? How about a joke. So we're all starting to go back out to dinner? Restaurants COVID Over here? Not necessarily. But here's one from a slightly different perspective. This Hello, I'm your server today. Brian, can you just describe for people listening? What's in this picture?

45:54 There's two root two robots at a restaurant sitting down, and there's a certain like a server rack next to him.

46:02 Okay, and the subtitle is when you go out for a bite, BYD if the server is by the table where the robots are doing, he says my name is d h x 005972. And I will be your server this evening. Alright, that's that's all I got for us for our job today.

46:22 Nice. Well, thanks, Anna, for joining us today. Thank you. Thanks for having me.

46:28 Yeah, it was great. Thank you, Brian. As always, everyone out there listening. Thanks so much. You're been around

Back to show page