Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #219: HTMX: Dynamic and live HTML without JavaScript

Return to episode page view on github
Recorded on Wednesday, Feb 3, 2021.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to

00:04 your earbuds. This is episode 219, recorded February 3rd, 2021. I'm Brian Okken.

00:11 I'm Michael Kennedy.

00:12 And I'm Jennifer Stark. Hiya.

00:14 Yay!

00:15 Yay! Welcome, Jennifer. It's so great to have you here.

00:17 Thank you. It's really great to be here. Thank you.

00:20 Yeah, yeah. It's been great. It's great to have you here. You know, we had you or I had you as a

00:25 guest over on Talk Python. And that was really fun talking about data science stuff over there.

00:31 And now we're happy to have you here on Python Bytes as well. Yeah. So you really quickly want

00:36 to just tell people about yourself before we jump into the topics.

00:39 Sure. Yep. So I'm Jennifer. I work at Loud Bible as a lead data engineer on a really small team of

00:47 three, but we're a bigger data team for research and insights as well. We've been spending most of

00:53 our time working on engineering stuff, but we've been moving gradually into like include more

00:58 data science tasks as well. So looking forward to doing some more of that.

01:03 Yeah, it sounds really fun. All right, Brian, you want to kick us off? I mean, I heard that you're

01:07 supposed to use virtual environments and not mess up what you're doing, but if you don't want to,

01:11 I guess you just don't do that.

01:12 Well, I use virtual environments.

01:15 I do.

01:16 All the time. But there was an article, so I wanted to cover this and there was some

01:20 discussion online. An article from Frostming titled, You Don't Really Need a Virtual Environment.

01:27 So what's the story there?

01:29 Yeah.

01:30 Yeah. So there's a little hint in the slug of the URL that the slug is introducing PDM.

01:39 So I don't know if he's really saying that you don't need it, but PDM stands for, what does it stand for?

01:45 Package Python Development Master. Well, that's cool. I think I want to put that on my business card.

01:52 I'm a Python Development Master. Anyway, so let's just go back up a little bit. This is kind of a neat tool.

02:01 It's sort of poetry-like, but it says it doesn't use virtual environments. It uses the package,

02:07 Dunder package directory. So what is that? So there's, we do have this problem with virtual

02:13 environments and I do. The main problem I think is it's hard to teach people. So if you're teaching

02:18 somebody new and you want them to install something, you kind of have to say, okay,

02:23 well type Python-M-VenV or, you know, Python-M-VenV space VENV and then you have to activate it.

02:33 And in the Mac people do this, the Windows people do something else. And then after you've activated

02:40 it, then if we've got requirements, you need to install them or install the requirements. And,

02:44 and I, you know, that's not a fun way to start teaching people how to use Python. So

02:49 I think we do need to solve that, but I'm not sure this is it, but there's-

02:53 Sure. Well, like Node.js has a similar problem, but they don't necessarily have as much of a

02:59 challenge because they have this like directory structure, which I think what DunderPyPackage

03:02 is trying to do, right? Like if you have a, a node modules folder, some directory up and you do

03:08 something with NPM, it just goes up till it finds when it's like, well, there, there's the top of

03:12 the project. We're good.

03:13 Yeah. So there's, that's, that's kind of what, so there's a PEP 582, which we'll link to that.

03:19 But, kind of has that, has this, it's proposing to have this Dunder packages, package directory

03:26 and have that sort of thing. So if you're, if you're in a directory with one of those around

03:31 and you do a pip install, I think this is how it works. Either it's supposed to, I think it's just

03:36 going to install stuff there instead of in your global one. So I actually think this would be cool.

03:41 Even, even if it's only used for teaching, it would be a cool thing to have because also you could,

03:48 you could possibly zip this whole thing up and give somebody a directory and they'd already have

03:53 all the packages and everything. That'd be kind of interesting. I wonder if that's kind of similar

03:57 to how Conda, is that similar to how Conda works? Can I use Conda ends instead?

04:03 Yeah. So it feels easier. Yeah. Yeah. You probably live way more in the Conda world than

04:08 the pip world. Right. Yeah. I think Conda is sort of a intermediate, right? So with Conda,

04:15 you do have to say Conda activate, right? Yeah. Manually, but you don't have to be in the right

04:21 place. Like with PIP, you could be anywhere. Yeah, exactly. You just say, I have an environment

04:25 called this. Let's roll, like activate that and then go. Right. Whereas this is, it's like,

04:30 I'm in the right location, but I don't want to have to talk about environments. And it just happens

04:35 to be that that environment has a directory structure that Python will know about. Yeah. Yeah. Yeah. So

04:41 there's another, another part of virtual environments. It's a little icky is that I guess it's time,

04:47 you know, maybe, it's that if you, if you create a virtual environment, you kind of tie it to a

04:53 particular Python version. Yes. And if you update your Python version, then you're not, your virtual

04:59 environments aren't pointing to the new one. And I don't really know. I don't know how to,

05:03 actually, I don't deal with that. I just delete the virtual environment and recreate it.

05:07 maybe there's really good at doing that because, you know, every time I brew upgrade my Python

05:12 for a major version, it just stops working. I'm like, Oh, come on, here we go. You know,

05:16 time just time to erase everything and start over. Yeah. So the PEP 582 might fix that also. Cause you

05:22 could just, I don't know if it fixes that because it's still in the directory structure. It does have,

05:27 Python versions in the directory naming thing. So, I think for minor upgrades,

05:33 it would work, but for major ones like going from three nine to three 10, I don't think it would help

05:38 you there. I don't know. anyway, I don't, I don't know enough about 582 to comment on this,

05:44 but I do think it's cool that PDM is around so that you can play with the dunder packages to see what

05:51 it's like. However, the workflow within PDM is way more complicated than virtual environments.

05:56 So in my opinion, so I don't think that it's going to fix the newbie, problem,

06:02 but yeah, it's still, that's, that's what I feel about with all of these things is it's like,

06:07 it solves two problems and it adds three. You're just like, Oh, come on.

06:10 Do I really want to trade those? a couple of comments from the, the live stream.

06:16 Hi Lang says conda rocks mostly. So, right there with you, Jennifer. Yeah. And then, yeah,

06:22 yeah. And then, Danelli says, there was a way to set up conda thing. So it automatically

06:28 switches to the conda invite, environment, see the environment.yaml file. I don't know anything

06:34 about this. Have you seen that Jennifer? I have not used that. No, it sounds like we should all check

06:40 that helpful. Yeah. Yeah. Thank you, Daniel. All right. Well, I guess we should jump over to the

06:45 next one. Something else that's really, helpful is a cookie cutter, right? So often we want to go

06:52 and say, well, I want to create a project and I don't want to just start from file, new project.

06:55 I want to have a certain structure. I want to maybe have some of the files there. So for example,

06:59 if I go and create a new pyramid web app, I can use a cookie cutter to generate that. And it'll say

07:05 things like what template language do you want to use? Do you want to use SQLAlchemy? And you answer

07:09 a couple of questions and it generates project already integrating those things with the right

07:13 directory structure and the right extra dependencies and whatnot. And that's cool, right? So I think

07:17 cookie cutter is really taken over as the primary way of creating projects that are structured. It's not

07:23 just Python. You could even create like Atari 2600 assembly language projects and C++, other weird stuff

07:28 like that. Anything that has to do with projects, just here's a bunch of files, replace, some

07:34 conditionally include others and so on. That's what cookie cutter does. And so that's not what I want

07:38 to talk about. What I want to talk about is this thing called copier. Have you guys heard of copier?

07:42 I have not. I have used cookie cutter, but I've not heard of this one.

07:45 Yeah. Cookie cutter is cool. And it's way more popular than copier copiers pretty relatively unknown,

07:51 but I think it's worth checking out. I don't know that I'll replace what I'm doing with

07:55 cookie cutter with copier. They're not interchangeable. They should be, that would be a great feature,

08:00 but I don't think they can share each other's templates. That said, the thing that is interesting

08:04 about copier primarily is that it allows you to upgrade working with projects. So if I go and make

08:11 a decision to create, say some web application or whatever else application, it even works for data

08:17 science, like structuring notebooks and whatnot. If I make a decision and then I change my mind after

08:22 I've already worked on it for a while, too bad. You don't get any choice. Like it's,

08:26 you throw it away or you create another one and you kind of diff the files. You're like,

08:30 ah, well, what's the difference over here? Oh, I should include this thing. But with copier,

08:33 you can rerun it on the project and make changes and apply those changes and different choices

08:38 to an existing project you're working on. That's why I think it's interesting.

08:42 That's cool. Does it have a, like a prompt like thing also? I mean, cookie cutter asks you things. I believe it does. Yeah. It will ask you questions. If you look at it,

08:51 it has, yeah, it absolutely has prompts. I can't really see a great example here.

08:56 It's, it doesn't use, I believe cookie cutters like native Python that you program it in. The scripts

09:03 are Python and then they drive arbitrary text files and whatnot with replacement. And it's kind of like

09:09 Jinja. This actually uses YAML. So if you look at an example somewhere, I'm not sure exactly

09:17 where a good example is, but basically you set up YAML files and the YAML files have different types of

09:23 prompts and questions. You can say like here, I want to ask for a password and then confirm it,

09:26 but don't show the output. So there's a lot of configurability and interesting things like that.

09:30 And then if you rerun it again, it'll say, here's the project structure that you have. Here's the

09:35 project structure that we're creating. And if it runs into a file, it'll say,

09:39 this one already exists. Do you want to override it? Use the one we're about to generate things like

09:44 that. So it's pretty neat. I think that's, it looks pretty cool. I definitely want to check it out.

09:48 Yeah. Yeah. It's, it seems to solve a slightly different problem than cookie cutter, but it's,

09:53 I think cookie cutter is the right conceptualization to have when you think about it.

09:56 Yeah. I did start using, I created my own cookie cutter for some data science-y things that I was

10:02 working on. And there's a data science cookie cutter that exists already, templates exist,

10:06 but it wasn't completely sitting my knees. So I, I made my own and then I was going to make one

10:11 for, for projects in our team. Cause we do some, you know, like one-off data analysis,

10:16 advanced projects. And then discovered that GitHub now has, you can make a repo with a temp as a

10:21 template and you can set it as a template in GitHub and then you just clone it and name it something

10:26 else. So that's solved part of, it doesn't solve everything. You know, if you want something

10:31 different, then, you know, this might be really good, but yeah, yeah, yeah, that's right. So I

10:36 remember if you go to your GitHub repository under settings, there's a checkbox that's off by default

10:41 that says this repository is a template. That's what you're talking about, right?

10:45 Yeah.

10:45 I see.

10:46 So if you set up like empty, you can set up your, set your file structure. so it's got nothing

10:51 about, I guess, some of the things that you're setting up in this one are not what you set up in,

10:56 in the GitHub template. It's just the file structure really. And if any files, you know,

11:00 that you want to pre-populate with any files. but yeah, so you, that's what I was going to

11:06 solve with cookie cutter. Cookie cutter would have been overkill for this.

11:08 I see. Yeah. I had never really thought of those two things as being the same, but you're right.

11:12 They're, they're basically the same. Cause normally when you fork a repo, it's like, well, now you can

11:16 contribute back, but the templates are just, now you start from here and it's not really related back.

11:21 Right. Cool.

11:22 Yeah.

11:22 Nice. Well, that, yeah, that, that brings us to your first topic, right? Tell us about it.

11:28 Yeah. Yeah. so I was thinking of data science and our team, we, we had a data science project,

11:36 that we started a couple of weeks ago and it had a deadline. So we weren't going to make anything

11:41 particularly pretty. We just wanted to get something analyzed and done. so we were using lots of

11:47 tooling that we hadn't used before because we're using massive data set. I think it was a couple of

11:53 gigs worth of text. So we had to use, Google AI platform notebook, which is just Jupyter notebook

12:01 on Google cloud. but you can, you can have different sizes of your machine. You can have as many

12:06 cores as you want, different types of machines if you want. and it would just run notebook

12:10 for you. So we thought that was sort of the problem. We just have like, have all these cores,

12:13 and we run our notebook on that and it'll be magical, but it, it wasn't, and a hub.

12:19 We're trying to apply to a pandas apply to this huge data frame. It just was not, was not working

12:25 at all. we, we even had the process bar on the bottom, like under the cell and it would take,

12:31 I think it was like 10 minutes to do, and it was still on 0%. And I thought, Oh wow.

12:36 You don't have time for this. Yeah. Don't have time for this. We're already on a deadline and it's

12:41 like, this isn't working. and then went over to, terminal and just checked like, top to see

12:48 what, what, processes were going on. And this was like one Python thing. I thought, well,

12:53 well, it's not, we could speed that up. Let's see what we can do there. Even though you have a ton

12:57 of cores and a lot of heart, a high end machine, it's still just single threaded basically. Right.

13:02 Yes, it is. so I looked at a few alternatives and didn't want to get too much into, I think some

13:07 people were suggesting, there's some desk related modules we could use. Like I think Swifter

13:13 was one, but there was, it didn't work instantly for me. So I looked for something else and found,

13:20 you have 30 seconds library work for me. Yeah.

13:24 I can't figure it out. Bin it. Start again. Find something else. I tried, found Pandarello,

13:31 which just parallelizes any pandas apply function. so you can tell it, you can tell it if you,

13:40 how many cores you want to use. You might not want to use all of them. and it's not like a linear

13:45 or exponential improvement. Is it like doubling your cores? Does not necessarily have your time.

13:49 Yeah. there's some overhead. but yeah, you can tell it how many cores you want to use.

13:53 you can also opt to think on the, if you scroll down, it says you can also add like a progress bar to

13:59 it in the options. but it's, and then you've got some benchmarking there as well. And it's just really

14:04 easy to use. so that solved our problem again, like the whole project was just quick and dirty,

14:11 but, to get it done quickly, this is great. And then going over to terminal and doing top again,

14:16 it's like all Python, just Python, Python, Python, Python. I was like, yeah. Yeah. Yeah. Beautiful.

14:20 And by default, if you don't, you can specify the number of workers, but if you don't,

14:25 it's just the number of CPU cores. Yeah. All of them. so just a quick question. It looks like this is,

14:30 it says that it's mostly around the apply function. What does apply do as a reminder?

14:35 if you have a, if you specify a function somewhere and then when you, when you hit

14:42 apply, I think it's an example that's actually a bit further down of the, of the kinds of,

14:46 applies that you do. So you can have, where you'd normally put apply func. So you can apply

14:51 a function to that whole directory of, sorry, to that whole data frame or to a specific,

14:57 column within that data frame. so any function you apply will be column will be

15:02 row wise in that column. Oh, okay. so the function only has to work on a single

15:07 row essentially. so anywhere where you'd put apply, if you're using parallel, you just

15:12 panderal, you just replace apply with parallel apply, and then it will. Nice.

15:17 Yeah. That's cool. That's very cool. Yeah. That's super cool. Daniel asks,

15:22 whoops, not that one, this one. He asks, how does this compare to Dask?

15:27 I do not know.

15:27 You know, I have not used Dask a lot either, but I think Dask can be set up to run in parallel on a

15:35 given machine. it can also be set up to run, you know, in a distributed cluster basically.

15:42 Yeah. my, my feeling is, yeah, my, my first impression is probably like, this is about,

15:49 I've got to apply this function to every element. I want that to be fast and simple. Let's just do

15:54 that. that's, that's my first thought.

15:57 I think the, the other option I used or looked at for 30 seconds was Swifter. And I think that is

16:03 a Dask base module, I think. but I might be misremembering.

16:07 And then, somebody else said, that apply is like a map in base Python. So.

16:13 Yeah, absolutely. Yeah. Very cool. Brian, what else is cool before we get on the next thing?

16:17 What is, if I wanted to learn pytest, say if I was Jennifer's team, maybe I could get a book on

16:23 pytest, right? So this episode is brought to you by us. Yeah. So I, I highly recommend a book called

16:29 Python testing with pytest. There's a small glitch with it though. It was written in 2017. So,

16:35 I, if you go to pytestbook.com, there's a bunch of a rata that will help you. There's some,

16:41 just some minor changes. I forgot to pin a version of one of the libraries, stuff like that. So,

16:47 if you go to pytestbook.com, that page has some a rata that helps you through learning pytest.

16:54 Awesome. And with the book, there is a second edition on its way, but it is

16:58 a long way out. So don't wait for it. That's a lot of work. Yeah. So, I'll over talk by the

17:04 trading and we're working on a bunch of courses, breaking news, never mentioned this before,

17:08 but we may be having a desk course coming soon. So, just, just so you know, and, Damon also

17:14 says, probably with more experience than definitely me or Brian, the desk has more features.

17:19 It can do chunking on the data frame to work around the Ram size limit, for example, and whatnot,

17:24 which is pretty interesting. And also notice down over here, I was it, this option use memory.fs.

17:29 it will actually use this memory file system thing, which sounds like it's good for lots of data

17:34 as well. But, and I, I haven't been out much. I used to love to go out and get like a milkshake

17:40 or something, but you wanted me to use ice cream instead for Python.

17:44 Yes. What's going on here?

17:45 Yeah. So actually, kind of love this. there's a, there's an article, from,

17:52 Kuyen Tran, stop using print to debug in Python, use ice cream instead. And I think we've

17:58 covered a couple of others, like other print alternatives. but, I went in and tried

18:03 this and it's pretty cool. So, with the, with the new F print stuff, you can, there's,

18:09 I forget when it came in, but you can do an equal sign, to, to print in, it prints a nice,

18:15 like variable name or whatever, whatever you want. And then the value of it next to it. So it's nice,

18:21 nice, but it's still a lot of typing. So if you want to print, print something nice,

18:25 you, you know, type it's a, it's a lot of typing. It's not tons, but when you're throwing debugging in,

18:31 you're probably stressed doing it quickly is good. So ice cream is just a way to do this faster. So

18:36 ice cream is, instead of typing print for your debugging output, you type IC.

18:41 So first of all, fewer characters, right? Right there. No curly brackets. You don't have to do

18:48 quotes. it's just, I see. And then, and then you give it whatever object or expression

18:53 you want to print. So that's cool. That's it. so even just at that, it's worth it. It's less

19:00 typing. I mean, you do have to import it, but there's that. but there's other stuff too.

19:04 So you can, if you don't give it any arguments, it, logs, it's kind of like easy deep flow

19:12 control debugging without having a debugger, because if it, every time it hits an IC statement without,

19:18 without any arguments, it'll by default print, like the file and function and line number where

19:25 it's at. So you can kind of trace through your stuff. So that's nice. if you want to have

19:31 that tracing, be part of all of your statements, even the ones where you pass something in, there's

19:37 a way to configure that too, which is very nice. Oh, nice. I, you can custom prefix, which is kind

19:43 of, it's, which is like super powerful. I'm totally going to use it for this. so the example that they

19:49 have is, is instead of you can, of course you can just put a string in or something, but it's a

19:54 callback. So you have a callback function getting called. So you can use that. Their example is to

19:59 inject the date time, which is kind of neat. You can inject the date time in your print statement,

20:04 but I was thinking you could use that to, encode system state, like which users logged in or,

20:11 whatever other system state you kind of want to, track while you're debugging something.

20:16 This would be really cool to do since you can use a callback function. a couple other features

20:21 with it. It, it doesn't go to standard out. It goes to standard error by default. So it's not

20:26 cluttering up your output if you're storing your output somewhere. and then, one of the other

20:32 things I'm glad they like this article lists this as a, as a feature, it's not a print statement. So

20:38 when you want to clean out all your debugging, you can just search for all of your IC lines and clean

20:43 those out. you don't, because you might have print statements that are supposed to be there and

20:47 you don't need to clean those out. So definitely. Yeah. You could even sort of cancel it out with

20:51 an import statement, just define import, define IC to be a function that takes whatever and does

20:56 nothing. Yeah. Yeah. That's really cool. So people who are listening, you know, you, you say, if I were

21:01 to say IC of a function call, like plus five and give it the number four, the output would be IC colon,

21:07 then actually plus five with the argument values colon, the return value. And so it's a really nice

21:12 way. Instead of just printing nine and 10, it prints, I called plus five with four and got nine.

21:17 I called plus five with five and got 10. And it just, it sort of summarizes the, the debug information

21:23 a little bit better. Jennifer, I think this might make sense inside even a Jupyter notebook.

21:26 Yeah, I think it will. I was just, I was just thinking this is even less typing than if you used

21:32 f-strings. Yeah. Yeah. Yeah. And a little more powerful. And like Brian said, you kind of know

21:36 it's intentionally a debug thing. You could even rename, you know, import IC as debug and just like

21:43 make it really clear if you really want it. Right. Oh yeah. Three extra letters to type. Yeah. I know.

21:48 I know. Cause I'm not sure if I saw IC in my code, you know, without being familiar with this,

21:55 that I know what that's about. It's your code. It's not a pun. Yeah. Are they making a pun?

22:00 Like I see, I see my errors. Oh, I see. I see. Yes. They may be. They may be. And Piling is a fan of

22:08 the name. Brilliant name. Yeah. It's pretty clever. All right. Good one, Brian. So the last two,

22:14 by the way, the parallel Pandarello had a great visualization. This one has some great visuals.

22:20 And this next one I want to talk about, you know, I think might be what part of the reason we love

22:23 these things. It's like they, they communicate their value. So clearly this thing called HTMX,

22:28 high power tools for HTML. So Brian, I know you fall into this realm. I do some of the time as well.

22:35 Jennifer, maybe not sure. If you're going to write some interactive web pages, you really want to drop

22:41 in and write a ton of JavaScript and do all that interaction by hand. Or would you rather just have

22:45 it like magic its way into interactivity? And so that's a little bit. Yeah. Who doesn't want the magic,

22:51 right? So this is kind of what this is like. Normally, if you have a web page, you have two

22:58 options. You can have a form and that form could like post back and submit some data. And then you

23:03 could write some JavaScript. So if I click on some element, something happens. But what this does is

23:07 it lets you to go to almost any, any element in your page, a picture, paragraph, whatever. And you can say,

23:14 if you interact with this, here is a CSS transition to run. Here's a WebSocket call to do. Here is a

23:20 Ajax JavaScript call. And then it does something in reverse. So what I could say, for example, is when

23:26 somebody clicks on this picture, replace it with whatever HTML fragment I get back on the server

23:32 that I told it to call. So the picture could be like, click this for this bit of data report. And

23:37 then what it does is return actually the HTML for a graph. That's like a live graph with the data

23:43 prefilled. And all you have to do is touch the picture and teach the server how to return the HTML.

23:48 And now you have this interactive page that's like live with animations and stuff. Super cool.

23:52 So let me show you probably the best way to see this is through an example. So for example,

23:57 there's a button. If you just include the script, that's all you got to do. And you say button,

24:01 instead of having it in a form, you just say HX dash post. That's the HTMLX thing. You give it a URL.

24:07 And when you click it, it says call slash clicked. And when it comes back, replace what button is,

24:13 the outer HTML, like button and everything in it with whatever you got back from the server. Okay.

24:18 And even has a, a haiku in here, which is pretty cool. JavaScript fatigue,

24:23 longing for a hypertext already in hand, but let's go, let's go look at some examples. These are cool.

24:28 For example, let's do lazy load. If I go over to lazy load, which is a little slow. So it probably

24:34 it's lazy. It's quite lazy. It is indeed. Maybe we will. Here we go. So we come over here and

24:41 if you just scroll down, you can see like it automatically loaded in this by refreshing.

24:46 See, there's a little action, boop, boop, boop. And then off it goes. And all I got to do is say,

24:51 is this image, it has this indicator. Here's the image to show while you're loading. And then what

24:56 you want to do is just show whatever you get from slash graph. Isn't that slick? And like,

25:01 that's literally what you get have to write. Let me show you another one.

25:04 I might do this just for the lazy loading. That's great.

25:07 I know here. Let's go do. Yeah. Look at the active search. So over here, I can type

25:11 J E. Okay. There you go. And just, as you type, all you got to do to like type in this little text

25:17 box and have all these search results come up lazy is just say, here's what you call ATP post. And the

25:23 trigger is the key change. There's even a delay. So as you type really fast, it doesn't go insane on

25:27 the server. It like waits really, really cool. Yeah. As a little indicator. And then if you notice at the

25:32 bottom, there's this thing you can show and it shows all the requests and the responses

25:37 that have gone back and forth. There's like a little debug toolbar here for the whole Ajax

25:41 interaction. Oh, sweet. Isn't that sweet? Yeah.

25:44 It is nice. How did you find this?

25:46 Gosh, I don't remember. I feel like maybe, maybe somebody, some listener told us about it, or I just,

25:52 I ran across it on Twitter or something. I feel like I found it from the community somewhere.

25:56 Cool.

25:57 But I don't remember where, but I'm, I'm excited about this. Include a JavaScript file,

26:01 put one line, and then it becomes this cool interactive thing all over the place. So yeah,

26:06 definitely digging it.

26:07 Yeah. Totally can use that.

26:09 Yeah, for sure.

26:10 Yeah. Same.

26:11 I'm sorry.

26:11 It might, it might encourage me to update my website.

26:16 Exactly. You're like, oh, it's super interactive. Look at all this. I rewrote it completely.

26:20 So much fun.

26:21 Yeah. Yeah. Cool. All right. What's next?

26:24 Hi, LDA Viz. Yes. This is also part of that quick turnaround data science project that my team did

26:32 a couple of weeks ago, we were looking at doing some topic analysis on text. And our first approach

26:42 was to use latent dirichlet, dirichlet. Nobody knows how to say it. Just got to say it with confidence.

26:47 Latent dirichlet. Oh no, I can't remember what the A stands for. Analysis. Maybe that's the A.

26:54 We applied that. And, but it's the output to understand what you're looking at. You know,

27:00 you can, you can have it print out what the topics are and what words contained in that topic. And,

27:05 but, you know, you can't, it's really hard to sort of get into the output of your model to evaluate

27:10 if it's a good model or not. So what some wonderful people in the R community did was they made LDA Viz,

27:18 which just displays the different elements of the LDA output in a really, really intuitive way. So even if

27:28 you're not too sure on the math behind LDA and, you know, what everything means, what, what Lambda is and what all the different,

27:36 like complex interactions are, it's quite intuitive. If you, if you spend a bit of time exploring the visualization.

27:42 So that was then ported into Python and that's called, in Python it's called PyLDA Viz, but the visualization is exactly the same.

27:52 So, yeah. So you'd have, so in this little partial screenshot of visualization, we have some bars and the blue bars and the red bars.

28:03 So the blue bars are like how, how much of the overall, you'd have all the words in all the topics, like in all the documents.

28:11 So baseball, how of the, all the words in all the documents, how much does baseball,

28:17 how much is baseball represented in all those documents?

28:19 And the red bar represents how much in that topic, topic number 19, how much topic number 19 is made up of baseball.

28:29 Oh, I see. So you have these different topics on the left that you can like click on them and it'll generate the bars to explain more detail.

28:36 Yes.

28:36 Okay.

28:37 You can click in all the different topics.

28:39 The number of topics is determined already in the model that you've already created.

28:42 And you can change that, rerun the model and get that many topics out.

28:46 So yes, you can like, you can click on the different topics and explore the top words,

28:51 either top words based on how it's sensitive they are across all the documents or within that, that one topic.

28:56 And then there's a slider as well.

28:58 I don't know if it's an example, if you scroll down, but there'll be a slider,

29:02 which goes between zero and one.

29:05 And at one, it's the word order, the topic words order is ordered by representation across all the documents.

29:14 And if you slide the slider all the way down to zero, it's shuffled all the words to be more specific,

29:22 to show the words that are more specific to that topic, that are exclusive to that topic and not in other topics.

29:30 Whereas if you have it to one, it's prioritizing the words in the list of words that are like everywhere.

29:35 So yeah, that's just really easy and nice to play around with and explore your model.

29:40 It seems like such a powerful way to explore these models around NLP stuff.

29:45 Yeah.

29:46 It just looks, it's just nice.

29:47 It's just well designed and makes you feel happy playing with it.

29:51 Yeah.

29:51 These pictures and live interactions are great.

29:54 Yeah.

29:54 And there's really good documentation as well.

29:55 So they've got links to easy to read documents that explain way better than I did what everything means

30:02 and how to interpret stuff.

30:05 So definitely take a look at that.

30:07 And there's some, I think, links to some YouTube videos and whatnot as well.

30:10 So yeah, the docs are really nice.

30:12 Links to academic papers explaining what everything is and topic models.

30:16 And yeah, good.

30:17 Yeah.

30:17 And there's some linked videos there.

30:19 I didn't pull them out because I think they're probably like talks or something.

30:21 But yeah, those look good as well.

30:23 Yeah.

30:23 Cool.

30:24 Nice.

30:24 So it says this package was ported over from R.

30:28 And I know there's a fair number of things in the Python data science world that's like that.

30:32 Do you see that still happening a lot?

30:34 Like what's this interplay between R and Python these days?

30:37 I think I've actually not seen that in a while.

30:40 To me, I'm not very aware of it, but it seemed like that was really popular a couple of years ago.

30:47 And I hear less of it now.

30:48 Yeah.

30:49 Yeah.

30:49 Well, yeah, same.

30:50 I think R is really, because R is a stat stat.

30:53 It's a very popular language.

30:54 So, and it's been around the stats longer, I think, than Python has been.

30:59 It's much more mature when it comes to stats.

31:02 And I think like very specific statistical applications are more advanced in R just because they've been around for longer.

31:10 Python is definitely catching up, though.

31:11 But, you know, with something like this, I think it's nice that rather than reinventing the wheel in Python, they've just taken something that already works and made it work in Python.

31:20 Exactly.

31:21 You're like, we like this.

31:22 Yeah.

31:22 We'll just do this.

31:22 This is great.

31:23 Yeah.

31:23 Like, why change it?

31:24 Yeah.

31:25 Daniel Chen threw out there, going back a few topics, that there's conda-auto-env, that project, which works, I think, probably like the PDM thing that you had, Brian?

31:38 What do you think?

31:39 Say that again?

31:39 I think we're talking about whether conda has this idea of automatically activating environments under PyPackages.

31:45 I think this project, conda-auto-env probably does.

31:49 And apparently there's a tie-in with RStudio as well.

31:51 I think I agree with it.

31:52 Nice.

31:53 I was just looking at R.

31:54 So R looks like it's been around since 93.

31:58 I didn't know it was that old.

31:59 Oh, wow.

32:00 Yeah.

32:01 Based on S, apparently.

32:02 Based on what?

32:04 F?

32:05 Based on S?

32:05 S.

32:06 Yeah.

32:08 Just one character, please.

32:10 One character is all we need.

32:11 Yeah.

32:11 See what's ahead.

32:12 We'll go with it.

32:13 What was up to follow on?

32:14 Yeah.

32:15 R was my first programming language for data analysis.

32:19 But I'm really out of touch with it.

32:20 Now that we've got TidyR, which is supposed to be really amazing and great for, I guess, it makes it easier for people new to programming to get up and running quicker.

32:30 But I look at R now and I think, oh, I don't know how that works or what that is.

32:35 I've just been out of it for so long.

32:37 Okay.

32:37 So you're completely in Python now?

32:39 Yes.

32:39 Yeah.

32:40 But I'm not like no to R.

32:43 I don't know.

32:44 You see it sometimes when people are like, yay to Python and no to R or the other way around.

32:47 And I think it's just silly.

32:49 Yeah.

32:49 They're both really, really good at what they do.

32:52 Yeah.

32:53 If they're doing something cool like this, like LDA viz, do that here as well.

32:57 Yeah.

32:57 Yeah.

32:58 Speaking of visualization, I want to remind people that are listening to the podcast that we do live stream it.

33:06 So you can hop on on Wednesdays and watch with us.

33:09 Or you can catch it on our YouTube channel so that you can see the things we're looking at.

33:16 We highlight if we're looking at a web page or a cool visualization, we can see it.

33:20 Yeah, absolutely.

33:21 Is that it, Brian?

33:23 I think it is.

33:24 It is.

33:25 Do you have any extra news or anything?

33:27 Nothing super exciting, but I did want to tell people about the JetBrains survey.

33:31 And if you've ever gone to the JetBrains site, did you know that they have a little terminal command prompt for agreeing to the cookie policy, which is kind of cool?

33:40 Anyway.

33:40 Yeah, I love it.

33:41 I'm like, oh, I hate these cookie things, but that's kind of cool.

33:45 I'm going to do that.

33:45 So they are launching the developer ecosystem survey for 2021.

33:50 And if you participate, you get some prizes.

33:52 It does take a little while.

33:54 It took me like 15 minutes.

33:55 It's a non-trivial amount of questions.

33:57 But I'm sure that we'll cover this in three months or whenever the report comes out, and there'll be all sorts of cool stuff we can talk about.

34:02 So, you know, Python people, get your voice heard.

34:04 Nice.

34:05 Yeah.

34:05 Got to remember to take that.

34:07 Yeah.

34:07 How about you?

34:08 I've got a couple exciting bits.

34:10 I am going to be speaking next week at a couple places.

34:14 So I'm going to be speaking at the Python user group for Aberdeen, which it's in the UK.

34:22 That's about all I know.

34:25 It's online.

34:26 It's a virtual thing.

34:27 It's online.

34:27 Yeah.

34:27 It's online now.

34:28 Yeah.

34:28 It's an online thing.

34:29 And so I'm going to teach.

34:34 We kind of did a survey of the people going, and there's a lot of people new to testing and new to pytest.

34:40 So I'm going to do sort of an intro to pytest sort of a thing, or at least a topic around pytest that's introductory.

34:47 And then I'm going to do a similar talk, but targeted a little bit closer to what they're doing to NOAA, which I'll probably get that wrong.

34:56 National Oceanic something, something.

34:59 So I'm going to talk with a group of those people next week.

35:02 That'll be fun.

35:03 And, oh, it's in Scotland.

35:06 Aberdeen is in Scotland.

35:07 Sorry.

35:08 Thank you, Alex.

35:11 So I told my kids about both of the things, and they're like, yeah, Aberdeen, that sounds neat.

35:16 But NOAA, really?

35:17 You're going to be talking to them?

35:19 So my kids are excited about that.

35:20 Yeah, that's super cool.

35:22 Jennifer, anything you want to throw out to people listening?

35:24 You run a user group, right?

35:25 Yeah.

35:26 We've got PyData in Manchester that we have going on.

35:30 And our next, it's on Meetup, and it's obviously on YouTube, because where else are we going to be?

35:38 But then our next one coming up is on agent-based models.

35:42 So that's going to be really cool.

35:43 Looking forward to that one.

35:44 And hopefully, and I'm not going to promise too much, but we did put our own podcast on hold for a little bit.

35:52 So hopefully, we will start that up again this year.

35:56 So one reason why I'm pretty interested in the tools that you guys use for your podcast, because I think this makes it really interesting and engaging.

36:05 Yeah, well, I think some of these tools, like we're using, for example, StreamYard for our live streams and stuff,

36:11 I do think there's a lot of low bar to adopt those kind of things for a lot of meetups and stuff.

36:17 So yeah, that's cool.

36:18 If people want to know what we're doing, they can shoot us a message, and we'll let them know.

36:21 I just looked it up.

36:23 Scotland is in the UK, so I wasn't completely wrong.

36:26 It is.

36:26 No, you're not wrong at all.

36:27 It's like squares and rectangles.

36:29 Come on.

36:30 You said it was a rectangle.

36:31 It's all good.

36:32 All right.

36:33 Well, with that, you think we should close it out with a joke?

36:35 Yes.

36:36 You think, Brian?

36:37 Yeah.

36:37 All right.

36:38 So this one comes to us from Edward Orochena, and send us this cool picture here.

36:44 And this has to do with an engineer helping a designer fix a problem.

36:49 I kind of feel like I want to be the developer.

36:52 Do you mind being the designer, Brian?

36:54 Sure.

36:55 All right.

36:55 So Brian comes to me, the designer comes and says, there's a problem with this design.

37:01 So I say, oh, no problem.

37:02 No problem.

37:03 We can fix this here in the terminal.

37:04 I pull open Z shell, and I'm rolling along.

37:07 Whoa, you're a hacker.

37:09 No, no.

37:09 It's just the terminal.

37:11 But where are all the buttons and icons and drop-down menus?

37:14 Is this the matrix?

37:15 Yes.

37:18 Have you ever had one of those experiences?

37:22 I had one of those experiences at a coffee shop.

37:25 I was doing something with the terminal, and I had three of them open.

37:28 And one of them was tailing a log, and one was running a pip install script with a bunch

37:34 of progress bars, and people were like, are you trying to hack us here on the wife?

37:38 I'm like, no, I'm just working.

37:40 Leave me alone.

37:40 So I was on the other side of it to start with.

37:45 I was a grad student.

37:47 I shared an office with a couple other people.

37:51 And one of the women that shared the office with us was a Vim user, or VI user at the time.

37:58 And I was with tags and everything.

38:00 And I was an Emacs person at the time with menus and stuff.

38:04 And so I was watching her code once, and it's just jumping all over the place.

38:09 So she'll go to a very, and her hand's on the keyboard, nothing, no mouse.

38:13 And the windows are popping back and forth, and she's going all over the place.

38:17 I'm like, oh, my God.

38:18 She's like thinking into the computer.

38:21 So I learned VI because of that experience.

38:26 Oh, that's awesome.

38:26 Yeah.

38:27 When you see people just using Vim or any of Emacs, whatever, like that's mind-blowing.

38:33 I still need my, I like to use my mouse.

38:35 Yeah, I like a blended experience as well.

38:40 Cool.

38:40 Well, thanks, everybody.

38:41 Yeah, thanks, Brian.

38:42 Thanks, Jennifer.

38:43 Thank you.

38:43 Yeah, this was really fun.

38:45 Thanks for having me.

38:46 Thank you for listening to Python Bytes.

38:48 Follow the show on Twitter via at Python Bytes.

38:51 That's Python Bytes as in B-Y-T-E-S.

38:54 And get the full show notes at pythonbytes.fm.

38:57 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

39:02 We're always on the lookout for sharing something cool.

39:04 On behalf of myself and Brian Okken, this is Michael Kennedy.

39:08 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page