Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #211: Will a black hole devour this episode?

Return to episode page view on github
Recorded on Wednesday, Dec 2, 2020.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:04 This is episode 211, recorded December 2nd, 2020.

00:08 I'm Michael Kennedy.

00:10 And I'm Brian Okken.

00:11 And we have a special guest.

00:12 Yeah.

00:13 Matthew Feigert, welcome.

00:14 Yeah, thanks so much for having me on.

00:15 Yeah, it's great to have you here.

00:17 You've been over on Talk Python before, right?

00:20 Yeah.

00:20 Talking about some cool high energy physics and all that kind of stuff.

00:24 Yeah, I looked that up last night just to try and remind myself.

00:28 That was episode 144.

00:29 I was on with my colleagues, Michaela Paganini and Michael Kagan to talk with you about machine learning applications at the LHC.

00:37 Yeah.

00:38 And you do stuff over with CERN at the Large Hadron Collider and things like that?

00:42 Yeah.

00:42 Yeah.

00:42 So I'm a postdoctoral researcher at the University of Illinois at Urbana-Champaign.

00:47 And so there I split my time between working on the Atlas experiment and working as a software researcher at the Institute for Research and Innovation and Software for High Energy Physics.

00:57 And so like on Atlas, Atlas is this like huge five story tall particle detector that lives 100 meters underground at CERN's Large Hadron Collider.

01:07 That's just outside beautiful Geneva, Switzerland.

01:11 And so there I work with a few thousand of my closest colleagues and friends to try and look for evidence of new physics and make precision measurements of physics we do know about.

01:22 And then my IRISHEP work is kind of focused on working in an interdisciplinary and inter-experimental team to try and improve the necessary cyber infrastructure and software for us to be able to use in upcoming runs of the Large Hadron Collider.

01:37 And in what we call like a high luminosity run, which is going to be way more collisions than normal.

01:41 Have you guys ever turned it up to full power?

01:43 Have you turned it up to full power yet?

01:45 So, yeah, the design luminosity or the design energy of the LHC is at something called 14 tera electron volts, 14 TV.

01:53 And we've been running intentionally at a lower operating energy for the last couple of years, just a little bit below that.

02:00 But in the late 2020s, we're going to suck the entire Earth into it.

02:05 You know, no, no experimental evidence of black hole creation.

02:08 Yeah.

02:08 But kind of the cool thing is that if we even did make a black hole, the LHC due to something called Hawking radiation, it would evaporate well before it could actually ever do anything interesting gravitationally.

02:18 But yeah, it's really exciting.

02:20 Really.

02:21 I'm joking, but it's such a cool place.

02:23 Such cool technology.

02:25 I mean, that's right out of the edge of physics these days.

02:27 And the technology side is neat, too.

02:29 Yeah.

02:30 No, it's super fun.

02:31 Well, welcome over to Python Bytes.

02:34 Yeah, it's great to be here.

02:35 So I yeah, it's good to have you.

02:36 Thanks for coming.

02:37 And Brian, I think let's let's start with another one of my favorite topics.

02:41 Farms.

02:42 I love farming.

02:43 You know, you see the bumper sticker.

02:45 No farms, no food.

02:46 I like food a lot.

02:48 So I love farms.

02:49 No, no.

02:50 But the farm stack.

02:51 We've heard the lamp stack, other stacks.

02:54 Like lamp is not as useful as farm.

02:56 Right.

02:57 Farm sounds more useful.

02:57 So tell us about farm.

02:59 So Aaron Bassett, he's I'm not sure.

03:02 I think he's one of the spokespeople for Mongo or something like advocate or something like

03:08 that.

03:09 Anyway, he's he's doing he he wrote this article, but they've also done.

03:15 I think that there's been some some talks given.

03:18 But this is this is a nice article.

03:19 It's a called introducing farm stack, which is FastAPI react and MongoDB.

03:25 So the I really actually appreciated the article and the code with it, because there's a GitHub,

03:34 a little GitHub to do credit app that they've put together.

03:37 And it the article describes basically all of the pieces of the application using it like

03:45 a little to do app.

03:46 But with FastAPI, you've got this is interactive, interactive documentation mode where you can

03:53 interact with the application just almost immediately.

03:57 You don't have to really do much to put it all together.

04:00 And and then for all your endpoints, you can actually interact with them, send send data,

04:05 do queries.

04:06 And there's a little animated gif to show how that's done.

04:09 But the article then, you know, goes through and says, you know, basically how how the endpoints

04:15 and routes get hooked up and then uses you, you, you, a corn to set up an async event loop and

04:23 get that going shows how easy it is to connect to a database and then defining models with and

04:30 how easy it is to set up a schema goes through.

04:34 And then it kind of hooks up, talks through the code discussion of you do have to write

04:39 code for the endpoints and and really how easy those are with all of these these pieces.

04:44 The React application is it's kind of a minimal app React application.

04:49 I'm not sure why they kind of included that, but it's kind of a neat addition.

04:53 There's a React application that's running that just sort of shows some of the interaction with

04:59 the the CRUD app and it gets updated while, you know, while you're changing things through

05:05 the interactive API.

05:07 And I just I like the demonstration of working through working with an API and working through

05:14 changing things and seeing it show up having a like a React app at the other end.

05:20 It's kind of a fun way to kind of experiment with an API.

05:24 This is a really neat thing.

05:26 And one of the other major stacks that's been used around Mongo is the mean stack.

05:29 And the farm stack is way nicer than the mean stack, not just because it uses Python and not

05:34 JavaScript.

05:34 But there's some interesting things here.

05:36 One of the examples is actually kind of blowing my mind in that it's an if statement using the

05:43 walrus operator awaiting an asynchronous call in an API method.

05:48 like the walrus operator and async, the await keyword.

05:51 I've never seen those together.

05:52 And it's kind of like it's inspiring.

05:55 It's nice.

05:56 It's good.

05:56 Yeah.

05:57 Yeah.

05:57 Such succinct code as well.

05:59 It's super nice.

06:00 I mean, it uses FastAPI, which is fantastic.

06:02 It's using motor, which is MongoDB's officially supported Python async library because you need an

06:09 async capable library in order to do things against MongoDB.

06:14 You know, this actually comes from the developer blog at MongoDB.

06:17 There also are some ORM like things, some ODMs, object document mapper stuff that also supports

06:24 async and await from MongoDB.

06:25 So if you're more in the ORM style, you might check that out.

06:28 But other than that, this looks pretty neat to me.

06:30 Yeah.

06:30 Yeah.

06:30 And I do know that a lot of people use the ORMs, but I like I appreciated the example without

06:35 an ORM for people because you throw an ORM example in there and then people that don't

06:41 use that particular one get lost.

06:44 Yeah.

06:44 Matthew, do you guys do anything with MongoDB?

06:46 Any of these kind of things?

06:47 FastAPI?

06:48 Yeah.

06:49 I have some friends that do.

06:50 I personally, myself, I'm not too versed in Mongo, but I've heard it on the show and

06:56 many, many times elsewhere.

06:58 So this is, I think, also just kind of paging through the article as Ryan was talking about.

07:03 It is pretty impressive.

07:04 So it's really concise.

07:06 Like, here's your four lines to completely implement the API.

07:10 Yeah.

07:10 Type of things, right?

07:11 Asynchronous, fast, like all the cool stuff.

07:13 Yeah.

07:13 Yeah.

07:14 There was an example, a case study of MongoDB being used at the Large Hadron Collider, but

07:20 that was many years ago and I don't know if it still is.

07:22 So I've completely forgotten where that is.

07:24 Yeah.

07:25 I don't know.

07:25 Yeah.

07:26 Yeah.

07:27 Cool.

07:27 Cool.

07:27 So next thing I want to talk about another programming language.

07:31 Last time, Brian, I went on and on, maybe the time before, two times ago, about .NET and C-Sharp

07:37 because Anthony Shaw had done that work on Pigeon to get Python to run on .NET.

07:41 And we're like, well, why are we talking about C-Sharp on this project, right?

07:45 On this podcast.

07:47 Well, I want to talk about something even more advanced, AppleScript.

07:49 Wow.

07:52 Cutting edge.

07:52 Yes.

07:53 It's like the CMD shell script of Apple.

07:57 Have you ever programmed an AppleScript?

07:59 It's painful.

08:00 No, I have not.

08:01 It's like you say, like, tell this application that to like make a command.

08:06 Oh, it's bad news bears.

08:08 Let me tell you.

08:08 So what I've come across is this thing called PyAppleScript.

08:15 Now, this is not brand new, but it's brand new to me.

08:18 And there's a lot of talk about Macs and people may be getting new Macs.

08:23 So I thought I would say, hey, look, here's a cool way to automate your Mac or, you know,

08:27 Macs within your company or whatever.

08:29 With Python instead of this dreaded NS AppleScript.

08:32 Okay.

08:33 All right.

08:33 So basically it's a Python wrapper around NS AppleScript, allowing Python scripts or applications

08:39 to communicate with AppleScript and AppleScriptable applications.

08:44 So apps for which they basically implement AppleScript and let you do that.

08:48 So scripts get compiled either from source or they can be loaded from disk.

08:52 They have these, some of these ideas are from AppleScript as a standard run handler and user

08:57 defined handlers can be invoked with or without arguments.

09:00 They're automatically converted.

09:02 The responses to and from AppleScript are automatically converted either from AppleScript to Python

09:08 types like Python string versus AppleScript one or vice versa.

09:12 Right.

09:12 So you don't have to do the type coercion, which is cool.

09:14 And they're persistent.

09:16 So you can call your handle multiple times and it retains a state like AppleScript would.

09:20 And it also has no dependency on the legacy AppleScript library or the so-called flawed scripting

09:28 bridge framework, which is limited to OSA script executables.

09:32 So that's pretty cool.

09:33 If you want to automate things on your Mac, you obviously could use Bash.

09:38 But if you're talking to some kind of application that implements one of these scripts, like for

09:43 example, you want to tell this other application to grab something out of the clipboard and then

09:48 tell it to do something or something like that.

09:50 Right.

09:51 Like you couldn't reasonably do that with Bash.

09:52 Right.

09:53 Once it starts up, you kind of want to go back and forth with it.

09:56 So it sounds like AppleScript might be the thing to do.

09:58 Pretty cool.

09:59 Yeah.

09:59 Yeah.

10:00 Yeah.

10:00 I mean, not a lot to it.

10:02 Like if you've got a script, your Apple macOS stuff, do it with Python.

10:07 You don't have to do it with that AppleScript stuff.

10:10 No, it's neat.

10:11 Yeah.

10:11 Yeah.

10:11 So Matthew, you probably brought something to do with physics, data science, I'm guessing.

10:16 What's your first one here?

10:17 Yeah, a bit.

10:18 So we currently live in this like really nice age of having awesome CI services and all these

10:24 really nice metrics for all your GitHub projects and everything.

10:27 So, you know, if you're, I'm thinking of like coverage.

10:30 So if you're, you know, using pytest and, you know, making sure that you're reporting your

10:35 coverage, you have all these really great services to also track your coverage and report that

10:39 in a nice shiny badge.

10:41 But let's say you're developing some tool or some library and you have some sort of performance

10:46 metric that you care about.

10:48 Let's say like how fast some, the speed of evaluation for certain expensive functions.

10:53 And you actually want to try and like track that through the entire history of your code

10:57 base.

10:57 And that's not something that's like traditionally very super easy to do.

11:01 So recently I was really happy to find something.

11:03 So like if you're making changes.

11:04 So if you're going to be adding some feature or whatever, you are refactoring it.

11:09 So it's easier to write, but you're not sure if that makes it faster or slower.

11:12 And this would sort of give you that information from week to week or something like that.

11:16 Exactly.

11:16 Yeah.

11:17 So you might like, you might go ahead and say like, okay, well, you know, I have like

11:21 some, some tests that make sure that this function evaluates and under some period of time, if

11:25 it's an expensive function for your test.

11:28 But let's say you actually want to like track across like different, different parameter,

11:33 parameterizations, how that function actually is about is being performing and evaluating it

11:39 in your whole code base.

11:39 So I've recently found this super cool tool written in Python called airspeed velocity.

11:46 And so from the docs, ASV airspeed velocity is a tool for benchmarking Python packages over

11:53 their lifetime.

11:53 So it deals with runtime memory consumption and even custom compute values.

11:59 And the results are then displayed in a super nice web front end that's interactive and basically

12:05 just requires like static webpage hosting.

12:07 So it's, it's pretty impressive.

12:10 And just if you click on the docs, you can see that's developed by a community of people,

12:15 but led by Michael Dorritboom.

12:18 I'm probably getting your, your name wrong.

12:20 Very sorry.

12:21 And Pauly Burton.

12:23 But if you look at some...

12:25 He's the guy that who was behind a pie oxidizer at Mozilla.

12:29 Oh, really?

12:30 Oh, okay.

12:30 Yeah.

12:31 That's a super cool project.

12:32 Yeah, for sure.

12:34 Yeah.

12:34 Yeah.

12:34 yeah.

12:35 And so, I mean, if you look at the other people that are on the contributor list, you can spot

12:40 a lot of names that are common in the SciPy and Jupyter ecosystem.

12:44 So it's, you already know that this is a nice community built tool.

12:48 And then also, as kind of some example cases, they give, current projects that

12:54 are using it like NumPy and SciPy and AstroPy.

12:56 So pretty well established projects.

12:59 And just as kind of like an example, if you click on like the SciPy project and go

13:03 to the interpolate function there, you can, you can just kind of look at a very nice

13:08 visualization of the actual evaluation, in time on the vertical axis across a whole

13:14 bunch of parameterizations, such as like CPython version and number of samples that are being

13:19 run.

13:19 And you can see this for the entire lifetime of the code base and you can zoom in on any

13:24 section just with the mouse.

13:25 And something I think that is super, super cool is if you, if you're looking at the visualization

13:30 of the plot and you see that, oh, there's like one commit where all of a sudden things go

13:34 funky and the evaluation time just jumps up.

13:36 You can just click on that node and it immediately opens up to that commit in GitHub, which is,

13:42 I think, super awesome that you don't have to go and like search through your commit history

13:45 to figure out what, like where that corresponds to.

13:48 It's just boom, right?

13:49 I'm looking at, it shows the, the Shaw from GitHub.

13:52 Yeah.

13:53 The, the, the, the unique identifier of the commit.

13:56 That's crazy.

13:57 Yeah.

13:57 So, wow.

13:58 Yeah.

13:59 So I've, I've, you know, a project that I'm working on, we've been interested in trying

14:03 to have the sort of like metric tracking for some of our, for some of our work.

14:07 So this is something that I'm actively kind of, looking at how we might be able to deploy

14:11 this for one of my projects with my coauthors.

14:14 but it's openly developed on GitHub.

14:16 It's up on pipe, pipe behind as well.

14:19 So just pip install ASV.

14:20 and then I think something that's kind of very cute and very kind of Pythonic is that,

14:25 if you, when you go to the reporting dashboard for the different libraries that you're actually

14:29 benchmarking, it will, up at the top, say the airspeed velocity of an unladen X.

14:35 So the airspeed velocity of an unladen like numpy or an unladen scipy.

14:40 So, you know, keeping very true to the, you know, Python's roots there.

14:44 There's some Monty Python, the, the show Zen in there for sure.

14:50 Exactly.

14:50 Yeah.

14:51 This is impressive.

14:52 I mean, Brian, how do you see this fitting into like testing and stuff?

14:55 I actually love this.

14:56 I, I could use this right away.

14:57 There's, there's lots of, well, a lot of times it's, it's not, yeah.

15:02 Performance of performance is always something we care about.

15:05 And, and benchmarking systems, and, you know, testing, it's always, it's something

15:11 you forget about sometimes, like running, running stuff and it still works, but like

15:17 over time things slow down and it's good to, good to know that.

15:21 Yeah.

15:21 And if this could just be automatic and just part of your CI, you just go back and see the

15:26 updates.

15:26 That'd be very cool.

15:27 Definitely.

15:28 Yeah.

15:28 I don't think that this is something that at the moment, and I'm happy to be corrected about

15:33 this.

15:33 I don't think at the moment there is some way that this is currently being, given

15:37 as like a CI service, but I think that this is something that you could like set up and

15:41 run for yourself pretty easily.

15:43 Yeah.

15:43 You could probably plug it in.

15:45 Yeah.

15:45 Yeah, exactly.

15:46 But you could probably do some kind of web hook when, when there's a check-in automatically

15:53 kick it off and then save a result.

15:54 Right.

15:54 You could just hook into the GitHub actions and then have it just call you back and start

15:59 your, you know, let's take a, take a record of this or whatever.

16:02 Yeah.

16:03 Yeah.

16:03 Very cool.

16:03 This is a great idea.

16:04 Yeah.

16:05 Something else that I'm, I haven't really investigated yet, but that I'm looking into is if this can

16:09 also be used to do like GPU benchmarking.

16:12 So like, let's say you have a library that, you know, also that is going to be, you

16:16 can transparently, use the APIs to transparently move from CPU to GPU.

16:21 Like you have something like Jack's or TensorFlow or PyTorch, then this might be kind of a nice

16:26 way.

16:26 if it's, if it's based on those to be able to like benchmark your GPU performance

16:31 as well.

16:31 Yeah.

16:32 Well, and that's one of the things you might not test, right?

16:34 If it could run either way, you might just run it on your machine, whichever one of those

16:38 it is and forget to try the other style.

16:40 Right.

16:40 Exactly.

16:41 Yeah.

16:41 And I don't think there's too many CI services that are going to, you know, generously give

16:46 you some like really nice GPUs to be doing benchmarking on.

16:48 Yeah, that's for sure.

16:50 For sure.

16:50 All right.

16:51 Now for the next item, let me tell you about our sponsor.

16:54 This episode is brought to you by Tech Meme, the Tech Meme Ride Home Podcast.

17:00 They've been for two years recording episodes every single day.

17:05 And so they're Silicon Valley's favorite tech news podcast, and you can get them daily 15

17:11 to 20 minutes, exactly by 5 PM Eastern, all the tech news you want.

17:15 Right.

17:15 But it's not just headlines, much like by them bites.

17:17 Actually, it's a very similar show, but for the broader tech industry, you could have a robot

17:21 read the headlines or just flip through them, but it has the context and the analysis all

17:26 around it.

17:26 So it's like tech news as a service, if you will.

17:29 So the folks over at Tech Meme, they're online all day reading to catch you up and just search

17:35 your podcast app for the Ride Home and subscribe to the Tech Meme Ride Home Podcast or just visit

17:42 pythonbytes.fm/ride to subscribe.

17:45 I have a theory, a hypothesis about this.

17:48 I think that probably actually be a ton of work to put together a show daily on a time like

17:53 that.

17:54 But it's great that they're doing it.

17:55 Do you have any other hypotheses, Brian?

17:57 Yes.

17:57 My hypothesis is that there's not enough examples out in the world of how people are using a

18:04 hypothesis in the field in real world applications.

18:07 So I'm excited that Parsec put it together.

18:12 So Parsec...

18:13 Well, let's take a real quick step back just for people who don't know.

18:15 What is Hypothesis?

18:17 Oh, okay.

18:18 Right.

18:18 Hypothesis is a testing framework.

18:20 Well, it's not really...

18:21 It attaches to other testing frameworks.

18:23 So you can use it with unit test or py test.

18:25 You probably should use it with py test.

18:27 But it's a way, instead of writing a declarative single test or test case, you can...

18:35 It's a property-based testing.

18:37 So you describe kind of...

18:40 It's not like...

18:41 I expect one plus two equals three.

18:43 I expect if I add two integers and they're both positive that the result is going to be

18:50 greater than both of them.

18:52 You have these properties that you describe what the answer is.

18:57 There's a...

18:58 The examples that Hypothesis and other tutorials on how to use Hypothesis have given are more

19:06 of these like A plus B sort of things.

19:08 They're simplistic things.

19:09 And I do see a lot of value in Hypothesis and I know a lot of people are using it.

19:14 But there haven't been a lot of good descriptions for really how it's being used.

19:19 Like a real world example of how it's being used.

19:23 Because I'm probably not going to...

19:26 I don't have those little tiny algorithm things.

19:28 I've got big chunks of stuff.

19:30 And Hypothesis does have to run the test many times.

19:34 So how do you do this effectively on a large project?

19:37 So I love seeing this article.

19:38 So Parsec is a...

19:41 It's a client-side encrypted file sharing service.

19:45 I'd never heard of them before.

19:46 This blog.

19:48 But it sounds cool.

19:49 It's cool.

19:49 They describe themselves as the zero trust file sharing service like Dropbox where it's

19:54 end-to-end encryption for Dropbox.

19:56 You could share the files, but it only matters if you actually have the key, right?

20:00 Right.

20:00 Actually, I have no idea.

20:02 Sure.

20:03 I suspect so.

20:05 Yeah.

20:05 It sounds like a cool service, actually.

20:07 It sounds pretty neat.

20:08 So they describe kind of what they're doing and some of the problems.

20:13 It's a large four-year-old asynchronous Python project.

20:18 And then they describe this RAID redundancy algorithm that they need.

20:24 It's fairly complex with a bunch of servers and stuff, a bunch of data stores going on.

20:31 And what they need to test is they need to check things like if the blocks can be split into chunks and if the blocks can be rebuilt from the chunks that were split up before.

20:41 And then if you can rebuild them if you've got missing chunks.

20:44 And so this all sounds fairly, you know, yeah, I can understand how you could try to test that.

20:50 But there's a lot of variables in there.

20:52 How big is the chunk size?

20:53 How many chunks?

20:55 How much stuff should be missing?

20:56 And all that sort of stuff.

20:58 And then they're thinking, yeah, hypothesis would be good for that.

21:04 The normal tutorials talk about a stateless way to test with hypothesis.

21:12 But they're saying that for them, the stateful method that is supported is very useful because they're an asynchronous system.

21:21 And they describe how to do that.

21:23 It's actually a fairly complex description.

21:25 And it's kind of a lot to get through.

21:28 But it's neat that the power's there.

21:30 So it does, you know, walks through how they exactly how they set up a test like this.

21:37 And this is something I think the testing community of considering hypothesis has been missing.

21:43 So this is great.

21:45 They end with some recommendations, which it's great.

21:51 So the recommendation is for parts of your system that which parts should you throw hypothesis at?

21:57 That's a really good question because you don't want to throw it at everything.

22:00 Right.

22:01 Because there is some expense to set it up and also to run everything.

22:04 So they describe it as if the piece you're testing is kind of an encoder decoder thing, like theirs is you're splitting things into chunks and then rebuilding things.

22:15 It's a hypothesis is a no brainer for that because you can compare.

22:21 Is my input the same as the encoded then decoded output?

22:26 Yeah.

22:27 The other case is if you have a simple oracle, like it's simple to test the answer, but it's complex to come up with the answer.

22:35 I'm not sure what that is.

22:38 But in the case, you know, some of the cases are, you know, I've got a complex system.

22:43 And I just there's properties about the output that are easy to describe.

22:48 The other one is, yeah, I guess similar is if it's hard to compute, but easy to check.

22:53 Well, one example that just jumps out at me right away is anytime you have a file format, I'm going to save this thing, be able to save and load these files.

23:01 Right.

23:02 Because all you got to do is load up a whole bunch of random data, say save, load.

23:06 Is it the same?

23:06 If it's not, that's a problem.

23:09 Yeah.

23:09 Yeah.

23:10 Yeah.

23:10 And actually, I have talked with some people that that have thrown this at some of the standard library modules just on the side to test.

23:22 Because there's a lot of standard library stuff that's like kind of encoding, decoding sort of thing or two way conversions.

23:29 Yeah.

23:30 Cool.

23:30 Yeah.

23:30 This is super nice.

23:31 I'm going to have to really dig into this article in more detail.

23:34 I remember the first time I like learned about hypothesis was when one of the core devs gave a talk at PyCypys 2019.

23:42 And it just blew my mind then.

23:44 And so this is so cool to see this like very, very interesting application here.

23:49 Yeah.

23:49 Yeah.

23:50 It seems like there's a lot of uses in data science.

23:52 Data science seems tough to test like that scientific computation side because slight variations, you might not get perfect equality.

24:00 Exactly.

24:01 Close enough.

24:02 Right.

24:02 It's like, well, it's off, but it's like, you know, 10 to the negative 10th or something off.

24:08 Right.

24:08 That doesn't actually matter, but the equality fails.

24:10 Yeah.

24:11 Yeah.

24:11 You end up using NumPy's, you know, NumPy's approximation comparison schemes quite a bit in your.

24:19 Yeah.

24:20 And your pie test.

24:20 I can imagine.

24:21 I can imagine.

24:22 Very cool.

24:23 All right.

24:24 Next one, Brian.

24:25 I told you about last time I talked about I'm still waiting on my Mac mini, right?

24:31 Yes.

24:31 I ordered the Apple, the M1 Mac mini maxed out.

24:35 And I'm a little bit jealous.

24:37 My daughter is getting a new Mac mini.

24:40 She doesn't, or Mac Air.

24:42 She doesn't know what about, but it's supposed to show up tomorrow and mine's still weeks away.

24:45 And I don't think that that's very fair.

24:47 But if you are an organization that depends on cloud computing and, you know, what organizations

24:53 don't these days, right?

24:55 They almost all do.

24:56 It was just announced at reInvent that AWS is going to be offering Mac instances as a type

25:02 of VM.

25:03 So until now you've been able to get Windows, Linux, that's it.

25:06 So for all those people out there who are offering some kind of iOS app, even if they're not like

25:11 a Mac shop, they still have to have Macs around because you can't compile and sign your IPA,

25:16 your Mac, whatever iPhone app format is.

25:19 You can't create those without a Mac.

25:21 So there's all these Macs that are around for like continuous, you know, CICD or checking those

25:26 things and whatnot.

25:27 So now you can go to AWS and say, I'll take a Mac mini, please.

25:32 That's pretty cool.

25:32 That's cool.

25:33 Yeah.

25:33 So you can do your tests up there and they don't have M1 yet.

25:36 Those are the Intel ones, but the M1 chips are coming later.

25:40 So you'll be able to do it.

25:41 What's interesting about this offering from AWS as is basically any cloud service, you would

25:46 imagine it's a VM, right?

25:47 But these, when you say I want one of these, you actually get a dedicated Mac mini.

25:52 That's you get pure hardware.

25:53 Well, that's why you can't get yours because Amazon bought them.

25:57 They did.

25:58 They had a huge truck full of them.

26:00 Well, they bought the Intel one.

26:01 So those were on sale, I bet anyway.

26:03 But no, they have some interesting, what do they call it?

26:09 Nitro.

26:09 I think they call it their Nitro service or something like that, which allows them to virtualize

26:14 actual real hardware.

26:15 So this is pretty neat.

26:16 You can sign up.

26:17 The billing is interesting.

26:18 You have to pay for at least one day's worth if you get it, which I think is like $24.

26:24 If you're going to run it continuously all the time, this is one pricey sucker.

26:29 Like the Mac mini you can get now is $700.

26:33 This is $770 a month.

26:36 Oh, okay.

26:37 So if what you need is like a couple of Mac minis, you're probably, and you need them

26:42 on all the time.

26:43 You're probably better off just buying a few and sticking them in a closet, especially the

26:47 M1s.

26:48 But if you just need one on demand every now and then, or you need to burst into them or

26:51 something like that, that could be interesting.

26:52 Yeah.

26:53 Yeah.

26:53 If you're back old school and you only release like once every three months.

26:56 Well, there was some conversations like, well, if your data is already stored in S3 and you

27:03 have like huge quantity of data and what you need to run is actually running like some video

27:08 processing on the Mac, you could do it by the data instead of transferring that kind of stuff.

27:12 Things like that might be interesting.

27:13 I don't know.

27:14 I would go ahead and throw out there also that this is all interesting.

27:18 I have links to this kind of stuff and whatnot, like the blog post announcing it and so on.

27:23 But there's also this thing called Mac Stadium.

27:26 And if you look at Mac Stadium, it's pretty interesting.

27:28 You go over there and say, give me a dedicated bare mini, a bare metal Mac mini in their data

27:33 center, $60 a month.

27:35 So you can actually get like a decent one for a decent price over there.

27:41 So if you just want one running all the time, it might be good.

27:43 But the thing is, if you're already like deeply integrated to AWS, maybe this is a good thing.

27:48 Yeah.

27:48 Yeah.

27:49 Is there anything you, yeah, go ahead.

27:51 I was just going to say, this seems pretty interesting.

27:53 I mean, I know one of the reasons that I love using GitHub Actions and Azure Pipelines

27:58 is the ability to be able to get access to Mac VMs for builds.

28:04 But if you, I could also see this being really interesting and useful if you have like some

28:08 very huge application or some like very large stack that you want to be able to be able

28:13 to do CI or tests on, that this could be really, really nice.

28:18 Especially if you don't just want to be like, you know, pounding and destroying like one Mac

28:23 over and over and over again.

28:25 Yeah.

28:25 Yeah.

28:25 This is nice.

28:26 Especially if you have a distributed team.

28:28 Yeah.

28:29 Which every team is basically a distributed team.

28:32 Yeah.

28:32 Yeah.

28:32 Welcome to 2020.

28:33 One thing that's interesting about this is you can literally press a button or even just

28:37 through the AWS, probably the Bodo API.

28:40 You can just make a new Mac instantly.

28:42 Like within seconds, you can have a clean, pre-configured Mac.

28:46 You can create AMIs, the Amazon machine image, which are like, install a bunch of stuff and

28:52 get it set up and then like save it so I can respawn new machines from it.

28:56 Those are pretty interesting options that just having a Mac mini in the closet.

28:59 You know, push a button, make a brand new one, try this, throw it away, make it a different

29:03 way, throw it away.

29:04 Like there are some use cases here that could be interesting.

29:06 That said, I won't be using it.

29:10 I'm just going to buy a Mac mini if I can ever get it.

29:12 All right, Matthew, what's this last one you got for us?

29:14 Yeah, I don't have any clever transition, but all right.

29:17 So maybe, I don't know about you, but I end up having to deal with a lot of JSON serializations

29:24 of different statistical models and different, and, you know, sometimes also getting, you know,

29:29 CSVs of different data sets that I want to be doing analysis on.

29:33 And, you know, your first instinct might just be to say, okay, I'm just going to open

29:38 this up in Pandas and start to get to work on it.

29:41 But if you kind of are used and comfortable to working in the Linux command line kind of

29:47 ecosystem of data tools, you might be itching a little bit and want to kind of just, you

29:51 know, peek inside at the command line level and kind of get to work there.

29:54 And so in that case, you might be really interested in this tool called VisiData.

29:59 So VisiData is written on this.

30:01 This is blowing my mind, actually.

30:03 Like, when I saw this, my jaw was kind of on the floor.

30:07 So we'll make sure that this is linked in the show notes because it has some really cool

30:12 videos.

30:12 But so from the docs, so it's VisiData is described as data science without the drudgery.

30:18 So it's an interactive multi-tool for tabular data.

30:22 It combines clarity of spreadsheets with efficiencies of being at the terminal and also, you know,

30:27 the power of Python 3 on a really lightweight utility that can handle millions of rows with

30:33 ease.

30:33 I can attest to that personally.

30:35 I've opened up to like four gigabyte CSV files before and it just, you know, drops

30:40 right in and starts asynchronously loading like a champ.

30:43 It's in addition to that, it supports kind of a really astounding number of file formats

30:47 that it supports.

30:48 Currently on the website, it says it supports 42 different file formats.

30:52 So it, you know, supports things that you would expect like CSV and JSON, but then it also

30:58 supports things like JIRA.

30:59 I guess like whatever JIRA uses for their sort of like tabular stuff.

31:05 It also can like read my, my SQL.

31:07 And I guess it can also even deal with PNG, the image file format, which I was, you know,

31:13 impressed by.

31:13 So this is all openly developed.

31:15 The output is a terminal, right?

31:17 Yeah.

31:17 Not a thing like text.

31:19 Yeah.

31:21 Yeah.

31:22 So this is all openly developed on GitHub by a guy named Saul Pawson, I think.

31:27 And if you go to the, if you go to the Visadata website, it also has plenty of links to live

31:36 demos of him doing kind of interactive examples of visualizations.

31:40 There's one lightning talk that he's given at, I think, PyCascades 2018 or something like

31:45 that, where he's able to just call up a CSV file of, of like, 311 components.

31:51 And then through the, through using Visadata, just kind of hone down onto certain boroughs

31:58 and then be able to do a filter on, on different complaint types to be able to basically find

32:05 complaints about rodents and then filter on rat complaints and then plot that inside of Visadata

32:10 still on the terminal to basically make a visualization of like rodent distribution in the New York City

32:17 boroughs.

32:17 So I thought that was, you know, quite amusing and really cool.

32:20 It's also, you know, this is a Python application.

32:24 So you might not want to, you know, continuously install this in every single virtual environment you make.

32:30 So, I mean, it is up on PyPI, so you can just do pip install Visadata.

32:34 But since it's an application, you probably might also want it just kind of as a generic tool on your machine.

32:40 So it's distributed through a lot of, you know, nice common package managers.

32:45 So if you're on Linux, they've got it on apt, as well as things like NX and GUX.

32:52 But I didn't see it on Yum.

32:54 So if you're on Fedora or CentOS, you might be a little bit out of luck.

32:57 You might have to do it manually.

32:58 It's, of course, on Homebrew and even Condo Forge.

33:02 And it's not listed there, but a very, very cool tool that's been featured on the show before,

33:08 which is PipX by Chad Smith.

33:10 Yeah, PipX is awesome.

33:11 It's so good.

33:11 I love it.

33:12 I tested this last night.

33:14 I just fired up a Python 3.8 Docker container and, you know, went ahead and installed PipX

33:21 and then used PipX to install Vizidata and was able to drop right into Vizidata as expected.

33:25 So it's very, very cool.

33:28 And just the power that you can have with it, I think, is worth checking out for anybody who

33:33 is doing data analysis with tabular data.

33:36 This is super cool.

33:37 I love when people build these tools that are kind of, you don't really expect them to be

33:41 so powerful.

33:41 And you talked about how you just dropped in and grabbed some random data and started answering

33:44 questions.

33:45 And that's super neat.

33:46 Yeah.

33:47 Yeah.

33:47 The number of inputs.

33:48 And because it's open source and because of all the other examples of data types, I

33:54 think even if you have a different data type, it shouldn't be too hard to modify this to

33:59 handle something different.

34:01 I do notice I'm excited about it.

34:03 It does have PCAP files for packet capture.

34:05 These are for communication packets.

34:08 Talking to all your devices and all your hardware at your company, right?

34:12 Well, this is like even Wi-Fi packets and cellular packets.

34:16 That's how we debug those.

34:17 So nice.

34:18 It's very cool.

34:20 Yeah.

34:20 Very cool.

34:21 And PipX is great.

34:22 I install a bunch of apps like Glances, which is a fantastic, like visualize the state, you

34:28 know, like top, but way, way better.

34:29 The ACPy, which is great for it's a better, but much, much better curl.

34:34 But the most important thing I install that way is a Py joke.

34:38 So now I can type Py joke on my command line and we're always right there.

34:42 So speaking of which, move on to our extras.

34:47 That's all of our main topics.

34:48 Brian, you got anything this week?

34:50 Oh, I did.

34:51 I haven't dropped them in.

34:52 Where'd my extras go?

34:53 Yeah, well, you got it.

34:56 I just wanted to bring up that the PyCon 2021 is going to be virtual and there's a website

35:03 up.

35:04 It's us.pycon.org slash 2021.

35:07 And there's not a lot there yet, but you can check out what's going to happen.

35:12 It's not surprising that they have to start planning it and they may as well plan it as a virtual

35:19 event.

35:20 I'm just kind of hoping that we would have live, but I understand.

35:23 Yeah.

35:24 I mean, Hycon is my geek holiday.

35:27 I love it.

35:28 It's both work, but it's also just such a nice getaway to connect with everybody.

35:32 You, everyone else we know from the community, listeners, I'm going to miss not having it.

35:38 Yeah.

35:38 I'm glad.

35:39 Do you attend?

35:40 Sorry, Brian.

35:40 No, it's good that they, I always check whenever they announce the date to make sure it doesn't

35:45 overlap Mother's Day.

35:46 And it does not.

35:48 That's not good.

35:50 Yeah.

35:50 So I have unfortunately not attended PyCon yet in person or, I mean, it was canceled this

36:00 year.

36:00 So maybe, maybe I'll attend this year remote, but I'm a regular attendee of the SciPy conference,

36:06 which this, so this past year, SciPy 2020 was moved online.

36:11 And I thought that the organizers did a fantastic job of actually writing it online while still,

36:16 you know, keeping kind of that SciPy community feel.

36:19 So that was helped a lot also by, you know, plenty of bad puns.

36:23 So I think that might, might be something that still comes through for PyCon 2021, maybe.

36:28 Yeah, absolutely.

36:30 One of the live listeners, Mohammed said, ask if it's going to cost money or if it's going to

36:36 be free this year to attend.

36:38 Did you notice anything, Brian?

36:40 I haven't looked.

36:41 I'm looking around and I don't know that it costs anything.

36:45 It's from what I can tell.

36:46 I don't see any pricing.

36:48 What I saw was sponsor information to get sponsors to sign up to be part of whatever they're doing

36:53 there.

36:54 but I can't tell.

36:55 Yeah, not sure.

36:56 Somebody else throw in the chat or put it into the, you know, visit pythonbytes.fm/211

37:01 and put it in the comments down there.

37:02 All right.

37:03 I got a couple here.

37:04 First of all, we're trying out live streaming here and I think it's going pretty well.

37:08 Seems like it's working out.

37:10 There's a bunch of people watching.

37:11 So if you want to get notified and we happen to keep doing this, just visit pythonbytes.fm

37:17 slash YouTube and it should have like the scheduled upcoming live stream.

37:21 You can like get notified there.

37:22 So we'll, maybe we'll keep doing this.

37:23 It's been fun.

37:24 Thanks for everyone out there who's watching right now.

37:26 And in addition to PyCon, which you just announced or mentioned the announcement of, that is the

37:33 main way that the PSF is funded, but they're also doing a dedicated offering sort of fundraiser

37:40 thing with six companies to help raise some money for the PSF and Talk Python training is

37:46 being part of that.

37:47 And 50% of the revenue of a certain set of our courses that are sold during the month of

37:53 December goes directly to the PSF.

37:55 And people who buy those courses through the PSF fundraiser also get like 20% of a discount.

38:00 So there's a link in the show notes for people to take some of our courses and donate to the

38:06 PSF.

38:06 If you'd rather just directly donate, that's fine.

38:08 But if you're looking to get some of our courses anyway, you can do it this way and support the PSF.

38:12 They're hoping to raise $60,000.

38:14 You know, hopefully we can do that for them and we'll see.

38:17 Brian, you announced Big PyCon.

38:20 Another thing that got announced is Small PyCon, PyCascades.

38:25 PyCascades being the mountain range that connects Portland, Seattle, and Vancouver.

38:28 And traditionally, this conference is cycled between those three cities.

38:32 I don't even remember anymore what it's supposed to be this year.

38:34 I think it's supposed to go back to Vancouver, but it's not going to Vancouver because nobody's

38:38 going anywhere.

38:39 So PyCascades is online.

38:42 And those do cost money.

38:42 It's $10 for students, $20 for individuals, and $50 for professionals to support that conference.

38:48 But I'll link to that one since that's one of our local conferences, if you will.

38:52 Yeah, they're trying to push.

38:53 They often push what's going on, try new things.

38:57 So it's a neat conference.

38:58 Yeah.

38:59 Yeah.

38:59 I enjoy my time there as well.

39:00 All right, Matthew, what are you got for us?

39:02 Anything else you want to get a shout out to?

39:03 Yeah, just a few items.

39:05 So Advent of Code 2020 is started now.

39:09 It's day two, but there's still plenty of time to get involved with that if you want to.

39:13 And for those of you who might not know, Advent of Code is just an annual coding challenge that

39:19 takes place every December.

39:21 And it's just basically 25 days of fun and interesting programming challenges.

39:26 So it's always a great opportunity to try and brush up on your Python and maybe learn about

39:31 some interesting collections that you might not have known about in the standard library.

39:38 So that's going on right now.

39:39 Worth checking out, I think.

39:41 And then I'm going to sneak in some very small physics-related follow-up to Python Bytes

39:48 episode 205, in which awkward arrays were talked about.

39:52 So the lead developer of awkward arrays is my friend and colleague, Jim Povarsky, who is one

39:59 of my scikit-hep co-collaborators, as well as also a member of iris-hep.

40:04 And as of today, which is recording December 2nd, awkward v1.0 is a release can that is up on PyPI.

40:13 So by the time that this goes live, if you just do pip install awkward, you should get awkward 1.0 releases

40:20 instead of having to do the...

40:21 No more awkward 1.

40:22 Exactly.

40:22 No more awkward 1.0.

40:24 No more awkward 0.0.

40:25 All that jazz.

40:25 It's so good to have the actual install statement be awkward itself.

40:28 Exactly.

40:29 So that's a nice little tidbit.

40:32 And I think there's some nice links in episode 205 if people want to learn more about awkward.

40:38 But that's kind of a backbone of kind of the Pythonic ecosystem for physics right now.

40:43 And then finally, I just want to give some kudos to Python Bytes as well, specifically for making

40:49 full transcripts of the shows available to view on pythonbytes.fm.

40:54 Not only is this, I think, like a cool idea in general, but I think this also makes the show

40:58 more inclusive to the deaf Python community, which is definitely out there.

41:03 And one of my good friends and co-authors is deaf, and I know that he definitely appreciates

41:09 this.

41:10 So good job, Ben, you guys for being more inclusive of the wider community.

41:14 Oh, that's so cool.

41:15 I didn't know anybody was utilizing it.

41:17 Yeah, that's awesome.

41:18 Thank you.

41:19 I think it's absolutely critical for that because the format is only audio.

41:24 But a lot of folks have reached out and said they also appreciate it if they're English as

41:28 a second language and they're not as good with English as well.

41:31 So that also helps, I think.

41:33 They're like, what was I saying again?

41:35 What a weird word.

41:36 Awkward array?

41:37 Why would they talk about that?

41:38 It doesn't make sense.

41:40 Yeah, transcripts and closed captioning is just more inclusive for everyone.

41:44 So that's awesome.

41:45 Yeah, thanks.

41:46 All right.

41:47 Well, let's wrap it up with a joke, Brian.

41:51 Yeah.

41:51 All right.

41:52 So you guys, I'm going to need your help here.

41:54 I'm going to let Matthew, I'm going to let you pick.

41:57 Do you want to be Windows or Apple?

41:58 I'll be Windows.

42:00 All right, Brian, you'd be Apple.

42:01 So the idea is like the title here is how to fix a computer, any computer.

42:07 So instructions for Windows.

42:10 Go ahead, Matthew.

42:10 So step one, reboot.

42:12 And then the flowchart goes to, did that fix it?

42:15 If no, proceed to step two.

42:17 Step two, format your hard drive and then reinstall Windows.

42:20 Lose all of your files and quietly leap.

42:23 Brian, Apple doesn't have that problem.

42:26 There's some totally different solution there.

42:28 Okay.

42:28 For Apple, it's step one.

42:30 Take it to an Apple store.

42:31 Did that fix it?

42:32 If no, proceed to step two.

42:34 Step two is buy a new Mac.

42:36 Overdraw your account and quietly weep.

42:39 That's me right now.

42:41 All right.

42:41 I got the Linux fix.

42:43 It's so easy.

42:43 It's totally like you don't need those things.

42:45 So you learn to code in C++.

42:48 You recompile the kernel.

42:49 You build your own microprocessor out of spare silicon you have laying around.

42:53 You recompile the kernel again.

42:55 You switch distros.

42:56 You recompile the kernel again, but this time using a CPU powered by the reflected light from

43:02 Saturn.

43:02 You grow a giant beard.

43:04 You blame Sun Microsystems.

43:06 You turn your bedroom into a server closet and spend 10 years falling asleep to the sound

43:10 of wearing fans.

43:10 You switch distros again.

43:11 You abandon all hygiene.

43:13 You write a regular expression that would make any other programmers cry blood.

43:17 You learn to code in Java.

43:19 You recompile again, but this time while wearing your lucky socks.

43:22 Did that fix it?

43:22 No.

43:23 Proceed to step two.

43:24 Revert back to using Windows and Mac.

43:26 Or Mac.

43:27 Quietly weep.

43:30 There's really no good outcome here.

43:31 They all in and quietly weep.

43:33 As a Linux user for the better part of a decade, I can neither confirm nor deny how accurate

43:38 that last part is.

43:39 Yeah, they all have their own special angle.

43:44 It just takes longer to get there with Linux to get to your destination, I guess.

43:49 Yeah.

43:49 All right.

43:50 Well, that's fun as always.

43:51 And everyone watching on YouTube, thanks for being here live and everyone listening.

43:54 Just thank you for listening.

43:55 Matthew, thanks for joining us.

43:57 Hey, thanks so much for having me.

43:58 This was really fun.

43:59 Yeah, yeah.

44:00 Great for the items you brought.

44:01 Enjoyed them.

44:02 And Brian, thanks as always, man.

44:04 Thank you.

44:05 It's been fun.

44:05 Yep.

44:06 See ya.

44:06 Bye.

44:07 Thank you for listening to Python Bytes.

44:09 Follow the show on Twitter via at Python Bytes.

44:11 That's Python Bytes as in B-Y-T-E-S.

44:14 And get the full show notes at pythonbytes.fm.

44:17 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

44:22 We're always on the lookout for sharing something cool.

44:24 On behalf of myself and Brian Okken, this is Michael Kennedy.

44:28 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page