Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #284: Spicy git for Engineers

Return to episode page view on github
Recorded on Monday, May 16, 2022.

00:00 - Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:04 This is episode 284, recorded May 17th, 2022.

00:09 I'm Michael Kennedy.

00:10 - And I am Brian Okken.

00:11 - And I am Daniel Mulkey.

00:13 - Daniel, great to have you here.

00:14 - Thank you, it's an honor.

00:16 - Yeah, it's an honor to have you.

00:17 Now, before we get into our first topic that Brian's gonna tell us about, just give us a bit of your background.

00:23 - Sure, I am a optical engineer in Southern California, But I have a significant amount of my time spent using Python for data analysis, instrument control, and other things.

00:34 So I've been doing it for the better part of the last five years.

00:38 And I've had a back and forth relationship with MATLAB and have finally married to Python, so to speak.

00:43 - Fantastic.

00:44 You've finally been able to get out of your dysfunctional relationship with math.

00:48 - Yes, exactly.

00:49 - It sounds a little bit like you might live in a parallel universe to Brian.

00:53 - Yeah, it sounds like it.

00:55 We should definitely get you on testing code Can we NPS about that?

00:58 I'm sure yeah, I'd love to Brian.

01:00 I would love to hear about our first topic you want to.

01:03 Well, talk about it sounds very distinct, you know, distinct to buy.

01:07 Yes, very distinct.

01:08 So I ran across this.

01:10 I can't remember how I ran across it.

01:12 Guess it doesn't matter.

01:13 But one of the things I like, it's a Python package called distinct to buy.

01:17 And it's very simple.

01:19 It's a lightweight Python package to provide functions to generate colors that are visually distinct from one another.

01:28 So I was thinking like, you know, you got a chart, like maybe you're taking user data or something and you don't know how many lines you're gonna plot, but you're gonna plot a whole bunch of lines.

01:39 How do you pick the colors for what the lines are?

01:41 So this is a kind of a neat thing to just pick visually distinct colors.

01:48 Pretty focused, but it's pretty cool.

01:52 And all you do is you kind of just give it, You give it like the number of colors you want and it gives you back the colors and you can it has display capabilities. So you have to install extra stuff to make that happen.

02:05 But you can display color swatches to with it and I was looking at some of the different colors that are available like one of the ones was.

02:14 15 different colors, I think it's 15 colors for normal vision versus some colorblindness.

02:21 So if you have colorblind people, you can pick based on some of that stuff.

02:26 There's a whole bunch of examples in the repo too that it's kind of fun to look at.

02:31 One of them was the normal colorblind one.

02:34 Oh, was that it?

02:36 No, that wasn't it.

02:37 But there's some really cool examples of different colors.

02:42 So if you just give it a few, it just grabs a few, of course.

02:44 But there's a whole bunch of neat ones, clusters and things.

02:49 So anyway, cool little library.

02:52 >> It's great.

02:52 >> Yeah, I like that they have--

02:54 I noticed when I was looking through it, they have a function for generating a color palette.

02:57 And so you can generate a color blind-friendly palette.

03:00 So hypothetically, that works well for visual color blind.

03:03 And if it's in print and you're doing black and white, so that was the most interesting thing to me.

03:06 >> Oh, do you mean this black and white?

03:08 That's interesting.

03:10 At least I think if you take a color blind palette and you make it black and white, typically it's still a decent contrast.

03:14 So you don't have to worry about printing things out.

03:17 - Oh, that's cool.

03:18 - Yeah, that's great.

03:19 And one of its functions is to take the color map that it generates and turn that into a map plot lib.

03:25 - Oh yeah, yeah, yeah.

03:26 - Which is cool.

03:27 - Well, that's what I was looking for.

03:29 - Yeah.

03:29 Oh, wow.

03:32 - And there's somebody in the audience who just found out the color blind.

03:34 - Yeah, go ahead, Daniel.

03:35 - No, just kidding.

03:36 And there's somebody in the audience who just found out the color blind.

03:39 (laughing)

03:40 They're like, is there a difference?

03:41 What is this?

03:41 (laughing)

03:42 - So, I, yeah, one of my kids found out like in high school that they were colorblind.

03:49 So, interesting.

03:51 - Yeah, how would you know?

03:52 - Yeah. - For a long time.

03:53 You're just like, people tell me that's a color.

03:54 I guess I'm not great at picking out that color.

03:56 - She got, she, an art teacher said, "I really love how you used both blues and greens "in the sky." And she was like, "I intended to just use blue, but thanks." (laughing)

04:07 - I have a friend who went to art school and that was essentially his story that he always had really vivid color choices 'cause he didn't see the same as everybody else.

04:15 (laughing)

04:16 It was great, it was awesome.

04:17 - That's pretty cool.

04:19 - Yeah, cool, all right.

04:21 Brian, we ready for the next one?

04:22 - Definitely.

04:24 - Okay, so let's talk about SQL Soda, or Soda SQL.

04:29 So this is a open source CLI tool that if you're doing like ETL, like ingest transform loads type of stuff, doing other sort of analysis or exploration of SQL data.

04:43 It allows you to connect to your data source like your database, and then define tests for what invalid data looks like, right?

04:51 Does this have to be a number?

04:53 Can it, does it just have to be not null?

04:56 You know, what is it?

04:57 So for an example here, they're talking about, here's the YAML file for a warehouse, a data warehouse reporting type thing for Postgres.

05:08 So you just set up like your connection and your host and all that kind of stuff and then off it goes.

05:14 So pretty neat.

05:15 And then you can scan your dataset to run tests against your data.

05:19 Isn't that cool?

05:20 - That's right.

05:21 It's soda cool.

05:22 - It's soda cool.

05:23 It is soda cool.

05:24 (both laughing)

05:26 Yeah, so you just say soda scan and you give it the YAML file for the connection information and then a YAML file for the types of things you wanna test.

05:37 So they've got this example of how you're talking to one of the data warehouses and it's going and pulling in these config files.

05:44 And it basically, this example, it's testing 54 different conditions.

05:48 Three tests were executed.

05:50 Everything's good to go.

05:52 So, you know, if you're getting kind of data dropped on you or you're scanning, you know, scraping data from other places on some kind of background job and you wanna bring it in, you know, if it's all automated, how do you know when it goes wrong, right?

06:03 So here's a nice, simple way to express that.

06:06 - Yeah, that's neat.

06:07 - Yeah, and Brandon out in the audience says, "I think we're looking at great expectations "for this same thing." And yeah, this is kind of a, I guess, my first impression is this is a less code way of doing what great expectations does, right?

06:21 So like you can just put together some YAML files that define what you wanna test for, right?

06:27 So for example, in this YAML file, I can say the metrics are row count, missing count, and missing percentage, and then I can test that the row count is greater than zero, right?

06:35 And then another one is for the column, for the ID, it's a UUID that it's, I'm allowing 0% of the UUID format to be invalid, right?

06:46 You know, that's got like a certain structure to it, right?

06:48 It's like a, either a straight UUID or a string that looks, that can be parsable over to one, I'm guessing, something like that.

06:54 So pretty cool.

06:55 I think that's probably the biggest difference.

06:57 So if you just want to define kind of like declaratively, like here are the conditions of which I want it to test, and then you want to just set it up to continuously scan it, looks good.

07:06 - The invalid percentage looks interesting because it's an interesting addition of like, you know, there can be some bad rows, but we don't want more than like 20% bad rows or something like that.

07:19 - Right, right, maybe you can't have zero errors, right?

07:22 Like you just, sometimes the data is just not there.

07:25 But if it's 100% not there, then something's gone terribly wrong or the data formats change and it's not called that anymore or whatever.

07:33 JSON, who knows? Daniel, what do you think?

07:35 My data is always in CSV files, so I have, I guess there are pros and cons to never having touched SQL, as I've heard from some.

07:42 Much easier to version control.

07:45 Just put the CSV in version control.

07:47 Yeah, anyway, I think this one's pretty neat.

07:52 People can check it out if they're doing relational data stuff, and they, especially if you're doing a lot of, like, on-demand, you know, not like you ask for it, but it's just on-demand processing, You're given a database and you want to check it out to see how it's doing.

08:05 So I won't go on anymore on that because I've got a ton of other extras.

08:09 So kick it over to you, Daniel.

08:11 Cool.

08:11 So let's see.

08:13 There was a review article back in 2020 published in the research journal Nature.

08:19 For anyone not in the research articles world, Nature is one of the top level ones.

08:27 For reference, in grad school, we had some fancy work we did with quantum entanglement.

08:31 and we got rejected by a sub journal of nature. So to get anything into nature is highly non-trivial.

08:37 I will add the caveat that this- - It's like the JAMA, the Journal of American Medical Association of science basically.

08:44 - It's absolutely one of the top ones. - Yeah.

08:47 - And I will say it's a review article. So it's easier typically to get a review article than to say, "Hey, this is bleeding edge research that's gonna change the world." But still, The big news is two things.

08:56 One, that there is a article by Travis Elephant and others on array programming with NumPy in nature.

09:04 That's a big enough deal that they chose to publish this.

09:07 And they got through.

09:07 And it's, I think, very significant that that software was something that was good enough to publish.

09:14 The other-- and they go through, and they talk about the fundamentals of it all.

09:18 There's one diagram I really like that sort of shows how the whole ecosystem stacks up.

09:22 You've got NumPy as the base.

09:24 - That's a cool visualization.

09:25 - Yeah, and then you got SciPy and Matplotlib and the other plotting libraries.

09:29 So there's the foundation.

09:31 - Yeah, I was just gonna say, for people who are listening, it's like the tree of life for scientific libraries.

09:37 Sorry, go on, Daniel.

09:38 - Yeah, that's absolutely right.

09:40 So from that foundation as far as algorithms and plots, you go up to a specific method you're using.

09:45 Are you doing image processing?

09:47 Are you doing machine learning or something else?

09:49 And off-domain specifics like Astropy, and I think you've had those guys on Python, we've gotten to talk to them and then down to very application specific.

09:56 So it numpy serving almost everybody who does anything numerical down to like Q-tip, which is used for people working on quantum computers.

10:04 Very large breadth being discussed. Q-tip. That's so cute.

10:09 I like it.

10:10 And yeah, so it's notable that Python got into nature.

10:18 And if you go search for Python, there are a lot of other articles, But it's also interesting to see that they're willing to publish software.

10:24 You guys have talked in the past about how you can't always publish a software package in any research journal, so how do you get credit for that if you're in academia?

10:32 But this is an interesting take to see that nature chose to publish it.

10:35 - Yeah, this is super interesting.

10:36 I think it's very valuable to just raise awareness, right?

10:40 It's, you know, this is the water that we swim in, but not everyone.

10:44 Everyone is immersed in the Python data science tooling, right? - Yeah.

10:49 - There's a lot of authors on here.

10:50 Yeah, I was trying to understand.

10:52 I'm guessing those are the maintainers of the packages that were included, but I mean, you don't have 20 people write one paper.

10:58 So I don't know how, I think it's kind of like the LIGO papers or like the gravitational wave interferometer ones where like this crazy list, it's like the first page of the articles, almost all authors, just 'cause there's so many people that worked on this for so long.

11:11 So I'm guessing that's the story.

11:13 - And you can access it.

11:15 Some articles, some journals, you can't actually read it unless you have a subscription, but this one's available.

11:22 - Indeed, yeah, very cool pick.

11:26 Before we move on, maybe you know, Daniel, Alvaro, an audience asks, have any of you come across a way to validate Pandas data frames against a schema, much like SQL Soda, Soda SQL?

11:38 - Thought of my scope.

11:40 - I feel like we have, but I don't remember.

11:42 - Yeah, I don't remember either.

11:46 Sorry, maybe something we should seek out for the next one.

11:50 And I think we might get some answers in the audience.

11:52 So we'll let them inform us as we move on.

11:56 So Brian, what's next?

11:58 - Well, this isn't Python specific, but I think a lot of Python people are using GitHub Actions.

12:05 So GitHub announced, I guess, recently, a supercharging GitHub Actions with job summaries.

12:12 It's an article that we'll link to.

12:14 and basically it's pretty cool.

12:19 I can't wait to try this.

12:20 I'm using GitHub Actions.

12:21 And the gist is you can now have Markdown go directly into your GitHub job summary sort of thing with like this crazy global variable called GitHub step summary.

12:35 But it's got Markdown to it.

12:39 And I'm like, well, what can you do with this though?

12:41 but Simon Wilson released, was tweeting about it.

12:46 And then said, and then Ned Batchelder said, "Hey, I'm using it too." So Ned has a little example on his, on coverage.py that shows, what does it show?

12:59 It shows you get this nice total coverage percentage.

13:03 If you wanna put that in your, in the coverage for your repo, you can do that.

13:09 interesting that coverage.py is not 100% covered.

13:13 (laughing)

13:14 I don't know why I find that funny.

13:16 - The irony, I love it.

13:17 - But, and then, so Simon also listed Dataset as an example on Dataset.

13:25 You doing, adding some extra stuff to, what is he adding?

13:31 Changed files.

13:32 Oh, he's got a tool that does, looks for how many files have changed in recently.

13:40 And he actually just wrote a write up for that.

13:44 So we're linked to that as well.

13:45 So GitHub action job summaries, and he shows how it works.

13:50 You can pop out stuff.

13:51 And I love Markdown.

13:52 - Yeah, even little code fences and all sorts of stuff.

13:55 That's very cool if you wanna structure something real nice like that.

13:57 - Yeah, it even has, so supposedly it's got a whole bunch of stuff.

14:02 It's got like, you can do tables even.

14:05 So that's neat and emojis.

14:07 Why not?

14:08 - Oh yeah.

14:09 - Pretty cool.

14:10 - You can put a little fire emoji in there.

14:11 Yes, do it.

14:12 - Is there any way to get images?

14:13 Like if you create an image during the action, can you reference it?

14:16 - I don't know.

14:18 It doesn't mention images, but.

14:21 - Maybe you could base 64 encode it and embed it as a data URL.

14:24 - Oh wow, it even does a mermaid, which is a way to do diagrams within it.

14:30 That's pretty neat.

14:31 - Very nice, like flowcharts.

14:32 - Yeah.

14:33 - Fantastic.

14:34 - This is a good one.

14:35 I need to learn to do more with GitHub Actions.

14:37 I don't do very much with them.

14:38 - I love them.

14:39 They're like, it was, I used to use Travis back in the day, but, and I think these are way easier, so.

14:46 - Daniel, do you do any of those sorts of things?

14:49 Any CI automation type stuff?

14:51 - A previous company, we used Azure DevOps and set up some stuff to build packages and build applications, but not at the moment.

14:59 There's just, it doesn't happen to be any code bases I have, but I need that.

15:04 - Yeah, very cool.

15:05 All right, well, I've got an interesting one here.

15:08 I wanna dive into it, you guys.

15:10 So this one, let me give some attribution here.

15:14 This one was sent over by Intimar, I believe from Meta, and then this is a write-up by Alex Wegged.

15:22 And what it is, is it's basically the notes for all of us who were not there for the 2022 Python Language Summit.

15:30 So that's pretty cool.

15:31 There were around 30 core developers, triagers, and special guests gathered the day before PyCon.

15:38 And so they had a bunch of different talks and ideas they discussed.

15:43 Quick summary, really it's about, so much of this is about performance and parallelism right now.

15:50 And then there's a lot of maintainability, back channels, back flows here.

15:57 All right.

15:58 Coming to these first, Sam Gill made a huge splash last year when he talked and he introduced the no-gill work that they had done for, I thought, 3.8, I believe. I can't remember, 3.8, 3.9. No, it was 3.9 for them. Cinder was 3.8. So for 3.9, and there's a lot of interesting optimizations and whatnot in that talk. So the idea is, could we live without a global will interpret a lock. Larry Hastings tried the galactomy sort of said, you know, it's too much of a penalty to try to live without it. But this no-gill work that Sam Gross did actually had very small overhead in terms of what it added, but potentially removed some of the gill things. So there's a lot of analysis of that. People were excited, but how is it written?

16:48 It says, "Robust," there was robust questioning.

16:52 (laughing)

16:54 One, I guess one of the biggest parts that they discussed was maybe this should be a fork of CPython.

17:00 There should be a no-gill version of Python.

17:03 And, but Sam is like, "Mm, I really don't wanna have just another separate version of Python.

17:09 I really want this to just help everyone." So, pretty interesting.

17:14 I think originally it was maybe gonna be a runtime flag you could pass to Python, but it's looking like it more likely is gonna turn out to be a compiler flag.

17:23 So you'd have to have a no-guil build, even though it's from the same source code.

17:27 So yeah, a bunch of interesting things, concerns about how it's gonna work with like C libraries and so on.

17:34 But that's, all these are pretty interesting readups, reads, write-ups.

17:39 So Eric Snow did a presentation on his per interpreter gill, which is interesting in how it approaches a slightly different problem than say, Sam Gross.

17:51 So Sam is trying to get it out of Python.

17:53 Eric is saying, well, if we could just have a sub interpreter like a little mini in-process interpreter that runs per thread, then they can all gill to their heart's content.

18:02 It doesn't matter 'cause it's all single threaded, right?

18:05 But what's interesting is if you go look at this one, in here, we've got this one.

18:11 It says something like way back in 1997, this idea of multiple sub-interpreters was added by Guido, but it really hasn't, nothing has been done with it.

18:23 And when somebody tries to do stuff with it, there were thousands of global variables.

18:30 And if you're going to have per-interpreters, you have to somehow have those not shared because then you're going to have the gill back on them, right, you have that locking.

18:37 So due partly to the deprecation of some of the old libraries and stuff, it's gotten a little simpler, but no, that was it for the next write-up.

18:47 But anyway, they reduced this to almost 1,000, to 1,200 remaining globals.

18:52 (laughing)

18:53 So needless to say, it is not totally solved here, right?

18:58 So again, one of the possible worries of all this stuff is, well, how are the C extensions going to deal with this?

19:03 Like, they don't know about multiple sub-interpreters.

19:07 Yeah, so anyway, that's another one of the main threads going on there, let's see.

19:12 Then this is probably the biggest deal.

19:14 This is a faster CPython 3.12 and beyond by Mark Shannon and Guido Van Rossum.

19:20 So stepping back a release, Python 3.11, if you haven't heard, is fast.

19:26 It's supposed to be 1.25 times faster than 3.10.

19:30 How about that?

19:31 - Yikes.

19:31 - This blows me away.

19:32 In one year, they were able to make Python 1.25x faster, and it's been out for 30 years.

19:39 It's not like, oh, well, we released it last year, now we've learned some things.

19:42 You know, it's really, really, really solidified in the way that it is, and then still, there's a lot of work, and this apparently is just the beginning.

19:50 This is like a five-year plan to add all sorts of optimizing JIT compilers and all sorts of things.

19:58 - How did they quantify that, or what subset the length because I tested on it.

20:02 That's the tricky thing to say.

20:04 Python is 25% faster.

20:06 Doesn't matter what you do, even if you just waiting on a database, it's still 25% faster.

20:10 Just overclock your computer in the background.

20:12 It's it liquid cools it.

20:14 With this, I believe that number comes from the unit tests like all the tests for CPython.

20:22 I'm not 100% sure, but I believe.

20:24 That was the conversation and so one of the big things coming is possibly a jet and optimizing jet compiler.

20:31 So right now they've found a way to optimize individual byte code instructions to make the runtime smarter and go, oh, I see what you're trying to do.

20:40 We could have a specialized version of that.

20:43 But that's on a per line basis.

20:45 Like how about inlining this method?

20:47 'Cause I only see it called in two places or something like that, right?

20:51 So you need something that can look more broadly at the code.

20:53 So that's this idea of the JIT compiler and so on.

20:56 So yeah, this is really good, but all three of these things I've talked about are like both, they might help each other, but they also might inhibit each other, right?

21:05 So like the no-gill work might interfere with some of the optimizations that they're doing over here and the multiple sub-interpreters also might be some interplay that they've gotta be worked out.

21:16 So I'll just summarize the rest.

21:18 WebAssembly, and so we've talked about PyScript last time and Pyodied, this is the official CPython build target build target for just CPython.

21:33 So this is really interesting, that it's sort of a more from the core devs rather than somebody coercing CPython into a different build on their own.

21:40 So that's pretty neat.

21:42 f-strings, apparently the F string parser is kind of this weird side parser thing that's not actually part of the Python code parser.

21:50 But now we have peg, the peg parser, it can support more of this and sort of unify that.

21:55 So yeah, there's something like 1,400 lines of customized C code for parsing F string.

22:02 (laughing)

22:04 Well, the people who wrote it knew, they did a lot of work.

22:07 - There's like 600 of the global variables right there.

22:10 (laughing)

22:12 - Exactly.

22:14 - The most important 1400 lines in all of Python right now.

22:18 The F string functionality.

22:19 - Then two of the big optimizations from Cinder, that's the Python 3.8 specialization from Meta.

22:28 One is, this is a presentation by Itmar Osterreicher.

22:33 So this is the person who sent this in actually.

22:36 This is looking at async methods.

22:40 And if you can be sure it's not actually going to a wait, treat it like a regular method.

22:46 So you know, like if you have an async method, you might say, do this, do this, do this.

22:49 If I already have the value in the cache return, else await database call, right?

22:55 If you already have it in the cache, why do you need to create a co-routine, schedule it on the loop, wait for the loop to get to it, and then return, just call it, like just regular method, just give us the answer.

23:07 That's the idea.

23:09 There's some interesting ideas that it might change runtime ordering, although I don't know there was any promises of runtime ordering, but yeah.

23:17 So that one's interesting.

23:19 Also, the issue and PR backlog.

23:22 now that we've moved to GitHub, apparently there are issues that are still 20 years old that are still open.

23:29 And traditionally, the core devs and the triagers and so on have approached these things like, well, should we close this or probably we need to keep it open 'cause it's important for historical reasons.

23:42 And they're starting to talk about like, this is not helpful for anyone.

23:46 Maybe our first question is like, why should we keep this open?

23:49 And if the answer is not clear, just close it.

23:52 There's a lot of talk about, well, this historical stuff and maybe someone wants to pick it up.

23:55 Boy, if it were me, I got to pick and obviously I don't, so it doesn't really matter.

24:00 I would just go, if it's older than two years, just close it.

24:02 Like there's a script that just says, over two years, select all, close.

24:05 Now let's go through and figure it out because at some point, you know, if you've got 20 years of, you should make this change.

24:11 Maybe even, maybe these things aren't even relevant anymore, you know, or things have moved beyond it or it doesn't make sense in 2022.

24:19 I don't know.

24:20 But I'm just, mostly what I got out of that article is I'm thankful that I don't have to deal with 20 years of issues and PRs.

24:25 (both laughing)

24:27 - But also, they don't go away if you close them.

24:30 They're still there if people really wanna see 'em.

24:32 So I think they should be, maybe two years might be a little extreme, but at the very least, five or three or something like that.

24:41 - There should be a number where that's true.

24:42 That number should be less than 30.

24:43 - And it's a smaller number than 20, right?

24:47 - Yeah, all right.

24:49 This is a long section, last thing, I'll close it out with this.

24:53 Immortal objects, the path forward for immortal objects.

24:56 So let me ask you guys this, can you change none or true or false?

25:01 No, right?

25:02 Does, do you think it's ever gonna go away?

25:04 Like, are we done using true and then it's just gonna get garbage collected or reference counted out of memory?

25:10 Nope, but you know what?

25:11 Every time you interact with true and false, it's still incrementing its ref count.

25:16 (laughing)

25:17 - Interesting.

25:18 And none and stuff because it's an object, right?

25:20 Oh, yeah.

25:21 And so this discussion is like, isn't there some that just shouldn't be participating in reference counting because they're just fundamental to, you know, like the idea of a class, like the structure of a thing that defines what a class is, true, false, the numbers, like the low numbers, like there should be some that are not consuming that memory because they don't need to keep track of that section and so on.

25:47 Right.

25:48 So anyway, this was the proposal.

25:50 Again, it's complicated is the story, but yeah, I do something a little bit like this on talk Python, the training site.

25:59 So I've done a lot to tweak the garbage collection around there and really change the defaults of like, what are the triggers for garbage collection?

26:09 So if I've got this many allocations and so on, and one of the things you can do is you can tell it from here on like what is existed up until now.

26:18 Freeze that and don't don't look at it when you have to look for cycles.

26:22 Right?

26:22 So I just, in my app startup, when it's a, it's kind of imported the things and it's about to start, it just says, okay, everything that you've done to come to life, just don't, don't trick that anymore.

26:33 Anything else I make from here on out, please clean that up.

26:35 And it, it seems it's kind of a super cheap, cheapo version, but you still get reference counting, right?

26:40 Yeah.

26:41 That's definitely an optimization.

26:43 - I think it's worth it for some of these immortal objects.

26:46 Why not?

26:47 - Yeah, I mean, we shouldn't be reference counting on none.

26:49 That's kind of weird.

26:50 - Unless it slows things down by having like some--

26:54 - It does, that's the thing that's crazy.

26:55 So over here, they're like, all right, here's the deal.

26:57 We shouldn't have to worry about this.

27:01 And so, where was it?

27:03 - Except it adds an if statement to everything, right?

27:07 - Yeah, it says the naive implementation of this makes it 6% slower, not faster.

27:13 Like, oh no.

27:14 - It makes sense, yeah.

27:17 - And we think we can make it only 2% slower.

27:19 - It's gonna be slower though, yeah.

27:24 - Well, the thing is, normally you would just reference count it.

27:27 You just go none plus equals one, right?

27:29 Or plus plus, minus minus.

27:31 But here you're like, you have to have a test.

27:33 Like, if it's an immortal object, do this, else do that.

27:36 And it's just like that bit in the hot loop of the runtime is just apparently overhead, you know?

27:41 - Yeah, for everything.

27:42 So everything you reference has to check to see whether or not it's an immortal object before it does the reference counting, so.

27:49 - Yeah, maybe it has a no-op method on it, I don't know.

27:53 I think it probably works straight on the field though.

27:56 All right, much like Highlander, Alvaro says, there can only be one run.

28:01 (laughing)

28:03 All right, well. - That's a trade-off.

28:05 - Yeah, yeah, this is definitely an interesting trade-off.

28:08 All right, well, I think that's more than enough for the language writer, but it was really cool that Alex wrote that up and then Marc sent it in 'cause that's a good insight to what's next.

28:17 - Cool, so it's my turn, right, given that it's showing?

28:22 - Oh, sorry, yes, go, Daniel.

28:25 - Cool, so people in the software community are blessed with many options for doing source control.

28:31 You know, you've got git-svn, mercurial, and other historical ones that maybe aren't as well used, But optical engineers, mechanical engineers, electrical engineers, everybody else doesn't have it nearly as good as the software community.

28:44 So anytime I see an option for that, it definitely sticks out in my mind.

28:48 So I don't remember how I found this, but it came upon AllSpice fairly recently, which is Git for people who are doing circuits.

28:56 - This is cool.

28:57 - And so it has, it looks exactly like GitHub.

29:00 You've got version control, you've got all the things you expect to have.

29:03 It's compatible with some of the common electrical design programs.

29:08 But it really just gives you the ability to do all these sorts of things that you take for granted if you're in a sophisticated workflow like software, but that you wish dearly you had for any other discipline.

29:18 - So when you put something in a source control and you diff it, what do you get?

29:22 Are you diffing graphics?

29:26 Are you diffing some sort of definition file that defines the circuit?

29:31 - One of the first thing they have is the diff tool, 'cause they know that that's kind of one of the big questions, right?

29:36 Is how do you compare the schematics?

29:38 So they have a way to do it visually and you can look at all the changes and it looks like they're highlighting each commit to whatever change was made on the schematic.

29:46 - Oh, that's cool.

29:48 - Yeah.

29:49 - Oh, that is very cool, yeah.

29:51 - Yeah, so one potential question would be, well, great, it's nice that you can do that on the internet, but I work at a commercial company that doesn't want to do that.

29:59 But they do have both, what they have a, They have self-hosting and they have a government cloud version if you're subject to things like ITAR or EAR.

30:09 So you can, in the same sense that Git has an enterprise option, Allspice also has an enterprise option.

30:15 Like an on-prem, self-hosted version.

30:17 Yeah.

30:18 So you don't have to give away your secrets.

30:20 Yes.

30:21 I have no personal experience with it, but it's very promising and exciting to see somebody trying to come up with better ways to do engineering work besides just software.

30:31 You can even configure it to integrate with Tortoise Git, like the Windows Explorer right-click type of Git.

30:38 - Yeah.

30:39 So exciting stuff.

30:40 Hopefully somebody helps out the Mechies and the optical engineers as well one day.

30:44 - Yeah, I mean, there's always large file support, but the diff is terrible, right?

30:51 So usually.

30:53 - Yeah, you're looking at binary files or stuff that's, yeah.

30:57 Humans are so good at processing images that if you have a visual comparison that's orders of magnitude better than trying to look at lines of your even if it is a plain text file that you can read through.

31:08 Yeah, definitely.

31:09 Yeah, here's your XML with its namespaces.

31:15 Good luck.

31:16 What?

31:17 What does this mean?

31:18 Yeah.

31:19 Well, cool.

31:20 I like it.

31:21 All right.

31:22 I do too.

31:23 Brian, you got any extras for us?

31:25 I yeah, actually.

31:26 So I've been busy.

31:27 I've got like this backstream of testing code episodes.

31:31 So the most recent one is that I put out was with Wilma Coogan.

31:36 We're talking about rich and textual and textualized.

31:39 It's really fun, really fun one.

31:41 But actually, so since we talked last Tuesday, I've got four extra episodes that have come out.

31:45 So we've got teaching, including testing with the web front end stuff, which was kind of an interesting story about like basically if you're college level students, but they're new to coding.

31:59 When do you include testing?

32:01 And Carl says right away, why not?

32:05 So also a developer and productivity episode.

32:09 I think that's, oh yeah.

32:11 And a Python Django rich and testing article, so or episode.

32:16 So lots of goodness over on testing code.

32:19 - They have a Django rich package apparently.

32:23 - Yeah, that was just for other, like the CLI, the Django CLI stuff, including Rich with that, which is great, but they've incorporated that into the test runner so that the Django test runner can do rich tracebacks, which is pretty cool. - Perfect.

32:43 - So.

32:44 - Daniel, you got anything else you wanna give a quick shout out?

32:46 Sorry. - Sure.

32:47 So Adafruit's a well-known company for doing maker electronics and, oh yeah, I don't have the links up, sorry.

32:56 But you know, Adafruit's well known and they do a good job of focusing at the first five minute experience of getting you up to speed with something on electronics.

33:04 But there are other companies that do the same thing as well.

33:06 So I was gonna shout out SparkFun, Seed Studio, and then other companies like OpenMV who has a focus on machine vision.

33:13 And they're less geared more for the people at the entry level, so maybe if you're a little more comfortable with certain things or a little more comfortable, you know, exploring those based on your own they can be good options. - Trying to build weird things, right?

33:24 More specialized maybe for people or trying to actually go to.

33:27 - Or if you go to, yeah, if you go to Adafruit and what you want is out of stock, you can check some other places too.

33:31 - Which unfortunately happens a lot these days.

33:33 It's, those things come and go, a lot of demand.

33:37 Awesome, all right, I do have some.

33:39 - Cool. - Yeah, that's right.

33:40 I do have some extra ones, but I kind of got a lot, so.

33:44 All right, let's see.

33:45 I'll go last.

33:46 All right, the first one is, I always love a good documentary on tech stuff, and sometimes these are super cheesy, but there's a documentary called "Power On, the story of Xbox, which is a four hour video, which you can watch on YouTube, which I'll link directly to the YouTube video.

34:00 And it's really good.

34:01 It's really interesting.

34:03 Whether you love or hate the Xbox, honestly don't care that much one way or the other, but it's just an interesting sort of view of like the last 20 years of technology from the sort of the gaming side.

34:13 So if people are looking for something to watch and they want to spend four hours doing it or spread it out, you know, they can check this out.

34:20 All right, speaking of videos, not that one.

34:23 This one I took, so recently I released my Git course on sort of a pragmatic introduction to Git, and I decided I wanted to share one part of it with a broader world, so I released a video called the Four Reasons to Branch with Git, and I put that on YouTube, people can check that out, so it's like an hour-long video I posted this week.

34:43 And then this one comes to us from Jason Percor, saying how cool is it to see Python showing up like right on the front page of various places?

34:51 So there's this place called EasyPost, Easypost.com which lets you do labels and track your labels and stuff.

34:58 But if you just scroll down just a little bit, it says, you know what, why don't you just either buy labels or you can just use this Python API right here.

35:07 And it doesn't even sort of, if your developers click to reveal the secret, you know, it's just like, no, here's your Python code for our company.

35:16 So just kind of a cool little thing for that.

35:21 Let's see, Brian Skin pointed out that the Stack Overflow 2022 Developer Survey is open for accepting comments, which is cool.

35:33 And I'm gonna put this up here on the screen first.

35:35 So Brian, do you see this?

35:36 It has all of this stuff I can't, if I click it, it'll just go away.

35:39 And like, this is an image, right, right here?

35:40 - Yeah. - Yeah.

35:41 What if I wanted that as text?

35:44 What if I wanted to somehow grab that?

35:47 So I've got this app, which I'm gonna tell you all about next.

35:50 called text sniper.

35:52 Watch this.

35:53 So you can't quite see if I just drag over that, just like you would a screenshot.

35:57 And then let's see.

35:59 I need somewhere I can paste this anywhere there.

36:03 So what I got out of that is check this out.

36:05 Oh, wow.

36:06 Isn't that cool.

36:07 Yeah.

36:07 I just control seed from like the picture on my, on my screen and it can do PDFs.

36:15 It can do screenshots.

36:17 Like, so for example, if you're watching a video presentation, you see a slide, you're like, I want to capture those bullet points or that shoot, grab it.

36:24 You got it.

36:25 So that is called text sniper, which is super neat.

36:29 All it does.

36:30 It's just like the select region for screenshot.

36:34 That's great.

36:34 And then boom, what it doesn't matter what's under it.

36:37 It's just, if it's texted OCR is it?

36:38 And then you got it.

36:39 Yeah.

36:40 So often like a small restaurant will put their address or their phone number, like in an image like come on, I gotta click on that sucker.

36:51 >> I want to just drop this, paste it into maps.

36:53 >> Yeah.

36:54 >> That's right. I think for doing research, if you're watching videos, you want to get something out of something that's on the screen like a slide or whatever.

37:01 This is pretty awesome.

37:03 It cost something like $11 once.

37:06 If it's useful to you, it'll be worth it.

37:08 If not, then it's not.

37:09 >> Then not.

37:10 >> It's got to be worth $11 or zero to you.

37:13 >> That was like a good OCR app.

37:15 - Yeah, yeah. - It's always novel.

37:16 - And it's just the ease of use, right?

37:19 Not take a screenshot and go find your app.

37:21 It's just like, slap, slap, drop.

37:22 Okay, so last one of my extras, Sam Low and Philip Guo sent over, Sam Low, sorry, and then sent over that, I had them on to talk about Pandas Tutor, and they were talking about the challenges of running Pandas Tutor on the server side and letting people run code, but it's pretty limited because you don't want them to hack the various things.

37:46 You don't want to keep it pretty limited so they don't take advantage of like your compute resources.

37:51 So now they just posted a message saying, Pandas Tutor, if you go over here and say, visualize your code, it'll go and do all these cool visualizations.

38:00 I know we've spoken about this before, but notice what it says here.

38:03 I can scroll a little.

38:04 It says, initializing PyIoDyed on WASM download.

38:09 Pandas running, boom.

38:11 And so all of this is running in client side Python, which is just--

38:16 - Wow. - Yeah.

38:17 So we talked about that being one of the topics of the Language Summit, the WASM support.

38:22 And here you have it in action.

38:24 So I said on the show, like, hey, have you guys considered this?

38:26 Like, ah, maybe we should.

38:27 And then like, this turns out to be a great idea.

38:30 (both laughing)

38:31 - That's pretty cool. - Like the message is, run code on the server.

38:34 That's slower.

38:35 We just recommend you run it here.

38:37 - Hmm, nice.

38:38 - All right, that's pretty neat.

38:40 Well done, you guys.

38:41 And that's it for my extras as well.

38:44 - That's a lot more real than I thought.

38:45 I guess I thought PyR and Dyed and WebAssembly were a little bit further off, but that's like, hey, there's an application right now doing that today.

38:51 - Yeah, yeah.

38:52 Brian, the anti-gravity HighScript thing you showed last week was so cool.

38:57 - Yeah, I didn't even know it was doing that before we showed it, but it's pretty neat.

39:02 - Yeah, yeah.

39:03 A lot of the interactions are super, they're getting, starting to be real, yeah, we're getting there, Daniel, we're getting there.

39:09 (Daniel laughs)

39:10 All right, how about a joke to wrap it up?

39:12 - Definitely.

39:13 - So we've all been in, well, maybe we haven't all been, we can all imagine being in awkward situations, maybe on a weird date, so I don't go on dates, really being married for a long time, but imagine, imagine that you had, here's a graphic of a woman who's on a date, like maybe just woke up in the morning after the first day, first time together sort of thing, And the guy who's like sculpted, right?

39:41 He's like clearly like a super fake, probably a good looking guy, whatever.

39:46 But he's in the shower and she's like flipping through his phone and says, when she looks through your phone, but all she can find is fork a child and kill it.

39:54 Google search.

39:55 Kill child and fork parent.

39:57 Kill parent with fork.

39:59 Kill parent without killing child.

40:01 (laughing)

40:03 - Kill child without killing grandchild.

40:04 - And she got this face of like, oh, what's that, sorry?

40:08 Those are great.

40:09 Yeah, she's got this look like I thought it was going so well and he's a murderer.

40:13 I can't believe it.

40:14 No, he's just trying to figure out Linux.

40:17 Don't don't hold it against kill child without killing grandchild.

40:22 It's so bad.

40:26 Can you do that?

40:27 Well, I don't know.

40:30 I haven't searched it, but I don't want to have to explain that search if I did search it in stealth.

40:35 - That's what incognito mode is for.

40:38 (laughing)

40:40 - This is totally benign, but if somebody sees it out of context, maybe they won't feel that way about it.

40:45 - It's like, there's what will get you on the FBI list, and then software engineers.

40:50 - Yeah, there's like a Venn diagram of that.

40:53 - Yeah, there's probably a small intersection there.

40:55 - It's probably pretty big, actually.

40:58 - Yeah, it's probably pretty big.

41:00 - Anyway, well, thanks everybody for a fun show again.

41:04 So yeah.

41:05 - You bet.

41:06 Thanks, Brian and Daniel.

41:07 It was great to have you here.

41:08 - Yeah, thanks.

41:09 - Thanks for coming.

41:10 Bye.

Back to show page