Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book

#302: The Blue Shirt Episode

Published Tue, Sep 20, 2022, recorded Tue, Sep 20, 2022
Watch this episode on YouTube
Play on YouTube
Watch the live stream replay

About the show

Sponsored by Microsoft for Startups Founders Hub.

Brian #1: Can Amazon’s CodeWhisperer write better Python than you?

  • Brian Tarbox
  • “Despite the clickbait-y title, whether CW’s code is better or worse than mine is at the margins and not really important. What is significant is that it has the potential to save me a ton of time and mental space to focus on improving, refactoring and testing. It’s making me a better programmer by taking on some of the undifferentiated heavy lifting.”
  • Some decent code generation, starting with Amazon API examples.
  • The generated dataclass method was neat, but really, the comment “prompt” probably took as much time to write as the code would have.
  • The generated test case is workable, but I would not consider that a good test.
    • Perhaps don’t lump together construction, attribute access, and tests for all methods in one test function.
    • That said, I’ve seen way worse test methods in my career. So, decent starting point.
  • Related and worth listening to: Changelog #506: Stable Diffusion breaks the internet w/ Simon Willison
    • Mostly an episode about AI generated art.
    • There is a bit of a tie in to AI code generation, the ethics around it, and making sure you walk up the value chain.
  • I’m planning on playing with GitHub CoPilot.
    • I’ve been reluctant in the past, but Simon’s interview is compelling to combine experienced engineering skill with AI code generation to possibly improve productivity. Simon does warn against possible abuse by Junior devs and the “just believe the code” problem that we also see with “copy from StackOverflow” situations.

Michael #2: Apache Superset

  • Apache Superset is a modern data exploration and visualization platform
  • An intuitive interface for visualizing datasets and crafting interactive dashboards
  • A wide array of beautiful visualizations to showcase your data
  • Code-free visualization builder to extract and present datasets
  • A world-class SQL IDE for preparing data for visualization, including a rich metadata browser
  • A lightweight semantic layer which empowers data analysts to quickly define custom dimensions and metrics
  • Out-of-the-box support for most SQL-speaking databases
  • Seamless, in-memory asynchronous caching and queries
  • An extensible security model that allows configuration of very intricate rules on who can access which product features and datasets.
  • Integration with major authentication backends (database, OpenID, LDAP, OAuth, REMOTE_USER, etc)
  • The ability to add custom visualization plugins
  • An API for programmatic customization

Brian #3: Recipes from Python SQLite docs

  • Redowan Delowar
  • Expanding on sqlite3 Python docs with more examples, including
    • Executing individual and batch statements
    • Applying user-defined callbacks: scalar and aggregate
      • scalar example shows using a sha256 function to hash passwords as their inserted into the database
    • Enabling tracebacks when callbacks raise an error
    • Transforming types between SQLite and Python
    • Implementing authorization control
    • … much more …
  • This is great for not only learning SQLite, but also, since these kinds of topics exist in other databases, learning about databases.
  • AND a great example of learning a subsystem by creating little code snippets to check your understanding of something.
    • One mod I would do in practice is to write these examples as pytest functions, because I can then run them individually while keeping a bunch in the same file. 🙂

Michael #4: -ffast-math and indirect changes

  • Brendan Dolan-Gavitt downloaded 4 TB of Python packages containing native x86-64 libraries and see how many of them use -ffast-math, potentially altering floating point behavior in any program unlucky enough to load them!
  • Python packages built with an appealing-sounding but dangerous compiler option, -ffast-math, could end up causing any program that uses them to compute incorrect numerical results.
  • When -ffast-math is enabled, the compiler will link in a constructor that sets the FTZ/DAZ flags whenever the library is loaded — even on shared libraries, which means that any application that loads that library will have its floating point behavior changed for the whole process.
  • A total of 2,514 packages eventually depend on a package that uses -ffast-math.
  • Because of highly connected nature of the modern software supply chain, even though a mere 49 packages were actually built with -ffast-math, thousands of other packages, with a total of at least 9.7 million downloads over the past 30 days, are affected.

Extras

Brian:

  • Thinking about changelogs
  • Focusing on helping teams with specific goals
    • Working on an experiment in consulting with some lead engineers before the training to altering the content of a pytest course so the examples better match what the team will need.
      • Sharing packages through internal system, as that’s usually different than the PyPI process.
      • Altering the database and API example of the TalkPython pytest course content to match a teams external resources and responsibility scope.
    • It takes extra time and thought, but in the end might increase engagement and excitement about testing and keeping up on Python’s evolving common practices.

Michael:

Joke:

Episode Transcript

Collapse transcript

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:05 This is episode 302, recorded September 20th, 2022.

00:11 I'm Michael Kennedy.

00:12 And I'm Brian Okken.

00:14 Hey, Brian. How are you doing?

00:15 I'm great. It's a nice day.

00:17 Yeah, it is a lovely fall day here in the Pacific Northwest.

00:20 Dry as can be. I just had a very nice walk with my dog.

00:24 Nice.

00:24 It's going to be hard to go back to work after this podcast, looking out the window.

00:29 I give myself 50-50 chances of making it.

00:31 Yeah, I got to go back to the other screen.

00:34 That's right. I'm going to be looking that way.

00:37 Awesome. Well, before we kick off the show, I also want to say thank you once again to Microsoft for Startups.

00:43 They're sponsoring this episode again.

00:45 And huge supporters of the show. Tell you more about that later.

00:48 Brian, could you just whisper to me about your next project here?

00:52 Code Whisperer.

00:54 So we've talked about, I think we've talked about GitHub Copilot before.

00:59 And I'm not sure if we talked about Amazon's Code Whisperer yet.

01:02 I don't think so.

01:04 Okay. So Code Whisperer is a similar kind of thing, I think.

01:07 I haven't tried it myself, actually.

01:09 But there's an article by Brian Tarbox that says, Can Amazon's Code Whisperer write better Python than you?

01:17 I brought this up because I've been thinking about it a lot, about these AI copilot sort of things and stuff.

01:24 So Amazon's offering looks like it's almost, I don't know if it's a similar sort of model.

01:31 In this example that he's giving, he has a bunch of examples.

01:36 He's going through, you write a description.

01:39 He's writing a description.

01:40 I don't know if this is the only way.

01:42 But basically, describe the function you want.

01:44 Like, function to open an S3 file.

01:46 And it writes one for you.

01:48 And even titles it.

01:50 So you give it a code comment and it, like, pops out some code.

01:54 Now, for this is kind of an interesting thing around, especially around Amazon services,

01:59 because there's a lot of Amazon services.

02:01 And, you know, you do a lot of API lookups and stuff.

02:03 So some help directly around APIs.

02:07 Actually, I think that that area makes kind of some sense.

02:11 Although, if you need an AI to figure out the API, maybe the API is a little complicated.

02:16 Just saying.

02:17 Exactly.

02:19 But the discussion is an interesting one through here about, basically, about the code that it gets out.

02:26 And it's really not talking about the morals of it or anything.

02:30 It's just really talking about using it and how good it is.

02:34 The punchline at the end.

02:37 So the author admits that the title was intended to be clickbaity.

02:44 And, you know, which is cool.

02:48 Because it's the internet.

02:49 Yeah.

02:49 But despite that, he, in walking through it, he thinks that it's actually, it's just making him a little bit better because it's more efficient.

03:01 And I'd like to quote a little bit.

03:03 It's a little bit better.

03:05 It's a little bit better.

03:06 It's a little bit better.

03:06 It's a little bit better.

03:07 But it's a little bit better.

03:07 It's a little bit better.

03:07 It's a little bit better.

03:07 But it's a little bit better.

03:08 spurs code is better or worse than mine is at the margins and not really important. What is

03:14 significant is that it has the potential to save me a ton of time and mental space to focus on

03:19 improving, refactoring, and testing. It makes me a better programmer by taking on some of the

03:24 undifferentiated heavy lifting. And I kind of like that idea of it kind of takes away the blank

03:31 canvas situation of like, you know, it might show you how it might one way to do it. And you can look

03:38 at it and go, Oh, no, I wouldn't do it that way. And then you can change it. But you've you now you're

03:42 on your second draft already, instead of so it's letting the AI do the first draft. It's kind of a

03:48 neat idea. I was looking he did this data class one, for instance, this kind of blew me away. He's got an

03:54 inventory item. And, and it's already any writes a description for a function that returns whether or

04:03 not an item costs more than $10. And, and it returns, it writes a function called expensive,

04:10 like he didn't say expensive in the title at all. But it's interesting. It said expensive,

04:16 and then it returns whether or not the unit price is greater than 10.

04:19 And it realized it was within a class. And so it used self dot unit price and not just some

04:25 unassociated function that returns greater than 10.

04:28 Yeah. So it is interesting. Yeah, yeah. Anyway, interesting discussion. And then also interesting

04:36 looking at the code, he tried it against test code, he said, I want to function the test the inventory

04:41 class. Well, one, I think it was probably maybe this was a prompting problem. You shouldn't have one

04:47 function to test an entire class. My, my, my druthers, but it did a decent job of at least

04:53 giving you a first start of like, one of the things to test is you need to test the expensive thing.

04:57 You need to function, you need to test the total cost. It just did it all in one function though. So

05:03 I mean, I guess that's what he asked for, but coming up with the total cost, which is computed.

05:06 That's kind of interesting. Yeah. Yeah. That is interesting. Yeah.

05:10 And the base item is a unit price of $10 and there's five of them. And so in the test,

05:16 it asserted the total cost is 50. Yeah, definitely. Interesting. Interesting to definitely look at

05:21 and good. And it might help you think about other test cases around it. So, so I guess cool. I wanted

05:28 to point out while I'm thinking about it, one of the reasons why I brought this up is I just listened

05:32 to a changelog episode with Simon Willison called stable diffusion breaks the internet. And this is

05:38 focused on AI driven artwork, which is definitely interesting and interesting conversation,

05:45 but in it they talk, since these are all programmers, they talk, talk about how this,

05:50 the same sort of argument applies around, around code generation of the morality of it. And,

05:56 and then aside more morals aside or in legal stuff aside, it's happening. So how do you,

06:02 Simon brings up the term of basically just you need to be one level of abstraction above the AI system. So

06:12 just to make sure that you were still adding value and the original author of this article talked about

06:18 this as well of it's, it's not about really not thinking it's about freeing up some of your brain

06:24 space to do other things. So in interesting. So, yeah, it is interesting. I mean, there's certain

06:29 things that you probably don't just don't need to remember. You know, I'm thinking of, do I really need to

06:35 remember all the steps in the connection string schema for connecting to SQLAlchemy? Probably not. I could

06:42 just say connect to SQL, you know, connect SQLAlchemy to a Postgres database and boom, it gives me,

06:47 you know, create the metadata, the metadata base class, and then create an engine and create a connection

06:55 and you're buying the engine, all those steps, right? Like if you could just kick that kind of

07:00 stuff out, that's something you want for a project and you just never do. It's not like, boy, I'm sure

07:05 I'm not good at connecting to SQLAlchemy. I'm just not a good programmer, I guess, right? You look it up,

07:09 you put it in there and you go. And so if you didn't have to take the step of looking up, that's kind of

07:13 cool. Yeah. I also like that. I didn't think about this before. And I think GitHub actually intended you

07:18 to think about it like this with naming it Copilot. It's not intended to take over your work, but it's

07:24 like sitting down with somebody that kind of knows what they're doing and being in pair programming with

07:29 them. Yeah. You can't turn off your brain, but maybe you can ease up a little bit. So anyway.

07:34 Wait, before you close this, scroll down to this black and white code editor. Boy, look at that.

07:39 If you check out this article, there is a, I don't even know what to make of it. Because to me,

07:43 it looks like a super retro early macOS, like macOS one type of UI, but then the file is c colon

07:51 backslash CD. It's just a mix of like beautiful retro. Yeah. Well, he was talking about the first

07:58 recorded code completion appears in the Pascal editor called Alice in 1985. So yeah. And I guess that's it.

08:07 Well, that's a, that's a heck of an editor. Super cool. All right. On to the next one.

08:12 Yeah. Two things real quick. I just want to point out or sort of make a comment. It's not pointed out

08:17 this morning. I had to make a new API because one thing I've learned about writing courses that depend

08:23 on other people's APIs, these other people suck at keeping their APIs running. They either decide,

08:29 you know what, this is costing me $10,000 a month and I'm going to have to charge for it. Boo hoo. No,

08:34 just kidding. That's a reasonable reason to change, but it changes like with the open weather API

08:37 or like this one for this Twilio course I was using. So I spent the morning a little bit of

08:44 yesterday and this morning, just doing a complete from scratch FastAPI API. And what a ton of fun

08:50 it is to just work with FastAPI and get to build out all sorts of neat, neat little things. And so,

08:56 you know, I just want to shout out if you're, if you're building something with FastAPI or you're

09:00 building an API, you can definitely give FastAPI. Look, there's a lot of, a lot of neat things you can do

09:04 to put together. Like here's a whole little website. It even does CSS and images and sort of,

09:09 sort of chameleon templates. I mean, it's basically static, but anyway, fun stuff and continues to be

09:14 fun. And so which, which, course is this for? Is it for the Python powered chat apps with Twilio

09:21 and SendGrid, which is actually a free course, but it sets up a chat bot that you order from like a

09:26 bakery type thing over WhatsApp. And the problem is if you go to the APIs that the WhatsApp thing was

09:33 using, they just 500 or 404 or one of those two things, neither of which is super useful for the

09:39 course. So I recreated it in FastAPI this morning, which is cool. Now it lives on the internet,

09:43 but that's not what I want to talk about as super as that is. I want to talk about Apache

09:48 superset. Okay. Have you heard of superset?

09:51 No. Well, the word I know.

09:53 Of course. But Apache superset is a modern data exploration and visualization platform. And I came

09:58 across that the other day and I'm like, what the heck is this? I haven't even heard of this. It has

10:02 almost 50,000 GitHub stars. Okay. That's insane. And is put together, by back, Max Bushman,

10:10 co also the creator of Apache airflow, which is pretty cool, right?

10:15 So this is, this turns out to be a really interesting program and it's written in Python

10:21 and TypeScript. It's like really front end heavy because it has a lot of visualizations and stuff,

10:25 right? But all the backend stuff, it's all the things that you would know. It's Flask, it's Redis,

10:30 Celery, many of the, you know, pandas and data science tools you would know, but it's not exactly

10:35 a tool for developers like Jupyter. So Jupyter would be a way that data scientists who know Python would sit

10:42 down and leverage their Python skills to check out data and explore things. This one is really almost

10:48 meant for like people who would say, I'm going to fire up Excel and see what's going on, or I'm going

10:53 to fire up some BI tool like Tableau. And I want to look at it a little bit and see what's going on.

10:58 And it's also open source and written in Python, which means it has APIs and extensions and plugins in

11:06 Python, which is pretty excellent. So it has a way to explore your data. Like Brian, look at this

11:10 picture. What do you think? It's, I don't know what it is, but it's pretty.

11:13 It's glorious, right? Like it's a fantastic way to visualize. You know, here's 25 contributors to a

11:19 stream over time. You can sort of see like the growth of their contributions or not. And so the way you

11:24 generate this is you just connect it to a database. It gives you the table. You say, make a chart out of

11:29 this database and you draggy, droppy, the pieces over and boom, there it goes. And it doesn't have to

11:35 just be the data in the database. It can be a computed field. So you could say, I want to graph

11:41 the sum of this join onto like the orders of each customer, or I want to see the max order for each

11:48 customer, you know, things like that. Right. So that's pretty cool. So you can explore data like

11:52 that. You can create these dashboards, these live dashboards to see what's the state of our

11:57 business today. And it even comes with a SQL IDE, all of this in the browser, very Jupyter-esque.

12:03 Pretty cool.

12:04 This is pretty neat. Yeah.

12:06 Yeah. Very, very neat. And it connects to, I told you it was Python. It connects to all of its databases

12:13 using SQLAlchemy. And so any database that can be a data source for SQLAlchemy, you know,

12:18 obviously Microsoft SQL server, Postgres, MySQL, but you know, things you might not think of like

12:24 Vertica or Druid or Amazon Redshift or Google BigQuery, all of these different data sources,

12:30 Databricks are available as a data source because SQLAlchemy knows how to talk to it. And this just

12:36 leverages SQLAlchemy. Yeah. Hey, hold it there for a sec. One of the things I learned recently,

12:40 which I don't know why I never got this before, but look at the SQLite logo. Yes. It's got a quill in

12:46 it. Did you, did you know that before that it's a quill for SQLite? Oh, quill. I did not put that

12:53 together now. How funny. Now we know. Cool. So anyway, yeah, people can check this out. It's

13:00 kind of a little bit intense to run, but you can pip install it, but probably the better way to do it,

13:06 you want to just try it out is to install it locally with Docker. So for me, for example, I just

13:13 put in the GitHub repo and then went in there and said Docker compose, gave it the YAML file and said,

13:17 pull and then up and off it goes.

13:19 So this is not a service. This is just something you can download and you run then.

13:24 It's something you can download and run, but it has a lot of infrastructure bits clicking together.

13:28 Okay. And so, when I interviewed Max Bushman, he actually is now the CEO and founder of preset,

13:37 which is superset as a service. So if you want to, if you want to have someone else host it for you,

13:42 you can go check it out with them. Right. But it's also a thing you can just run yourself,

13:47 but look how popular it is. Almost 50,000 get up stars, 10,000 forks. And I just learned of it.

13:51 That's nuts.

13:52 Well, I mean, you know, go figure. People actually want to know what's in their data.

13:56 I know.

13:57 Weirdos.

13:58 Yeah. It's so weird. What I think is cool about it is it lets you connect into like your live

14:04 operational data, not just like, Oh, I downloaded a CSV and now I can ask questions. Right. You can just

14:08 like whatever the current data is, let's get that and build a dashboard around it.

14:12 Pretty awesome. Yeah. Yeah. All right. Well, superset, if people need an alternative to Excel

14:17 or BI or Tableau or whatever, check out superset. It's very, very Python friendly and looks pretty

14:22 nice.

14:23 You know what else is nice?

14:24 Tell me.

14:24 Microsoft for startups.

14:27 Ah, they are. They are very nice. So yes, it's time to tell everyone about our sponsor,

14:33 isn't it, Brian?

14:34 Yeah.

14:34 Let me tell you all about Microsoft for startups. They created Microsoft for startups, founders

14:38 hubs to help give early stage startups, the support that they need to be successful. If you are dreaming

14:46 of, or in these stages of an early stage startup, you know, you should go apply. And the link at the

14:52 bottom in the show notes is by them by set of them slash founders hub 2022, all one word go over

15:00 there and apply is completely free to apply. You don't have to be third party verified. You don't have

15:04 to be VC funded. If they think your startup has merit, you're in the program program comes with

15:10 many thousands of dollars of cloud credits. You can, you get some to start. And as you make your

15:17 way through different stages of your life cycle, you get a bunch more, but what's maybe even more

15:22 important is access to their mentorship network. So there's a reason that Silicon Valley is the

15:29 heart of so many startups. And it's not just, you know, the nice weather, if anything,

15:34 I don't encourage people to go outside and not work on their projects, right? It's the network

15:38 and it's the connections. And if you live somewhere else, or if you're not in that space, it's very

15:44 hard to get connected with the right people to make the right steps, right? So this program will get you

15:51 set up there. So in addition to all the cloud credits and so on, you have access to this founders

15:56 network where you can book one-on-one meetings with hundreds of different mentors, many of whom are

16:01 founders themselves that are experts in areas such as idea validation, fundraising, management and

16:08 coaching, sales and marketing. That's the one that's the toughest, I think. If you can nail that,

16:13 you're golden. So make your idea a reality today with a critical support for Microsoft for startups,

16:18 Founders Hub. Check them out at pythonbytes.fm/foundershub 2022. Thanks again to Microsoft for

16:25 supporting our show. Yeah. Thank you. Yeah. Indeed. All right, Brian. Now what you got?

16:30 Well, I want to share something that Jeremy Page from the chat says, I thought SQL, always thought the

16:37 SQLite logo was an homage to TCL and I've got the logo for TCL. So maybe, I don't know.

16:44 Perhaps. Interesting. So, but I wanted to talk about recipes from Python SQLite again, recipes from Python

16:53 SQLite docs. So this is kind of a, there's a, this is an article by, I wrote it down, I promise I did.

17:00 Redowan Delaware, cool name. So this, he was going through the SQLite three docs on the Python docs.

17:09 And there's a, there's a lot of examples, but some of them don't have actual examples. It just talks

17:15 about the API. And so he decided to write out some of the examples as little code snippets. And I really

17:21 like this. If you're learning SQLite or even, you're just want to learn not SQLite in particular, but

17:27 databases. These are concepts that apply to a lot of things. So he's, he's got, of course, whether or not

17:33 you can execute individual statements or batch statements. So he's got little examples for that

17:38 goes into this is interesting. I thought was user defined callbacks. I thought this was really cool.

17:45 For instance, a scalar function, he defined a, and I knew that like you could put user defined functions

17:51 in databases, but I haven't ever done that really. He has a, a hash function, SHA256, that creates a hash

17:59 for passwords. And then he shows how to use that when he passes in a username and password into the

18:05 database, how it turns it into a hash, hashes it before it stores it.

18:10 That is cool. I never knew you could do that. Here's a Python function passed over as part of a

18:16 passed over to SQLite. And then the SQL statements can call it. That's, that's real cool.

18:21 Yeah. I mean, there's a special syntax. So that's good that there's these examples of like insert into

18:26 user values, users values, and then this question mark and SHA256 question mark.

18:32 Also, that's fantastic that that's being shown because that's the parametrized,

18:37 then the anti little Bobby tables version.

18:40 Okay.

18:41 Which is the best practice, right? The alternative is something worse.

18:46 Yeah. And then, you know, aggregate functions, which kind of got lost here, but there's a whole

18:53 bunch of really cool examples of using, using SQLite and, and they're really tiny examples. And so the,

19:00 one of the other things I wanted to share the reasons I wanted to share this article is I think

19:04 this is a really great way to learn an API or learn a service is to write these little example

19:11 things in little code snippets and try it out. Try it out with a table that you're creating that only

19:17 has two or three elements in it so that you can, you can play with it and, and you can get your head

19:22 around what you think the answer should be and what it does. The only thing I think I'd probably add,

19:27 of course, is if you're going to do little code snippets, these all have to be in separate files,

19:33 right? Unless you just write test functions. So this is a great use for pytest. I use it all the time.

19:38 If I'm learning something, I just do these little code snippets, but I do them within a test function.

19:43 And then it can be, it's not really testing anything except my own knowledge, but I can run

19:47 them just by right clicking on the, or clicking on the little arrow that the editor has for each

19:52 little function. So just rerun the failed test until, until I understand it. Yeah. Oh yeah. Very cool.

19:59 Anyway. All right. How about something we don't understand? Okay.

20:03 Let me take you over, let me take you over to a weird world of cascading consequences. So there's

20:08 this guy who is a assistant professor at NYU Tandon security and reverse engineering person named

20:16 Brandon Dolan Gavit. And there's this tweet here over to his blog post saying, a new blog post in which I

20:24 I download four terabytes of Python packages containing native X86 libraries, you know,

20:31 something that's done some C++ thing like G event or pandas, one of those numpy that then bundles

20:38 it into a wheel. And apparently there's a bug in one of the C compilers that if you pass dash F

20:45 fast dash math, it will potentially alter the floating point behavior of your program. If you

20:51 compile it with that. All right. So we're in Python, we don't compile things that often. What do we care?

20:56 Well, what this does is it reconfigures how the process uses like some low level registers,

21:04 but some feature of the CPU on how it does floating point math. And because when the library is loaded,

21:10 it changes that feature while it changes it for the entire program, AKA your program. That doesn't

21:16 sound great. Does it? No. So let's, let's dive in. So the article is called someone's been messing

21:22 with my sub normals, sub normals, I suppose being an aspect of floating point computations. So here he

21:28 is in Python 3.8 and he says from transformers import code gen for causal LM. And that's all they,

21:36 it's all he wanted. This is in IPython terminal. And it starts bumping out all these warnings.

21:41 NumPy core get limits. User warning, the value of the smallest subnormal for class

21:46 numpy.float32 type is zero. Over and over and over these start popping out. It's like,

21:52 hmm, well, warnings about floating point numbers sounds bad. What do you think?

21:56 Yeah.

21:57 So it turns out that something, not numpy, but something that is in this library was compiled with

22:05 this dash FF math dash fast flag. When it got imported, it changed how numpy was working. Okay.

22:13 So it says, what were the problems? It says, well, it changes the floating point unit behavior that's on

22:19 the CPU, the actual FPU. I remember when, by the way, CPUs didn't come with that. Like I was trying to

22:25 decide with my first computer to get a 486 SX or DX. And I got the DX because it came with a floating

22:31 point unit on the CPU. Anyway, that thing gets messed with and says for some algorithms that depend on the

22:39 behavior and will fail to converge if it's set to treat this as different. So it uses the FTZ DAZ flags

22:47 in the MX CSR register. That's part of the part that I don't understand. I don't, I don't work that low

22:53 level, but it turns out it's not ideal. So it said, well, what is actually going on here? And apparently

23:00 there's a way, there's a whole bunch of stuff, how you can search through Linux and whatnot to figure out

23:05 what processes are doing this weird stuff. And also apparently if you compile with the dash, oh fast, it

23:12 also like cascades over to having the same behavior. So there's some exploration, like you wrote some C code

23:18 and then imported it into Python. and it seemed all fine. And then did the same thing with oh fast and able to get

23:25 all these warnings. I've never seen this warning. So I guess that's good, but it turns out the culprit was

23:31 G event of all things, which is a event-based asyncio networking library. Yeah. But somehow something was

23:39 using it. And when it got imported, it freaked everything out. So then the question becomes, well,

23:43 if G event, G event can be causing these problems because somebody thought it was awesome to compile

23:49 the fast version, not the slow version. What else is out there? So, Brandon went through and decided

23:55 to download four terabytes of wheels for all the things that might have some kind of x86 binary in

24:02 them. And then there's a ton of analysis of trying to figure out like, well, how do you actually look for

24:08 and find whether or not this program has this feature or not? It turns out to be pretty tricky.

24:13 So there's a bunch of stuff about going through to just check to see like what, how do you test it for

24:19 this many packages? Cause the test he was using before was super slow. So anyway, it's, it's not ideal.

24:26 I think there was something like 49 different packages. Let's see. I wrote it down up here. I'll get this

24:32 number, right? Yeah. There's 49 packages, 49 packages on PyPI that were built with this flag.

24:38 However, thousands of packages use those libraries and hence were also subject to that behavior with

24:45 10 million downloads in the last 30 days. So that's pretty nuts, huh?

24:49 Well, I mean, you're kind of scaring me. So how do I know if I need to care? I guess,

24:55 you know, I, are you doing iterative floating point math that goes down to like very small things?

25:01 Probably, probably not. I don't think I need to care. I'm doing like, I need, I didn't need to

25:06 know what 33% of, you know, 69 is. It should be fine. Right. but if you're doing, well,

25:12 you got to test, got to test your code. And I guess we have to test our math as well. I just sort of

25:17 trust that a lot of that works. Yeah. I suppose you would see those warnings, right? That about the

25:23 floating point subnormal coming in. Okay. Yeah. so there's a great long list of here of packages,

25:31 let's see. I'll just read some out. People might know. So for example, G event, G event, G event, HTTP client, flask socket IO dagster, which is used in data science a lot for

25:43 like data engineering, web socket, G event, web socket, locust for a testing, interpret high Kafka

25:51 and on and locust plugins, parallel SSH, right? So it doesn't matter if you're using that library

25:56 for the math, just if it gets imported, it changes all the math of the program.

26:01 So anyway, there, there it is. People can check it out. The comments are pretty glowing about this

26:07 research. Matthew Adams, for example, says crazy, awesome work, bro. You should be knighted for this.

26:13 in our chat, Alvaro says, run your, run your test with a dash W error, which you should be

26:20 anyway. So cool. So warnings treated as errors basically. Yeah. Yeah. Or set that particular

26:26 one to be, warning. All right. Well, I guess that's it for our four items that we're

26:31 covering today. Am I right? Yeah. I was just, I was, I was giggling during part of that. Cause I,

26:36 the subnormal just cracking me up. Like, like why is, why is Brian talk like that? I don't understand

26:42 most of his words. Oh, don't worry about him. He's subnormal. I don't know. I also, I also like the

26:50 title of the overall blog, push the red button, push the red button for a research, malware

26:56 reverse engineering, pen testing on the blog. Yeah. Nice. Nice. All right. Well, how about some extras?

27:01 Yeah. I don't have anything I want to show, but, but I was just going to say a couple of

27:08 things I've been up to. I've been thinking about change logs a lot and for on test and code,

27:13 instead of doing like a one episode on change logs, I thought I would talk to several people and do an

27:19 NPR style combined. So it might be, it might end up being a series of episodes that I'll release

27:25 together or, or one long episode. I'm not sure yet, but, basically I'm thinking about change

27:30 logs a lot. the other thing I've been doing is, thinking about, so we had that pie test

27:36 course out, right? Last week. we did just awesome on, Talk Python Training. and,

27:43 I, I, I, it's cool. anyway, Talk Python Training. I always get to it by just

27:48 remembering that I switched that and just say training.talkpython.fm and you can get there.

27:53 But, but I've had some requests to take some of the content and, change it for individual

27:59 teams. So, and this is an interesting thing to me to, to, and to think about, to say, cause like in

28:05 this course we do a database and a command line interface, but we're mostly testing through the API.

28:09 So API API with the database application. so we're doing things, the, the layered things,

28:15 but some people are like, well, I don't use the database. So maybe we could swap that out with

28:19 an example that uses one of the resources we have. And more of our example, we don't do the API. We do

28:26 these little, we're testing something else. So like, okay, we can cover the concept. So it's a neat idea

28:31 to try to focus that towards people. So if it, I guess if you're interested in doing that,

28:35 check out, python test.com and under training, check me out. So yeah. Awesome. Yeah. There's a

28:42 lot of ideas in that course that can be applied to different industries, different ways. Yeah. Yeah.

28:46 Different ways for sure. Awesome. Yeah. So the PI test course is going super strong. People really

28:51 love it. great work on that, Brian. I have another course to announce cause it's been a week.

28:55 It's been a week. It's been a week. Python data visualization. So this is a course by Chris

29:02 Moffitt over at Talk Python Training. And the idea is there's all these different choices. I mean,

29:07 we just talked about superset today and throw, throw that in as another thing in the pile of general

29:12 visualization tools, right? So you might do matplotlib, or maybe you want to use something new

29:17 like Altair. So this course goes through and shows you what it's like to do visualizations

29:23 in these different frameworks, like matplotlib, Seaborn, even pandas and plot.ly and streamlet.

29:29 And then you can build out these different scenarios and say, well, in this case, it might make more

29:34 sense to use matplotlib, or I might choose Altair and it'll help you choose a visualization

29:38 framework, but also it'll show you how to use all of them. So it's a nice broad exposure to all these

29:43 different frameworks. So people can check that out. Talk by thumb.fm click on courses. Ooh, this,

29:48 this is definitely useful. I got a project that I need this for. So yeah, this is going to be a good

29:52 one. It is a good one. I've already seen it. I've seen it several times actually, but it's good.

29:57 Let me see. Do I have any more extras I want to give a shout out to? No, just those two things.

30:02 And then I have, I have two jokes for you this week because one is not enough.

30:05 No. Yeah.

30:06 The first one here has to do with people who maybe learned a different language, maybe are

30:13 hating a little on Python. So here's somebody says, me laughing at all the Python hate on this

30:19 sub Reddit as I study C#. Silly language. Come on. We all know C# is better. And then

30:26 that's like a smiling, laughing person. And then a more seriously, somewhat concerned starting a new

30:31 job and realizing on the job board, 95% of them are asking for Python. That's very fun.

30:37 Well, that now I want to go over to the, like the, the C# subreddit and see if I can find some

30:44 anti Python jokes.

30:45 I know.

30:45 Wouldn't that be good?

30:46 All right. Well, that one's pretty good. And then were you affected by the recent, we have,

30:52 for people who are not in our area in Pacific Northwest, there was a massive windstorm, like

30:57 30, 40 mile an hour wind, 25% humidity, a hundred degrees. It was like, if somebody threw a cigarette

31:04 out the window, the entire Pacific Northwest would just go instantly catch fire. It was like,

31:08 it was insanely bad. And so we had our power turned off in the West Hills here because the

31:14 trees were so likely to fall over and cause a fire from knocking over. So they just cut the power for

31:20 like a little bit. They also did that in California. There's like a big, it was a bit of an irony.

31:25 Like one day they said, we're going to only allow the sale of electric cars after 30, 30, 35 or something,

31:32 whatever the date is. I mean, I'm, I'm going to support that. I'm a fan of electric cars and all,

31:36 but like the next week they said, Oh, we're going to turn off your power. Cause actually I think the

31:41 electric cars might help balance it out. But anyway, bit of an irony. So this next joke has to do with

31:45 that. So I got ahold of this from Kylie codes and she highlighted this tweet that says the governor

31:52 has declared that for California, the governor has declared a state of emergency and ask all

31:56 Californians not to run NPM install between 4 PM and 9 PM today in an effort to save energy and fight

32:04 this wildfire danger. Oh, that's awesome. Isn't that good? Yeah. Yeah. I love it. So that's,

32:11 that's the two jokes I got for you. yeah, nothing too deep. Well then also you, me and missed one.

32:17 There was a, like the, the, the build on of that. The build. All right. Do tell us about it.

32:22 Okay. Governor declares the state of emergency and asked all Californians to not run,

32:25 a wasm pack build between 4 PM and 9 PM. Exactly. Nice. Cool. And John Sheehan says,

32:33 it's funny because it's true. Didn't you just talk about the other day, about rough and having

32:43 our Python tools faster, like the JavaScript community is being concerned about faster tools.

32:48 Maybe not everywhere. Maybe not a hundred percent. Yeah. Awesome. All right. All right. Well,

32:53 good episode as always. Thank you. Thank you. We'll talk to you next week. Yeah. See you next week.

32:59 Thanks everyone for listening. Bye. Bye.


Want to go deeper? Check our projects