#302: The Blue Shirt Episode
About the show
Sponsored by Microsoft for Startups Founders Hub.
Brian #1: Can Amazon’s CodeWhisperer write better Python than you?
- Brian Tarbox
- “Despite the clickbait-y title, whether CW’s code is better or worse than mine is at the margins and not really important. What is significant is that it has the potential to save me a ton of time and mental space to focus on improving, refactoring and testing. It’s making me a better programmer by taking on some of the undifferentiated heavy lifting.”
- Some decent code generation, starting with Amazon API examples.
- The generated dataclass method was neat, but really, the comment “prompt” probably took as much time to write as the code would have.
- The generated test case is workable, but I would not consider that a good test.
- Perhaps don’t lump together construction, attribute access, and tests for all methods in one test function.
- That said, I’ve seen way worse test methods in my career. So, decent starting point.
- Related and worth listening to: Changelog #506: Stable Diffusion breaks the internet w/ Simon Willison
- Mostly an episode about AI generated art.
- There is a bit of a tie in to AI code generation, the ethics around it, and making sure you walk up the value chain.
- I’m planning on playing with GitHub CoPilot.
- I’ve been reluctant in the past, but Simon’s interview is compelling to combine experienced engineering skill with AI code generation to possibly improve productivity. Simon does warn against possible abuse by Junior devs and the “just believe the code” problem that we also see with “copy from StackOverflow” situations.
Michael #2: Apache Superset
- Apache Superset is a modern data exploration and visualization platform
- An intuitive interface for visualizing datasets and crafting interactive dashboards
- A wide array of beautiful visualizations to showcase your data
- Code-free visualization builder to extract and present datasets
- A world-class SQL IDE for preparing data for visualization, including a rich metadata browser
- A lightweight semantic layer which empowers data analysts to quickly define custom dimensions and metrics
- Out-of-the-box support for most SQL-speaking databases
- Seamless, in-memory asynchronous caching and queries
- An extensible security model that allows configuration of very intricate rules on who can access which product features and datasets.
- Integration with major authentication backends (database, OpenID, LDAP, OAuth, REMOTE_USER, etc)
- The ability to add custom visualization plugins
- An API for programmatic customization
Brian #3: Recipes from Python SQLite docs
- Redowan Delowar
- Expanding on sqlite3 Python docs with more examples, including
- Executing individual and batch statements
- Applying user-defined callbacks: scalar and aggregate
- scalar example shows using a sha256 function to hash passwords as their inserted into the database
- Enabling tracebacks when callbacks raise an error
- Transforming types between SQLite and Python
- Implementing authorization control
- … much more …
- This is great for not only learning SQLite, but also, since these kinds of topics exist in other databases, learning about databases.
- AND a great example of learning a subsystem by creating little code snippets to check your understanding of something.
- One mod I would do in practice is to write these examples as pytest functions, because I can then run them individually while keeping a bunch in the same file. 🙂
Michael #4: -ffast-math and indirect changes
- Brendan Dolan-Gavitt downloaded 4 TB of Python packages containing native x86-64 libraries and see how many of them use -ffast-math, potentially altering floating point behavior in any program unlucky enough to load them!
- Python packages built with an appealing-sounding but dangerous compiler option, -ffast-math, could end up causing any program that uses them to compute incorrect numerical results.
- When -ffast-math is enabled, the compiler will link in a constructor that sets the FTZ/DAZ flags whenever the library is loaded — even on shared libraries, which means that any application that loads that library will have its floating point behavior changed for the whole process.
- A total of 2,514 packages eventually depend on a package that uses -ffast-math.
- Because of highly connected nature of the modern software supply chain, even though a mere 49 packages were actually built with -ffast-math, thousands of other packages, with a total of at least 9.7 million downloads over the past 30 days, are affected.
Extras
Brian:
- Thinking about changelogs
- Focusing on helping teams with specific goals
- Working on an experiment in consulting with some lead engineers before the training to altering the content of a pytest course so the examples better match what the team will need.
- Sharing packages through internal system, as that’s usually different than the PyPI process.
- Altering the database and API example of the TalkPython pytest course content to match a teams external resources and responsibility scope.
- It takes extra time and thought, but in the end might increase engagement and excitement about testing and keeping up on Python’s evolving common practices.
- Working on an experiment in consulting with some lead engineers before the training to altering the content of a pytest course so the examples better match what the team will need.
Michael:
- New course: Python Data Visualization
- pytest course going strong
Joke:
Episode Transcript
Collapse transcript
00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
00:05 This is episode 302, recorded September 20th, 2022.
00:11 I'm Michael Kennedy.
00:12 And I'm Brian Okken.
00:14 Hey, Brian. How are you doing?
00:15 I'm great. It's a nice day.
00:17 Yeah, it is a lovely fall day here in the Pacific Northwest.
00:20 Dry as can be. I just had a very nice walk with my dog.
00:24 Nice.
00:24 It's going to be hard to go back to work after this podcast, looking out the window.
00:29 I give myself 50-50 chances of making it.
00:31 Yeah, I got to go back to the other screen.
00:34 That's right. I'm going to be looking that way.
00:37 Awesome. Well, before we kick off the show, I also want to say thank you once again to Microsoft for Startups.
00:43 They're sponsoring this episode again.
00:45 And huge supporters of the show. Tell you more about that later.
00:48 Brian, could you just whisper to me about your next project here?
00:52 Code Whisperer.
00:54 So we've talked about, I think we've talked about GitHub Copilot before.
00:59 And I'm not sure if we talked about Amazon's Code Whisperer yet.
01:02 I don't think so.
01:04 Okay. So Code Whisperer is a similar kind of thing, I think.
01:07 I haven't tried it myself, actually.
01:09 But there's an article by Brian Tarbox that says, Can Amazon's Code Whisperer write better Python than you?
01:17 I brought this up because I've been thinking about it a lot, about these AI copilot sort of things and stuff.
01:24 So Amazon's offering looks like it's almost, I don't know if it's a similar sort of model.
01:31 In this example that he's giving, he has a bunch of examples.
01:36 He's going through, you write a description.
01:39 He's writing a description.
01:40 I don't know if this is the only way.
01:42 But basically, describe the function you want.
01:44 Like, function to open an S3 file.
01:46 And it writes one for you.
01:48 And even titles it.
01:50 So you give it a code comment and it, like, pops out some code.
01:54 Now, for this is kind of an interesting thing around, especially around Amazon services,
01:59 because there's a lot of Amazon services.
02:01 And, you know, you do a lot of API lookups and stuff.
02:03 So some help directly around APIs.
02:07 Actually, I think that that area makes kind of some sense.
02:11 Although, if you need an AI to figure out the API, maybe the API is a little complicated.
02:16 Just saying.
02:17 Exactly.
02:19 But the discussion is an interesting one through here about, basically, about the code that it gets out.
02:26 And it's really not talking about the morals of it or anything.
02:30 It's just really talking about using it and how good it is.
02:34 The punchline at the end.
02:37 So the author admits that the title was intended to be clickbaity.
02:44 And, you know, which is cool.
02:48 Because it's the internet.
02:49 Yeah.
02:49 But despite that, he, in walking through it, he thinks that it's actually, it's just making him a little bit better because it's more efficient.
03:01 And I'd like to quote a little bit.
03:03 It's a little bit better.
03:05 It's a little bit better.
03:06 It's a little bit better.
03:06 It's a little bit better.
03:07 But it's a little bit better.
03:07 It's a little bit better.
03:07 It's a little bit better.
03:07 But it's a little bit better.
03:08 spurs code is better or worse than mine is at the margins and not really important. What is
03:14 significant is that it has the potential to save me a ton of time and mental space to focus on
03:19 improving, refactoring, and testing. It makes me a better programmer by taking on some of the
03:24 undifferentiated heavy lifting. And I kind of like that idea of it kind of takes away the blank
03:31 canvas situation of like, you know, it might show you how it might one way to do it. And you can look
03:38 at it and go, Oh, no, I wouldn't do it that way. And then you can change it. But you've you now you're
03:42 on your second draft already, instead of so it's letting the AI do the first draft. It's kind of a
03:48 neat idea. I was looking he did this data class one, for instance, this kind of blew me away. He's got an
03:54 inventory item. And, and it's already any writes a description for a function that returns whether or
04:03 not an item costs more than $10. And, and it returns, it writes a function called expensive,
04:10 like he didn't say expensive in the title at all. But it's interesting. It said expensive,
04:16 and then it returns whether or not the unit price is greater than 10.
04:19 And it realized it was within a class. And so it used self dot unit price and not just some
04:25 unassociated function that returns greater than 10.
04:28 Yeah. So it is interesting. Yeah, yeah. Anyway, interesting discussion. And then also interesting
04:36 looking at the code, he tried it against test code, he said, I want to function the test the inventory
04:41 class. Well, one, I think it was probably maybe this was a prompting problem. You shouldn't have one
04:47 function to test an entire class. My, my, my druthers, but it did a decent job of at least
04:53 giving you a first start of like, one of the things to test is you need to test the expensive thing.
04:57 You need to function, you need to test the total cost. It just did it all in one function though. So
05:03 I mean, I guess that's what he asked for, but coming up with the total cost, which is computed.
05:06 That's kind of interesting. Yeah. Yeah. That is interesting. Yeah.
05:10 And the base item is a unit price of $10 and there's five of them. And so in the test,
05:16 it asserted the total cost is 50. Yeah, definitely. Interesting. Interesting to definitely look at
05:21 and good. And it might help you think about other test cases around it. So, so I guess cool. I wanted
05:28 to point out while I'm thinking about it, one of the reasons why I brought this up is I just listened
05:32 to a changelog episode with Simon Willison called stable diffusion breaks the internet. And this is
05:38 focused on AI driven artwork, which is definitely interesting and interesting conversation,
05:45 but in it they talk, since these are all programmers, they talk, talk about how this,
05:50 the same sort of argument applies around, around code generation of the morality of it. And,
05:56 and then aside more morals aside or in legal stuff aside, it's happening. So how do you,
06:02 Simon brings up the term of basically just you need to be one level of abstraction above the AI system. So
06:12 just to make sure that you were still adding value and the original author of this article talked about
06:18 this as well of it's, it's not about really not thinking it's about freeing up some of your brain
06:24 space to do other things. So in interesting. So, yeah, it is interesting. I mean, there's certain
06:29 things that you probably don't just don't need to remember. You know, I'm thinking of, do I really need to
06:35 remember all the steps in the connection string schema for connecting to SQLAlchemy? Probably not. I could
06:42 just say connect to SQL, you know, connect SQLAlchemy to a Postgres database and boom, it gives me,
06:47 you know, create the metadata, the metadata base class, and then create an engine and create a connection
06:55 and you're buying the engine, all those steps, right? Like if you could just kick that kind of
07:00 stuff out, that's something you want for a project and you just never do. It's not like, boy, I'm sure
07:05 I'm not good at connecting to SQLAlchemy. I'm just not a good programmer, I guess, right? You look it up,
07:09 you put it in there and you go. And so if you didn't have to take the step of looking up, that's kind of
07:13 cool. Yeah. I also like that. I didn't think about this before. And I think GitHub actually intended you
07:18 to think about it like this with naming it Copilot. It's not intended to take over your work, but it's
07:24 like sitting down with somebody that kind of knows what they're doing and being in pair programming with
07:29 them. Yeah. You can't turn off your brain, but maybe you can ease up a little bit. So anyway.
07:34 Wait, before you close this, scroll down to this black and white code editor. Boy, look at that.
07:39 If you check out this article, there is a, I don't even know what to make of it. Because to me,
07:43 it looks like a super retro early macOS, like macOS one type of UI, but then the file is c colon
07:51 backslash CD. It's just a mix of like beautiful retro. Yeah. Well, he was talking about the first
07:58 recorded code completion appears in the Pascal editor called Alice in 1985. So yeah. And I guess that's it.
08:07 Well, that's a, that's a heck of an editor. Super cool. All right. On to the next one.
08:12 Yeah. Two things real quick. I just want to point out or sort of make a comment. It's not pointed out
08:17 this morning. I had to make a new API because one thing I've learned about writing courses that depend
08:23 on other people's APIs, these other people suck at keeping their APIs running. They either decide,
08:29 you know what, this is costing me $10,000 a month and I'm going to have to charge for it. Boo hoo. No,
08:34 just kidding. That's a reasonable reason to change, but it changes like with the open weather API
08:37 or like this one for this Twilio course I was using. So I spent the morning a little bit of
08:44 yesterday and this morning, just doing a complete from scratch FastAPI API. And what a ton of fun
08:50 it is to just work with FastAPI and get to build out all sorts of neat, neat little things. And so,
08:56 you know, I just want to shout out if you're, if you're building something with FastAPI or you're
09:00 building an API, you can definitely give FastAPI. Look, there's a lot of, a lot of neat things you can do
09:04 to put together. Like here's a whole little website. It even does CSS and images and sort of,
09:09 sort of chameleon templates. I mean, it's basically static, but anyway, fun stuff and continues to be
09:14 fun. And so which, which, course is this for? Is it for the Python powered chat apps with Twilio
09:21 and SendGrid, which is actually a free course, but it sets up a chat bot that you order from like a
09:26 bakery type thing over WhatsApp. And the problem is if you go to the APIs that the WhatsApp thing was
09:33 using, they just 500 or 404 or one of those two things, neither of which is super useful for the
09:39 course. So I recreated it in FastAPI this morning, which is cool. Now it lives on the internet,
09:43 but that's not what I want to talk about as super as that is. I want to talk about Apache
09:48 superset. Okay. Have you heard of superset?
09:51 No. Well, the word I know.
09:53 Of course. But Apache superset is a modern data exploration and visualization platform. And I came
09:58 across that the other day and I'm like, what the heck is this? I haven't even heard of this. It has
10:02 almost 50,000 GitHub stars. Okay. That's insane. And is put together, by back, Max Bushman,
10:10 co also the creator of Apache airflow, which is pretty cool, right?
10:15 So this is, this turns out to be a really interesting program and it's written in Python
10:21 and TypeScript. It's like really front end heavy because it has a lot of visualizations and stuff,
10:25 right? But all the backend stuff, it's all the things that you would know. It's Flask, it's Redis,
10:30 Celery, many of the, you know, pandas and data science tools you would know, but it's not exactly
10:35 a tool for developers like Jupyter. So Jupyter would be a way that data scientists who know Python would sit
10:42 down and leverage their Python skills to check out data and explore things. This one is really almost
10:48 meant for like people who would say, I'm going to fire up Excel and see what's going on, or I'm going
10:53 to fire up some BI tool like Tableau. And I want to look at it a little bit and see what's going on.
10:58 And it's also open source and written in Python, which means it has APIs and extensions and plugins in
11:06 Python, which is pretty excellent. So it has a way to explore your data. Like Brian, look at this
11:10 picture. What do you think? It's, I don't know what it is, but it's pretty.
11:13 It's glorious, right? Like it's a fantastic way to visualize. You know, here's 25 contributors to a
11:19 stream over time. You can sort of see like the growth of their contributions or not. And so the way you
11:24 generate this is you just connect it to a database. It gives you the table. You say, make a chart out of
11:29 this database and you draggy, droppy, the pieces over and boom, there it goes. And it doesn't have to
11:35 just be the data in the database. It can be a computed field. So you could say, I want to graph
11:41 the sum of this join onto like the orders of each customer, or I want to see the max order for each
11:48 customer, you know, things like that. Right. So that's pretty cool. So you can explore data like
11:52 that. You can create these dashboards, these live dashboards to see what's the state of our
11:57 business today. And it even comes with a SQL IDE, all of this in the browser, very Jupyter-esque.
12:03 Pretty cool.
12:04 This is pretty neat. Yeah.
12:06 Yeah. Very, very neat. And it connects to, I told you it was Python. It connects to all of its databases
12:13 using SQLAlchemy. And so any database that can be a data source for SQLAlchemy, you know,
12:18 obviously Microsoft SQL server, Postgres, MySQL, but you know, things you might not think of like
12:24 Vertica or Druid or Amazon Redshift or Google BigQuery, all of these different data sources,
12:30 Databricks are available as a data source because SQLAlchemy knows how to talk to it. And this just
12:36 leverages SQLAlchemy. Yeah. Hey, hold it there for a sec. One of the things I learned recently,
12:40 which I don't know why I never got this before, but look at the SQLite logo. Yes. It's got a quill in
12:46 it. Did you, did you know that before that it's a quill for SQLite? Oh, quill. I did not put that
12:53 together now. How funny. Now we know. Cool. So anyway, yeah, people can check this out. It's
13:00 kind of a little bit intense to run, but you can pip install it, but probably the better way to do it,
13:06 you want to just try it out is to install it locally with Docker. So for me, for example, I just
13:13 put in the GitHub repo and then went in there and said Docker compose, gave it the YAML file and said,
13:17 pull and then up and off it goes.
13:19 So this is not a service. This is just something you can download and you run then.
13:24 It's something you can download and run, but it has a lot of infrastructure bits clicking together.
13:28 Okay. And so, when I interviewed Max Bushman, he actually is now the CEO and founder of preset,
13:37 which is superset as a service. So if you want to, if you want to have someone else host it for you,
13:42 you can go check it out with them. Right. But it's also a thing you can just run yourself,
13:47 but look how popular it is. Almost 50,000 get up stars, 10,000 forks. And I just learned of it.
13:51 That's nuts.
13:52 Well, I mean, you know, go figure. People actually want to know what's in their data.
13:56 I know.
13:57 Weirdos.
13:58 Yeah. It's so weird. What I think is cool about it is it lets you connect into like your live
14:04 operational data, not just like, Oh, I downloaded a CSV and now I can ask questions. Right. You can just
14:08 like whatever the current data is, let's get that and build a dashboard around it.
14:12 Pretty awesome. Yeah. Yeah. All right. Well, superset, if people need an alternative to Excel
14:17 or BI or Tableau or whatever, check out superset. It's very, very Python friendly and looks pretty
14:22 nice.
14:23 You know what else is nice?
14:24 Tell me.
14:24 Microsoft for startups.
14:27 Ah, they are. They are very nice. So yes, it's time to tell everyone about our sponsor,
14:33 isn't it, Brian?
14:34 Yeah.
14:34 Let me tell you all about Microsoft for startups. They created Microsoft for startups, founders
14:38 hubs to help give early stage startups, the support that they need to be successful. If you are dreaming
14:46 of, or in these stages of an early stage startup, you know, you should go apply. And the link at the
14:52 bottom in the show notes is by them by set of them slash founders hub 2022, all one word go over
15:00 there and apply is completely free to apply. You don't have to be third party verified. You don't have
15:04 to be VC funded. If they think your startup has merit, you're in the program program comes with
15:10 many thousands of dollars of cloud credits. You can, you get some to start. And as you make your
15:17 way through different stages of your life cycle, you get a bunch more, but what's maybe even more
15:22 important is access to their mentorship network. So there's a reason that Silicon Valley is the
15:29 heart of so many startups. And it's not just, you know, the nice weather, if anything,
15:34 I don't encourage people to go outside and not work on their projects, right? It's the network
15:38 and it's the connections. And if you live somewhere else, or if you're not in that space, it's very
15:44 hard to get connected with the right people to make the right steps, right? So this program will get you
15:51 set up there. So in addition to all the cloud credits and so on, you have access to this founders
15:56 network where you can book one-on-one meetings with hundreds of different mentors, many of whom are
16:01 founders themselves that are experts in areas such as idea validation, fundraising, management and
16:08 coaching, sales and marketing. That's the one that's the toughest, I think. If you can nail that,
16:13 you're golden. So make your idea a reality today with a critical support for Microsoft for startups,
16:18 Founders Hub. Check them out at pythonbytes.fm/foundershub 2022. Thanks again to Microsoft for
16:25 supporting our show. Yeah. Thank you. Yeah. Indeed. All right, Brian. Now what you got?
16:30 Well, I want to share something that Jeremy Page from the chat says, I thought SQL, always thought the
16:37 SQLite logo was an homage to TCL and I've got the logo for TCL. So maybe, I don't know.
16:44 Perhaps. Interesting. So, but I wanted to talk about recipes from Python SQLite again, recipes from Python
16:53 SQLite docs. So this is kind of a, there's a, this is an article by, I wrote it down, I promise I did.
17:00 Redowan Delaware, cool name. So this, he was going through the SQLite three docs on the Python docs.
17:09 And there's a, there's a lot of examples, but some of them don't have actual examples. It just talks
17:15 about the API. And so he decided to write out some of the examples as little code snippets. And I really
17:21 like this. If you're learning SQLite or even, you're just want to learn not SQLite in particular, but
17:27 databases. These are concepts that apply to a lot of things. So he's, he's got, of course, whether or not
17:33 you can execute individual statements or batch statements. So he's got little examples for that
17:38 goes into this is interesting. I thought was user defined callbacks. I thought this was really cool.
17:45 For instance, a scalar function, he defined a, and I knew that like you could put user defined functions
17:51 in databases, but I haven't ever done that really. He has a, a hash function, SHA256, that creates a hash
17:59 for passwords. And then he shows how to use that when he passes in a username and password into the
18:05 database, how it turns it into a hash, hashes it before it stores it.
18:10 That is cool. I never knew you could do that. Here's a Python function passed over as part of a
18:16 passed over to SQLite. And then the SQL statements can call it. That's, that's real cool.
18:21 Yeah. I mean, there's a special syntax. So that's good that there's these examples of like insert into
18:26 user values, users values, and then this question mark and SHA256 question mark.
18:32 Also, that's fantastic that that's being shown because that's the parametrized,
18:37 then the anti little Bobby tables version.
18:40 Okay.
18:41 Which is the best practice, right? The alternative is something worse.
18:46 Yeah. And then, you know, aggregate functions, which kind of got lost here, but there's a whole
18:53 bunch of really cool examples of using, using SQLite and, and they're really tiny examples. And so the,
19:00 one of the other things I wanted to share the reasons I wanted to share this article is I think
19:04 this is a really great way to learn an API or learn a service is to write these little example
19:11 things in little code snippets and try it out. Try it out with a table that you're creating that only
19:17 has two or three elements in it so that you can, you can play with it and, and you can get your head
19:22 around what you think the answer should be and what it does. The only thing I think I'd probably add,
19:27 of course, is if you're going to do little code snippets, these all have to be in separate files,
19:33 right? Unless you just write test functions. So this is a great use for pytest. I use it all the time.
19:38 If I'm learning something, I just do these little code snippets, but I do them within a test function.
19:43 And then it can be, it's not really testing anything except my own knowledge, but I can run
19:47 them just by right clicking on the, or clicking on the little arrow that the editor has for each
19:52 little function. So just rerun the failed test until, until I understand it. Yeah. Oh yeah. Very cool.
19:59 Anyway. All right. How about something we don't understand? Okay.
20:03 Let me take you over, let me take you over to a weird world of cascading consequences. So there's
20:08 this guy who is a assistant professor at NYU Tandon security and reverse engineering person named
20:16 Brandon Dolan Gavit. And there's this tweet here over to his blog post saying, a new blog post in which I
20:24 I download four terabytes of Python packages containing native X86 libraries, you know,
20:31 something that's done some C++ thing like G event or pandas, one of those numpy that then bundles
20:38 it into a wheel. And apparently there's a bug in one of the C compilers that if you pass dash F
20:45 fast dash math, it will potentially alter the floating point behavior of your program. If you
20:51 compile it with that. All right. So we're in Python, we don't compile things that often. What do we care?
20:56 Well, what this does is it reconfigures how the process uses like some low level registers,
21:04 but some feature of the CPU on how it does floating point math. And because when the library is loaded,
21:10 it changes that feature while it changes it for the entire program, AKA your program. That doesn't
21:16 sound great. Does it? No. So let's, let's dive in. So the article is called someone's been messing
21:22 with my sub normals, sub normals, I suppose being an aspect of floating point computations. So here he
21:28 is in Python 3.8 and he says from transformers import code gen for causal LM. And that's all they,
21:36 it's all he wanted. This is in IPython terminal. And it starts bumping out all these warnings.
21:41 NumPy core get limits. User warning, the value of the smallest subnormal for class
21:46 numpy.float32 type is zero. Over and over and over these start popping out. It's like,
21:52 hmm, well, warnings about floating point numbers sounds bad. What do you think?
21:56 Yeah.
21:57 So it turns out that something, not numpy, but something that is in this library was compiled with
22:05 this dash FF math dash fast flag. When it got imported, it changed how numpy was working. Okay.
22:13 So it says, what were the problems? It says, well, it changes the floating point unit behavior that's on
22:19 the CPU, the actual FPU. I remember when, by the way, CPUs didn't come with that. Like I was trying to
22:25 decide with my first computer to get a 486 SX or DX. And I got the DX because it came with a floating
22:31 point unit on the CPU. Anyway, that thing gets messed with and says for some algorithms that depend on the
22:39 behavior and will fail to converge if it's set to treat this as different. So it uses the FTZ DAZ flags
22:47 in the MX CSR register. That's part of the part that I don't understand. I don't, I don't work that low
22:53 level, but it turns out it's not ideal. So it said, well, what is actually going on here? And apparently
23:00 there's a way, there's a whole bunch of stuff, how you can search through Linux and whatnot to figure out
23:05 what processes are doing this weird stuff. And also apparently if you compile with the dash, oh fast, it
23:12 also like cascades over to having the same behavior. So there's some exploration, like you wrote some C code
23:18 and then imported it into Python. and it seemed all fine. And then did the same thing with oh fast and able to get
23:25 all these warnings. I've never seen this warning. So I guess that's good, but it turns out the culprit was
23:31 G event of all things, which is a event-based asyncio networking library. Yeah. But somehow something was
23:39 using it. And when it got imported, it freaked everything out. So then the question becomes, well,
23:43 if G event, G event can be causing these problems because somebody thought it was awesome to compile
23:49 the fast version, not the slow version. What else is out there? So, Brandon went through and decided
23:55 to download four terabytes of wheels for all the things that might have some kind of x86 binary in
24:02 them. And then there's a ton of analysis of trying to figure out like, well, how do you actually look for
24:08 and find whether or not this program has this feature or not? It turns out to be pretty tricky.
24:13 So there's a bunch of stuff about going through to just check to see like what, how do you test it for
24:19 this many packages? Cause the test he was using before was super slow. So anyway, it's, it's not ideal.
24:26 I think there was something like 49 different packages. Let's see. I wrote it down up here. I'll get this
24:32 number, right? Yeah. There's 49 packages, 49 packages on PyPI that were built with this flag.
24:38 However, thousands of packages use those libraries and hence were also subject to that behavior with
24:45 10 million downloads in the last 30 days. So that's pretty nuts, huh?
24:49 Well, I mean, you're kind of scaring me. So how do I know if I need to care? I guess,
24:55 you know, I, are you doing iterative floating point math that goes down to like very small things?
25:01 Probably, probably not. I don't think I need to care. I'm doing like, I need, I didn't need to
25:06 know what 33% of, you know, 69 is. It should be fine. Right. but if you're doing, well,
25:12 you got to test, got to test your code. And I guess we have to test our math as well. I just sort of
25:17 trust that a lot of that works. Yeah. I suppose you would see those warnings, right? That about the
25:23 floating point subnormal coming in. Okay. Yeah. so there's a great long list of here of packages,
25:31 let's see. I'll just read some out. People might know. So for example, G event, G event, G event, HTTP client, flask socket IO dagster, which is used in data science a lot for
25:43 like data engineering, web socket, G event, web socket, locust for a testing, interpret high Kafka
25:51 and on and locust plugins, parallel SSH, right? So it doesn't matter if you're using that library
25:56 for the math, just if it gets imported, it changes all the math of the program.
26:01 So anyway, there, there it is. People can check it out. The comments are pretty glowing about this
26:07 research. Matthew Adams, for example, says crazy, awesome work, bro. You should be knighted for this.
26:13 in our chat, Alvaro says, run your, run your test with a dash W error, which you should be
26:20 anyway. So cool. So warnings treated as errors basically. Yeah. Yeah. Or set that particular
26:26 one to be, warning. All right. Well, I guess that's it for our four items that we're
26:31 covering today. Am I right? Yeah. I was just, I was, I was giggling during part of that. Cause I,
26:36 the subnormal just cracking me up. Like, like why is, why is Brian talk like that? I don't understand
26:42 most of his words. Oh, don't worry about him. He's subnormal. I don't know. I also, I also like the
26:50 title of the overall blog, push the red button, push the red button for a research, malware
26:56 reverse engineering, pen testing on the blog. Yeah. Nice. Nice. All right. Well, how about some extras?
27:01 Yeah. I don't have anything I want to show, but, but I was just going to say a couple of
27:08 things I've been up to. I've been thinking about change logs a lot and for on test and code,
27:13 instead of doing like a one episode on change logs, I thought I would talk to several people and do an
27:19 NPR style combined. So it might be, it might end up being a series of episodes that I'll release
27:25 together or, or one long episode. I'm not sure yet, but, basically I'm thinking about change
27:30 logs a lot. the other thing I've been doing is, thinking about, so we had that pie test
27:36 course out, right? Last week. we did just awesome on, Talk Python Training. and,
27:43 I, I, I, it's cool. anyway, Talk Python Training. I always get to it by just
27:48 remembering that I switched that and just say training.talkpython.fm and you can get there.
27:53 But, but I've had some requests to take some of the content and, change it for individual
27:59 teams. So, and this is an interesting thing to me to, to, and to think about, to say, cause like in
28:05 this course we do a database and a command line interface, but we're mostly testing through the API.
28:09 So API API with the database application. so we're doing things, the, the layered things,
28:15 but some people are like, well, I don't use the database. So maybe we could swap that out with
28:19 an example that uses one of the resources we have. And more of our example, we don't do the API. We do
28:26 these little, we're testing something else. So like, okay, we can cover the concept. So it's a neat idea
28:31 to try to focus that towards people. So if it, I guess if you're interested in doing that,
28:35 check out, python test.com and under training, check me out. So yeah. Awesome. Yeah. There's a
28:42 lot of ideas in that course that can be applied to different industries, different ways. Yeah. Yeah.
28:46 Different ways for sure. Awesome. Yeah. So the PI test course is going super strong. People really
28:51 love it. great work on that, Brian. I have another course to announce cause it's been a week.
28:55 It's been a week. It's been a week. Python data visualization. So this is a course by Chris
29:02 Moffitt over at Talk Python Training. And the idea is there's all these different choices. I mean,
29:07 we just talked about superset today and throw, throw that in as another thing in the pile of general
29:12 visualization tools, right? So you might do matplotlib, or maybe you want to use something new
29:17 like Altair. So this course goes through and shows you what it's like to do visualizations
29:23 in these different frameworks, like matplotlib, Seaborn, even pandas and plot.ly and streamlet.
29:29 And then you can build out these different scenarios and say, well, in this case, it might make more
29:34 sense to use matplotlib, or I might choose Altair and it'll help you choose a visualization
29:38 framework, but also it'll show you how to use all of them. So it's a nice broad exposure to all these
29:43 different frameworks. So people can check that out. Talk by thumb.fm click on courses. Ooh, this,
29:48 this is definitely useful. I got a project that I need this for. So yeah, this is going to be a good
29:52 one. It is a good one. I've already seen it. I've seen it several times actually, but it's good.
29:57 Let me see. Do I have any more extras I want to give a shout out to? No, just those two things.
30:02 And then I have, I have two jokes for you this week because one is not enough.
30:05 No. Yeah.
30:06 The first one here has to do with people who maybe learned a different language, maybe are
30:13 hating a little on Python. So here's somebody says, me laughing at all the Python hate on this
30:19 sub Reddit as I study C#. Silly language. Come on. We all know C# is better. And then
30:26 that's like a smiling, laughing person. And then a more seriously, somewhat concerned starting a new
30:31 job and realizing on the job board, 95% of them are asking for Python. That's very fun.
30:37 Well, that now I want to go over to the, like the, the C# subreddit and see if I can find some
30:44 anti Python jokes.
30:45 I know.
30:45 Wouldn't that be good?
30:46 All right. Well, that one's pretty good. And then were you affected by the recent, we have,
30:52 for people who are not in our area in Pacific Northwest, there was a massive windstorm, like
30:57 30, 40 mile an hour wind, 25% humidity, a hundred degrees. It was like, if somebody threw a cigarette
31:04 out the window, the entire Pacific Northwest would just go instantly catch fire. It was like,
31:08 it was insanely bad. And so we had our power turned off in the West Hills here because the
31:14 trees were so likely to fall over and cause a fire from knocking over. So they just cut the power for
31:20 like a little bit. They also did that in California. There's like a big, it was a bit of an irony.
31:25 Like one day they said, we're going to only allow the sale of electric cars after 30, 30, 35 or something,
31:32 whatever the date is. I mean, I'm, I'm going to support that. I'm a fan of electric cars and all,
31:36 but like the next week they said, Oh, we're going to turn off your power. Cause actually I think the
31:41 electric cars might help balance it out. But anyway, bit of an irony. So this next joke has to do with
31:45 that. So I got ahold of this from Kylie codes and she highlighted this tweet that says the governor
31:52 has declared that for California, the governor has declared a state of emergency and ask all
31:56 Californians not to run NPM install between 4 PM and 9 PM today in an effort to save energy and fight
32:04 this wildfire danger. Oh, that's awesome. Isn't that good? Yeah. Yeah. I love it. So that's,
32:11 that's the two jokes I got for you. yeah, nothing too deep. Well then also you, me and missed one.
32:17 There was a, like the, the, the build on of that. The build. All right. Do tell us about it.
32:22 Okay. Governor declares the state of emergency and asked all Californians to not run,
32:25 a wasm pack build between 4 PM and 9 PM. Exactly. Nice. Cool. And John Sheehan says,
32:33 it's funny because it's true. Didn't you just talk about the other day, about rough and having
32:43 our Python tools faster, like the JavaScript community is being concerned about faster tools.
32:48 Maybe not everywhere. Maybe not a hundred percent. Yeah. Awesome. All right. All right. Well,
32:53 good episode as always. Thank you. Thank you. We'll talk to you next week. Yeah. See you next week.
32:59 Thanks everyone for listening. Bye. Bye.