#216: Container: Sort thyself!

Published Wed, Jan 13, 2021, recorded Wed, Jan 13, 2021

Sponsored by Datadog: pythonbytes.fm/datadog

Special guest: Jousef Murad, Engineered Mind podcast (audio, video)

Play on YouTube

Watch the live stream replay

Brian #1: pip search. Just don’t.

pip search [query] is supposed to “Search for PyPI packages whose name or summary contains [query]”
The search feature looks like it’s going to be removed and the PyPI api for it removed.
Alternative, and better approach, just manually look at pypi.org and search for stuff.

Right now it does this:

    $ pip search pytest
    ERROR: Exception:
    Traceback (most recent call last):
    ... [longish traceback ommited] ---
    xmlrpc.client.Fault: [Fault -32500: "RuntimeError: PyPI's XMLRPC API has been temporarily disabled due to unmanageable load and will be deprecated in the near future. See https://status.python.org/ for more information."]

The Python Infrastructure status page says, as of Jan 12: “Update - The XMLRPC Search endpoint remains disabled due to ongoing request volume. As of this update, there has been no reduction in inbound traffic to the endpoint from abusive IPs and we are unable to re-enable the endpoint, as it would immediately cause PyPI service to degrade again.”
This started becoming a problem in mid December.
The endpoint was just never architected to handle the scale it’s getting now.
There’s a current issue “Remove the pip search command”, open on pip.
- The commend thread is locked now, but you can read some of the history.
I personally don’t understand the need to hammer search with a CI system or other.
- Probably should be using a local cache or local pypi mirror for an active/aggressive CI system.
If you have scripts or jobs that run pip search , it ain’t gonna work, so probably best to remove that.

Michael #2: QPython - Scripting for Android with Python

Python REPL on Android - interesting
Scripting Android tasks with Python - more interesting
Free, open source app that is ad supported.
Some people have commented that their phone is their only “computer”
With SL4A features, you can use Python programming to control Android work:
- Android Apps API, such as: Application, Activity, Intent & startActivity, SendBroadcast, PackageVersion, System, Toast, Notify, Settings, Preferences, GUI
- Android Resources Manager, such as: Contact, Location, Phone, Sms, ToneGenerator, WakeLock, WifiLock, Clipboard, NetworkStatus, MediaPlayer
- Third App Integrations, such as: Barcode, Browser, SpeechRecongition, SendEmail, TextToSpeech
- Hardwared Manager: Carmer, Sensor, Ringer & Media Volume, Screen Brightness, Battery, Bluetooth, SignalStrength, WebCam, Vibrate, NFC, USB

Jousef #3: Thesis: Deep Learning assistant for designers/engineers

PyTorch (3D) / TensorFlow
The thesis: what is it actually about & goal of the thesis
Libraries mainly used: numpy, pandas
(Reinforcement Learning & GANs)

Brian #4: sortedcontainers

Thanks to Fanchen Bao for the topic suggestion.

Pure-Python, as fast as C-extensions, sorted collections library.

    >>> from sortedcontainers import SortedList
    >>> sl = SortedList(['e', 'a', 'c', 'd', 'b'])
    >>> sl
    SortedList(['a', 'b', 'c', 'd', 'e'])
    >>> sl *= 10_000_000
    >>> sl.count('c')
    10000000
    >>> sl[-3:]
    ['e', 'e', 'e']
    >>> from sortedcontainers import SortedDict
    >>> sd = SortedDict({'c': 3, 'a': 1, 'b': 2})
    >>> sd
    SortedDict({'a': 1, 'b': 2, 'c': 3})
    >>> sd.popitem(index=-1)
    ('c', 3)
    >>> from sortedcontainers import SortedSet
    >>> ss = SortedSet('abracadabra')
    >>> ss
    SortedSet(['a', 'b', 'c', 'd', 'r'])
    >>> ss.bisect_left('c')
    2

“All of the operations shown above run in faster than linear time.”
Types:
- SortedList
- SortedKeyList (like SortedList, but you pass in a key function, similar to key in Pythons sorted function.)
- SortedDict
- SortedSet
Great documentation and tons of performance metrics in the docs.

Michael #5: Łukasz Langa Typed Twitter Thread

Let’s riff on typing for a bit.
Here is my philosophy: If I have to type more than three characters to complete a symbol in my editor, something is wrong.
e.g. to go from email_service. → email_service.send_account_email() I should only need to type .sae then tab/enter. These types of things are vastly better because of type hints.
Python type hints are more malleable than even TypeScript.
Lukasz is addressing this comment: Controversial take: Types in a Python code-base are a net negative.
Points
- put enough annotations and tooling connects the dots, making plenty of errors evident.
- The most common to me at least is when a None creeps in.
- The second bug often caught by type checkers is on the "return" boundary: one of your code paths forgets a return.
- squiggly lines in your editor
- Microsoft is now developing powerful type checking and code completion for Python in VSCode. This effort employs a member of the Python Steering Council, and possibly also the creator of Python himself soon. You think they would settle for "illusion of productivity"?

Jousef #6:

Point Cloud operations → open3d

Extras:

Michael:

via Francisco Giordano Silva: On Brian's ref to using numpy all for array element-wise comparison, also please check out numpy.allclose method. Allows you to compare two arrays based on a given tolerance.

Brian:

Just this: 2021 is exhausting so far.
Test & Code has shifted to every other week to allow time for other projects I’m working on.
- This is probably a short term change. But I don’t know for how long. It’s definitely not going away though. Just slowing down a bit.

Jousef: Engineered Mind podcast

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.

00:04 This is episode 216, recorded January 13th, 2021.

00:10 I'm Michael Kennedy.

00:12 I'm Brian Okken.

00:13 And Brian, we have a special guest, Yousef.

00:15 Welcome.

00:16 Hi.

00:16 Great to have you here.

00:17 You want to just take a quick moment and tell folks about yourself, maybe about your podcast real quick?

00:22 Yeah, sure.

00:22 Thanks for even being able to participate in this podcast.

00:26 So my name is Yousef.

00:28 I might not be well known as you guys are, for sure.

00:31 I'm a mechanical engineer from Germany.

00:33 I'm based in Germany as well.

00:34 And I'm working for an IT company called SimScale, who are providing cloud-based simulation technology.

00:40 On the site, I'm hosting a podcast called Engineered Mind.

00:43 And I'm working on a bunch of other stuff.

00:45 For example, my thesis, which we'll extensively cover, let's see, in this podcast.

00:49 Yeah, yeah.

00:50 You used a couple of cool libraries and stuff over there, which we'll feature here.

00:55 All right.

00:56 Well, the very first item, Brian, let's talk about maybe doing a pip search a lot.

01:04 Yeah.

01:05 Well, I kind of forgot pip search was the thing, because when I'm looking for PyPI stuff, I go to pypi.org.

01:12 Yeah, so do I.

01:13 Yeah, exactly.

01:14 It's really fast there.

01:16 Yeah, but there's a feature called pip.

01:20 You can do a pip search, which the documentation says it's supposed to search for PyPI packages whose name or summary contains whatever query.

01:30 So I can say pip search pytest, for instance, and it should show me if pytest is a package on PyPI.

01:39 But right now, if you do that, it comes back with a big traceback.

01:43 And it says, fault, what, minus 32500?

01:48 Runtime error.

01:50 The PyPI's XML.

01:52 Anyway, the API is broken.

01:53 So this is on purpose.

01:56 What happened is, actually, I don't know really what's happened, but the service is getting swamped.

02:03 The search endpoint is getting hit extremely hard.

02:07 Yeah, I saw some message or some tweet that was to the effect of, is somebody out there running an insane number of searches against this endpoint?

02:15 Please don't.

02:17 Yeah, I don't know what's going on.

02:18 So there's some guesses.

02:20 Maybe it's a rogue continuous integration server or something weird's going on.

02:26 But in the meantime, right now, we're going to link to a Python infrastructure status page, which has an update on this.

02:36 So if anybody wants to follow, you can check that out.

02:40 It says that the search endpoint remains disabled due to ongoing request volume.

02:46 And I think this really started becoming a problem mid-December.

02:51 And so I'm not sure what happened then.

02:55 And then there's a related issue on GitHub for pip.

03:00 So there's an issue open saying remove the pip search command.

03:05 So I think the end result is, and even the error message says the search endpoint will be deprecated in the near future.

03:13 So I think that this way to do pip search is just going to go away.

03:18 That's actually a little surprising because usually a lot of these things are so backwards compatible.

03:24 Yeah.

03:25 And there's quite a discussion on the issue thread.

03:29 But the gist of it is the current architecture is never designed to handle the volume it's getting right now.

03:36 So there's a comment at the end of the thread that says, if you've got an idea for how to do this algorithm better or a way to do it scaled, go ahead and discuss it.

03:49 But there's a link to it.

03:50 But there's a link to it.

03:51 We're not going to put that link in the show notes.

03:53 But in the IPI thread or the GitHub thread, there's a link to it if you want to comment on that.

04:00 But basically, we're bringing this up.

04:02 You may have figured out it might be a fluke or whatever, but it's really going on.

04:07 And a plea to look at your continuous integration scripts.

04:11 And if you're doing a pip search in there, take those out.

04:13 It ain't going to work anyway.

04:15 It's got to be some kind of bot, some automatic thing like this, because it's already given the error message.

04:21 Like, people would stop, you know, if it wasn't.

04:25 Maybe somebody's trying, like, constantly trying to scrape all of the PIPI data out.

04:30 I don't know.

04:30 Yeah, why do a search?

04:31 That's just weird.

04:33 Yeah, exactly.

04:33 I don't know what's going on here.

04:35 But I guess don't do it.

04:37 Don't do it.

04:38 Doctor, it hurts when I do this.

04:40 Stop doing that.

04:40 So the next one I want to talk about is QPython, not QTPython or Qt or anything like that, but QPython, which is a way to do Python on Android.

04:52 So we've talked about a couple of interesting applications.

04:55 We've talked about Carnets or Carnet.

04:57 I think it's French pronunciation, I've been told.

05:00 And that's a really cool way to do Jupyter on iPad.

05:03 So local.

05:04 All these are local, obviously, not just running in the browser.

05:07 There's Pythonista, which is really interesting.

05:11 And QPython is also an interesting one for a couple reasons.

05:14 Because you get an SDK and a REPL for your Android device, which is pretty interesting.

05:20 But the reason I'm covering it, I think it's interesting.

05:23 Somebody, I think somebody sent this over.

05:25 No, I ran across this myself.

05:26 Anyway, is it allows you just to integrate with the underlying Android APIs and features for automation.

05:35 Yeah, cool, right?

05:36 So you can do things like check the system.

05:41 You can send out toast notifications.

05:43 You can interact with applications.

05:45 You can mess with the clipboard.

05:46 You can do barcode scanning, speech recognition, send emails, like all those kinds of things around even, you know, screen brightness or checking your battery or whatever.

05:58 So if you want to get access and automate your Android things, Python, well, here's a cool little app to do it.

06:05 Okay.

06:06 Wait a second.

06:07 So I'm not an Android user that much.

06:10 I've got like one Android tablet.

06:11 But I didn't know it can make toast.

06:13 Yeah.

06:14 Well, it really prefers sourdough, but it will go even as far as rye if you have to.

06:21 No, what's toast?

06:23 Do you know what toast is?

06:24 It's like a pop-up notification, I think.

06:25 Oh, okay.

06:26 Yosef, are you an Android person or an iPhone person?

06:32 I have to confess I'm an iPhone person.

06:35 I used to be completely against iPhone, but once you're in the ecosystem, you never get out.

06:40 It's like the godfather.

06:42 They just keep pulling you back in, man.

06:43 Yeah.

06:44 I just recently got a new iPhone as well, and I'm general about it.

06:49 But because we have our mobile apps, the training for the courses, I've got an Android tablet and I've got an Android phone and so on.

06:56 Oh, also, we've got a comment here on YouTube.

07:00 So is it an own framework or can you use it in Android, Kotlin, and Java?

07:05 I believe it's more like an app that you run, and then within that, you can do little jobs and stuff.

07:11 So way to aesthetic.

07:12 It's not something you can bring in that I'm aware of because you install it from Google Play, for example, to get started and so on.

07:19 But maybe you can plug it in.

07:21 They do talk about having an SDK, so possibly, but I got the sense that it's more for writing code outside than getting it on your device.

07:27 But yeah, pretty cool.

07:29 So if you're into Android and you want to do Python automation on it, this is pretty cool.

07:34 It's free.

07:34 Get it on the Android store.

07:35 Apparently, it has ads, but it's also open source.

07:38 So go with that.

07:39 Do you know if there's a counterpart for iOS?

07:41 I don't know about the automation side.

07:44 There's a thing called Carnets, which is really cool.

07:48 Let's see if I can find that.

07:49 Carnets app.

07:51 I believe that's how you spell it.

07:52 Yes.

07:53 That's Jupyter on the App Store.

07:55 And that thing, I don't really want to open the App Store, but apparently I have to.

07:59 Well, so much for that.

08:00 But Carnets, it's here.

08:02 Oh, and it's also on Google Play.

08:06 Is that the same thing?

08:07 No, that's a totally different thing.

08:08 But Carnets or Carnet is a very cool app that lets you do something similar.

08:13 There's also Pythonista.

08:14 Those are the two I know for iOS.

08:16 All right.

08:17 So moving along, Yosef, maybe tell us a little bit about your research and then one of the libraries you've been working with here.

08:26 Yeah, sure.

08:27 So Open3D is one of the possibilities to visualize.

08:29 There you go.

08:30 3D.

08:31 I had it out of order.

08:32 Yeah, yeah.

08:32 Let's talk about PyTorch first.

08:34 Sorry.

08:34 That's fine.

08:35 So PyTorch 3D is basically an option.

08:37 Let's say if you work with meshes.

08:39 Let's say a mesh consists of edges and points, for example, and these edges connect all the points and what you get at the end is a mesh.

08:46 So PyTorch, which is for Facebook, Facebook AI research, and they created this framework, so to speak, to be able to work efficiently with 3D data.

08:57 So unfortunately, I'm using point cloud data.

08:59 But the beautiful thing is that if you use PyTorch native application, which you produce for your 3D geometry, it runs, I wouldn't say significantly, but roughly 10 times slower than this PyTorch 3D, which is implemented especially for 3D problems.

09:17 Oh, wow.

09:17 Okay.

09:18 So what kind of problem do people solve?

09:20 What problem are you solving when you're working with this?

09:23 Yeah.

09:23 So in the beginning, it was like I was doing some kind of research.

09:26 Unfortunately, they are coming out paper like every day and not too many, actually, in the field of deep learning, especially when it comes to point cloud or like geometric data.

09:35 And the goal, just to inform the audience a bit, is my goal is basically to use deep learning and use some kind of or create an assistant system for engineers and designers.

09:45 That means, let's say you're an engineer and we have this CAD model.

09:48 So CAD, which stands for Computer Aided Design.

09:51 So you would create a model, for example, of a gear, and then you would have that gear.

09:54 But sometimes we have this differentiation between implicit knowledge and explicit knowledge.

09:59 Explicit knowledge means this is existing knowledge, which we already know about.

10:02 Let's say this knowledge can sit in a database.

10:05 And sometimes we are not making use out of it.

10:08 And then we have this implicit knowledge.

10:09 Let's say an engineer comes into a company, is completely new, and he brings knowledge with him to the company.

10:14 Now, the problem I want to tackle is because we're having so many data and we're accumulating geometric data in a company, we have to make use of that.

10:21 And my approach is, hopefully, when I'm at the end of the thesis, which is like in roughly two months, is that I have a system or web application as a front end where the engineer or designer picks or starts a design or picks a point cloud or a design.

10:35 And then it would suggest the engineer or designer with a probability of what they want to model.

10:41 Let's say he picks a gear.

10:42 Or maybe you want to have like an arrangement of gears or any specific big component.

10:48 Or let's say you take a wheel.

10:50 Okay, for example, what would you...

10:51 Like for a transmission or something?

10:52 Exactly, for a transmission or they pick a wheel and it could be a Tesla or it could be any other car.

10:57 And then it would give you a probability.

10:59 Okay, this wheel is maybe from a Tesla.

11:01 And then it would suggest you Tesla with a, for example, 89% probability.

11:05 And then you would click on the web application.

11:07 This is the idea.

11:08 And then it would pop the geometry into the web browser in the front end.

11:12 Oh, that's pretty cool.

11:14 So it basically, it's like image recognition, but instead of for pictures, it's image recognition for 3D CAD outlines.

11:21 Exactly.

11:22 It's so cool that you mentioned it because there's a big difference between doing a convolutional neural networks or deep learning for images because images are 2D.

11:29 It's like a 2D matrix.

11:31 But if you have a point cloud, then you have a tensor of higher dimensionality.

11:34 And then you are kind of forced to use, for example, NumPy and all these kind of things.

11:39 And if you're lucky, you could use something like PyTorch, PyTorch 3D, which you can also use CUDA on to be way more efficient.

11:45 Yeah.

11:46 Yeah.

11:47 Wow.

11:47 That's really cool.

11:47 So it looks like a neat thing.

11:49 This is, you know, I haven't done any 3D work for a while, but yeah, it looks, looks pretty cool.

11:55 I would love to see, I don't know, some pictures and stuff.

11:58 It would not be neat, but yeah.

11:59 They have a very good, like if someone is interested in seeing what PyTorch 3D can do, Facebook AI Research has an own YouTube channel and they pitched PyTorch 3D on that channel.

12:09 And they really do a nice, they show you what you can do with it.

12:11 So it's really interesting.

12:13 Yeah.

12:13 Oh, awesome.

12:14 Well, I guess I'd never really thought about applying, you know, AI, ML stuff to 3D meshes, but it makes perfect sense.

12:21 And I can see it's totally different than images.

12:23 Yeah.

12:24 Very cool.

12:24 Brian, do you guys do any, you don't do any CAD stuff with your devices, do you?

12:27 Well, I mean, yeah, some people do.

12:29 Not me either.

12:30 There's a lot of CAD that goes on in the ASIC design and stuff.

12:33 Yeah, I can imagine.

12:35 Yeah.

12:35 Cool.

12:36 All right.

12:36 Now, before we get to the next one, I want to get something sorted out, Brian.

12:40 Okay.

12:40 I want to talk about Datadog.

12:42 So they're back to support the show.

12:43 Thank you, Datadog.

12:44 Yay.

12:44 And so they're really about helping you troubleshoot latency, CPU, memory bottlenecks in your apps.

12:52 And if you don't know where it's coming from, Datadog will seamlessly correlate the logs and the traces at the level of individual requests, cross systems, allowing you to quickly troubleshoot your Python app.

13:03 And they have a continuous profiler that allows you to find the most resource consuming parts of your app in production, just running all the time at any scale.

13:12 And it has very little overhead.

13:14 So that's pretty cool.

13:14 Instead of trying to debug it and then deploy it and hope that kind of translates to production, just turn it on and watch.

13:20 Yeah.

13:20 So, yeah, that's cool.

13:21 So be the hero that got that app back on track at your company.

13:25 Get started with a free trial and support the podcast at pythonbytes.fm/Datadog.

13:30 Or just click the link in your podcast player show notes.

13:33 Now that that's sorted out, Brian.

13:35 Yeah.

13:36 So sorting.

13:37 Sorting's a thing.

13:39 And the default Python containers are not sorted.

13:43 And there's reasons behind that.

13:47 But sometimes you need to sort stuff.

13:48 So there's a Python library or a package called sorted containers.

13:54 I like it.

13:55 It's a very, I mean, I like the name at least.

13:57 It's a very easy to remember sort of thing.

13:59 But this is amazing.

14:02 I looked into this.

14:04 So this was recommended by Fanch and Bao recently for us to take a look at.

14:09 And it's a pure Python based sorted collections library.

14:14 And it's as fast as other packages that are built using C extensions.

14:20 Wow.

14:21 That's the impressive part.

14:22 It's also fairly memory safe.

14:26 But the documentation is pretty cool.

14:29 There's a whole bunch of different benchmarks.

14:32 So you can take a look at how it deals with large, large things.

14:36 But it's really pretty zippy.

14:37 It was pretty cool.

14:39 The right on the front page, there's some, there's some example.

14:43 And we're going to throw this in the show notes too.

14:45 Of just, there's, you've got, it handles a handful of different data types.

14:49 It shows sorted lists, sorted dictionaries, and sorted set.

14:54 There's also a sorted key list.

14:57 And I had to look that up to figure out what that was.

15:00 So sorted, the sorted function within Python allows you to pass in a key,

15:06 which the key really is a function to use to create a key for sorting.

15:12 And.

15:14 Right.

15:14 Because the things in there might not have a natural sort, right?

15:17 Like if you put a bunch of order objects in there, well, how do you sort those?

15:21 Do you sort them by price?

15:22 Do you sort them by date?

15:24 Right.

15:25 So you select out that element.

15:26 Yeah.

15:26 You should have it selected out or you can, you can do something like they might be sortable

15:31 by default, but you want it to be like a reverse sort or something like that.

15:35 Right.

15:35 So, so, and there's some, some caveats listed so that you have to make sure that your,

15:40 the key that you pass in, it follows some conventions like two identical items should be,

15:46 should have the same key, stuff like that.

15:48 It's all reasonable things, but it's a, it's a fairly easy, easy and complete package to just

15:54 use.

15:55 It looks, it acts just sort of like a normal, the normal thing, containers like lists and

16:02 dictionaries and sets.

16:02 It just, it just remains sorted all the time.

16:05 And this is pretty incredible.

16:07 So.

16:07 Yeah.

16:08 I can totally see bugs get into your code because you're like, well, we put stuff into this list

16:13 and oh, I want the latest one.

16:14 So it's the last one, but maybe, you know, you forgot to sort it before you did that or

16:18 the first one's the last cause you reversed it or whatever.

16:21 So what are the things that confused me when I first look at the, at this, I was

16:24 scratching my head for a second.

16:25 Cause it looks like a fairly simple set of like examples with just like a small set of

16:31 elements in it.

16:32 So like the first one is a, a list of like AB, E, A, C, D, B, just, you know, a few characters.

16:39 And, you know, it's a whole bunch of these examples with just a little small amounts.

16:43 And it says, underneath, this is all, all of the demo listed above, takes

16:49 a gigabyte of memory.

16:50 And I'm like, what the heck?

16:52 Why is it taking so much memory?

16:55 It's only five things.

16:56 Come on.

16:57 Yeah.

16:57 I mean, like why?

16:58 It's cheap.

16:58 Don't worry about it.

16:59 But there's hidden in there.

17:01 There's an example of, that five character list sorted list that gets multiplied,

17:09 by 10, 10 million.

17:11 So it's a 10, like a 50 million, a million characters in a list that got sorted.

17:17 Right.

17:17 Yeah.

17:17 So, and then like things like, and then all the operations like count.

17:22 So you can say, count all the C's in there.

17:24 It'll tell you how many there are.

17:26 and all of, a lot of these operations like counting stuff with, with a sorted set take,

17:31 you know, less than linear time.

17:33 So, yeah.

17:34 So there's, there's, there's times you need sort and this is a cool one to check out.

17:38 Yeah.

17:39 It's cool.

17:39 It's nice that it's pure Python.

17:40 super easy to install.

17:42 Right.

17:42 And it's not going to have any like weirdness around that.

17:45 Like if you say got an M1 computer and thing won't compile or whatever.

17:49 No, this looks really cool.

17:51 Yosef, what do you think?

17:52 Yeah, this looks amazing.

17:53 I'm also, I'm in touch with my brother, on the side and he's also watching our podcast

17:59 at the moment.

17:59 And he's also saying because of the one gigabyte memory for sorting is, it's incredible.

18:03 It's crazy.

18:04 Yeah.

18:06 That's, that's pretty awesome.

18:07 Yeah.

18:07 I guess it's just showing like you can have a ton and it's all nice.

18:10 So it's, I mean, it seems really straightforward, but having these things sorted, we just got

18:14 dictionaries that would stay put.

18:16 So, having sort of dictionaries is also cool.

18:19 Yeah.

18:21 Right.

18:22 It used to be that they sort of, if they had the same keys and stuff or they wouldn't

18:26 necessarily retain their order, of the things you added, but now they do.

18:29 Right.

18:30 So if people are confused and think, well, aren't dictionaries already sorted?

18:33 No, they're there.

18:34 They just stay in the order that they were created.

18:36 Exactly.

18:38 So yeah.

18:39 Yeah.

18:39 Similar, but not exactly the same thing.

18:41 Yeah.

18:42 All right.

18:43 so I want to, this next one, I want to riff a little bit on typing and I want to do

18:48 that around a tweet, which I think I've got to put into a different type of, hold on.

18:54 For some reason, Twitter has stopped showing me like the entire conversation of things.

19:00 I don't know why, but I guess it doesn't really matter.

19:02 So Lucas Lenga responded to a tweet that went out there.

19:07 You know, Lucas is obviously he's core developer.

19:09 He's been doing really important stuff, but one of the main focuses that he's been working

19:13 on is around type hints and typing with things like my pie.

19:17 He was instrumental in bringing typing to Facebook and the Instagram code bases and things like

19:23 that.

19:23 So, there's a tweet that says controversial take types in a Python code base are a net negative.

19:30 That's not Lucas.

19:31 This is his, he's about to have a whole long conversation about this that I'm going to

19:34 talk about.

19:35 But, Brian, what do you think?

19:37 You retweeted this.

19:39 Them's fighting words.

19:40 Them's is fighting words.

19:42 Yeah.

19:43 So yeah, what do you think?

19:45 I think that, I think that they're good too.

19:48 Yeah, I do too.

19:49 I think when I first saw them, I was a little concerned, like, oh my goodness, this is going

19:54 to potentially, you know, turn Python into something like TypeScript.

19:59 And while I appreciate what TypeScript does to make JavaScript much better, I almost always

20:05 walk away from working with TypeScript with a feeling of like, ah, that kind of hurt and

20:08 was painful.

20:09 I wonder why it had to go that way, you know, because the TypeScript requires, it's like,

20:14 it's like C# or C++.

20:16 The types have to match and they have to be there.

20:18 And if they don't match at all, then it just won't work.

20:21 Right.

20:21 It's super frustrating.

20:22 Oh, this thing is not defined.

20:24 And, you know, there's, cause there's libraries that might not have types and then how do

20:27 you work with them?

20:28 And it's just, if I find it, there's always some little edge case.

20:30 It's like, ah, this is frustrating, but I never feel that way with Python.

20:33 And I really have come to love Python's type hints.

20:37 And obviously Lucas starts out his conversation saying, this is easily disproven.

20:42 If you ever use PyCharm or VS Code, the code completion in there is based on type annotations.

20:48 If you've ever seen your editor highlight a function and squiggly say, this expects something

20:53 else than what you're giving it, you know, besides the number of variables, but like you're

20:57 giving it a string and it wants a number or something like that.

21:00 You're using type annotations and you can enhance your code by doing that.

21:05 Right. So I was actually talking to Yusuf about this yesterday.

21:08 My philosophy, or maybe my, my rule of thumb is you don't have to always do it this way,

21:12 but you know, if you're working in your editor and you have to type more than say three characters

21:18 to get some kind of symbol to come up, you're probably doing it wrong.

21:21 So like if you have email service and you want to have email service, send account email,

21:27 you should be able to say dot S A E S A E, right.

21:31 Send account email and it should know the type that's been returned, what an email service

21:35 is, that it has this property and just write it for you.

21:38 Right. So to me, a lot of the typing stuff, I know, I mean, this comment is somewhat about

21:43 bugs. Like I never found a good bug because of this.

21:45 To me, that's almost like a side benefit.

21:48 It's about quickly generating code without stopping to go look at the code definition,

21:53 without going over to the documentation to see what I could have typed over here.

21:56 You know, it's for example, AWS people.

22:00 This is insanely frustrating to work with AWS because you get these like weird, create this

22:05 service and you give it a name and then you get an S3 service back, but it has no idea that it's

22:10 an S3 service. So you get zero help on what anything, even I think go to definition doesn't

22:16 quite work because it's, you know, use some factory method to reach down some weird place

22:20 and get the thing. So I think really driving the code generation experience without being in

22:25 documentation, without jumping around and reading all the source, just go forward. I think it's

22:30 super nice. So to me, like that is the biggest win of all of this stuff.

22:34 So let me give you the entire thread is very interesting. So yeah.

22:38 Yeah. So let me touch on a couple of the points of the thread because I can't get it to come up in

22:42 the screen share, but that's fine. I took notes luckily. So some of the things he pointed on, he's

22:48 like, here's tweet one of 10. So number one, put enough annotations and then the tooling will

22:54 connect the dots and make plenty of errors evident as well as like heighten this code

22:59 generation auto magic, right? That's one. The most common types of errors though, that'll creep in

23:05 is if, if none is being used where you expect a concrete, concrete type and things like my pie

23:10 will say you're using a type that is an optional of something, but you're not checking to see if it's

23:16 none before you dereference it. You're probably going to end up at some point with an attribute.

23:21 None type does not contain attribute, whatever you tried to do, you know, upper or whatever,

23:25 right?

23:26 None is, none is not scriptable or something.

23:29 Yes. So yeah. Something like that, right. Or callable or any of the things. Also another

23:35 common bug is the return case. So if you've got a function, you can, you know, maybe check

23:42 something and return one value, check something else, return another value. But if you forget at the

23:47 end and you fall through and you don't put up some kind of concrete return type, Python

23:51 functions just return none. Like this actually blew me away when I learned Python and I learned

23:56 about functions that they always, always, always return something. There's no such thing as a

24:01 void function in Python.

24:02 Yeah.

24:02 As a C++ person, that probably surprised you too, right? Like with your C++ background.

24:06 I did a bunch of other Perl and stuff like that.

24:09 But the return type is actually one of the greatest documentation features as well, because

24:16 sometimes you can try to, you can kind of figure out what the parameters are going to look,

24:21 think, you know, you can guess, but what's the return type? Is it going to be a list? Is it going

24:26 to be a tuple? Is it going to be a single element? What if there's more than one element? Yeah.

24:31 Having type hints around the return types is a great feature.

24:35 Yeah, absolutely. All right. Let me touch on a couple more and I see some listener comments in

24:40 the stream as well. Squiggly lines in your editor. Anyone like I just got this the other day. I,

24:45 I thought I was supposed to pass an object ID, the primary key manga, but we'd overridden it. And

24:51 it's actually a string. It said, you're passing a object ID when you expect a string. I'm like,

24:55 Oh yeah, I guess I am. All right. Well change that. Right. That's really nice. Instead of that

24:59 being a runtime error. And he talks about the work with TypeScript and Anders Hausberg and what he did

25:06 to help build that. And TypeScript, like I said, is pretty neat. But he also points out that, you know,

25:11 the same company, Microsoft is developing powerful type checking and code completion for Python with

25:16 VS Code. And they're, you know, they have one of the Python steering council folks working on there.

25:22 And maybe that's Brett. And also possibly the Python creator himself, Weido. So do you think those two

25:28 people would be working on something that just provides the illusion of productivity? Probably

25:32 not. So let's see a couple of comments. Chris May. Hey, Chris, happy to see you out there. He says,

25:38 code completion is such a confidence builder too. I think it's so awesome because for me,

25:42 it's both amazing for beginners because they can type dot and go now what? And for experts,

25:47 they can just blast out code so quickly because you just type dot the few things and you know,

25:52 like you said, with confidence, you just keep going.

25:55 A lot of these features, you get them if everybody around you writing the code that you're using is

26:03 using type hints. You don't necessarily need to use type hints yourself, but then you're being a bad

26:09 citizen and not helping the people out that you're sharing code with. So if you don't share code at all,

26:14 and you're only working on projects with yourself, then, you know, go ahead. Don't use type hints.

26:19 It's up to you, right? Yeah, absolutely. Yusuf, what do you think? Do you guys use

26:24 type hints on your project?

26:25 No, not really. Like it's not something that's in our conscious mind, I would say. I'm not sure if

26:30 that's also something really because you're an engineer. I wouldn't want to generalize, but

26:35 engineers are usually bothered with the problem itself rather than digging down on the types,

26:39 for example. It depends. It depends on what language we use.

26:42 Yeah, it's a bit of a computer science-y topic, I can see. But yeah, I just, like I said, I love how

26:48 it generates the content so much easier. Magnuson also commented, I love, for example, Pydantic,

26:56 but I agree with Romalo, Luciano Romalo, who was in this thread. Hopefully it won't be required

27:01 in Python to help people get started. Yeah. So I think the typing stuff is really interesting. Like

27:06 Pydantic, we've talked about a bunch. It's a super interesting example of really using typing

27:10 to generate cool data ingestion and processing. Like if you say I've got a Pydantic model and one

27:17 of its fields is a list of integers, but you give it a list and the things in the list happen to be

27:21 strings that could be integers, it'll automatically convert it and stuff like that is really fantastic.

27:26 Yeah, I think that's always going to be an add-on type of thing.

27:29 Yeah, even though I'm a fan of typants, I don't use them all the time and I would be very opposed

27:35 to having them be required.

27:37 Yes, I would too. I would too. I don't think they need to be on the whole code base. I mean,

27:41 it depends if your goal is to say, I want to use them for mypy or mypyC and like completely generate

27:47 stuff. But if your goal really is to get a little bit of help with editors, just having it on the

27:52 boundaries, like here's the data access layer, the things that come out of there return whatever,

27:56 like you don't have to do anything else and the editors will pick it up and run.

28:00 Yeah.

28:01 All right. One quick question. What is a function return? If there's no return, it returns none.

28:06 It returns none. So that's why you don't have to just say whether there's a return type. It always

28:09 returns none. All right. Next up, I guess we got the one I tried to open with there,

28:15 Yousef, is Open3D. That looks fun.

28:18 Yeah. This is basically a library which you could use in Jupyter, which I tried to use,

28:23 but somehow they at the moment have problems using Open3D. So what you can do is you call Open3D in

28:28 your Jupyter notebook and then have the point cloud visualized. However, there are some ways around it,

28:33 but Open3D, I think if I would start all over again, I would probably use Open3D to visualize my

28:38 point cloud, which I'm actually working with in my Jupyter notebook. I'm not sure if using a Jupyter

28:43 notebook is also something you would recommend personally, maybe Brian and Michael,

28:47 if you're a fan of Jupyter notebooks. I think it depends on the application, right?

28:50 Yeah. I think it depends as well. And to me, it really depends on what I'm trying to do

28:55 and the kind of code. Am I trying to explore data and does it have a really strong visualization

29:01 component or is it like a utility type thing? So for example, one of the things that I wrote

29:07 recently that I would never put into a Jupyter notebook, but I find really helpful is we've got

29:12 literally thousands of video files, MP4s and whatnot for the online courses. And in order to

29:19 import them, one of the things I have to tell the database is how many, how long in seconds is each

29:25 file and where does it live and stuff. So I've got a little script and I just say, go, go to this

29:29 directory and generate a little JSON output for all of the files and by and parse them and tell me how

29:36 long they are like that kind of app doesn't belong there. Right. It's just, that's a command line type

29:40 of utility type of thing. But if I want to visualize something like this, I think it may well be really

29:45 good for it actually. So I think it varies.

29:48 Yeah. There's a lot of application parts of my work that I think using a Jupyter notebook actually might be

29:55 more beneficial. So I'm often taking big, huge trace data and stuff for like a spectrum traces and

30:02 there, those could easily be driven from a Jupyter notebook and with the visualization stuff would be good.

30:09 Yeah. Yeah. Cool. So this thing is a set of both C++ and Python libraries for basically working with 3D meshes, right?

30:19 Mostly 3D data. For example, if you use a LIDAR, so when you work with a laser and this looks, for example, great.

30:26 I never watched the video, but if you scan objects in your surroundings, usually what you get is a point cloud

30:32 and which you can then visualize using Open3D. And the big disadvantage with point clouds is that

30:38 they're kind of unstructured. So you could have one matrix representing one point cloud and you could

30:43 have the same matrix switching two points, but the matrix would be different. This is also a problem

30:48 that a lot of papers try to tackle and make sure that, you know, get around the spot.

30:53 Yeah. Nice. Yeah. The example video here is using Open3D for 3D object detection, which is, that's pretty wild.

31:02 Yeah. Nice.

31:03 The things people do these days.

31:04 I know. I think it's really interesting, all this image processing and analysis stuff. Yeah.

31:10 Good question, Brian, by the way. This is what I ask myself when I listen to Python bytes. As an engineer,

31:15 what are you guys doing? It's great.

31:17 Absolutely. Cool. All right. Well, that's it for all of our main items. Brian, you got anything extra

31:25 that you want to throw out there?

31:26 Yeah. I just wanted a couple of things. One, 2021 has been exhausting so far. Yeah. I don't know if

31:32 anybody else has got the same experience, but wow. And also I've got a lot of extra projects,

31:38 side projects that I'm working on right now. Python bytes is one of them, but there's other stuff going

31:43 on as well, trying to do more writing. And because of that testing code is, has shifted to an every other

31:50 week cadence. So it's not going away. I know a lot of, oddly enough, I've had a lot of feedback in the

31:57 last couple of months of people saying, thank you for the podcast. I've learned so much. So I do not,

32:02 I don't want to shut it down. I want to keep it going and there's no plans on shutting it down.

32:06 It's just slowing down so that I have room in my life for other, other projects as well. So just

32:11 wanted to let people know that.

32:12 Yeah. Yeah. I try to, for Talk Python, batch it up and do a whole bunch, just to say this week, I'm just

32:17 going to get nothing done, but I'll do a ton of recording and then just roll them out. I had three

32:22 months of stuff done in like a week and a half. I was, I really needed a break after that, but then I,

32:26 I was good.

32:27 Yeah, cool.

32:28 Yeah. Cool. Well, thanks for the update. Yusef, anything you want to share on the way,

32:32 way out into the show?

32:34 I just want to say thank you for letting me or being able to participate in this quick and brief podcast.

32:40 Keep doing what you do guys. I follow you both on Twitter and what you share and what you do is really

32:44 amazing. So it's really inspiring for an engineer who wants to delve into the field of Python and

32:48 all fancy kind of things to, to listen to your podcast, taking your courses or following you

32:53 on social media. It's really great. You learn a lot and I actually have to learn more to be honest

32:58 myself.

32:58 You know, I have to learn more.

33:00 Yeah.

33:01 It never stops. It never stops.

33:03 Yeah. But that's awesome. Thank you so much. Really appreciate that.

33:06 How about you, Michael?

33:07 I have one quick thing, driven by yours, a comment you had last time. So Francisco Silva

33:14 pointed out, we had talked about some of the num pic, Pythonic, the idiomatic numpy stuff that you

33:22 might do and how like, instead of looping over stuff, you can just like add say like two numpy arrays

33:27 and it'll add them or you can, you know, dot product them and whatnot. Right. So one of the things you can do,

33:32 I guess we also talked about like ones and zeros to generate a prebuilt list of those.

33:37 So one of the things he talked about is the all close method. So if you've got floating point numbers,

33:43 one of the things that's really frustrating is like, are these equal? Well, does it mean floating point

33:47 numbers equal, right? Like they could be so nearly the same, but not the same, right? They could be within

33:53 an insane amount of closeness, right? Like 10 decimal places and then a one, right? So all close is like,

34:00 well, if they're within, you know, one, one thousandth of each other call, consider them the same.

34:05 Well, all close takes a bunch of parameters that you can, you can specify the tolerance though. So yeah.

34:10 Anyway, I thought that was cool.

34:11 Yeah. Hey, while we're on the topic, I may as well throw out, I've got it. I've got it. So I tried to use this method

34:17 of using NumPy and, I ran into a problem. So I'm hoping some data science people can help me

34:22 figure out how to solve it. So my problem is just the simple thing. If I've got two, two arrays,

34:27 I want to see if all of the elements are element wise, less than or equal to the other element in the

34:32 other array. Okay. I can do that with NumPy, but what I can't, that assumes that all of the elements are

34:39 the same data type, like comparable. If, if there are strings thrown in there, it doesn't work. So

34:46 obviously, I don't know if it's obvious, but so I gotta, I had to do some cleanup at a time,

34:51 but I don't know what the most, the best way is. So reach out to me if you've got an answer.

34:55 Awesome. Yeah. I don't have an answer, but I'm sure people do. And, quick, quick comment,

34:59 here, this is the one on my show. Magnus Carlson, says tip. I found out about

35:05 copier and alternative to cookie cutter that can be run later as well to update the project to a newer

35:12 template. That's pretty cool. I hadn't heard of that. And, also Toml spec has reached 1.0.

35:18 parser might be added, added to the standard lib. Also haven't covered that, but that's cool news.

35:22 Thanks for sharing you guys. Yeah. And, I guess thanks for being here. Yousef, thanks for joining

35:28 us and Brian. Thank you as always, man. Thank you. Thank you so much, guys. Bye everyone. Bye.

Want to go deeper? Check our projects

Course: Python for the Absolute Beginner course

Beginners

HTMX + Flask

FastAPI

pytest book

Full transcript