Transcript #39: The new PyPI
Return to episode page view on github00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
00:05 This is episode 39, recorded August 14th, 2017.
00:10 I'm Brian Okken, and again, Michael is on vacation, and we have a guest host, and this week we have Mahmoud Hashemi.
00:18 Hey, Mahmoud.
00:19 Hi there.
00:20 Great to be here.
00:22 Yeah, you've been on Testing Code, and you've been on Talk Python a couple times.
00:27 Yeah, a couple of my faves, for sure.
00:28 Yeah, well, when I was looking up Talk Python, I noticed that you were on episode 4 in 54.
00:34 Yeah, and I don't know, when Guido was on, you know, Michael was kind enough to ask my question, and I did like a panel thing.
00:41 I don't know, I guess, yeah, it's been really nice to have repeat appearances.
00:46 People recognize me by my voice now.
00:49 It's kind of strange, but like, I'm very appreciative at the same time.
00:52 That's good.
00:53 That's great.
00:54 And so, thanks a lot for helping to do this today.
00:56 Yeah, hopefully I can do Michael Wright taking his spot here.
00:59 Well, let's just jump right in.
01:00 I'm really excited about your first topic.
01:02 Oh, sure.
01:02 So, let's see.
01:04 First up, I mean, one thing that's been on my radar, I'm not sure if you guys talk about this before.
01:08 Like, sometimes I'm listening to Python bytes, and it's a little bit garbled or something.
01:12 Have you guys tried calling decode?
01:14 I'm kind of curious, like, why it's not Python stirs.
01:17 But one thing that's been on my radar is the new PyPI.
01:22 So, if you haven't been on Distutils, SIG, you may have not seen that there's actually a new PyPI, pypi.org.
01:32 And this is going to be the Python package index going forward.
01:37 So, this is what we've been calling Warehouse before.
01:40 Is that right?
01:40 So, Warehouse is the software that runs PyPI, you know?
01:45 Okay.
01:45 And so, yeah, it's a package index.
01:48 It's going to be where all of your wheels and SDISTs live.
01:52 And there's basically a lot of development that's happening here.
01:55 My friend Donald Stuffed is doing an amazing job with his team.
01:59 Basically, yeah, we're up to 114,598 projects at the moment.
02:05 This even lists the number of files, almost a million files with 230,000 users.
02:12 And so, yeah, I would definitely check out this pypi.org for yourself.
02:15 But for the most part, I wanted to talk about how they're deprecating the old PyPI.
02:21 So, pypi.python.org is now basically just a read-only interface.
02:26 And if you've tried to upload a package recently, then you may have seen an error, HTTP 4.10,
02:33 which is like a 4.04, but this is 4.10 gone, meaning it was here, but now it's gone.
02:38 And so, yeah, you basically make sure to use a new version of setup tools,
02:42 and it'll automatically start using the new one.
02:45 As long as your configs don't state otherwise, you might have to update a config.
02:48 But this is a tremendous leap forward in a lot of ways.
02:52 And they need some help doing it too, you know?
02:55 So, it's all open source on GitHub.
02:57 There are issues.
02:58 I'm working on one right now.
02:59 Yeah, it's got a lot of cool features.
03:02 Have you taken a look, Brian?
03:04 I've looked around a little bit.
03:05 Now, one of the things I've noticed, like right off the bat, is it says up at the top,
03:09 there's a big red bar that says...
03:11 I know, it's kind of scary.
03:12 Yeah.
03:12 So, do you know, I'm guessing eventually at some point they'll, the other interface will just redirect to here, or is there...
03:20 I mean, you know, cool URLs don't change.
03:22 Personally, in my view, I'd like it if they just kept it up and put the red bar over there,
03:27 that this is a, you know, archive version of PyPI.
03:31 But for now, all those URLs are still working.
03:34 And if you ask me, pypi.org has been in use for so long, because actually, if you've paid
03:39 close attention, a lot of your downloads, pip is downloading from the new one.
03:44 Oh, okay.
03:45 So, yeah, it's been in production a long time.
03:47 In fact, they just hit, I think, a petabyte a month in bandwidth downloads.
03:52 So, yeah, just for a sense of the cost there, I think it's like in the tens of thousands,
03:57 like 30, 40,000 a month to host PyPI.
04:00 And that's kindly donated by the Fastly CDN.
04:04 Should they stop feeling so generous, you know, we got to support our community somehow.
04:10 So, there is a donate button here.
04:13 But I think that right now, what they need most is sort of like people to work on cool
04:18 features, like one that I saw has been working on that I'm very excited for, not strictly PyPI.org,
04:24 but same team, the Python Packaging Authority.
04:27 They are working on making a dependency graph between all packages.
04:31 So, if you've ever wondered what depends on what ahead of time, then this would enable that.
04:37 So, yeah.
04:39 How do I start working on it?
04:40 Do I go to the GitHub page?
04:42 Yeah.
04:42 So, I think it's github.com forward slash PyPA or I think it might be forward slash warehouse.
04:48 Yeah.
04:49 Okay.
04:49 So, and, you know, Donald has been very candid about like, you know, the areas that need
04:55 development and he's been working very hard.
04:57 He's at Amazon now and he spends some time working on stuff there.
05:02 Oh, one last thing, like distutils, right?
05:04 So, so they still, there's an email list called distutils-sig, which stands for special interest
05:10 group.
05:11 And so, distutils-sig, you can just go join the listserv and you can read the archive and
05:16 see the conversations they're having.
05:18 If you care about packaging, you're probably already on there.
05:20 But if you aren't, definitely subscribe.
05:23 Oh, I didn't know about it.
05:24 Yeah.
05:25 So, we'll try to drop a link in the show notes for that.
05:28 So, okay.
05:30 Well, that's, that's really cool.
05:31 Pretty good for first topic, you know, I don't know.
05:33 Yeah, definitely.
05:35 And I, and the one, one thing I want to add is I know that Donald has been vocal before about
05:41 how awful the previous code was.
05:43 Yeah.
05:44 I mean, it's, it's pretty old code, right?
05:46 Like, I don't even know.
05:47 I, it may not predate WSGI, but it's pretty old.
05:50 You've looked at the new code.
05:52 I've looked at the new code.
05:52 I can talk about the new code if we got a second.
05:54 So, I've, I've looked at it.
05:55 I've used it.
05:56 It's got 100% coverage.
05:58 It's got a lot of CI stuff set up.
06:00 It uses Docker.
06:01 I had a little bit of trouble, like, you know, with the make-based approach to running the
06:07 thing, but it's pretty complex.
06:09 Like, it runs, I think, an elastic search and all this stuff.
06:12 So, basically, yeah, you just.
06:14 People shouldn't be afraid to help out just because they've heard bad things about the
06:17 old code.
06:18 No, the new code is, it's pretty idiomatic, I think.
06:21 And, you know, if you're familiar with SQLAlchemy, and I think it uses also maybe like
06:26 Pyramid, I think.
06:28 And it looks like the tests are in pytest, too.
06:31 Yeah, the test is definitely in pytest, which is, frankly, the only way I've heard and have
06:36 also found myself.
06:38 So, yeah, it's been good.
06:40 Oh, I could talk about this for a long time, but let's move on to the next topic.
06:44 Absolutely.
06:44 So, one of the things I just read about this yesterday.
06:47 There's a, I read about it on Make, I think it's the Make website, but it's CircuitPython
06:54 is now going to be, is supported by a whole bunch of Adafruit hardware.
06:59 It's great news for hardware hackers and also tinkerers like myself.
07:03 And so, we'll put a link in the show notes to the Make article.
07:06 But there's also, so I had heard Adafruit announced CircuitPython in January.
07:12 And it's a, it's an open source, it's based on MicroPython.
07:16 So, CircuitPython is also open source, but it's, so I'm not quite sure how they differ.
07:22 But they've added some things to make it easier to control hardware.
07:25 And they already had, like, two devices, Metro M0 and Feather M0 express versions that support
07:35 CircuitPython right off the bat.
07:37 And they're, I guess they're working on a Circuit Playground Express.
07:41 All of these look like really fun things.
07:43 But the thing that really caught my attention was Gemma M0 that was announced at the end of
07:49 July.
07:50 And this thing is, like, the size of a quarter.
07:52 It's a little small thing that you can make wearable software projects with, like LEDs and
07:58 whatever.
07:58 And you just plug it in into your computer.
08:02 And you instantly, it's like an extra drive.
08:04 You can see a main.py and it just, you can just start programming Python right away.
08:09 Yeah, right.
08:10 So, basically, it just, like, it sort of functions kind of like a USB drive.
08:13 And there's a single main entry point in there.
08:16 And you can just modify it.
08:17 And then, you know, you don't need to install anything.
08:20 Or anything like that.
08:21 Yeah, there's no loading.
08:22 Apparently, it does support Arduino.
08:24 But you don't, like, right off the bat, you don't have to install anything.
08:28 You can just start programming.
08:29 And these are, they, right now, they're currently out of stock.
08:32 But I'm sure they get new stuff in pretty quick.
08:35 But they're, it's under 10 bucks to start programming some wearable programming.
08:39 So, I definitely have to get one of these.
08:41 Yeah.
08:41 I can't wait to start wearing some running Python.
08:43 That'd be taking it to the next level.
08:45 And I'm also going to link to what I thought was great was they realized that, I mean, they
08:51 are encouraging people to use Python if they can for programming hardware.
08:55 But they realized that a lot of people are new to the Python community.
08:58 So, there's a page called Creating and Sharing Circuit Python, a Circuit Python library.
09:05 And it's got a whole bunch of great links, like, basically just telling people what, what
09:09 we, when we call it, say library, we mean a package or a module with a setup file and doing
09:15 it all right.
09:15 And there's little intros to GitHub and Read the Docs and Travis.
09:20 So, is it like, when you say package or module, is this their own format?
09:24 Or is this like Python packages, wheels, that sort of thing?
09:27 Yeah, it's just Python stuff.
09:28 But it's just really quick tutorials to get people up to speed fast.
09:32 Sure.
09:33 So, it's like sort of a full, it's got like an end-to-end thing.
09:36 It doesn't just send you left and right to other sites.
09:39 Yeah, right.
09:40 It's really telling you everything.
09:42 And it's, they're pretty condensed.
09:43 Actually, they're pretty good job condensing all that information.
09:47 Yeah, you don't need the whole context and history of Python packaging.
09:51 It's been, we've come a long way since, you know, eggs and that sort of stuff.
09:55 Yeah.
09:55 And then one of the things that is kind of interesting is they have a concept of bundles.
10:02 And really, all the bundle is, is a bunch of installable Python packages that are zipped up into a bundle.
10:10 Hmm, sure.
10:11 We normally don't really care about that because on a larger computer, it's not that big of a deal.
10:18 But these little tiny devices, you still have to care about how big it is.
10:22 So, you're only, you might want to get everything that somebody cool has made, but you don't need it all.
10:28 You just need like the little part that, you know, blinks the LED for you or whatever.
10:32 Sure.
10:32 So, it sort of freezes it all together.
10:34 Yeah.
10:34 These embedded applications are interesting.
10:36 So, now that, so I maintain this one library called hyperlink.
10:39 And I guess it's pretty widely used because twisted depends on it.
10:43 And so, I've gotten some interesting feedback of a few things.
10:47 Like one code review I just went through.
10:49 I promise this is related.
10:51 Basically, I'm using pytest and I'm writing my assert statements.
10:55 And, you know, I love that pytest rewriting with the great error messages and so forth.
10:59 But I got a comment on my code review that these tests are not runnable in an embedded environment
11:05 because they will run with dash OO, which elides all of those assert statements.
11:10 And I'm like, well, you're kind of running the tests wrong if you're not using pytest.
11:15 But in these embedded environments, I don't know, maybe the convention is different.
11:20 So, when you get yours, definitely like test it out.
11:22 Maybe you'll have to put a little caveat on your pytest recommendation if that's not what we can do on hardware.
11:30 I don't know.
11:30 Oh, that's interesting.
11:31 Yeah.
11:32 Yeah, I'll definitely have to check that out.
11:34 So, I don't want the hardware people to not buy my book.
11:38 That would be terrible.
11:38 Well, that's the thing.
11:40 With something like hyperlink, which is for URLs, I'm like 99.9% sure it's going to run exactly the same everywhere.
11:48 So, I'm confident that if it runs on my machine, it runs on Travis CI, it runs on CodeVayer or whatever.
11:52 It's going to AppVayer, I think.
11:55 It'll be fine.
11:56 But at the same time, hardware people can be sticklers, as I'm sure you know.
12:01 So, I respect that.
12:03 I respect that.
12:04 Cool.
12:04 Yeah, neat.
12:06 Well, what do we got next, Mahmoud?
12:08 Oh, right.
12:08 It's back to me.
12:09 So, I don't know.
12:10 I mean, so I spend a lot of my time pretty deep into development of all sorts of infrastructural sorts.
12:18 And I find myself subscribed to Python Dev, Python Ideas, Distutils SIG.
12:24 And, you know, you can't read everything there and still have a life.
12:27 So, only a few things catch my eye.
12:30 But this one in particular caught my eye because my friend Hinek has this great library called Adders.
12:34 If you haven't heard of it, my other friend Glyph has a whole blog post that tells you why you have to use this library, ATTRS.
12:43 And it's basically class decorators that make writing high-level classes very easy.
12:50 So, it sort of derives from this sort of tradition of name tuples, right?
12:55 Raymond Hedinger had this great idea to make name tuples, which let us define a class-like structured thing within just one line.
13:03 But the problem with name tuples is that if you want to add methods to it, then you have to inherit from it.
13:09 And they're immutable by default.
13:10 And they don't really, even though they generate a dunder init for you, they don't do a whole heck of a lot of validation.
13:17 So, Adders comes along, fixes all these things, adds a bunch of other cool functionality, and does it with class decorators.
13:23 It doesn't pollute your final object with anything you don't want, right?
13:27 It doesn't, because you don't inherit from anything.
13:29 So, you just inherit from object.
13:31 After Glyph's post took off or something, the core Python devs sat up, took some notice of this, and said,
13:38 maybe we have been neglecting a higher-level interface for quickly defining classes.
13:42 You know, you just want to have four or five fields all sort of batched together.
13:48 And you don't want to have a lot of functions that everywhere have to define 15 arguments.
13:52 So, like, how can we quickly, in a nice, concise, Pythonic way, define a Python class?
13:59 And they came up with this new thing, which is still, I guess, kind of, this is what I mean, like, I don't know if this is a little bit too deep underground, but there's this kind of, there's this GitHub that Eric V. Smith, who is a Python core dev, has called Data Classes.
14:13 And the issues of this have been really interesting to watch because HENIC and a bunch of core devs have been kind of debating, like, hey, should we just use adders?
14:23 If adders is getting so popular, should it just be part of the core Python?
14:28 And, you know, people seem to like it.
14:30 Why make something that's so close to it that sort of thing?
14:33 There's sort of a draft PEP inside of the Data Classes repo, and there's some examples of how it's used.
14:40 Has some semantic differences, has some syntactic differences.
14:43 I think that it's pretty interesting to watch.
14:45 And, in fact, they seem to be encouraging more experimentation in this area.
14:50 Even though I like adders, they seem to want even more options, at least from themselves.
14:56 So, I don't know.
14:57 I had a good time reading the issues.
14:58 Maybe other people enjoy it too.
15:00 Yeah.
15:00 So, is this, it's similar to adders then?
15:03 Yeah.
15:03 It's pretty similar to adders.
15:05 There are, the differences are sort of fine enough that you have to kind of look closely.
15:11 Basically, I think that what it is, is like, there's actually an issue called, why not just adders?
15:18 And they sort of explain that they want to use, like, the new, I think, type hint syntax type stuff.
15:26 Okay.
15:26 So, yeah.
15:27 Other people, like, kind of said that, hey, maybe, like, naming-wise, data classes is a little bit clearer than adders, because someone who is a new Python programmer doesn't know that adder is an attribute or something like that.
15:42 That's true.
15:42 So, it has some syntactic differences, yeah.
15:45 And there are some big names in this discussion.
15:48 There are, there are.
15:49 So, that's what I mean.
15:50 It's sort of like the inner circle, right?
15:52 This is kind of like the sort of stuff that I have to follow.
15:56 Oh, that's awesome.
15:56 To be on the edge here.
15:58 And it happens kind of behind the scenes.
16:00 But I really do encourage people to join these email lists if you want to see the action happening.
16:05 You know, you don't have to be a spectator.
16:07 Or you don't have to sit maybe in the nosebleed section of the arena on open source, right?
16:12 You can get up close on the, on, like, you know, get the, get the front row seats.
16:16 And before you know it, you'll actually get involved.
16:18 It'll be fun.
16:19 Yeah, that's great.
16:20 Oh, thanks for bringing that up.
16:21 That's cool.
16:21 Well, speaking of trying to get involved, unless you've had your head under a rock, data science is a thing.
16:28 Is it, really?
16:31 It isn't something that I have to do, use on a daily basis, but it's definitely something I want to pay attention to.
16:37 And I ran across, there's a lot of books and tutorials that are huge because it's a huge topic.
16:44 And I ran across a article called Pandas in a Nutshell.
16:48 And it's, I like it because it's a, it's a Jupyter Notebook style post.
16:53 So you can just see the code working.
16:55 And it's mostly tutorial by example with, it's just a little bit of extra code for explanation.
17:01 And the big part of it is really just talking about a couple of data structures.
17:05 It's just talking about the series data structure, which is a one-dimensional array with indices.
17:11 So just kind of like a vector.
17:14 And then the data frame, which is like a two-dimensional array.
17:18 And all the sort of common things that you need to do with it, like specifying a custom index or adding, combining two series or with matrix stuff, adding columns, adding a column that's based on another column.
17:36 Then this sort of stuff sort of seems like Excel, like working on a spreadsheet.
17:40 I think for a lot of people, like that is the natural next step, you know, when they want to get into programming.
17:46 It's either going to be doing visual or is it like, you know, basic script of some sort inside of Excel or, you know, maybe move into Python.
17:54 Yeah, and I guess that's one of the things I like about this little nutshell article is that it's, if somebody is already doing some things in spreadsheets and they want to switch to working with pandas, this might be a pretty good stepping point to try to get things going.
18:10 And it's actually something I'm going to grab some of the concepts in here to try to deal with some of the large amounts of data that I deal with on a daily basis as well.
18:19 Oh, for sure.
18:20 So I haven't used, and I bring this up because I'm just starting.
18:23 I'm trying to use pandas on a daily basis now.
18:27 And it is, like, I've actually faced a lot of the same challenges.
18:30 It's just because it's Python doesn't mean that it, you know, doesn't require some sort of kind of paradigm shift in your thought.
18:37 It's like thinking about data frames is very different than thinking about lists in Python or dictionaries in Python.
18:43 It's somewhere between Python and, like, full-blown relational databases.
18:48 And so you do have to change the way you think how to approach a problem, especially if you want to get some performance out of the thing, because it has all this great broadcasting logic that it can perform.
18:58 But it's not going to work if you just iterate over it in four loops.
19:02 Yeah, and I guess that's where the data frames and series stuff comes in is because you want to do some computation on everything and or, you know, searching on stuff.
19:13 So it's kind of like a combination of a database and an in-memory database and something else.
19:19 Where I work, some of our data scientists are, you know, coming from an R background.
19:24 And the data frame is based on R construct, I believe.
19:27 So they, you know, find it quite natural.
19:30 And the Python is what they sort of struggle with.
19:33 And they come to me for that.
19:34 But a Python person would want to ramp up on the data frame itself.
19:38 And so this notebook seems like a great option to do that quickly.
19:41 Yeah.
19:41 So that's just a quickie.
19:43 So that's it.
19:45 Your last topic.
19:46 Oh, already.
19:47 So, yeah, basically, just yesterday I was at this conference, PyBay 2017.
19:54 It's sort of the Bay Area, Silicon Valley, regional Python conference.
19:58 Only the second annual one.
19:59 There's, it's surprising how long it took to spin up here.
20:02 Meanwhile, PyOhio has been going for who knows how long.
20:05 So anyways, but it was a great conference.
20:08 Almost 500 developers, pretty good turnout.
20:12 And a lot of great topics covered the, I gave a packaging talk.
20:17 But the thing I'm going to talk about today is actually the opening panel was on static typing.
20:24 And it was quite an interesting mix.
20:27 They had, first of all, it was very international.
20:30 They had people from Germany, Russia, Poland, USA, and Netherlands.
20:33 It seems like Europeans are big fans of static typing for whatever reason.
20:38 Guido included.
20:39 So, yeah, they had people from, I think, let's see, PyCharm, University of California, Berkeley.
20:47 Then also Quora, Google, and I think another guy too.
20:52 So it was a really nice cross-section of the industry and also the world.
20:58 And they just talked about the state of static typing.
21:02 So right now, just to bring you up to date, I'm not sure how recently you covered this stuff on the podcast.
21:06 But there are currently three or four static type checkers.
21:12 So in Python 3, you can specify your types however you'd like.
21:17 Built into the language, it's not going to do a lot of complaining in case types don't match.
21:23 First of all, at runtime, nothing is checked.
21:26 So if you want to check it, it would be at a compile time step.
21:30 The annotations are still there at runtime.
21:33 And then you have a static type checker, the most popular of which is mypy, run over that and check it.
21:40 Kind of like a linter or any other, I mean, static analysis tool.
21:44 And so there are other ones too, though.
21:48 Google has one that is not super well documented, but they use it internally.
21:54 Then PyCharm has this functionality as well, which is also kind of built from scratch.
22:00 And they made a pretty good case why you would want one built into PyCharm,
22:05 which is that basically it can do incremental checking.
22:08 So while you're still writing, it can do sort of partial checks, maybe a little bit better than mypy.
22:13 Oh, right.
22:14 The last person on the panel, Ukash Langa from Facebook.
22:18 He also comes to my meetup.
22:20 Anyways, so yeah, he's very opinionated about types.
22:23 We'll get to that in a second.
22:24 One that wasn't talked about was PyLint.
22:27 So I was actually blown away.
22:28 I updated my Emacs config recently, and I sort of integrated some more linting stuff.
22:34 And the default PyLint these days can do an amazing amount of inference.
22:38 It'll tell you you have the wrong number of arguments.
22:41 It'll tell you that, like, oh, this default doesn't match that type.
22:45 It'll do so many different things, in addition to its standard, very opinionated idea of how many arguments a function should even have and that sort of thing.
22:55 Anyways, so those are our four sort of type inference engines.
22:59 And they all are slightly different.
23:01 But everyone seemed to get along pretty well on stage.
23:04 And they talked about, you know, potentially in the future actually merging these things and making a PEP that would allow them to all sort of comply together, maybe even turn into a single project.
23:15 So that was really nice to see.
23:17 And one of the most interesting questions was basically from the audience.
23:24 They said, like, well, what is the real point behind the static typing?
23:28 Like, what is the biggest benefit that you see?
23:31 And there was a little bit of divergence on this, right?
23:33 Some people like it for the strictness of it all, being, you know, kind of the dictator of your own code base or whatever, right?
23:41 But everyone else seemed to be pretty much on the same page that this is for human readability.
23:48 This is a sort of documentation that can then be checked automatically at a rather large scale.
23:54 So it's attached to the function, but it's more than just a doc test.
23:59 And so the interesting side effect of this is that they, even though they all work on static typing stuff, they have a pretty nuanced view of how much static typing you should apply.
24:10 So they say that, like, you know, maybe a list of a certain type, right?
24:17 But actually defining, say, a completely recursive type is, one, not supported.
24:22 And two, maybe not even that desirable because you don't want your function signatures to get super, super complex.
24:28 So, yeah, I mean, it was interesting that they thought the human side of this was the most important part as opposed to, say, like a Haskell programmer or something where they want the mathematical correctness of it all.
24:40 It's also interesting that there's, I would have liked to listen to the discussion of how much you should use of it.
24:47 Well, it was at LinkedIn.
24:48 I think that they recorded it.
24:49 It should go up pretty soon.
24:50 Yeah, I'll definitely, you know, it was only a couple of days ago, but once the video is available, I'll maybe send it to you.
24:56 You can add it to the show notes.
24:57 Yeah.
24:58 Some interesting side effects of this, by the way, like some things to consider.
25:01 So Cython does not support the new Python type syntax.
25:06 So even though all these guys are kind of on the same page and buddy-buddy, like, you know, for us, people who really like Cython and have used it to achieve a lot of performance and type correctness to some degree, are a little bit out of luck at the moment.
25:19 There, I think that people are working on making a pull request to it or something that would support, add support for this.
25:24 But it's such a big change to the syntax.
25:27 And Cython has its own type syntax, which is less focused on semantic types as this is, and more focused on being in line with C types, which allows you to have more compact memory, memory-like usage.
25:42 And the people on the panel were actually pretty clear that the static types advantage is not in performance.
25:48 So a project like PyPy, which actually can use types to achieve higher performance, they find that the JIT is faster without taking hints from the user in the code.
25:59 So it just disregards this stuff.
26:00 Oh, interesting.
26:01 Yeah.
26:01 Because the JIT has the actual types.
26:03 So just a real quick thought experiment.
26:06 Like, imagine that I say, I'm going to pass you a list of integers.
26:10 That list is three integers long.
26:13 Okay.
26:13 I can just check them.
26:14 One, two, three.
26:15 All integers.
26:16 Good to go.
26:16 No type error.
26:17 Right.
26:18 But if I pass you a list of 20,000 integers, right, every time I pass that to you, I have to check that every single one is an integer.
26:25 Otherwise, like, you know, I want to have a type error.
26:27 That sort of thing is going a little bit against the spirit of Python and being like sort of practical and duck typey and whatnot.
26:36 So a friend of mine from Intel, you know, was sitting next to me and he was saying how we call he came to Python so he wouldn't have to type everything.
26:44 But thankfully, you don't have to type everything.
26:46 Like the standard library itself, for instance, is all the type definitions for that are available in this joint type shed repo that all of these static type people sort of built together.
26:58 And I'll link to that in the show notes for sure.
26:59 Yeah.
27:00 My favorite use so far that I've come across for my own work is putting type hints in interface areas like an API module to that.
27:11 That's how you interact with the package.
27:12 So those are great places for type hints.
27:14 Oh, for sure.
27:15 And so wait, are you saying that so there is this old thing like they're trying to get rid of it.
27:20 Basically, Python has these sort of stub files, these interface files.
27:23 Some people call them the header files for Python.
27:26 Like I think it's a .py file.
27:28 Okay.
27:28 .py.
27:29 I was just thinking like I've got a package that has a whole bunch of internal code, but it has like an API module that you should people interact with from the outside world.
27:42 That's a great place for pretty much any interfaces that are not you that's going to use it, that somebody else is going to use it.
27:50 Those are great places to put type hints if it matters.
27:52 Oh, definitely.
27:53 Definitely.
27:53 Cool.
27:55 But I'm pretty new to it too.
27:56 So thanks for bringing that up.
27:58 That was very interesting.
27:58 Yeah.
27:59 Yeah.
27:59 And I mean, I think that they're still changing this stuff quite a bit, right?
28:03 So I, you know, early adopters go nuts.
28:05 But for the rest of us that like a little bit more boring technologies, you know, I'm going to go ahead and let the auto inference engine of Pilot figure things out for me.
28:13 I'm not going to, you know, jump on the bandwagon so quickly.
28:16 And I'm glad you brought Pilot up.
28:17 I've been sort of dismissing it because I've been using FlightGate.
28:22 But I'll have to take a look at Pilot again.
28:24 Oh, yeah.
28:24 They've definitely ramped up development on that again.
28:27 I mean, you have to, for me anyways, right?
28:30 I just blacklist a lot of the errors because I kind of don't agree with every single thing that they test for.
28:35 But they make it pretty easy to do.
28:37 You just change it in an I and I file.
28:38 No big deal.
28:39 Last topic, again, comes back to me finally getting my head out of thinking about pytest 24 hours a day.
28:47 And one of the things I want to start looking at is some of the web frameworks like Django and Flask.
28:55 I haven't played with them much personally.
28:57 And there's a bunch of personal projects and work projects I'd like to do with them.
29:01 And also quite a few people that listen to testing code are web people.
29:07 And so just to kind of get more understanding of that, I'm trying to learn more frameworks.
29:12 And one of the things that I've had a hard time getting my head around is ORMs or object relational mappers.
29:18 So luckily I ran across an article on Fullstack Python, which is Matt McKay's site.
29:26 Amazing site.
29:26 Yeah.
29:27 And basically it's Fullstack Python.
29:30 I don't remember what it's called, but I think it's just object relational mappers.
29:35 And it goes through what they are.
29:38 So a norm is some code that automates the transfer of data from your internal Python objects and classes to database tables.
29:50 And they're useful so that you can write Python code instead of writing SQL queries.
29:56 And he talks about that and then also talks about why you need them and some downsides.
30:01 And yeah.
30:03 So the downsides actually were interesting.
30:05 I didn't think that anybody would talk about what's wrong with using ORMs.
30:09 Yeah.
30:09 I mean, realistically, there are some definite engineering trade-offs.
30:13 So what did he say?
30:13 Well, he said, well, a few things are impedance mismatch, which coming from electrical world, I was like impedance mismatch.
30:22 So that's like 50 ohms to 75 ohms, right?
30:25 Yeah.
30:25 Yeah.
30:25 But it's basically the way a developer is using the objects is different from how, can be different from how the data is stored and joined in the tables in your database.
30:36 And especially if you've set up the tables in a way that's not like, it's contradictory to how it's being used all the time.
30:44 It might be slow and you can maybe reshaping your data might speed that up.
30:50 And then potential for reduced performance.
30:53 And this isn't surprising to me.
30:55 If you stick some code in the middle, it's not free.
30:59 It's got to run.
31:00 And then also shifting complexity from database to the application code, which this is something that I didn't quite understand right off the bat.
31:09 But if you think about it, it's not too bad.
31:10 But there's databases are complex thing pieces of software that have things like stored procedures, stored procedures and a whole bunch of fancy join math and stuff.
31:21 Right.
31:21 Right.
31:21 That might not be supported by an ORM.
31:24 So you're going to, you have to, if you had to do that stuff, you have to do it in your application instead.
31:29 So it's, it's using, using a database in a simpler way, but that complexity has to go somewhere and it'll go in your application code.
31:36 Yeah.
31:37 Almost certainly.
31:37 But I mean, until you get like database specialists, then, you know, it makes it a little bit easier for you as, you know, a sole developer, for instance.
31:46 Yeah.
31:47 So I punted at first and used a document databases because I didn't have to think about ORMs right off the bat.
31:53 But, but I mean, so, so, but the thing is that an ORM, like he's correct.
31:57 Like a database is definitely a very advanced, complex tool, but a lot of that advances in complexity you retain even when using an ORM.
32:05 For instance, a lot of document databases don't have great transaction models, don't have great, you know, sort of multi-version concurrency models.
32:13 And, you know, so when they put all that work into Postgres or even like MariaDB or something like that, you can, just by using an ORM, it seems almost as simple as a document database, but you get that operational, you know, feature.
32:27 Yeah.
32:28 I'd definitely heard of SQLAlchemy or SQLAlchemy, but I hadn't heard of a couple of the others that he listed here, PeeWee and Pony and SQL Object.
32:39 Have you used any of these?
32:41 Yeah, so SQLAlchemy is definitely my go-to and I'll talk about why in a second.
32:47 But yeah, I mean, I've used Django's ORM because I did the Django tutorial and that's one of the first things they teach you.
32:53 Django has a serviceable ORM, but there are some issues with it that SQLAlchemy actually does a much better job with.
33:00 And I have used PeeWee, in fact.
33:02 I like PeeWee.
33:03 It's sort of like a simplified version of Django.
33:06 In my opinion, it basically says like, look, if you're not going to be SQLAlchemy, then, you know, you can just be plain simple.
33:13 And it does a pretty good job.
33:15 But these days, SQLAlchemy has gotten so good that, you know, I just reach for that every single time I'm going to work with a relational database in Python.
33:24 Okay.
33:24 So one thing that SQLAlchemy has is that it sort of has this working copy of all the models and they end up being kind of like singletons within a given process space.
33:35 So with Django, you can actually get two copies of the same thing from the database within the same request or the same process.
33:45 And that means that basically concurrently somewhere else in your program, it could change something, save it.
33:51 And then when you change it in your request handler you're actually trying to work on, that will overwrite the previous change.
34:00 You know, like if you change column A in one thread and column B in another thread, whichever thread saves first is going to overwrite the other unchanged value.
34:09 So there's a setting that's off by default, I think, in Django called atomic requests.
34:15 And you have to enable that to prevent that sort of situation.
34:18 But Django is not alone in this.
34:19 I think that Rails, at least for a very long time, did the same thing.
34:23 And Django, of course, is sort of Python's response to Ruby on Rails.
34:26 So, yeah.
34:28 Does SQLAlchemy not have this problem?
34:31 So SQLAlchemy doesn't have this problem because basically, yeah, you only get one copy of that thing in your system.
34:36 It has this sort of local index of primary key to the object version of that row that you're representing, for instance.
34:44 Okay.
34:45 So, yeah.
34:46 So, yeah.
34:46 SQLAlchemy sort of has, it adds a lot of machinery, makes SQLAlchemy a little bit more complex.
34:51 But I had a friend who I think spent days tracking down this issue with Django.
34:56 And SQLAlchemy never would have happened.
34:59 So you pay some upfront costs with setup with SQLAlchemy.
35:01 But I think it's definitely worth it.
35:03 When it comes to this sort of ORM thing, though, like if I can provide some general advice, ORMs are sort of the tools of applications.
35:13 And if you want to see, if you want to form a real opinion on object relational mappers, you should look at and compare applications.
35:21 So I spent a fair amount of time reading Reddit source code, which does, I think, use SQLAlchemy.
35:28 And it uses it without the declarative object mapper.
35:31 It uses it with the sort of legacy or lower level SQLAlchemy tools.
35:36 But you still get a real sense for where they use an ORM and where they don't.
35:41 And SQLAlchemy actually makes it very easy to pass through normal SQL text.
35:45 That's another thing I really like about it.
35:47 It understands that ORMs are an abstraction that's useful 90% of the time.
35:51 And for that last 10%, you really want the full power of the driver or the database itself.
35:58 Okay, cool.
35:59 I don't have any opinion on these extra couple links that I put in here.
36:02 Matt has some dedicated pages for SQLAlchemy and PeeWee.
36:07 And one of the things I like about Matt's site anyway, the Fullstack Python, is he gives his opinion and information when he has it.
36:15 And when somebody else has already explained it well enough or better, he just links to their stuff and says, go read that.
36:22 Yeah, absolutely.
36:23 No, I mean, he's a real team player in that regard.
36:24 But I also, I just got to, you know, give a shout out to him.
36:27 Like he so consistently adds to the site.
36:30 It's become such a tremendous resource for someone who wants to develop an application.
36:34 I'm sure that listeners of this podcast are, for the most part, like already aware of it.
36:39 But yeah, definitely check it out.
36:41 Definitely.
36:41 Well, that's all of our topics so far.
36:44 We didn't address what you're up to lately other than helping out with podcasts.
36:51 Yeah, no, it's funny.
36:53 I'm also like prepping for another podcast as well, but partially examined life, I guess.
36:59 But basically, yeah, what am I up to lately?
37:02 Well, I had a talk at PiBay and because it was based on a blog post, I thought it'd be easy to put together slides.
37:07 Now, it still took like just full disclosure.
37:09 It took like another 40, 50 hours to make slides from that blog post.
37:14 But it seemed really well received.
37:15 And so I'm very relieved right now.
37:17 I got some nice life events coming through, parents coming to town, keeping me real busy.
37:22 I also am working on this hyperlink library, like I mentioned earlier, URLs in Python.
37:27 And it's used by Twisted and some other big projects.
37:31 So fixing bugs in there is always kind of contentious, which is why I got a lot of support for people who work on things like setup tools, which is even more widely used.
37:41 So then beyond this, let's see.
37:44 Yeah, writing blog posts.
37:46 I got I think my draft count is up to like 100 now.
37:49 But yeah, maybe more conferences, more talks.
37:53 I don't know why I keep signing up for these things, but it's great meeting people out there.
37:56 People out there should really look into PiBay and regional conferences, meetups.
38:01 Oh, well, I run a meetup to the Peninsula meetup, the hottest new meetup in the Bay Area, Silicon Valley.
38:08 And so, yeah, like, yeah, yeah, we were a pun.
38:12 Hey, this is programming, man.
38:14 It's all about the terrible puns.
38:16 So we but yeah, Peninsula.
38:19 Yeah, I think we even have the site now, Peninsula dot org.
38:23 And, you know, we're on Twitter and so forth.
38:25 I do my best to record the talks.
38:27 But for people who want to break into this type of, you know, speaking and that sort of thing, just look at look no further than your local meetup.
38:35 Right.
38:35 Go make a 15 minute, 30 minute talk.
38:38 See how it goes.
38:39 Iterate on it.
38:41 Right.
38:41 Have a brown bag at your company.
38:43 Just keep iterating on it.
38:44 And, you know, something will stick.
38:46 And then you can submit it to something like PyCon or whatever.
38:50 That's a great idea.
38:51 I think a lot of people think that you could you just have to work really hard on a talk and give it once and then it's done.
38:57 But a lot of people give them several times.
38:59 Yeah.
38:59 And also, like if there's not a meetup in your area, just maybe start one.
39:03 Python programmers are literally everywhere.
39:05 So we like, you know, even though there's a South Bay Python meetup, which is sort of like more towards Sunnyvale, like kind of south of Mountain View area.
39:16 And there's this SF Python meetup, which is up in San Francisco.
39:20 We put one right in the middle.
39:22 I guess California traffic is bad enough that we sort of have a captive audience, literally.
39:27 But we'll get like, you know, I think when Guido came, there were almost 100 people at the meetup.
39:32 And normally we get like 50.
39:34 But it's great because everyone can socialize and something a little bit more intimate.
39:38 It's a little bit less stressful when you're trying to give the talk yourself, too.
39:40 Yeah.
39:41 So it wouldn't be a Python Bytes episode if I didn't plug my book.
39:45 By all means.
39:46 So one of the things I want to bring up is the Python testing with pytest has a nice discussion forum.
39:52 It's kind of built into what Pragmatic offers for all the books.
39:57 But if you ever ask a question on there, it pings me and emails me and says there's a question.
40:02 Just this morning, I answered a question.
40:05 Somebody got on and said that they were actually...
40:08 I love this.
40:09 They said that the book is helping them understand testing better.
40:13 And I love comments like that.
40:16 But he had a question about Monkey Patch versus Mock.
40:20 And I'm not going to get into it too much here.
40:23 But I did reply to him.
40:24 And it's all up there for everybody else to read, too.
40:27 So I'll have a link in the show notes to that.
40:29 That's great.
40:30 Yeah.
40:31 Those sorts of comments really keep you going.
40:32 I wish that my O'Reilly thing had had such a discussion forum.
40:36 Instead, I have to...
40:38 I got my feedback through reviews for a while.
40:40 Oh, yeah.
40:42 Yeah.
40:42 I mean, emails, too.
40:43 People email.
40:44 And I appreciate it.
40:45 Yeah.
40:45 I get them from all over the place.
40:46 I get it through the discussion forum.
40:48 I get it from Twitter.
40:50 And from...
40:51 We've got a Slack channel.
40:52 So people come and tell me what's wrong in the Slack.
40:55 So...
40:55 Yeah.
40:56 Definitely.
40:56 I don't know.
40:57 For just like sort of chatting here, right?
40:58 I've been really into like Riot.im, which is a Python-based open source Slack sort of thing.
41:06 And there's also Zulip, which is just everywhere these days.
41:09 They're doing an amazing job.
41:10 So what's the first one, Riot?
41:12 Yeah.
41:12 So Riot.im, and it runs a sort of protocol called Matrix.
41:16 And it's a very, very large thing.
41:20 It's basically like you can have end-to-end encrypted chats with people who are on it.
41:25 But I use it because it's an IRC bridge.
41:28 Like I said, if you want to be sort of in this inner circle, see the goings-ons, IRC is still
41:33 very much alive.
41:34 So you've got your list serves and IRC and so forth.
41:39 And Riot makes that pretty easy to get into.
41:43 You know, there's a free node bridge and you just join a free node thing and you can look
41:46 at IRC through your browser while having end-to-end encrypted chats with your other friends.
41:51 It also has a like sort of peer-to-peer video chat that works really, really well.
41:55 Because it's just the WebRTC open source protocol.
41:59 Works great in Firefox.
42:00 Well, I'm going to cut you off because we're right.
42:03 But oh, wait.
42:04 Yeah, we're way long.
42:05 Anyways, that's great.
42:06 Also, I think this is an awesome topic.
42:08 I think that you should come on to Test and Code and we can talk about IRC and communication
42:15 channels.
42:15 That'd be fun.
42:16 That's actually a great idea.
42:17 Yeah, for sure.
42:18 I'm always like coming up short with topics when the, when the, but yeah, here we are
42:22 just chatting.
42:23 That's great.
42:23 That's a great idea.
42:24 Again, thank you so much for coming on.
42:26 I love having new voices on, on here.
42:28 And it's been my pleasure.
42:30 And thank Michael.
42:31 You know, when he gets back, I'll send him an email.
42:33 This has been great.
42:34 Yeah.
42:34 And we'll keep in touch.
42:36 Thank you for listening to Python Bytes.
42:40 Follow the show on Twitter via at Python Bytes.
42:43 That's Python Bytes as in B-Y-T-E-S.
42:46 Get the full show notes, including links at pythonbytes.fm.
42:51 If you have a news story you'd like featured, visit pythonbytes.fm and send it our way.
42:56 We're always on the lookout for sharing something cool.
42:58 This is Brian Okken on behalf of myself and Michael Kennedy.
43:03 Thank you for listening and sharing this podcast with your friends and colleagues.