Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #94: Why don't you like notebooks?

Return to episode page view on github
Recorded on Wednesday, Sep 5, 2018.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds. This is episode 94 recorded September 5th 2018 I'm Michael Kennedy and I'm Brian Atkin. Hey Brian. How you doing? I'm doing really good Yeah, yeah, excellent doing very well. The Sun is shining summer has not left us yet It's not that great for productivity, but it's definitely good for keeping the spirits up Yeah, you know what else is keeping my spirits up is my digital ocean servers the ones running this site and many others, they've been working perfectly.

00:29 So they've been going really, really strong and we'll tell you more about them later.

00:33 But in the meantime, if you wanna check them out, pythonbytes.fm/digitalocean, get $100 credit for new users.

00:39 Brian, when I was in the C++ world, the C# world, design patterns were like this massive thing and you had to know all the design patterns and there was like dependency injection and IOC containers and all this stuff.

00:53 But I feel like Python doesn't have as much rigor around it because you don't have to jump through so many hoops to make certain things happen, I guess.

01:02 What do you think?

01:03 - Yeah, I think so.

01:04 It's actually something that's interesting 'cause I came from the C++ world.

01:08 So C patterns were a thing in C# also?

01:10 - Oh yeah.

01:11 - Okay.

01:12 Well, I don't even know who the gang of four are, but there were four authors that wrote the A Design Patterns book.

01:21 Let's see, Eric Gamma, Richard Helm, Ralph Johnson, and I'm not gonna try to pronounce that last one, John something.

01:28 Anyway, in, gosh, in the '90s, if you were in C++ or C#, apparently, you read this book or others around design patterns.

01:38 And then when I got into Python, I was a little curious whether that was a thing in Python or not, but I haven't really heard much other than I haven't really needed it.

01:48 A lot of this stuff isn't really needed.

01:49 What I think is interesting is there's those patterns that you see from the Gang of Four and the sort of derivative ones, derivative books and thinking.

01:58 And a lot of it, like you say, is not needed, but there are other patterns that are really useful and like come in, like for example, meta classes, for example, or decorators, there's other stuff, right?

02:08 Generator methods, all sorts of stuff that is in here that don't appear in the Gang of Four because you know, C++ or Smalltalk just didn't do that.

02:17 They were highly based on Smalltalk actually.

02:19 their patterns. Right. Well, one of the things that caught my attention today was a tweet by, gosh, who's this? Brandon Rhodes.

02:28 Brandon Rhodes. Yep. And he's doing, he's got a site called pythonpatterns.guide.

02:33 And it has, he's sort of going through a lot of, a lot of different, I think he's going through the Gang of Four book, but he might be also doing other, pulling together other design pattern things that he's talked, yeah, he's pulling together information from talks and writing, and I think he's creating more information too. But there are a whole bunch of these, trying to apply some of these patterns to Python and kind of sometimes different ways to do it.

03:03 So you can do things in different ways. And so far, he's got abstract factory pattern, the builder pattern, factory method, composite decorator. Yeah, we definitely have decorators.

03:15 And then things like monkey patches and iterators, things like that.

03:20 And how that applies, I'm glad that somebody that knows what they're talking about is trying to figure out how does this all apply in Python.

03:28 And I haven't really dug too much into this.

03:30 I just think it's a neat resource to try to read about some of these.

03:34 Yeah, I definitely think it's a really neat resource.

03:37 And Brandon has some interesting thinking on design patterns and architectures.

03:41 He gave a super not counterintuitive talk called clean architecture.

03:47 I think it was at PI Ohio a couple years ago.

03:49 And when I first started watching it, I was like, I just disagree with everything you're saying.

03:53 This just seems so wrong.

03:55 And then after 10 minutes, I'm like, but wait a minute.

03:58 I think it's right.

03:59 Like, I think I've been thinking about this all wrong.

04:01 And it really, really caught my attention because I, I didn't agree with it so much.

04:07 But then I I'm like, wow, this is really compelling what you're telling me.

04:10 So maybe I need to rethink what I'm thinking.

04:12 And whenever I have that feeling, I'm like, whoa, I need to pay attention because, you know, I might learn something really good here.

04:18 Yeah.

04:18 And yeah, so that's a that's a good point.

04:20 I'm not necessarily saying I since I haven't really dug through this too much.

04:24 I'm not sure.

04:26 I mean, I respect Brandon is a is a smart guy.

04:28 I expect that there's some really great stuff in here, but you may not may not agree with all of it.

04:34 So we'll try to dig up a link to that clean architecture, too, because that sounds interesting.

04:38 It's super interesting.

04:39 Yeah, it's definitely a good one.

04:40 Cool.

04:41 Well, thanks for bringing this up.

04:43 I love these Python patterns and I love sort of the how would, you know, these traditional more formalized patterns actually look in our language.

04:52 And there's a lot of interesting examples there.

04:53 Yeah.

04:54 What we got next?

04:55 Well, we got this thing called Arctic.

04:57 And Arctic is an API framework over top of MongoDB and pandas.

05:06 The idea is this is a thing that's been around since around 2012.

05:11 Its sole purpose is analyzing time series data super fast.

05:18 One of their headline is basically, "Archic millions of rows per second of time data in Python." That is really quite impressive.

05:27 I can tell you a lot of the ODMs and ORMs and stuff, they don't do millions of records per second.

05:32 So the idea is that it basically bakes in pandas and num pies and all those kinds of things.

05:39 And it has an underlying data store that's backed by MongoDB.

05:42 And it actually uses like the binary low level communication.

05:46 So instead of trying to like store all the data, and then bringing it back and de serializing each row, I think what it does is it actually just stores the binary data of pandas and and it'll pickle like NumPy arrays and stuff like that and just exchanges like the memory structure and just pulls it straight back and go, yep, here it is, let's look at it.

06:06 And it's pretty cool.

06:07 - Yeah, yeah, definitely.

06:09 And there's a lot of applications that use just huge amounts of time series data.

06:14 - Yeah, so they say the two big areas they think it's useful is IoT, little tiny IoT devices that maybe Python is running on, and financial analysis.

06:25 So they're, you know, it's sort of been extracted out of the work that this financial company called MAN, A-H-L, I've never heard of them, but I think they're mostly an Asian company, but also in the US, around investment and so on.

06:38 So they've been working on this and they actually have some numbers on how this thing performs relative to other types of projects that they pursued or other things that were available.

06:50 So they talk about the different kinds of data that they store and analyze for stock trading and analysis.

06:55 And they say, look, we have this sort of data that's for one day, a whole bunch of it, maybe 10,000 rows, and they can work with those 10,000 rows in four milliseconds.

07:05 And they say, compare that to what we were getting out of SQL Server, which was 2.2 seconds.

07:11 So, you know, 500 times slower, which is pretty incredible.

07:15 And they have this other like tick data, like, you know, the stock ticker type of data.

07:19 They can say in one second, they can process 3.5 megs worth of that data in Python or 15 megs in Java.

07:27 And there were some other projects that they were trying to improve over called OtherTig, which took like 40 seconds versus one.

07:33 So really, really interesting, high-performance, database-backed time series.

07:40 - Hmm, neat.

07:41 - Yeah, so if you're into pandas, NumPy, and you've got to store and query a bunch of time series, whatever the reason, this is probably worth checking out.

07:49 And it's also tested with pytest, which is pretty cool, right?

07:52 - Oh, well, of course.

07:53 Any real project's tested with pytest.

07:56 - That's right, of course.

07:58 So, one of the things I really like about the Python community is the fact that there's so much sharing of information out of conferences and meetups and things like that.

08:09 So we have another thing you found here for PyCon, right?

08:13 - Yeah, so the PyCon, I don't remember when it was, but PyCon Australia wasn't too long ago, And they've already got all the videos up.

08:22 And we have a link to the PyCon Australia videos.

08:25 And I've got quite a few of them queued up that I'd like to listen to.

08:31 I'm kind of bad about videos, actually.

08:33 I often just listen to them and then go back and look at the slide parts of information that I wanted to capture.

08:41 But I like listening to talks as well.

08:43 But there's one from Mark Smith, which he always amuses me his Twitter handle is Judy2k and he won't tell me why.

08:51 But his talk is how to publish a package on PyPI and that's the one I've watched so far.

08:57 There's a lot of great talks there though, but I think this one's a great one that it the end punchline is use cookie cutter.

09:05 But he blasts through not using cookie cutter, all the sort of stuff you have to do to get up and you know it's every little piece makes sense and it's not difficult, but there are a lot of different little pieces.

09:14 But he goes through this entire thing in like less than half an hour.

09:18 And so that's pretty impressive to watch him talk about all the different pieces and why they're there and what they're used for.

09:25 So that's a good one to sort of understand what's going on in the packaging world in a very short amount of time.

09:31 Oh, that's really cool.

09:32 Yeah, there's a bunch of cool ones here.

09:34 A couple in MicroPython, actually.

09:36 So one writing fast and efficient MicroPython code, and the other is async IO in MicroPython.

09:42 Both of those are pretty cool.

09:43 I'm kind of tying to what we were just talking about previously. Yeah, and then there's like gosh, there's solid api's and there's Looks like a real a lot of good stuff And I know that Australia is since it's big travel burden to other places other pythons There you'll you'll see some speakers there that you're not gonna see other places. So that's cool Yeah, absolutely, and they have 88 videos. So that's that's pretty solid. Yeah, quite cool. That's a good one So before we move on, I'll tell you about another cool thing, DigitalOcean.

10:13 So big fan.

10:15 And so one of the things that they've released--

10:17 we talked about this just a couple of times, not very much--

10:20 is this idea of projects.

10:22 So when you go into your--

10:25 name your cloud provider, you might have a bunch of servers, a bunch of ESP storage type things, virtual storage blocks, load balancers, all sorts of stuff.

10:35 And it's really hard to know what goes with what.

10:38 Do you have a staging environment, a production environment, all that kind of stuff, right?

10:42 So how do you organize that?

10:43 So DigitalOcean has come up with this feature called Projects that lets you group things like your droplets, that's virtual machines, and floating IPs, and back storage, like spaces, into these different use cases.

10:55 So, you know, yeah, actually we're done with this project so we can turn that server off and destroy it, and not like the fear of, "I don't think we're using this one, "but I'm not gonna destroy, I'm not going to delete it because what if I'm wrong, right?

11:08 So a very cool feature you can take advantage of for all of their stuff.

11:12 Check them out at Python bytes.fm/digitalocean and I'll give you a hundred dollars credit for new users.

11:18 That's awesome.

11:18 Hey, let's talk about another cloud provider.

11:20 Right.

11:22 I'm right on the back of that.

11:23 So one of the ways that you can run your code on the internet is like I just described with digital ocean.

11:29 like I do for our stuff, is to create some virtual machines and various other pieces and sort of use it as so-called infrastructure as a service, right?

11:38 IaaS.

11:39 But you might also use platform as a service, like here's my code, run it.

11:42 So Google App Engine, Heroku, those types of things.

11:45 So Google App Engine has a pretty interesting announcement, and it's interesting for both it's good now and like, oh my gosh, I can't believe it was like that.

11:55 So the announcement is that Google App Engine has released their second generation runtimes which the Python one is now based on Python 3.7.

12:03 That's pretty awesome, right?

12:04 It is.

12:05 You want to run some code.

12:06 Boom.

12:07 Here's my Python 3.7.

12:08 So that's really good.

12:09 You might think, "Oh, Michael, what was the previous one?

12:11 3.6?

12:12 3.5?" No, I believe the previous one was 2.7.

12:15 Oh, no.

12:16 Until now.

12:17 Like if you were using Google App Engine, I believe you had to use a legacy Python, period.

12:23 Yeah, that was like mid 2018 that I just said that that wasn't like a statement around 2012 or something that was just now.

12:31 But let bygones be bygones.

12:33 And now it's Python 3.7, which is pretty awesome.

12:36 So apparently, it's a pretty big upgrade, you get a bunch of new things like for example, it's based on their new sandbox container sort of Docker like things.

12:47 It removes a bunch of restrictions, like, in addition to only running on the old Python, legacy Python, you could only use a white labeled set of packages.

12:57 And now in the new Google App Engine, you can use arbitrary packages, just put them in a requirements file, which is pretty sweet.

13:03 That's a big change.

13:04 It is a big, pretty big change.

13:06 So a lot of cool things like auto scaling and things that are a little bit easier as well.

13:10 So anyway, if you're interested in Google App Engine's platform as a service for Python, it just got many, many times better.

13:16 Yeah.

13:17 Yeah.

13:18 So Brian, I typically write my code in Python files, not really in notebooks per se.

13:24 How about you?

13:25 - Yeah, mostly in files, but I'm trying to learn Jupyter notebooks some and utilize them.

13:31 They're kind of fun, especially in data science realms or looking at plotting data and stuff, notebooks are fun.

13:39 But there was a person named Joel Gruse that says he does not like notebooks.

13:46 - And Joel is notable 'cause he's not like a random dude on the internet.

13:50 But Joel Grus has written a book called Data Science from Scratch.

13:54 He's done a lot of work in data science, things like that.

13:57 I've even had him on Talk Python many moons ago.

13:59 - Yeah, and this wasn't just like a one-off comment.

14:04 He gave this talk at JupyterCon, and that's kind of hilarious.

14:10 But the video for that is not available yet as far as I couldn't find it.

14:14 So because that was just recently or still going on, I'm not sure but the the slides are up. He put the slides up and For one puts me to shame of it You know this presentation is got so many animations and pictures and stuff plus it's like I haven't even got through it yet It's like a hundred pages long or more, but it's really good but but it's a serious a serious discussion about some of the issues with with the problems with notebooks that people new to notebooks don't quite get and People old to notebooks just sort of know it and don't really think about it anymore. And one of the big ones is that the there's hidden state and so like all and essentially we think of files as Like you said we normally work in files. So they they get run from top to bottom Except for you know functions don't get run they get interpreted as functions and then when they are run that run top to bottom essentially and Notebooks are not like that. You can jump around and execute different bits of code in different orders if you feel like it and That stateness can lead to weird confusing things So it's just like a gotcha to know about and then he goes on to talk about some of the issues where if you Start learning how to code with notebooks You may end up, you know developing some bad habits like importing notebooks instead of just trying to I mean like that's a thing apparently you can do is you can define some functions in a notebook and then import them into another notebook.

15:43 Well, I mean, wouldn't it be better to just put them in a different library, in a package or a library?

15:49 >> Use the package, use the library, exactly.

15:51 >> Yeah.

15:52 So some of those, and I'm highlighting this not because I think notebooks are evil, but because I think it's important to listen to people saying, listen to a voice that says they aren't a silver bullet.

16:05 they have their issues also.

16:08 And we just need to be careful and talk, and make sure you don't fall into those traps.

16:12 - Yeah, these are really interesting, and these are certainly issues to look out for.

16:16 And wow, this is a funny presentation.

16:17 I cannot wait to watch this video.

16:19 Joel, if you're listening, please let us know when it's out, or if someone else sees it come out, shoot us a note, either email or Twitter, 'cause this is fantastic.

16:26 - Yeah, plus, also, I can't even imagine how long it took to put together this presentation, because it's, yeah, there's a lot of animations in there, it's quite a riot.

16:36 - It is quite a riot.

16:37 - Yeah, anyway, there's that.

16:38 Just the other side of maybe notebooks aren't awesome.

16:42 - Yeah, it's pretty interesting.

16:45 So we've had a couple of conversations around the various PEPs and stuff that have been maybe causing some kerfuffle in the community.

16:56 Obviously the biggest one was PEP 572 about the in-place assignments, and that was the thing with all the stress around it that Guido said, "Hey, after this, this is my last one.

17:08 "I've given my all, I'm out of here.

17:09 "You guys, it's up to you." We actually had Brett Cannon and Carol Willing on episode 87 to talk all about that, right?

17:16 And one of the things that we talked about was what comes next, right?

17:22 If it's not down to Guido to make the final decisions, which is how it has worked, how will the Python community decide what it's up to?

17:32 So yeah, so Barry Warsaw has published five peps at least around this.

17:39 And I don't think this is a decision.

17:41 It's sort of a structure to further the conversation and make a decision.

17:46 So he just published not too long ago, pep 8000, which is Python language, government proposal overview.

17:53 And I don't know if this is common in peps.

17:55 I haven't seen it that much.

17:57 But it's like a gathering of other peps that are specific details.

18:01 So there's PEP 8001, 8002, 810, and 811.

18:06 The first two are about voting and ways in which this government might work.

18:11 And then the higher ones, the 810s, are actual proposed models.

18:16 And there's a third one, an 812, that I forgot to put in the notes.

18:20 And so there's for the government styles, government styles, we have the BDFL governance is one of the proposed options, which is to elect a new person who is the final decider.

18:34 Basically, get your step down, who is going to take that place to now participate in that way?

18:41 We also have the council governance model, which we talked about interesting things, should there be an even or odd number of people on the council?

18:47 And then the last one, I think--

18:49 let me pull it up real quick-- I think it is community--

18:53 yeah, the community governance model.

18:55 and that one's a little more freeform.

18:57 So these are all different ways of possibly arranging and solving that problem.

19:02 And there's a lot of examples like, let's see how Rust did it.

19:05 Let's see how OpenStack manages their organization and so on.

19:09 So there's a lot of concrete stuff there.

19:12 So anyway, that's pretty cool.

19:14 If you have a strong thought on this and you wanna participate, get in there, make comments, let people know what you're thinking because it's still open.

19:23 It's not anything decided, right?

19:25 it's still up in the air.

19:26 So if you wanna have a say, now is the time to make statements.

19:30 - Wow, it's like government working in our own community.

19:34 - What?

19:35 Incredible, incredible.

19:37 - Okay.

19:38 - Yeah, so this is pretty cool.

19:39 I don't know where it's gonna go, but I like that it's all laid out like this.

19:44 My guess is it's gonna go down the council model, maybe with, I don't know, I think it's gonna go down the council model, but we'll see.

19:51 - Yeah.

19:52 - I think that whatever they do, they need, they should, if there's a council, they should have to meet together to make decisions and pass around a talking stick or something.

20:04 - Yes.

20:05 (laughing)

20:07 I love it.

20:08 - Oh, we could come up with something weird that they have to follow.

20:12 - How about the Python Staff of Power that you were carrying around?

20:15 - Yeah, but then, should it be the blue and yellow one or should it be the green and yellow one?

20:21 That is a big question.

20:23 Yeah, I don't know.

20:25 So, sorry, green and gold.

20:27 People in Australia say it's gold, not yellow.

20:30 But it looks yellow to me.

20:31 Yeah, I definitely thought that stick was a big hit.

20:33 I don't know if people don't know what you're talking about.

20:35 What should they Google to find this stick?

20:37 I think it's Pythonic Staff of Enlightenment.

20:39 I don't know.

20:40 That's got to do it.

20:42 How many hits on Google can there be for that?

20:44 I don't know.

20:45 Awesome.

20:46 So, yeah, they should have to pass that thing around.

20:48 All right, well, that's it for our items this week.

20:50 this week.

20:51 You got anything extra you wanna share with folks?

20:53 - I don't, actually.

20:54 Just trudging along, we got a couple more testing codes out.

20:58 - Yeah, very nice.

20:59 - How about you?

21:00 - I've got, of course, some Talkpilot stuff queued up to be released shortly.

21:04 I have been recording some courses, which are gonna be awesome, and I'm very excited about them, doing a bunch of stuff in parallel.

21:10 So I'll let you know when that's sort of further along.

21:13 But I do have two things I wanna talk about this week really quickly.

21:17 One is we got a message on Twitter, I don't have the name of who sent us this was John actually Thanks, John who sent us this heads up that?

21:26 Brian Granger one of the guys behind I Python and Jupiter and all that stuff from the very early days is Giving a free webcast and it's a ACM Sponsored thing says project Jupiter from computational notebooks to large-scale data science with sensitive data So that sounds interesting to you. I put the link in there. It's this Friday this episode Probably will come out on Thursday. So you got to take action right away. If you're listening, there's probably a recording or something afterwards, you can check that out. The other thing is, you know, we talk sometimes about the popularity of Python. Yeah, so I don't want to beat this one to death too much. It's not really worth its own item. But Python continues to climb yet another ranking. So the T o B index is one of the more well respected, more long running ways of ranking programming languages. And I think when we started this podcast, Python was either Fifth or sixth, I think it was sixth on this list.

22:21 It is now third.

22:22 - Probably because of the podcast.

22:24 - Certainly, partly because of it.

22:26 - Yeah.

22:27 - But that may be a very small part of it, or maybe it's meaningful.

22:30 But what's really interesting is it's now above C++, C#, JavaScript is way above JavaScript, and JavaScript's going down.

22:40 It's above Ruby, it's above many, many things.

22:43 What it's not above is it is not above Java or C.

22:47 And not only is it not above them, but it's like half.

22:51 So it's like 7.6% to C is 15.4%.

22:55 It's gonna be a long time, if ever, till it gets to a two or a one.

23:00 But it's definitely doing quite well.

23:02 - Yeah, so yeah, what is the Tyobi index?

23:06 - If you look into it, they talk about their philosophy and where they measure stuff from and so on.

23:12 It's been a long time since I read it, so I don't remember the details, but they do lay out where the ranking comes from.

23:17 Okay, cool.

23:18 Yeah.

23:19 All right.

23:20 Well, that's it for this week.

23:21 Thanks for chatting with me, Brian.

23:22 Thank you.

23:23 Bye.

23:24 Bye.

23:25 Thank you for listening to Python Bytes.

23:26 Follow the show on Twitter via @PythonBytes.

23:27 That's Python Bytes as in B-Y-T-E-S.

23:31 And get the full show notes at PythonBytes.fm.

23:34 If you have a news item you want featured, just visit PythonBytes.fm and send it our way.

23:39 We're always on the lookout for sharing something cool.

23:41 On behalf of myself and Brian Okken, this is Michael Kennedy.

23:45 Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page