Transcript #30: You are not Google and other ruminations
Return to episode page view on github00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds.
00:06 This is episode 30 recorded on June 13, 2017. I'm Michael Kennedy.
00:11 And I'm Brian Okken.
00:12 And we have tons of good stuff to share with you today. I'm very excited about all of our items.
00:17 But first, we have a brand new sponsor. I want to say thank you.
00:20 That's exciting.
00:21 It's super exciting. So I want to welcome Datadog to Python Bytes as a sponsor.
00:26 And Datadog has this cool thing where if you do a little test integration with them, you actually get a free shirt. So I'll tell people how to get a free shirt later.
00:34 Wonderful. Cool.
00:35 I heard that not everybody writes software like at the scale of Google.
00:38 Yeah, I think that's probably true, especially of me. But there's a there's a couple articles that I want to talk about today. Actually, mostly one. There's an article by I think it's a Ozan Oneh, which is a pretty cool name, called "You are not Google" and also he goes on to say that you're also not Amazon and you're also not LinkedIn. But it's a, it isn't to say that the Google, Amazon, LinkedIn all have applications that might look similar to normal folks applications, but the scale is definitely different. So just be aware of that. And I guess it's a reaction to people chasing a lot of the shiny new technologies like asynchronous IO and other things.
01:24 And I'm guilty of that as well.
01:26 But he presents, when looking at solutions, he presents a model called UNPAT, I think, or maybe it's UNFAT, I'm not sure.
01:35 U-N-P-H-A-T, which is try to understand the problem that you're first.
01:41 Enumerate multiple possible candidate solutions.
01:45 read papers or articles about the solution that you're gonna try.
01:50 Look at the historical context of why the solution came to be in the first place.
01:55 List out the advantages and disadvantages and don't forget to think.
02:00 Make sure that that solution really fits your problem.
02:03 And anyway, I think it's a good discussion about how a lot of the architectures and stuff that people write about and how Google and others do things might not apply to you.
02:13 So, just a heads up.
02:15 I think it's a super interesting article and it's really it's a good reminder.
02:18 I mean, on one hand, it's really interesting to think about computation at like a massive, massive scale, like some of these data center things and the incredible failover and possibly like global redundancy that some of these companies like Google operate at.
02:38 And people read these things are like, oh, we're gonna do our startup and it's gonna to have like if we're lucky we'll have 1000 users after like three months.
02:45 Well maybe you don't need to architect around the same things that say Google or LinkedIn or Amazon are architecting around.
02:53 Maybe you need to link architect around.
02:55 Let's get this thing working as fast as possible and then we'll deal with the scale later.
03:00 So there's a quote that says and here this pretty nice says the thing is there's like five companies in the world that run jobs as big as Google for everybody else.
03:08 between all this IO and fault tolerance and things you don't really need.
03:12 It's really just this idea of like, it's cool to study these patterns, but these patterns were created within a context of a problem.
03:20 Do you have that same problem?
03:21 It's like the recurring theme.
03:22 I really like it.
03:23 - Yeah, there's also, even if you are huge, even if your problems are big, they still might not be the same.
03:29 And one of the things he talks about is maybe the, like your data store, maybe the number of writes is more important than how often it's read or vice versa.
03:40 So even at large scales, the problem might be different than somebody else's.
03:44 - Yeah, yeah, for example, Amazon optimized for write tolerance on the database that backs their shopping cart.
03:53 But is that your primary concern?
03:56 If it is, maybe do what they're doing.
03:57 If it's not, then maybe that's not the best database for a general database, right, for example.
04:02 - Yeah, yeah, anyway, it's Goodread.
04:04 - Speaking of databases, Oh, let's talk about your microservice one as well that ties into this.
04:08 - Oh, just along the same line, I ran across another article that just is called Enough with Microservices, and it's a similar type of article.
04:18 It's also well-written.
04:19 It's by Adam Drake, and we'll have a link up.
04:22 Mostly it's similar sort of discussion about microservices and dependencies and how that's a complexity that it adds to your cost, so make sure that you're aware of that before you try to jump into it.
04:36 - Yeah, I recently had Miguel Grinberg on Talk Python.
04:41 I haven't released that episode yet because I have like three months of backlog of stuff recorded that's gotta get out, but it's on its way and it's a really good episode around microservices.
04:50 So if you're really thinking about microservices, check that out, it's really enlightening.
04:54 Actually, I learned a lot from talking to him.
04:57 But I think one of the takeaways is that switching to microservices makes your application simpler than it otherwise would be.
05:04 Instead of having one complex application, you know, have like, say, six, very simple applications, but your DevOps and deployment and coordination and your infrastructure story gets way more complicated.
05:15 And so you're kind of pushing the complexity of your application around and does it make sense to push programming complexity into infrastructure DevOps complexity, I think it depends on your organization, how many people are on your team, how complex is your app, but certainly a lot of small apps probably shouldn't be microservices.
05:34 It's interesting you bring that up with what your organization looks like because there's a lot of small startups and small organizations and sometimes just individual people that you really pay attention to what you like to spend your time on and what your skills are because like when you just said that I was like wow I better be very careful about that because DevOps is not my strong suit.
05:57 Exactly, I'd rather have the complexity in my app and carefully factor that thing, rather than push it to a bunch of servers that coordinate.
06:04 Right?
06:05 Yeah.
06:06 Yeah.
06:07 Yep.
06:08 Pretty, pretty interesting.
06:09 So speaking of databases and things that are complex and things that scale super high, let's actually not talk about a database with this thing called no DB.
06:14 Okay, I haven't heard of this at all.
06:15 So tell me about no DB.
06:17 So no, no DB is a Pythonic object store that uses Amazon s3 as the back end.
06:25 So it as a programming interface, it looks like a simple NoSQL database.
06:32 But what is actually doing instead of running a server or something is it's talking to s3 and storing your objects there.
06:40 So you can like insert into the database and you configure the year like your connection string if you will for the database is here's my s3 account and here's the bucket I want to store it into a folder, think folder.
06:52 And then you just like insert, query, update and delete from this database.
06:58 But what actually happens is it stores it over there.
07:01 And I believe the default is actually to pickle the Python object.
07:05 So you even get like type preservation, like you could insert a customer, and then boom, outcomes a customer with its like functions and everything.
07:13 Okay.
07:14 Interesting.
07:15 Yeah.
07:16 Interesting, right?
07:17 So this was done by Rich Jones.
07:19 released it in April and it sort of ties in with some of the serverless architecture, right?
07:26 Like this is the guy that works on Zappa.
07:28 We talked about Zappa last time, which lets you run web applications on Amazon Lambda, which is already pretty interesting.
07:36 So this is like another, it's like you don't have a web server, so maybe you get away with not having a database.
07:42 And it can handle a decent amount of load, but it's not like a full on super database.
07:46 It's more like for prototyping and things like that.
07:48 Cool. Yeah. So some of the examples of he said it might be good for us prototyping, like I said, but also storing API event responses for like replay. So if you're doing microservices, you want to store all the traffic that goes back and forth, you could just do that here really easily capturing logs, simple data like here, Phil, add me your email to this thing. One of the more interesting things is if you're doing lambda AWS lambda, you can have triggers that call the function based on s3 events. This file was changed, this bucket had a new thing added. And what that means is you could insert into the database and it would call an AWS lambda function as a result of that. So you could like insert this thing and the act of storing it also kicks off some action.
08:38 Oh, neat. That's actually pretty cool.
08:39 Yes, it's pretty neat. So So plus the article has a nice picture of a fish skeleton.
08:45 I like it.
08:47 Yes.
08:48 Yeah.
08:49 Pictures are important.
08:50 The one that you're talking about next also has a cool picture.
08:52 Cool logo.
08:53 Yeah.
08:54 I've heard of Elizabeth a few times.
08:55 Yeah.
08:56 I do have to admit that the logo did bring me into this a little bit.
09:01 So yeah, we talked about Faker before, which would let you create like test data that looked real.
09:08 Like give me an address, give me an email, things like that.
09:10 So this is like a competitor to Faker, huh?
09:12 Yes, and it's-- so if you haven't listened to Faker, that was on Epsode 25.
09:16 I looked it up this time.
09:18 It's definitely a competitor to Faker.
09:20 They even have some comparisons.
09:24 And it looks like on their project page, one of the main features that they're going for for Elizabeth is performance.
09:32 Apparently, it's faster than Faker.
09:34 Yeah, Faker's kind of slow.
09:35 I mean, Faker's really nice, but it is--
09:37 I tried to generate a database that had like a couple million entries with Faker and it was a little, it took a while, let's just say.
09:43 Yeah, yeah, I haven't tried anything huge. I wonder, I'm curious to how Elizabeth compares.
09:48 But it definitely is a similar space, but I think it's just another project. Maybe it fits better for your project. And there's, the articles were really well written. So there's, we're linking up two part medium articles. And there's also a, it looks like the same person wrote a pytest plugin so that you can, and the pytest plugin is actually pretty darn cool. It allows you to within a test be able to, as a fixture, you can bring in different parts of the fake data. So yeah, that's really cool. Yeah, definitely looks nice. I'm, I feel like it's in some ways complimentary to Faker. I'm not sure you would use both in the same thing, but you can get kind of get slightly different data so depending on what you're after one or the other may be better.
10:34 It's a slightly different model of how to pull the data out so I think it's good for people to try both and see which style works best for them.
10:42 It also does different localization as well.
10:45 Yeah, the localization of all these is pretty impressive to me actually.
10:48 Yeah, I wouldn't want to try to do that project myself but I'm glad it's around.
10:53 I'm glad it exists.
10:54 Yes.
10:55 Are you ready to hear about how to get a free t-shirt?
10:57 I really want a free t-shirt actually.
11:00 So yeah, so data dog, those guys came along said, Hey, we'd love to sponsor and support the show and get the word out about out about our, our project that we got here.
11:09 So data dog is we've talked about roll bar before, right?
11:13 And roll bar monitors your application for errors.
11:15 Well, data dog kind of does something a little bit similar on a grand scale.
11:22 So data dog will look at your application, and all the layers of infrastructure on it.
11:29 Let's suppose we have, say, a flask app, we could integrate Datadog and it will give us metrics about that flask app.
11:36 But it'll also tell you about like the NGINX web server, and your database and the Linux machine that is running on and basically the entire stack of your application from the servers, the database servers, the web server, all those things and put all that stuff together.
11:53 So you can have a really holistic view of what you're doing.
11:58 And you can even integrate it with all these different things.
12:00 It'll integrate with things like AWS.
12:03 It integrates with Rollbar if you use those guys.
12:05 It integrates with many, many different things that you might already be using.
12:08 So it's super powerful.
12:09 It integrates with Postgres, with MongoDB, and so on.
12:13 So very, very cool.
12:14 Companies like Zendesk and Salesforce and even PagerDuty use it.
12:18 If you haven't heard of Datadog, if you haven't tried it, go to pythonbytes.fm/datadog and And they've got this little thing you try it out and you get a free t-shirt.
12:28 So pythonvice.fm/datadog support the show and get a shirt.
12:31 I think this shirt's cute also.
12:33 Yeah, that's nice.
12:34 So thanks Datadog.
12:36 And you know what, let's talk about what's coming in Python.
12:39 I feel like my next two items actually are both sort of future looking Python things.
12:46 So I feel like we just talked about Python 3.6, didn't we?
12:49 We've been talking about it since the beginning.
12:51 Yeah.
12:52 Yeah, it's been out for, I guess it's been out for a while now.
12:54 And so they're starting to talk about what's coming in Python 3.7.
12:57 Okay, I haven't looked at all.
13:00 So I'm kind of wanted to highlight that there's, there's a whole bunch of things that I put here that are interesting to that I think are really worth like super interesting.
13:09 And I'll just touch on the other ones.
13:11 The first one is an optimization.
13:13 Okay, so Python works by having a bunch of op codes and then interpreting those op codes in this like giant switch method in this file called cval.c.
13:26 And it basically is a loop in a switch method and it looks the opcodes and it figures out what to do.
13:31 So they've added two new opcodes, load method and call method.
13:35 And it allows them to skip some instantiation of a few objects.
13:40 And it results in potentially methods in Python 3.7 being 20% faster than Python 3.6.
13:46 Oh, cool.
13:47 So one of the big sort of trade-offs that you make in Python is function calls are relatively expensive compared to other operations.
13:57 And we obviously want to write smaller functions and break our code apart for usability and readability, but that can make things slow.
14:06 So having faster functions can actually make a really big difference in Python.
14:10 Okay, neat.
14:11 So 3.6 to optimize dictionaries a lot and we might optimize function calls in 3.7.
14:16 >> Yeah, absolutely.
14:17 Absolutely.
14:18 So there's some new modules, like there's a new remainder function in math, the dis function, which is a disassembly function.
14:25 If you've ever, if you haven't done this, it's pretty cool.
14:27 You can say import dis.
14:29 I think it's dis.dis, module.disassemble.
14:33 And you give it like a function or a class or something and it'll show you the opcodes, kind of like that load method, call method I was talking about.
14:40 Another really interesting thing that's coming in 3.7 is Async Context Manager.
14:46 So a context manager is a thing you can use in a with block, right, like a file handle, database transaction, those types of things.
14:55 Well you can have asynchronous context blocks and this async context manager lets you basically make the instantiation step in those context managers asynchronous, which is pretty cool.
15:07 Oh, that's cool.
15:08 Yeah, one more that's kind of for the crazy book is now functions can have more than 255 Apparently that was a limit that was bothering someone and they said, well, let's make it possible for functions have more than, you know, like 300 arguments because 250 wasn't enough.
15:28 Yeah, I run into that all the time.
15:29 I do too.
15:30 It's really frustrating.
15:33 Why would you need that?
15:34 I have no idea, especially when you got star args and star star Kadobe args.
15:38 So anyway, it's now a thing or it's going to be a thing in 3.7.
15:43 Yeah.
15:44 It looks like you wrote down bytes from hex and byte array from hex.
15:50 Yeah, so those are conversion functions that'll parse hexadecimal strings into bytes.
15:58 And the change is that it used to have an error if there was like white space on the beginning or end, which really didn't affect what the thing was, but it wouldn't accept them.
16:09 So now they basically strip off all the white space for you.
16:12 And so it's a little more tolerant of inputs.
16:14 - Okay, cool, that'll matter for some people.
16:16 - More tolerance is always good in my opinion.
16:20 - Yeah.
16:22 - I would love it if there was like an army of people or things could go test my code and find out what errors for me.
16:27 - Yeah, well, I was really glad.
16:30 So there's an article called "Unleash the Test Army." It is about a hypothesis and I'm glad this came around because since I talk about testing a lot, I get questions about hypothesis a lot, and I have never used it.
16:47 I know that you've had--
16:48 - Dave McKeever?
16:48 - Yeah, I think you've had him on the show.
16:51 - Yep, I have.
16:52 On DocPython episode 67.
16:55 - Oh, you're ready too.
16:56 Did you look that up?
16:57 - No, I was talking about it last night, actually.
16:58 - It's somebody's experience with working with hypothesis.
17:02 It's a good introductory article to kind of tell you what it is.
17:06 So hypothesis is a testing framework that will really just come up with a lot of different ways to throw, you set it up so that it throws different data at your code and it's more of a unit test type thing, I think.
17:21 You have to define the input and output of your functions and whatnot to make it work.
17:26 It's really pretty quick about being abusive and getting at where the problem areas might be.
17:33 This is the first article that I've read that kind of explained how to get into it quickly because it doesn't look, hypothesis doesn't look like something you can really just pick up right off the bat, but this is a short introduction.
17:44 One of the things I like is at the end, he talks about his conclusions with working with it.
17:50 And one of the conclusions he came up with is that it forced him to pin down his function specifications and really to consider special cases.
17:58 So really think about the interface to the function you're gonna test.
18:02 What are the good parameters?
18:03 What is the expected behavior?
18:05 And what are the bad outputs?
18:07 And what do those look like?
18:09 Making you think about your interfaces is a good thing.
18:11 So if hypothesis helps people think about interfaces, great.
18:14 - Yeah, I think it's really, hypothesis is interesting.
18:17 I haven't had a chance to do a ton with it, but basically, instead of choosing examples, like, well, let's see what's an edge case if the register value is false and the email address looks like this and the price looks like that.
18:31 That seems like a good example.
18:32 Let's pass that to my test and see what happens, right?
18:35 So instead of doing that, you can go to hypothesis, just write a regular test, but then add on to it this decorator that says, okay, that thing is like an email address, that thing is a Boolean.
18:45 And these are some numbers here's their range, go after it, and it'll just do a bunch of different examples, and record which examples worked and which ones failed and things like that and store that in a file.
18:57 And it's pretty cool.
18:58 It can find those edge cases and other things you might forget about.
19:01 And this example of kind of do it in an interactive way, like you're not really sure how you should test your, I mean you've written some tests but you're not really sure how, what inputs to throw at it, which test cases, and making you think about where the edges are and the different corner cases. I think that's a good thing.
19:19 That is a good thing. The edges and corner cases are a super important part of unit testing I think. Yeah I'm still trying to figure out where exactly what level of the development process and and what level of testing this makes the most sense at, but there's definitely algorithmic pieces that in your code that might be a little confusing. I don't think this would make sense to throw at every unit test in your system. But there's definitely places where this would make sense. Yeah, well, it's cool. People should check it out. And it's an approachable article for sure. The last thing is one of these Python versus legacy Python things and chalk up one more win for Python.
19:55 So most people have heard of Heroku. Heroku is a platform as a service cloud provider.
20:02 Kenneth writes works there, for example.
20:04 So he's his unofficial title is something to the effect of like Python overlord at Heroku.
20:10 That's like on his business card or something.
20:13 And so anyway, he and the crew there basically make it so you can say here's my app.
20:20 And here's my requirements dot txt.
20:22 Run this, please.
20:24 And until recently, the default has been when you say run this Python app.
20:29 It's like, cool, you mean to seven, right?
20:31 You could run it on Python 3 but you had to like configure it explicitly.
20:36 If you said nothing it ran on Python 2.
20:39 The big news is on what's that June 20th, 2017, Heroku is switching the default to Python 3.6.1.
20:47 Wow.
20:48 So hooray for Python 3.
20:50 So now if you go to Heroku and you say run this, it's going to be like awesome Python 3, right?
20:55 That's what you wanted.
20:56 And so this thing and I'm linking to basically links over or displays their blog post and their blog post is super short that talks about it just says, basically what I said, effective Tuesday, the default runtime is now Python 361.
21:10 Yes.
21:11 If you've already got job running there, it won't switch, right?
21:14 Exactly.
21:15 No, it is only for new projects.
21:17 So in the Reddit thing, there's a few interesting quotes.
21:20 Somebody said lots of new projects start out on Heroku all the time.
21:23 So this is really great news for Python three adoption.
21:26 - Yeah, and someone else said Python 3 is really happening.
21:27 Yay!
21:28 I was actually a little worried about the future of Python for a while, but I feel like it's all downhill from here.
21:34 - Yeah, apparently people that don't listen to our podcast.
21:36 - That's right.
21:37 Our listeners know better.
21:38 I mean, there's a lot of these examples, right?
21:40 We've got all the new frameworks that are exciting, but we also have like Django 2 dropping support for Python 2 and ironically those numbers match up, but the newest version of Django is only gonna be Python, is Python 3 only and things like that.
21:53 It's really starting to pick up speed.
21:56 - Yeah, one of the, that comment there was interesting is that a lot of new projects start on Heroku.
22:01 So must be people starting out a project and then later grabbing different server solutions or something.
22:08 - I haven't done a lot with Heroku to be honest, but I think it's really simple to basically just wire up a Git repository, do a push to it, and it'll just start running your app magically.
22:19 So it's really, really easy to get started.
22:21 And then maybe as you grow, maybe like costs become a concern or you just want more control or whatever, but it's super easy to get started.
22:28 And however you get started on whichever version of Python is probably where you're gonna stay.
22:32 So that's good news.
22:34 - Yeah, great.
22:35 Well, cool.
22:36 - Yeah, very cool.
22:37 And that's it for the news, Brian.
22:38 You got anything else you wanna share?
22:39 - No, no.
22:40 So, wow, number 30 in the can almost.
22:43 So. - 30, yeah.
22:44 That's awesome.
22:45 - I'm finishing up the last chapter this week, chapter seven for Python testing.
22:50 So that's gonna be done soon.
22:52 - Yeah, yeah, very, very cool.
22:54 You're one of these days, the book will be a thing that you've done in the past instead of a constant job of yours.
23:01 Yeah.
23:01 Yeah.
23:02 And hopefully, I can't wait till it's an actual physical copy.
23:05 So it'll be good to have some, a stack of copies with, with that.
23:08 Yeah.
23:08 That's awesome to hear you're making progress.
23:10 And so thanks for covering snooze with me.
23:12 how about you?
23:13 Do you have like now four months of a podcast ready?
23:16 I have, I have about three months of podcasts that I've recorded.
23:20 I'm going to go on vacation for a while in the later half of the summer.
23:23 So I'm trying to make sure that everything is going to be smooth, no interruptions.
23:28 And so I have, I think, 13 or 14 episodes of Talk Python already recorded.
23:33 There's tons of interesting stuff.
23:34 I'm really looking forward to sharing.
23:36 I don't want to hold it back, but I got to dull them out week over week or it'll, it won't solve the problem.
23:41 How about this?
23:42 And as for this podcast, if we, we haven't really decided yet, but if we do a break, we'll definitely let people know before that happens so that they're not just hanging out there waiting.
23:53 Yeah, absolutely. We'll try to keep it rolling, but we might miss a week or two with some trips there in the summer. All right. Well, thanks for sharing your news with everyone.
24:03 And thank you to Datadog. Get your t-shirt, pythonbytes.fm/datadog. Thanks, Brian. See you next week.
24:10 Thank you. Yep. See you.
24:13 Thank you for listening to Python Bytes. Follow the show on Twitter via @pythonbytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Aukin, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.