Brought to you by DigitalOcean - grab your $100 credit and deploy your first project for free


« Return to show page

Transcript for Episode #120:
AWS, MongoDB, and the Economic Realities of Open Source and more

Recorded on Thursday, Feb 28, 2019.

0:00 KENNEDY: Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is Episode 120 recorded February 28th, 2019. I'm Michael Kennedy.

0:12 OKKEN: And I'm Brian Okken.

0:13 KENNEDY: And this episode is sponsored by Digital Ocean, check them out at pythonbytes.fm/digitalocean. Lots of good stuff at that URL, but more about that later. Brian welcome back to the show man.

0:23 OKKEN: Oh, thanks, nice episode last week.

0:24 KENNEDY: Thanks, yeah, we went to Seattle and we all had fun and sadly you were knocked out. But understandably.

0:29 OKKEN: Man I sent it to everybody I knew. My daughter and whole bunch of people said at least listen to the first bit of it because this is why I enjoy working with Michael.

0:38 KENNEDY: Wow.

0:38 OKKEN: It was so nice, nice of you to do a shout out to me. That was nice.

0:42 KENNEDY: Thanks, yeah, that's really great. Your welcome, that's the least I could do. Wish you were there, but that's okay. Speaking of people who were almost there, at least at the conference, so I got to spend a lot of time with, but I wasn't actually at that live recording, Nina Zakharenko. How about that? That's our first, the source of our first item, right?

0:57 OKKEN: Yes, how could I resist? She put together this wonderful resource for people. So, what I'm talking about is The Ultimate Guide To Memorable Tech Talks, and it is. She said it was going to be a blog post that turned into what she calls a book, but it's seven articles on medium. It's a seven-part series that covers things like choosing a topic, writing a topic proposal, tools that she uses. Planning, writing it, writing your talk out, practicing it and delivering the talk, and it's a phenomenal resource. I'm not done with it actually, I've read up through the tools section, and I'm looking forward to the rest of it. But one of the things, I mean it speaks right to me because I got into podcasting partly to get better about this public speaking thing and it's difficult, and especially for, I think, tool nerds and introverts and stuff like that. Jumping into the tools in the little details you can lose sight of stuff. So one of the I think she noticed she quote from the article is, "I noticed I'd procrastinate on making the slides look good, instead of focusing my time on making quality content."

2:04 KENNEDY: It's easy to get sucked into that, yeah.

2:05 OKKEN: Yeah, totally! When I should have been, like, I've given two talks, and many, many talks informally at work and stuff, but two conferences. I spent so much time just tweaking the different slides instead of practicing it and making sure I understand it well enough to even throw the slides away and just do the talk, and so I think focusing on planning and leaving all the time enough to get all that. This is actually the series that I wish I had when I started getting into trying to do talks, because, either you have, out there there's either stuff that doesn't have enough information, or it's just a quick gloss-over, or it focuses on one aspect, or on the other hand you got courses and books. This is just a series of blog posts that you can staple together with one staple. It's good.

2:55 KENNEDY: Not the big, heavy, super-duper like binder but like a regular one, right? Like the red one from office space.

3:01 OKKEN: Yeah, yeah, just one of those.

3:03 KENNEDY: Yeah, now this a great series, and nice work Nina, good pick Brian. I think it'll help a lot of people, I like that it's writing, it actually covers writing the talk proposal because it's great to get into public speaking but you have to get accepted to get into public speaking a lot of times. Although, although at the bigger conferences, say PyCon, you can always do an open space, and that's kind of like a dip-your-toe-in-the-water 'cause you don't have to lead the whole conversation, but you kind of are, sort of, leading it in spirit.

3:29 OKKEN: Yeah, and then you can go through all of these steps for, just to get ready, and you only have to be up there for, I don't know how long the lightning talks are, but they're not long.

3:37 KENNEDY: Yeah, for sure. Yeah, lightning talks are five minutes, open space is, I think, are 25, but you don't do the talking the whole time. We're definitely going to, you know, as you spoke about the live recording, we're definitely going to do more live recordings at the main PyCon as some open spaces if nothing else.

3:50 OKKEN: Yeah, definitely, I wanted to interview a bunch of people for stuff.

3:54 KENNEDY: Bringing the mics. Okay, super. So the next one is something I've been digging into lately. Have you done anything Kubernetes?

4:00 OKKEN: No.

4:01 KENNEDY: Let me take a step back. Have you done anything with Docker?

4:03 OKKEN: No.

4:03 KENNEDY: So these two things are sort of stacked on top of each other, and they often get complicated. So, Docker is a little bit of a complicated thing if you haven't done a lot of deployment and dev ops type stuff, because it's all about configuring Linux, isolated machines if you will. Then to actually use them properly, really, with scale out and fail over and clusters and connections between them and all that, like a Docker container for your web app, one for the database, you really need some kind of orchestration which is Kubernetes. Whew! So, there's a ton to learn about all this stuff. Luckily, Michael Herman from TestDriven.Io formerly of Real Python. I did a really nice write-up here. He has a course on TestDriven.io and this is like an extracted, long example tutorial to get started with Flask on Kubernetes.

4:52 OKKEN: Wow, that actually sounds neat.

4:54 KENNEDY: Yeah, it's actually pretty approachable, I said it's a complicated topic, but he really lays out the core ideas and you know, if you ask Pocket it says reading time is 16 minutes, so it's pretty hefty, but it's not book-size, it's not super, super long. Of course, the little steps that it's has take more than just one line, or whatever. You got to like let it configure a cluster or something. So there's a little wait period, but really nice, it talks about how to basically get a Vue JS front end, Flask back end, app that talks to a database, Postgres in particular, up and running on a Kubernetes cluster. So kind of micro-service style, which is pretty cool. I'll just run off some of the goals that you'll get out of it. So explain what a container and container orchestration is, pros and cons of you using Kubernetes over, say, things like Docker Swarm. All the primitive concepts of Kubernetes node, pod, service, deployment, etc. Spinning up a Python app, which is nice, locally with Docker Compose. There's a nice little utility called Minikube, which lets you basically in just a couple of command lines create a Kubernetes cluster pre-configured on your machine, which is a super pain, so that's really nice that you can just install that thing and have it going, and stuff like that, it's really quite nice.

6:06 OKKEN: Oh, I'm definitely going to check this out. Plus, in 16 minutes, I can now put like 75 new keywords on my resume.

6:13 KENNEDY: Oh yeah. This is dense in buzzword bingo hits. Nice! You and I both picked something that's a little on a little bit on the controversial side of the world in open source and the community. You go first.

6:27 OKKEN: Okay so this is just, I think it's still playing out, but in the end January, maybe they announced it earlier, but on the Travis CI blog, they announced that Travis CI is to join Idera. I don't know, another company. Basically they got bought by another company, and frankly I don't really care who owns what, but then in February, I started seeing the hashtag Travis Alums on Twitter, and they, it looks like, from the outside looking in, it looks like they're laying off a bunch of engineers, and I don't really know what's going on because they're not really talking about it. There hasn't been anything else on GitHub or the Travis CI blog to say what's going on. So I've used Travis for running tests on Python projects. I've wanted to do this anyway, so now is a good time to start looking at if I've got source code on GitHub and I want to use testing somewhere, and it isn't Travis, where would I do it? So, right now I'm trying to use GitLab and Azure Pipelines with the help of some other people to try to get those running on a couple of projects. But there's a lot more, I mean, GitHub lists 17 different options for continuous integration that you can hook up.

7:43 KENNEDY: Yeah, that's a big deal, right? Travis has been kind of the go-to, plug it into your open source, public, GitHub repo, and just let it churn away, and do all the checks on PRs and things like that, right?

7:54 OKKEN: They claim that our experience is not going to be different, however, with less engineers, well it seems it might stagnate and stuff.

8:04 KENNEDY: Yeah, something you also talked about here is Azure Pipelines, and that definitely seems like it's getting a lot of traction. I've heard of some major open source projects moving over to that, either partially or entirely.

8:18 OKKEN: I'm dropping a few links in here. One of the links is a couple things from Anthony Shaw, of course, it wouldn't be a show without Anthony. One of them is an article that he writes about Azure Pipelines with Python, by example, and also, he's got a Pytest plugin to help with testing on Azure. One of the people that's been helping me out is, I'm going to get his name wrong, but Anthony Sottile? Sorry, Anthony. But, he's got a whole bunch of different Azure templates on his GitHub repo that look neat.

8:51 KENNEDY: Yeah, cool, and as far as I understand it, Azure Pipelines are free for public repos and things like that, so it's interesting.

8:59 OKKEN: All the things I'm going to try are things that have reasonable, free levels for open source projects.

9:06 KENNEDY: That's kind of an interesting trend that's been going around. There sure are a lot of things that are free for public, open source repos, but paid for private stuff. That seems like a pretty good balance to me.

9:16 OKKEN: Out in the GitHub go-to allow private repos is interesting, and I'm not sure what drove the Travis change, but, anyway...

9:25 KENNEDY: Yeah, I'm pretty sure GitHub was pressured by Bitbucket, but I don't know anything about the roots of the Travis one. Hey, do you have more good new around DigitalOcean for any and all of your listeners? So, they've traditionally had these, like, high-compute instances called droplets, they're virtual machines, and they've had memory-heavy ones. But the memory-heavy ones didn't have dedicated CPUs, like didn't share with the VM host, right? I still couldn't be guaranteed of consistent workloads on that thing. So now they're announcing a general-purpose droplet that is a blend of dedicated CPUs and a wide range of memory configurations. So basically you can get quite a bit of RAM, I think the lowest one is 8 gigs and it goes up to 64 or 120 gigs of RAM, and a whole bunch of dedicated CPUs. So it's pretty cool, and they talk about some good examples of using it would be, like web applications, hosting an e-commerce site where latency matters, or a medium-size relational DB, like Postgres or a NoSQL like MongoDB database 'cause you want your database to be fast, right? That's the heart of your app, even if you scale web front end, and other sorts of analytics, and so on. So, you can check that out, it's in limited availability, but you can go and click a button and say, "I want to try this." So check them out at pythonbytes.fm/digitalocean. It really supports the show, and keeps us going each week.

10:49 OKKEN: Yes, very cool.

10:50 KENNEDY: Yeah, thanks. So, speaking of infrastructure, it's cool to be able to set up your code, and your websites, and all of your infrastructure to run on Linux, and you might even do that on Docker, you might even put those Docker containers on Kubernetes, but a lot of us are not using Linux as our dev machine, right? A lot of us are using Macs, at least the ones you'll see walking around the conferences. So this next item is called Python Server Setup for Mac OS, and it's cool, a little Apple emoji.

11:18 OKKEN: Nice!

11:19 KENNEDY: Yeah so if you want to run your Nginx, and uWSGI, or your Gunicorn, like your production software stack, but you want to run it locally on Mac OS, here's a little guide on how to do that. So it basically takes you through setting up Nginx, having Nginx serve your static assets, they're using Gunicorn and Flask, but you could pretty easily swap that out for uWSGI, or whatever. Whatever else you want to use, and then Nginx be the front-end proxy for like a scaled-out Gunicorn back-end. So it's pretty cool, now, I went through this recently, and not all the commands seem to be working, at least on my system, I don't know why or what they're going through? Maybe I missed a step? But, even if the commands don't all exactly work, I think it's still is a pretty good guide, and it's on GitHub, so if you find mistakes open an issue. It's not super popular, but I think a lot of people find it helpful if they're trying to develop in something closer to their actual configuration.

12:15 OKKEN: Okay, is this more for like for development environment sort of thing?

12:19 KENNEDY: Exactly, 'cause normally what you do is get something like Waitress, or you get some other built-in non-production, full-Python based web server. Then all the requests go through your Flask, or Pyramid, or whatever. But in production it doesn't work that way, you have Nginx doing SSL, and doing static assets, and then only passes the python requests over, and there's quite a bit of difference. So if you want to have sort of a QA-local thing on your Mac, here you go.

12:47 OKKEN: Huh, that's nice, okay. I'll have to check this out.

12:50 KENNEDY: What do you have for the next one?

12:51 OKKEN: When I first started using command-line interface stuff, I used Click. I did a quick search for different ways to build a command-line interface. Meaning not an interactive thing, although I know those are a thing, but just if I've got a script that I'm writing and I want to take values in, it's parsing the arguments passed into the script, or whatever. Click is great, I love it, but it's something extra, it's not built in to Python yet, it's an extra-dependency. That's fine for a lot of times, but sometimes you just want a couple different parameters that somebody could pass in, and so the built-in argparse is part of the center of the library, and that would be good to use.

13:34 KENNEDY: Yeah, it's cool, and it's better than just going to sys.argv and going after an array of, whatever, right? You need a little more help than that.

13:40 OKKEN: Right, I actually didn't realize that argparse was so easy, so there's Jeff Hale put together an article called, "Learn Enough Python To Be Useful argparse." That's kind of a long title, but it's a tutorial on argparse, and it's that I couldn't find a good intro guide to argparse when I needed one, so I read this article. I got to say, it's a very nice, quick introduction. So, if you want to throw some arguments on a application, try this.

14:07 KENNEDY: It definitely seems like a good idea. There's certainly times where I'm like, you know, I just don't feel like going to all the formality. I just need to know, basically did they pass some simple thing here, yes or no, right? And this is cool, especially for my little proof of concept ideas. Yeah, so it comes built in, which is really nice, nothing to install. So you don't necessarily want to take on Click and make people with virtual environments and pip, and all that. It's literally, this is the only, that would be the only thing, right? So here's another good chance to not do that.

14:37 OKKEN: Oh yeah, that's a great idea, it's if you don't already have any other dependencies, then yeah, using this.

14:42 KENNEDY: Exactly, 'cause that'll definitely increase the challenge of using your code. Alright, so you got to go first with Travis CI, I have another one that's a little bit of a sticky situation here, but I think it's really interesting to dig into, okay? This is actually from January 14th? Actually, let me look at the AWS blog and see when they published it. January 9th. So this is something I've been wanting to talk about for a little while, but it's involved, there's a lot of moving parts, and a lot of layers to this so I didn't want to just give it like a super peripheral, sort of skim the surface kind of thing. So, here's the deal. I'm talking about an article by a guy named Ben Thompson who runs Stratechery, which is a really great resource for understanding the business side of software, so it's super interesting work that he's doing there. This one's called, "AWS MongoDB and The Economic Realities of Open Source." Yeah, and he also runs a podcast called Exponent. Exponent is sort of the audio side. He does this with another guy named James, and I'm linking to an hour-long episode he did on this called inverted pyramids, which talks about demand and building platforms, and things like that. So, really nursing inter-play between open source, cloud computing, and so on. But it focused on this disagreement at a business level between MongoDB and AWS, okay? So MongoDB, they make databases, right? They make a document database, and they have kind of a unique API and they are definitely the most popular of the document databases. Regardless of whether you like MongoDB or not, certainly they are well known and well used in that space, in the document DB space. If you're not doing that, you're probably doing Postgres in the Python world. But, AWS wanted to have a MongoDB as a service type of offering, like they have RDS, and other manage-database options, they want to have, basically, a MongoDB-compatible one. But MongoDB, the company, their business model is basically three things. They sell an enterprise version of the on-premise software, that's kind of unrelated to this, we have a free-community version, and then they sell this thing called AtlasDB, which is MongoDB as a service. What you do is you go to a MongoDB Atlas, and you say, "I'd like to run this on AWS, or I'd like to run it on Azure, on Google's Cloud." So that's MongoDBs business model. AWS just wants to have a service. So MongoDB, who owns the IP, the licensing of MongoDB, changed it to something called AGPL. Have you heard of AGPL?

17:21 OKKEN: Well, I think, but I don't remember what it is.

17:22 KENNEDY: You've heard of GPL, right? Basically, if you use this software in yours, then you must make yours opens source. That's true when you directly depend on it, but what if it's hosted and managed by Amazon, and you just talk to it over the network? You're not interacting with it, you're not using it, right? Well, AGPL is like GPL plus network. So if you access this software over the network, it's like the GPL applies to you.

17:48 OKKEN: Oh, weird!.

17:49 KENNEDY: Isn't that weird? So basically, what that means is AWS cannot have a hosted MongoDB, because if they do, it triggers this AGPL and AWS becomes open source, or something crazy like this, right? The consequences are too high, and Apple's banned it from the App Store, Google has banned it from all the, any AGPL software, basically, is not allow within Google. It's really not something these cloud companies are interested in having interactions with it, right? Because it takes the GPL bit, and doesn't talk about shipping stuff, but accessing over the network. So, that brings us to the interesting part. So what does AWS do? Not just go, oh great, we just keep running Atlas. No, AWS says, "Today," they said this on January 9th, "Today, we are launghing Amazon DocumentDB with MongoDB compatibility, a fast, scalable, and highly-available database that's designed to be compatible with your existing MongoDB applications and tools." It's purpose-built, it has all this replication, et cetera, et cetera, and it's all for your production scale MongoDB workloads. So they took the exact API of MongoDB, and rebuilt a brand new server and service from it from scratch.

18:57 OKKEN: Okay, interesting.

18:58 KENNEDY: They didn't do it on the latest one, cause the latest one has the AGPL, so they went back to 3.6, the latest one if 4.05, or something like that. So they went back to the latest one, the newest one that doesn't have this restriction. They mimicked that, which should be good enough for most people, I would guess.

19:13 OKKEN: Yeah, weird.

19:14 KENNEDY: Is that interesting? You changed your license so we can't have it, so we literally copy your API like byte for byte, on the wire, identical, and rebuild a new service from scratch.

19:24 OKKEN: Okay, both sides are kind of being sneaky, and whatever.

19:30 KENNEDY: Yes, I feel a little more kinship towards the MongoDB side because they developed this whole thing from scratch, and they built it up, and they got it popular. But yeah, it's definitely interesting, tit and tat.

19:43 OKKEN: So what is the AGPL to air all of Mongo then? What if I build my website?

19:47 KENNEDY: Yeah, the community edition is separate for your self-hosting somehow, but there's clauses about accessing it over the network. Basically, you know very much like GDPR is like mostly built to fight Google and Facebook, right? This, my understanding is basically this is to fight the cloud providers, not meant to interact with regular people just running the community version.

20:08 OKKEN: Okay, but regular people are running on clouds too.

20:11 KENNEDY: I know, it's quite interesting.

20:12 OKKEN: I just would not want to pick a fight with one of these big players, but, whatever.

20:17 KENNEDY: Yeah, so I encourage people to both, maybe first listen to the podcast, and then read the article if you want to go all in, but you can skim the article. You can't really skim the podcast, and I know some people do, like 2.5x, but those people are crazy. It melts my brain, I can't do it.

20:31 OKKEN: I do 1.3.

20:32 KENNEDY: Yeah, that's a pretty good balance, literally just layer upon layer. Maybe I'll come to the conclusion real quick. So, thus we have arrived at a conundrum for open-source companies. MongoDB leveraged open source to gain mind share, right? Started as open source. MongoDB, Inc. started a successful company selling addition tools for enterprises to run MongoDB, but more and more enterprises don't want to run software, easy or not, they want to hire AWS or Microsoft or Google, or some other cloud provider to run it for them because they value performance and scalability, and all that. So it leaves MongoDB pretty much like a little outside the value chain, which is interesting, and there's just a lot of interesting trade-offs for open source, VC-funded companies, and traditional monetization strategies are looking harder, and harder in the face of cloud computing companies that just go, that's a cool API, we'll do that. I told you, lots of layers, I really, I don't know, it takes a lot of pondering, I really don't know where I am, but I think it's a big deal, honestly.

21:32 OKKEN: It is a big deal, and for the rest of us, the topic is important. Document databases are kind of amazing, and I think we need to push those forward, and having legal structures trying to get in the way of making this just better for everybody. I know that everybody that started it should be able to make money, but also it needs to move forward. So yeah, there's lots of sides to this.

21:57 KENNEDY: There're a lot of angle from a lot of sides, and Ben does a super job of covering it. So, anyway, link to them, if that sounds interesting, and check it out, it's not exactly the newest of news, but I think it's big news still in open-source world. Alright, you got some extra stuff for us this week?

22:12 OKKEN: I'm so terrible with names, but the people who do the Teaching Python podcast, I just released an episode of Testing Code where I interviewed them.

22:22 KENNEDY: Yeah, I saw that come out, that's great!

22:23 OKKEN: Yeah, so that was cool. Do you have any extras?

22:25 KENNEDY: I have a couple of extras, just mostly follow-ups and some news. So the folks running PyTexas in Austin, which sounds like a really fun place to be if you coulnd't come to PyCascades in Seattle. PyTexas is going to be April 13th and 14th, registration is open, and I'm linking to their page and stuff. It looks pretty cool.There's one, yeah, there is the article that we covered from Anthony Shaw last week. Apparently, it was like two years old, and somebody had sent it to us, so it felt like, oh yeah, here's a new article from Anthony. So we covered it as if without any caveats, Anthony Shaw, sorry we dug up your old, your very far-past predictions of the future and then presented them from today. But they're still good, I thought it was quite a good article, and it stands the test of time pretty well. And finally, remember when we talked about Rust Python and I said, the reason this makes me super, super happy Brian is that it enables this web assembly future of Python, because it's Rust, has a very strong web assembly story?

23:24 OKKEN: Yep.

23:25 KENNEDY: Well, on that episode page, one of the core developers posted a comment, said, "Thanks for covering this. By the way, rustpython.github.io/demo." Yeah, that's it running in the web. The CPython built on Rust, well Rust Python, running the web under web assembly. So if you open that bad boy up, boom. It's up and running, you can go and type stuff. There's not much of a standard library, that's something we already covered previously, so it's kind of mostly at language level thing, but here it is, two megabites web assembly binaries running in your browser, it's pretty cool.

24:01 OKKEN: That is really pretty cool.

24:03 KENNEDY: Yeah, yeah right, so that's all of it for my extra stuff as well. You got a joke for me? People are enjoying this jokes segment.

24:10 OKKEN: I am enjoying the jokes segment. I do not have one.

24:12 KENNEDY: Okay, I got two, I'll see what you think of these, okay? Why was the developer unhappy at their job?

24:17 OKKEN: I don't know, why?

24:18 KENNEDY: They wanted A-R-R-A-Y-S, arrays. That's pretty bad.

24:24 OKKEN: That's lists for all the Python people.

24:26 KENNEDY: Yes. Where did the parallel function wash it's hands?

24:30 OKKEN: I'm not sure, where?

24:30 KENNEDY: In async.

24:31 OKKEN: Was there a line, or did it have to wait?

24:35 KENNEDY: That's pretty good, yeah, these are bad, these are pretty bad but here they are.

24:42 OKKEN: Now it's good. I have kids, so I'll re-tell these. They won't understand it, but that's alright.

24:47 KENNEDY: You know, but they'll be in the style in which they should appreciate, they just won't, the jokes just won't make sense.

24:54 OKKEN: Yeah, anyway.

24:56 KENNEDY: Awesome, well, welcome back, and thank for doing the show with me today.

24:59 OKKEN: Yeah, thank you.

25:00 KENNEDY: You bet.

25:01 OKKEN: Thank you for listening to Python Bytes. Follow the show in Twitter via @PythonBytes, that's Python Bytes as in B-Y-T-E-S, and get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm, and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Bryan Okken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page