Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #182: PSF Survey is out!

Return to episode page view on github
Recorded on Wednesday, May 13, 2020.

00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to

00:04 your earbuds. This is episode 182, recorded May 13th, 2020. And I am Brian Okken.

00:11 And I'm Michael Kennedy.

00:12 And this episode is brought to you by Datadog.

00:15 There's two surveys that I feel really do a good job of keeping their, you know,

00:20 they have their thumb on the pulse of the community. And I would say one is the PSF

00:25 JetBrains combination. That one's really good for Python in particular.

00:29 The other is the Stack Overflow survey. Well, the Stack Overflow survey recently came out,

00:33 you know, a couple months ago or something like that. So now it's the PSF JetBrains survey.

00:38 And when I first saw this, I thought, oh, this is last year's because it was 2019. But

00:42 they do a ton of analysis on it. And the survey was done in 2019, but it's released now. So anyway,

00:48 I want to talk about some of the results in there because there's some interesting takes. And also,

00:53 I want to say thank you to Jose Nario who sent this in because, like I said, I thought it was last year's.

00:58 I'm still waiting for the 2020 edition. All right. So let's talk about some of the results.

01:02 First of all, one of the first questions asked was, do you primarily use Python as your main language?

01:08 Or are you interested in Python because it's some other, like I also happen to use it in addition

01:13 to JavaScript or something? 84% of the people who took the survey, and that was mostly folks who

01:20 visited Python.org. So 84% said it's my primary language. And that's unchanged from last year.

01:28 Okay. Interesting.

01:29 A lot of the analysis here was, how has this changed over the last year or so? All right. So there's a lot

01:35 of interesting trends to be pulled out from here. They said, what other languages do you use? Well,

01:41 there was JavaScript, which has gone down in usage. There's Bash, which has gone down in usage.

01:46 There's HTML, which has gone down in usage relative to last year. And C++, which has gone down relative

01:54 to last year. So the people who took the survey, the same number of people said they're primarily

01:58 using Python, but the other languages they're employing, all the ones who are, which were popular

02:03 seem to be going down. And I would say this means maybe people are starting to lean more on Python,

02:09 those who use it. I don't know what conclusions we should draw there.

02:12 There's also some growth in some other languages as well. I'm guessing like interesting to see all

02:17 the other ones down.

02:18 Yeah. Well, this is down within the Python community, right? This is not necessarily down overall,

02:23 right?

02:24 Yeah.

02:24 So I would say we should look at the Stack Overflow survey to like really more like even

02:29 Stack Overflow trends. But if you look at that one, it's all the other languages are relatively

02:33 down as well. So interesting. Now, a lot of the divide happens around web versus data science

02:41 here. And so a lot of the questions said, are you a web developer primarily or data scientist

02:46 primarily? And then if you are, what type of tools do you use? So for example, if you're

02:51 a data scientist, use Python, of course, but you also make larger use of C++, Java, R and

02:58 C# as a data scientist. But if you are a web developer, you make more use of SQL, JavaScript

03:03 and HTML, which is, you know, no surprise because it's hard to write the web without HTML and

03:09 JavaScript. So those are pretty interesting. And they said 58% of the people use Python

03:15 for both work and personal projects. And they said, all right, well, what do you use it for?

03:20 Right? So there was a bunch of answers. I'm going to give you the top four because the average

03:26 person who filled this out chose four things that they primarily use Python for, right? It's

03:30 like I could use it for web and DevOps or something like that, right? It's not exclusive.

03:35 So the top four were data analysis, which is exactly the same as last year. 59% of the respondents

03:43 said that web development. This is pretty interesting. 51% of the people said they do web development,

03:49 but this is down 4% year over year.

03:52 Interesting.

03:53 That's a big drop. And I feel like web is a big component of what the Python space is.

03:57 Yeah.

03:57 So I don't really know what to drop from that. Like maybe just more people are sort of backfilling in

04:02 the data science side, maybe. Maybe the tools that they use are better. So things like streamlet and

04:08 stuff, they're like, well, I don't have to do web now. I'm not really sure.

04:10 Yeah.

04:11 Machine learning, 40%. That's up 1%. DevOps, 39%. That's down 4% as well, which is the same number

04:18 as a web development. And these are kind of similar things. So what you use Python for is

04:23 pretty interesting. And then more broadly, it says, what do you use it the most for? Web,

04:28 data analysis, and machine learning. Web is down 1% as the primary thing you use it for. Data analysis

04:35 is up 1%. Machine learning is up 2%. Also, not surprising, I guess, but good news on the Python

04:41 versus legacy Python story, 90% of people are using Python 3.

04:47 That's a huge change from five years ago when it was like, oh yeah, there's that one weird guy that

04:52 uses Python 3. But everyone else, no. More of the data scientists are using Python 3 as a percentage

05:00 over the web developers. And I think that's because data science, the libraries and the tools are

05:06 changing a lot. It's not like you have legacy data science. Like, oh yeah, we're using that machine

05:10 library from machine learning library from like 10 years ago because we just don't want to change it.

05:14 It's like those are fundamentally changed and you just have new tools. Whereas web development,

05:19 I think there's a lot of folks that are like, yeah, we're still got that, you know, Django 1 app

05:23 going on Python 2 or something.

05:25 I'm not surprised there's still some Python 2, but 10% still seems a little high.

05:29 I think it's mostly legacy code. I'm not entirely sure, but you can use your legacy Python for your

05:33 legacy code. All right. And then web frameworks. Flask was kind of neck and neck with Django last year.

05:41 Now it's like totally ahead. So 48% Flask developers and 44% Django developers and everything else is

05:47 pretty small. Data science, 63% of the people are using NumPy, 55% Pandas, 46% Matplotlib.

05:55 Here's one I threw in for you, Brian. Testing, 49% of the people are using pytest.

06:00 Yes.

06:01 Yeah. 30% of the people are using the built-in unit test.

06:05 And 34% are using this other framework I hadn't heard of before. It's called the Nun framework.

06:10 Yeah. That's unfortunately high, but it's funny that it's higher than unit test.

06:15 Yeah, I know. That's what I thought as well. Pretty funny. All right. A couple more and we'll

06:19 be done with this one. So cloud, the cloud platforms people are using, AWS is in the lead. No surprise

06:24 there with 55%. What did surprise me is Google Cloud, GCP and whatnot, is actually number two at 33%.

06:33 DigitalOcean, shout out to them, is at 22%. And then Heroku is 20%. And Azure is 19%. So pretty

06:43 interesting. And then how do you run your code in the cloud in a production environment? Do you run

06:48 in containers, VMs, platforms of service like Heroku or something? So containers, 47%. That's pretty high.

06:54 VMs, 46%. And then platform as a service is 25%. What editor do you use? PyCharm, 33%.

07:03 VS Code, 24%. Vim, 9%. Everything else is like a super small margin after that. Yeah. And that's pretty

07:10 interesting. Yeah. So I threw in an extra one on there at the end. Well, one, the containers, I thought

07:16 containers was number two below VMs last year. And now it's jumped to number one.

07:22 Oh yeah. Like that's probably Kubernetes, like hosted Kubernetes clusters, right? That people are throwing their Docker images into. Oh yeah. What's the last one? Take us to this one.

07:32 And then the tool use. They also listed a handful of things of people using. 90% people using version control. That's good. 80% write their tests. That's good. 80% code linting. And 65% people use type hinting, which actually is a little higher than I.

07:48 Oh yeah. Way to go.

07:49 That's nice. And about half the people using code coverage tools, 52%.

07:53 Yeah. Thanks for adding that. I'm very familiar with legacy code and non-legacy code, which I guess you call modern, but you're going to take it to 11. What's up? What's up with this?

08:04 Yes.

08:05 This level.

08:05 It's hyper modern.

08:06 Woo.

08:07 Yeah. This is from Claudio. Cool name. But anyway, so he wrote a, actually, it's like a book. This is an incredible series of blog posts. And he actually writes them out. They're all linked together called hyper modern Python and sets them up in six chapters. He's got setup, testing, linting, typing, documentation, and then the last chapter is CI, CD. And he just wrapped it up.

08:34 I've been watching this and kind of, I was going to announce it when, when he was done. It's a really fun series to learn about, you know, you've learned some of the basics of Python, but you want to like get some best practices and take it up a notch. And I think this, this is good. It's opinionated. Of course, I actually like opinionated things. And some of the opinions I don't quite follow. Like, he likes, PyE and V. I'm not really a fan of PyE and V, but, but that's all right.

09:02 also poetry uses poetry. I use flit or usually, but anyway, he does, recommend the source layout, which is good. But, for setup, there's a whole bunch of neat stuff in here. And, some tools that like for linting, he talks about flake eight black import order bug bear, which is fun. I don't know if we've covered that, but there's a whole bunch of tools that I haven't even heard of. Like safety and dessert.

09:26 And, data validation with dessert and marshmallow. So I'll have to look that sort of stuff up. That looks neat. So just a good run through it in the CICD section. He talked about using GitHub actions and, reporting your coverage with code cov uploading to PyPI and using test PyPI servers and documenting on read the docs. I think this is a fairly good representation for modern projects.

09:50 So, yeah, it covers a lot of stuff that people probably should be doing and maybe haven't taken the time to set up or dig into.

09:57 Yeah. And one of the things I want to, highlight also is an incredible use of pictures. The imagery that he uses on these, posts are like the CICD section was, some 1970s, space station or, you know, space, colony images. And they're beautiful. So it's worth it just for the pictures.

10:18 It is worth it just for the pictures. They're great. Yeah. Very nice one.

10:21 Well, another thing that's really nice is Datadog.

10:24 Indeed.

10:25 Yep. This episode is of Python Bytes is brought to you by Datadog. And let me ask you a question. Do you have an app in production that is slower than you like? It's performance all over the place. Sometimes fast, sometimes slow.

10:38 Now here's the important question. Do you know why? Well, with Datadog, you will. You can troubleshoot your app's performance with Datadog's end-to-end tracing. Use the detailed flame graphs to identify bottlenecks and latency in that finicky app of yours. Be the hero that got the app back on track at your company. Get started today with a free trial of Datadog by going to pythonbytes.fm

11:02 slash Datadog. And you even get a free t-shirt. So I really like my purple Datadog t-shirt. I wear it a little too much, probably.

11:11 Yeah, it's a cute one. Awesome. Yeah, thanks, Datadog.

11:12 This next one that I want to cover, Brian, is a little bit just kind of a fun thing. Dan Bader shared this with me the other day. He's like, man, you got to check this out. Look at this thing. And I'm like, what is it? It's called the Open AI Jukebox.

11:25 And so it's this AI that creates music. It doesn't just create music. It creates different genres of music in the styles of certain artists with lyrics and musical accompaniment.

11:45 It's wild, right? I mean, I had you listen to a couple of the samples and folks out there, you should just go click on the Open AI Jukebox link in the show notes and go to the curated samples and play a couple. How would you classify them, Brian?

12:00 I'd classify them as you can tell their music. None of these I would pick up want to go out, rush out and buy the album right away.

12:09 Yeah, neither would I. I mean, to me, they sound like sort of bad recordings of an artist, you know, maybe taking on like a phone at a live concert where you've kind of got like not a good audio setup.

12:22 There's like a little bit of a, I think I remember this song bit to it, even though there's no way.

12:28 Yeah, but it was created by an AI, which is insane. So one, they've got a country song in the style of Alan Jackson, which is a country singer.

12:38 And you could convince me that that was Alan Jackson singing that song if I didn't know better. I'm like, oh, I've never heard that song and I'm not really super familiar with his music, but I kind of know what the guy sounds like.

12:47 That's probably one of his songs because it sounds like he's singing it, which is crazy. It's got Elvis Presley. It's got Katy Perry.

12:54 It's got some heavy metal in the style of Rage. I've not actually familiar with him. You got some other crazy stuff like alternative metal in the style of Disturbed or a jazz like Ella Fitzgerald.

13:07 And it's really interesting how accurate these things reproduce what those artists' style of singing is, their voice, what their voice sounds like, the style of music they write.

13:20 I mean, I would definitely not want to go listen to this to like relax or whatever, but it's as a AI example, it's pretty crazy.

13:29 Yeah. And one of the things I really appreciate is the music while you're listening to it. It kind of, it shows you the words, the lyrics while you're listening to it.

13:36 Yeah. The lyrics highlight because some of them it's easy to understand, but others like the Disturbed one, it's not so much easy to understand.

13:44 It's like a highlighted thing, but my brain wanted to have, see the little bouncy ball, like, I don't know, the kids music.

13:51 Yeah, exactly.

13:53 So the code for this is available on GitHub. The data set they use, they train the model. To train it, they crawled the web and curated a data set of 1.2 million songs.

14:06 Oh, wow.

14:07 That's got to use some bandwidth to get a hold of those. And then 600,000 of those were in English. And then it paired those with lyrics and metadata from Lyric Wiki.

14:15 Okay.

14:15 So it went and found the songs and said, okay, we need the written text so we can teach the model what the words are so that it can make up its own lyrics, I guess.

14:23 Yeah. And then it says the top level Transformers trained on the task of predicting compressed audio tokens.

14:31 And they provide additional information like the artist and genre for each song and said they get two advantages from that. First, it reduces the entropy of the audio prediction.

14:39 So the model's able to get better audio quality for the given style. And at generation time, they're able to guide it in the style of their choosing.

14:48 Like, no, no, we want some like hard rock versus Elvis or whatever.

14:52 Anyway, if you're into AI, this seems like a pretty wild project to check out.

14:57 Yeah, definitely.

14:58 Yeah. I'd say it's even curious.

15:00 Curious.

15:01 Or if you're curious, you should go check it out.

15:02 You should. And you should also check out this next post called The Curious Case of Python's Context Manager.

15:10 So Redewan Deloire went through this because context managers are really important.

15:16 And when you start really making some elegant, really readable code, it's good to make use of context managers.

15:23 If you're not familiar with what they are.

15:25 Anytime you see a width, like with open file as F or something like that, that's a use of a context manager.

15:33 And the file one is probably the most notable one.

15:35 And so when you leave, what it does is it hooks up the keeping track of the data so that when you exit the block, it can clean up after itself.

15:45 And so really handy things.

15:46 I've seen a lot of different tutorials on how to write them.

15:49 And a lot of them also, they go through the class.

15:53 So in general, the punchline is use the context manager, context lib.contextmanager decorator, and use a yield statement in the middle.

16:02 And that will help you out.

16:04 That's really the working way to do it.

16:07 But if you want to write, you write a class-based one with the dunder init, dunder enter, dunder exit.

16:14 I think those are good examples.

16:16 One of the things I liked about this tutorial is that it went through that, and it didn't seem artificial.

16:20 It seemed, it's like, this isn't one way to do it.

16:23 But it went through it pretty quickly, and we went through some other stuff.

16:26 And it's a pretty quick tutorial.

16:28 But then it gets into some really fun stuff that I really appreciated, like context managers as decorators.

16:37 And writing your own.

16:38 And then create, so that you're not using a decorator to make a context manager.

16:43 You're actually creating a decorator that is a context manager.

16:46 And that's a pretty interesting thing.

16:48 And then wrapping things and nesting context managers with block or with statement.

16:54 Yeah, that's cool to have like three of them instead of just one.

16:57 Rather than like having three with blocks, right?

17:00 Somehow I just missed that.

17:02 Yeah.

17:02 Yeah.

17:02 And I've just nested the with blocks.

17:05 But you don't have to do that.

17:06 You can have them all on one line, which is cool.

17:09 Yeah, there's a lot of cool little tips here.

17:11 And then combining them.

17:12 So like creating context manager that's really a combination of two other context managers or more, more than one, which is nice.

17:18 And then you kind of get into this, right?

17:21 And you think, well, what am I going to do with this?

17:23 Where would I use it?

17:24 And so there's three examples that he lists and shows.

17:28 There's context managers with a SQLAlchemy session, which is a really cool idea.

17:34 Yeah.

17:35 I love this one.

17:36 Yeah.

17:36 Yeah.

17:36 How to use rollback and session so that when you're testing and stuff, you can just automatically undo the thing that was in the block.

17:44 Sweet idea.

17:45 Using it for exception handling so that you can end that in combination with using it as a decorator is a neat idea to have a policy for how to deal with certain exceptions during parts of your code.

17:58 And then just decorate those functions with exception handling.

18:02 That's pretty cool.

18:02 And then the last one was talking about persistent parameters across HTTP requests.

18:07 So it goes from very gentle to really deep into using this well pretty quick, but it's really easy to read.

18:15 Yeah.

18:16 Well, you know, I realize reading through this that I've not been using the decorator style nearly enough.

18:21 I'm always like, oh, I'll just add a class with an enter exit type of thing.

18:25 But yeah, the context managers, you just get a function and do, you know, it's pretty clean.

18:29 Yeah.

18:30 It reminds me a lot of pytest fixtures.

18:33 Yeah.

18:33 Yeah, for sure.

18:34 That's a great article.

18:35 I like it.

18:35 I'll definitely have to study that one up.

18:37 So previously we spoke about NB Dev, which takes Jupyter Notebooks and allows you to do a whole bunch of cool things with them, right?

18:46 You can export the stuff into a script.

18:50 You can have it strip out some of the metadata, the saved output, which is like the bane of all GitHub committed notebooks.

18:58 Because every time you rerun it, if it's taking variable input data, it's going to have different metadata.

19:03 So every run is a conflict, a merge conflict, which is no fun.

19:08 So that thing solved a bunch of those types of things.

19:12 But Clement Roberts sent over a message and said, hey, that's really cool.

19:15 And NB Dev is great.

19:17 But if you're looking just to do the stripping of the metadata, there's a project called NB Strip Out, which is pretty clever.

19:25 And yeah, you can just set it up as like a git pre-commit hook.

19:29 And then every time you commit your stuff to GitHub, it automatically just strips it out so that you never run into any of those merge conflicts.

19:37 No, hooking up as a pre-commit hook.

19:39 That's a great idea.

19:40 Yeah.

19:40 Yeah.

19:40 So you can either run it from the command line or once you're in a GitHub repo and you've pip installed NB Strip Out, you just say NB Strip Out --install.

19:51 That's it.

19:51 Now it's a git pre-commit hook and it'll take care of doing all the things for Jupyter Notebooks.

19:56 Oh, nice.

19:56 Yeah.

19:57 So it basically is the same as going to Jupyter saying clear all output in the UI, but it just does it as you try it, you know, only as you save it to GitHub, which is cool.

20:05 And then there's also a YouTube tutorial, right?

20:09 We've said that it's really cool to have screenshots of like UIs.

20:14 Well, this is also really nice.

20:15 So if you go actually to the PyPI listing, there's like a YouTube video right there that shows you like a four minute tutorial of like why you should care about this.

20:25 And I've done that super useful.

20:27 I'm like, oh, this is kind of interesting.

20:28 Let me watch it.

20:29 I'm like, yep, that looks useful.

20:30 We should talk about it.

20:31 Really nice.

20:32 Yeah.

20:32 Nice.

20:33 Yeah.

20:33 Anyway, so if people are working with Notebooks and they're having this merge conflict issue, it's the saved output.

20:39 People forgetting to clear the output before they commit it.

20:42 Don't make them remember.

20:43 Just do mbstrip out --install and you're golden.

20:47 In episode 179, we had Guido on, which was totally a lot of fun.

20:50 One of the things he brought up was the 2020 Python Language Summit.

20:56 And it was really interesting to listen about that.

20:58 But if you want to read more about it, there's a pretty good write up of all the topics I talked about.

21:04 So we're linking to a post that has links to other posts.

21:08 We've got things like should all strings become f strings and using the peg parser and replacing CPython's parser with the peg parser and different things like that.

21:18 And even some of the lightning talks, just little snippets of kind of what was talked about and kind of like a news article feed of what's going on.

21:28 And I found it really interesting and helpful to be able to kind of pay attention to what's going on at the Language Summit and what's going on with the language going forward to keep up with everything.

21:39 I also wanted to bring up that there's been notifications recently about there's a voting coming up for the board of directors for the Python Software Foundation.

21:49 And so the PSF and some of the board of directors did a video on what this feels like and what does it mean to do that.

21:56 So we're linking to the video and a link to the information about nominations because nominations are open for new board members up until the 31st of May.

22:05 Oh, that's cool.

22:06 Yeah, it is a little bit mysterious to me what the PSF board of director folks do.

22:11 I mean, I can imagine, but I don't really know for sure.

22:14 So it's cool that they've got a video on talking about that.

22:17 Yeah, if you know people who should be part of it, nominate them.

22:20 That's cool.

22:21 Yeah.

22:21 Assuming they want to be nominated.

22:22 Awesome.

22:25 All right.

22:25 Well, I guess that's it for all our items.

22:27 You got anything extra, Brian?

22:27 I don't.

22:28 I've just been working along.

22:29 How about you?

22:30 Two quick things.

22:31 Follow up.

22:32 We talked about Austin, the profiler, which is awesome.

22:37 Does CPU profiling, also memory profiling.

22:40 Remember, it had the TUI.

22:42 It had the web GUI.

22:43 It had all the different user interfaces.

22:46 And one of the ways you could view it, one of the many ways you could view it was through

22:50 SpeedScope.

22:51 It's cool.

22:53 So Wendell Bauman sent in a script that he'd created called PySpeedScope, which will let

23:00 you generate one of these SpeedScope files that you can then visualize from Python, from running

23:06 Python.

23:06 So anyway, I'll just link over to his GitHub repo for PySpeedScope, which looks pretty cool.

23:11 Also, I updated our search engine.

23:14 I realized that when you search for stuff over at Python Bytes, it would give you good results,

23:19 but it would just present them kind of as if they were all equal.

23:24 So now it ranks things.

23:25 So I added ranking to our little search engine.

23:27 So now if you search for stuff, it's more likely to give you good results.

23:31 Well, that's cool.

23:34 Without ranking, isn't it just pulling up random lists of things?

23:37 Well, I mean, everything fit in there, right?

23:40 If you search for everything that came up had, like if you search for, I don't know,

23:45 Jupyter, it would have to have Jupyter in it if it came up as a result.

23:49 But for example, if Jupyter was in the title or just like a random, and here's an example,

23:54 Jupyter notebook versus the topic is about Jupyter.

23:57 Like they would show up in whatever order they just came back.

23:59 Now it's like the stuff that's about it more specifically shows up first.

24:03 Oh, very helpful.

24:04 Yeah, indeed.

24:05 So hopefully that's a little bit helpful.

24:06 I use that all the time.

24:07 I know.

24:08 That's why I did it.

24:09 I'm like, there's this thing.

24:11 I know it's in here.

24:11 And why are there so many results?

24:13 It should be right at the top because the title is basically what I search for.

24:16 Yeah, exactly.

24:17 So this is for us, but people can benefit as well.

24:19 Definitely.

24:20 All right.

24:20 You got some jokes?

24:21 Yeah, these are definitely groaners, but they're submitted by friends of the show on Twitter.

24:27 So I appreciate this.

24:28 So a couple of them.

24:30 Due to social distancing, I wonder how many projects are migrating to UDP and away from TLS

24:36 to avoid all the handshakes.

24:38 Well, we have to, right?

24:40 Yeah.

24:41 You don't want to get a computer virus.

24:43 Next, a chef and a vagrant walk into a bar.

24:45 Within a few seconds, it was identical to the last bar they went into.

24:49 I got it.

24:52 So vagrant manages virtual machines and chefs helps set up, configure those environments.

24:58 Got it.

24:59 Okay.

25:00 Anyway.

25:01 So.

25:01 Yeah.

25:02 I like how the, you're leaving somewhat understanding these as an exercise of the reader.

25:07 And partially I'm ruining their joy of solving the problem.

25:11 Yeah.

25:11 No, it's all good.

25:14 I took me, both of these took me a little bit of Googling to understand.

25:18 Beautiful.

25:18 I like them.

25:19 They're, they're definitely groaners, but they do kind of make you think for a second as

25:22 well, which is great.

25:23 Yeah.

25:24 So thanks a lot.

25:25 Yeah.

25:25 You bet.

25:25 I think we're done.

25:26 Thanks.

25:26 See ya.

25:27 Bye.

25:27 Thank you for listening to Python Bytes.

25:29 Follow the show on Twitter at Python Bytes.

25:32 That's Python Bytes as in B-Y-T-E-S.

25:35 And get the full show notes at pythonbytes.fm.

25:37 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

25:43 We're always on the lookout for sharing something cool.

25:45 This is Brian Okken.

25:46 And on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast

25:50 with your friends and colleagues.

Back to show page