Brought to you by Michael and Brian - take a Talk Python course or get Brian's pytest book


Transcript #182: PSF Survey is out!

Return to episode page view on github
Recorded on Wednesday, May 13, 2020.

00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds. This is episode 182 recorded May 13th, 2020.

00:10 And I am Brian Okken. And I'm Michael Kennedy. And this episode is brought to you by Datadog.

00:16 There's two surveys that I feel really do a good job of keeping their, you know, they have their thumb on the pulse of the community. And I would say one is the PSF JetBrains combination. That one's really good for Python in particular.

00:29 and the other is the Stack Overflow Survey.

00:31 Well, the Stack Overflow Survey recently came out, you know, a couple months ago or something like that.

00:35 So now it's the PSF JetBrains Survey.

00:38 And when I first saw this, I thought, oh, this is last year's 'cause it was 2019, but they do a ton of analysis on it.

00:44 And the survey was done in 2019, but it's released now.

00:48 So anyway, I wanna talk about some of the results in there because there's some interesting takes.

00:53 And also I wanna say thank you to Jose Nario who sent this in because like I said, I thought it was last year's I'm still waiting for the 2020 edition.

01:00 All right, so let's talk about some of the results.

01:02 First of all, one of the first questions asked was do you primarily use Python as your main language or are you interested in Python because it's some other like I also happen to use it in addition to JavaScript or something.

01:15 84% of the people who took the survey, and that was mostly folks who visited python.org, so 84% said it's my primary language and that's unchanged from last year.

01:27 year. Okay, interesting. A lot of the analysis here was how has this changed over the last year or so. Right, so there's a lot of interesting trends to be pulled out from here. They said what other languages do you use? Well, there was JavaScript which has gone down in usage. There's bash which has gone down in usage. There's HTML which has gone down in usage relative to last year. And C++ which has gone down relative to last year. So the people who took the survey, The same number of people said they're primarily using Python, but the other languages they're employing, all the ones which were popular seem to be going down.

02:04 And I would say this means maybe people are starting to lean more on Python, those who use it.

02:10 I don't know what conclusions we should draw there.

02:12 There's also some growth in some other languages as well, I'm guessing.

02:16 Interesting to see all the other ones down.

02:18 Yeah.

02:19 Well, this is down within the Python community, right?

02:22 This is not necessarily down overall, right?

02:24 So I would say we should look at the Stack Overflow survey to like really more like even Stack Overflow trends.

02:31 But if you look at that one, it's all the other languages are relatively down as well.

02:34 So interesting.

02:36 Now, a lot of the divide happens around web versus data science here.

02:42 And so a lot of the questions said, are you a web developer primarily or a data scientist primarily?

02:47 And then if you are, what type of tools do you use?

02:50 So for example, if you're a data scientist, use Python, of course, but you also make larger use of C++, Java, R, and C# as a data scientist.

03:00 But if you are a web developer, you make more use of SQL, JavaScript, and HTML, which is, you know, no surprise because it's hard to write the web without HTML and JavaScript.

03:10 So those are pretty interesting.

03:12 And they said 58% of the people use Python for both work and personal projects.

03:18 And they said, all right, well, what do you use it for?

03:20 Right?

03:21 There is a bunch of answers.

03:23 I'm going to give you the top four because the average person who filled this out chose four things that they primarily use Python for.

03:31 I could use it for web and DevOps or something like that.

03:34 It's not exclusive.

03:35 So the top four were data analysis, which is exactly the same as last year.

03:42 59% of the respondents said that.

03:44 Web development.

03:45 This is pretty interesting.

03:46 51% of people said they do web development, but this is down 4% year over year.

03:52 - Interesting.

03:53 - That's a big drop, and I feel like web is a big component of what the Python space is.

03:57 So I don't really know what to draw from that.

03:59 Like maybe just more people are sort of backfilling in the data science side maybe.

04:04 Maybe the tools that they use are better.

04:07 So things like Streamlit and stuff, they're like, well, I don't have to do web now.

04:10 I'm not really sure.

04:11 Machine learning, 40%, that's up 1%.

04:13 DevOps, 39%, that's down 4% as well, which is the same number as a web development. And these are kind of similar things. So what you use Python for is pretty interesting. And then more broadly, it says what do you use it the most for web data analysis and machine learning. Web is down 1% as the primary thing you use it for data analysis is up 1% machine learning is up 2%. Also, not Not surprising, I guess, but good news on the Python versus legacy Python story, 90% of people are using Python 3.

04:47 That's a huge change from five years ago when it was like, "Oh yeah, there's that one weird guy that uses Python 3, but everyone else, nah." More of the data scientists are using Python 3 as a percentage over the web developers.

05:02 I think that's because data science, the libraries and the tools are changing a lot.

05:07 It's not like you have legacy data science.

05:12 Like, "Oh yeah, we're using that machine learning library from 10 years ago because we just don't want to change it." It's like those are fundamentally changed and you just have new tools.

05:21 Whereas web development, I think there's a lot of folks that are like, "Yeah, we're still got that Django one app going on Python two or something." - I'm not surprised there's still some Python two, but 10% still seems a little high.

05:32 - I think it's mostly legacy code.

05:34 I'm not entirely sure, but you can use your legacy Python for your legacy code.

05:35 All right, and then web frameworks.

05:38 Flask was kind of neck and neck with Django last year.

05:42 Now it's like totally ahead.

05:43 So 48% Flask developers and 44% Django developers and everything else is pretty small.

05:49 Data science, 63% of the people are using NumPy, 55% Pandas, 46% Matplotlib.

05:55 Here's one I threw in for you, Brian, testing.

05:58 49% of the people are using pytest.

06:00 - Yes. - 30%.

06:01 Yeah, 30% of the people are using the built-in unit test.

06:05 And 34% are using this other framework I hadn't heard of before.

06:09 It's called the none framework.

06:11 - Yeah, that's unfortunately high, but it's funny that it's higher than unit test.

06:15 - Yeah, I know, that's what I thought as well.

06:17 Pretty funny.

06:18 All right, a couple more and we'll be done with this one.

06:20 So cloud, the cloud platforms people are using, AWS is in the lead, no surprise there, with 55%.

06:26 What did surprise me is Google Cloud, GCP and whatnot is actually number two at 33%.

06:34 DigitalOcean, shout out to them, is at 22% and then Heroku is 20% and Azure is 19%.

06:42 So pretty interesting.

06:44 And then how do you run your code in the cloud in a production environment?

06:48 Do you run in containers, VMs, platforms as a service like Heroku or something?

06:51 So containers, 47%, that's pretty high.

06:55 VMs 46% and then platform as a service is 25%.

07:00 What editor do you use?

07:02 PyCharm 33%, VS Code 24%, Vim 9%.

07:06 Everything else is like a super small margin after that.

07:08 Yeah, and that's pretty much it.

07:10 - Yeah, so I threw in an extra one on there at the end.

07:14 Well, one, the containers, I thought containers was number two below VMs last year.

07:20 And now it's jumped to number one.

07:22 - Oh yeah, like that's probably Kubernetes.

07:24 like hosted Kubernetes clusters, right?

07:28 That people are throwing their Docker images into.

07:30 Oh yeah, what's the last one?

07:31 Take us through this one.

07:32 - And then the tool use.

07:33 They also listed a handful of things of people using.

07:36 90% people using version control, that's good.

07:39 80% write their tests, that's good.

07:41 And 80% code linting.

07:43 And 65% people use type hinting, which actually is a little higher than I--

07:48 - Oh yeah, way to go.

07:49 - That's nice.

07:50 And about half the people using code coverage tools, 52%.

07:54 Yeah, thanks for adding that.

07:55 I'm very familiar with legacy code and non-legacy code, which I guess you call modern, but you're going to take it to 11.

08:03 What's up?

08:04 What's up with this level?

08:06 - It's hyper modern.

08:07 - Woo!

08:08 - Yeah.

08:09 This is from Claudio, cool name.

08:11 But anyway, so he wrote a, actually, it's like a book.

08:15 This is an incredible series of blog posts and he actually writes them out.

08:20 They're all linked together, called Hyper Modern Python, and sets them up in six chapters.

08:26 He's got setup, testing, linting, typing, documentation, and then the last chapter is CI/CD, and he just wrapped it up.

08:34 I've been watching this, and I was gonna announce it when he was done.

08:38 It's a really fun series to learn about.

08:41 You know, you've learned some of the basics of Python, but you wanna get some best practices and take it up a notch.

08:48 And I think this is good.

08:50 It's opinionated, of course.

08:52 I actually like opinionated things and some of the opinions I don't quite follow.

08:55 Like he likes pyenv, I'm not really a fan of pyenv, but that's all right.

09:02 Also poetry, he uses poetry.

09:04 I use Flit usually, but anyway, he does recommend the source layout, which is good.

09:10 But for setup, there's a whole bunch of neat stuff in here.

09:13 And some tools that, like for linting, he talks about Flickate, Black, Import Order, Bugbear, which is fun, I don't know if we've covered that, But there's a whole bunch of tools that I haven't even heard of like safety and dessert, and data validation with dessert and marshmallow.

09:29 So I'll have to look that sort of stuff up.

09:31 That looks neat.

09:32 So just a good run through it in the CI/CD section, he talked about using GitHub actions, and reporting your coverage with CodeCov, uploading to PyPI and using test PyPI servers and documenting on read the docs.

09:46 I think this is a fairly good representation for modern projects.

09:50 Yeah covers a lot of stuff that people probably should be doing and maybe haven't taken the time to set up.

09:55 Or dig into yeah and one of the things i want to highlight also is incredible use of pictures the imagery that uses on these posts are like the cct section was some nineteen seventies and space station or you know space colony.

10:14 images and they're beautiful so it's worth it just for the pictures.

10:18 It is worth it just for the pictures, they're great.

10:20 Yeah, very nice one.

10:21 Well, another thing that's really nice is Datadog.

10:24 Indeed.

10:25 Yep, this episode of Python Bytes is brought to you by Datadog and let me ask you a question.

10:31 Do you have an app in production that is slower than you like?

10:34 It's performance all over the place, sometimes fast, sometimes slow.

10:38 Now here's the important question, do you know why?

10:41 Well, with Datadog you will.

10:43 You can troubleshoot your app's performance with Datadog's end-to-end tracing.

10:47 Use the detailed flame graphs to identify bottlenecks and latency in that finicky app of yours.

10:52 Be the hero that got the app back on track at your company.

10:56 Get started today with a free trial of Datadog by going to pythonbytes.fm/datadog.

11:04 And you even get a free t-shirt.

11:06 So I really like my purple Datadog t-shirt.

11:09 I wear it a little too much, probably.

11:10 That's a cute one. Awesome. Yeah. Thanks, Datadog.

11:12 This next one that I want to cover, Brian, is a little bit just kind of a fun thing.

11:17 This, Dan Bader shared this with me the other day.

11:19 He's like, "Man, you got to check this out. Look at this thing." And I'm like, "What is it?" It's called the OpenAI Jukebox.

11:25 And so it's this AI that creates music.

11:31 And it doesn't just create music.

11:35 It creates different genres of music in the styles of certain artists with lyrics and musical accompaniment.

11:44 So it's wild, right?

11:46 I mean, I had you listen to a couple of the samples.

11:48 And folks out there, you should just go click on the Open AI Jukebox link in the show notes, go to the curated samples, and play a couple.

11:57 They're-- how would you classify them, Ryan?

12:00 I'd classify them as you can tell they're music.

12:02 [LAUGHS]

12:05 None of these I would pick up, want to go out, rush out, and buy the album right away.

12:09 Yeah, neither would I. I mean, to me, they sound like sort of bad recordings of an artist, you know, maybe taking on like a phone at a live concert where you've kind of got like not a good audio setup.

12:22 There's like a little bit of a, I think I remember this song, bit to it, even though there's no way.

12:28 Yeah, but it was created by an AI, which is insane. So one, they've got a country song in the style of Alan Jackson, which is a country singer.

12:37 And you could convince me that that was Alan Jackson singing that song, if I didn't know better.

12:42 I'm like, oh, I've never heard that song, and I'm not really super familiar with his music, but I kind of know what the guy sounds like.

12:47 That's probably one of his songs, because it sounds like he's singing it, which is crazy.

12:51 It's got Elvis Presley.

12:52 It's got Katy Perry.

12:54 It's got some heavy metal in the style of Rage.

12:58 I'm not actually familiar with them.

13:00 You got some other crazy stuff, like alternative metal in the style of Disturbed or a jazz like Ella Fitzgerald.

13:07 And it's really interesting how accurate these things reproduce what those artists' style of singing is, their voice, what their voice sounds like, the style of music they write.

13:20 I would definitely not want to go listen to this to relax or whatever.

13:24 But as a AI example, it's pretty crazy.

13:29 Yeah.

13:29 And one of the things I really appreciate the music while you're listening to it, it kind of, it shows you the words, the lyrics while you're listening to it. - Yeah, the lyrics highlight, 'cause some of them it's easy to understand, but others, like the disturbed one, it's not so much easy to understand.

13:44 - It's like a highlighted thing, but my brain wanted to see the little bouncy ball, like, I don't know, the kids' music.

13:52 - Yeah, exactly.

13:54 So the code for this is available on GitHub.

13:57 The dataset, they use, they train the model.

13:59 To train it, they crawled the web and curated a data set of 1.2 million songs.

14:06 Oh wow.

14:07 That's got to use some bandwidth to get a hold of those.

14:09 And then 600,000 of those were in English.

14:11 And then it paired those with lyrics and metadata from the Lyric Wiki.

14:15 Okay.

14:15 So it went and found the songs.

14:17 It said, "Okay, we need the written text so we can teach the model what the words are so that it can make up its own lyrics, I guess." Yeah.

14:25 And then it says the top level transformer is trained on the task of predicting compressed audio tokens.

14:30 And they provide additional information like the artist and genre for each song.

14:35 And it said they get two advantages from that.

14:37 First, it reduces the entropy of the audio prediction.

14:40 So the model's able to get better audio quality for the given style.

14:44 And at generation time, they're able to guide it in the style of their choosing.

14:48 Like, no, no, we want some like hard rock versus Elvis or whatever.

14:52 Anyway, if you're into AI, this seems like a pretty wild project to check out.

14:57 Yeah, definitely.

14:58 Yeah, I'd say it's even curious.

15:00 Curious.

15:01 Or if you're curious, you should go check it out.

15:02 You should, and you should also check out this next post called "The Curious Case of Python's Context Manager." So Redouane Delaware, went through this because context managers are really important.

15:16 And when you start really making some elegant, really readable code, it's good to make use of context managers.

15:23 If you're not familiar with what they are, anytime you see a with, like with open file as F or something like that, that's a use of a context manager.

15:33 The file one is probably the most notable one.

15:35 So when you leave, what it does is it hooks up the keeping track of the data, so that when you exit the block, it can clean up after itself.

15:44 So really handy things.

15:46 I've seen a lot of different tutorials on how to write them, And a lot of them also, they go through the class.

15:53 So in general, the punchline is use the context manager, contextlib.contextmanager decorator, and use a yield statement in the middle.

16:02 And that will help you out.

16:04 That's really the working way to do it.

16:07 But if you want to write, you write a class based one with the dunder init, dunder enter, dunder exit.

16:14 I think those are good examples.

16:16 One of the things I liked about this tutorial is that it went through that.

16:19 and it didn't seem artificial, it seemed it's like, this is the one way to do it, but it went through it pretty quickly and went through some other stuff and it's a pretty quick tutorial.

16:28 But then, it gets into some really fun stuff that I really appreciated, like context managers as decorators and writing your own and then create so that you're not using a decorator to make a context manager, you're actually creating a decorator that is a context manager.

16:46 That's a pretty interesting thing.

16:48 and then wrapping things and nesting context managers with block or with statement.

16:54 Yeah, that's cool to have like three of them instead of just one.

16:58 Rather than like having three with blocks, right?

17:00 Somehow I just missed that.

17:02 Yeah.

17:02 Yeah.

17:03 And I've just nested the with blocks, but you don't have to do that.

17:06 You can have them all on one line, which is cool.

17:09 Yeah, there's a lot of cool little tips here.

17:11 And then combining them.

17:12 So like creating context manager that's really a combination of two other context managers or more than one, which is nice.

17:19 Then you get into this and you think, well, what am I going to do with this?

17:23 Where would I use it?

17:24 There's three examples that he lists and shows.

17:28 There's context managers with SQLAlchemy session, which is a really cool idea.

17:34 >> Yeah, I love this one.

17:36 >> How to use rollback and session so that when you're testing and stuff, you can just automatically undo the thing that was in the block.

17:44 Sweet idea. Using it for exception handling so that you can, and that in combination with using it as a decorator is a neat idea to have a policy for how to deal with certain exceptions during parts of your code, and then just decorate those functions with exception handling. That's pretty cool.

18:03 And then the last one was talking about persistent parameters across HTTP requests.

18:08 So it goes from very gentle to really deep into using this, well, pretty quick, but it's really easy to read.

18:15 Yeah, well, you know, I realize reading through this that I've not been using the decorator style nearly enough.

18:21 I'm always like, oh, I'll just add a class with an enter/exit type of thing, but yeah, the context managers you just get a function and do, you know, it's pretty clean.

18:29 Yeah, it reminds me a lot of pytest fixtures.

18:33 Yeah, for sure. That's a great article. I like it. I'll definitely have to study that one up.

18:37 So, previously we spoke about NBDev, which takes Jupyter notebooks and allows you to do a whole bunch of cool things with them, right?

18:46 You can export the stuff into a script, you can have it strip out some of the metadata, the saved output, which is like the bane of all GitHub committed notebooks, because every time you rerun it, if it's taking variable input data, it's going to have different metadata, so every run is a conflict, a merge conflict, which is no fun.

19:08 So that thing solved a bunch of those types of things, but Clement Roberts sent over a message and said, "Hey, that's really cool and NBDev is great, but if you're looking just to do the stripping of the metadata, there's a project called NBStripOut, which is pretty clever." And yeah, you can just set it up as like a Git pre-commit hook, and then every time you commit your stuff to GitHub, it automatically just strips it out so that you never run into any of those merge conflicts.

19:37 - Oh, hooking up as a pre-commit hook, that's a great idea.

19:40 - Yeah, yeah, so you can either run it from the command line or once you're in a GitHub repo and you've pip installed nb strip out, you just say nb strip out --install, that's it.

19:51 Now it's a Git pre-commit hook and it'll take care of doing all the things for Jupyter notebooks.

19:56 - Oh, nice.

19:57 - Yeah, so it basically is the same as going to Jupyter saying clear all output in the UI, but it just does it as you try it, only as you save it to GitHub, which is cool.

20:06 And then there's also a YouTube tutorial, right?

20:09 We've said that it's really cool to have screenshots of like UIs.

20:14 Well, this is also really nice.

20:15 So if you go actually to the PyPI listing, there's like a YouTube video right there that shows you like a four minute tutorial of like why you should care about this.

20:25 And I've done that, it's super useful.

20:27 I'm like, oh, this is kind of interesting.

20:28 Let me watch it.

20:29 I'm like, yep, that looks useful.

20:30 We should talk about it.

20:32 Really nice. - Yeah, nice.

20:33 - Yeah, anyway, so if people are working with notebooks and they're having this merge conflict issue is the saved output.

20:40 People forgetting to clear the output for they committed, don't make them remember, just do mb strip out --install, and you're golden.

20:47 - In episode 179, we had Guido on, which was totally a lot of fun.

20:51 One of the things he brought up was the 2020 Python Language Summit.

20:56 And it was really interesting to listen about that.

20:58 But if you want to read more about it, there's a pretty good write up of all the topics I talked about.

21:04 So we're linking to a post that has links to other posts.

21:09 We've got things like, should all strings become f-strings and using the peg parser and replacing CPython's parser with the peg parser and different things like that.

21:18 And even some of the lightning talks, just little snippets of kind of what was talked about and kind of like a news article feed of what's going on.

21:28 I found it really interesting and helpful to be able to kind of pay attention to what's going on at the language summit and what's going on with the language going forward to keep up with everything.

21:39 I also wanted to bring up that there's been notifications recently about there's a voting coming up for the board of directors for the Python Software Foundation.

21:50 And so the PSF and some of the board of directors did a video on what this feels like and what does it mean to do that.

21:56 So we're linking to the video and a link to the information about nominations 'cause nominations are open for new board members up until the 31st of May.

22:06 - Oh, that's cool.

22:06 Yeah, it is a little bit mysterious to me what the PSF board of director folks do.

22:12 I mean, I can imagine, but I don't really know for sure.

22:14 So it's cool that they've got a video on talking about that.

22:18 Yeah, if you know people who should be part of it, nominate them.

22:20 That's cool.

22:21 - Yeah.

22:22 - Assuming they want to be nominated.

22:23 (laughing)

22:24 Awesome.

22:25 All right, well, I guess that's it for all our items.

22:27 You got anything extra, Brian?

22:28 - I don't, I'm just been working along.

22:30 How about you?

22:31 - Two quick things.

22:31 follow up, we talked about Austin, the profiler, which is awesome.

22:37 It does CPU profiling, also memory profiling.

22:40 Remember, it had the TUI, it had the web GUI, it had all the different user interfaces.

22:46 And one of the ways you could view it, one of the many ways you could view it was through Speedscope.

22:51 It's cool.

22:53 So Wendell Bauman sent in a script that he'd created called PySpeedscope, which will let you generate one of these SpeedScope files that you can then visualize from Python.

23:05 Nice.

23:06 So, anyway, I'll just link over to his GitHub repo for PySpeedScope, which looks pretty cool.

23:12 Also, I updated our search engine.

23:14 I realized that when you search for stuff over at Python Bytes, it would give you good results but it would just present them kind of as if they were all equal.

23:24 So now it ranks things.

23:25 So I added ranking to our little search engine.

23:28 So now if you search for stuff, it's more likely to give you good results.

23:32 Well, that's cool.

23:35 Without ranking, isn't it just pulling up random lists of things?

23:37 Well, I mean, everything fit in there, right?

23:41 If you search for it, everything that came up had, like if you search for, I don't know, Jupyter, it would have to have Jupyter in it if it came up as a result.

23:49 But for example, if Jupyter was in the title or just like a random, and here's an example Jupyter notebook versus the topic is about Jupyter.

23:57 Like they would show up in whatever order they just came back.

24:00 Now it's like the stuff that's about it more specifically shows up first.

24:03 Oh, very helpful.

24:04 Yeah, indeed.

24:05 So hopefully that's a little bit helpful.

24:06 For me too.

24:07 I use that all the time.

24:08 I know.

24:09 That's why I'm like, there's this thing.

24:11 I know it's in here.

24:12 And why are there so many results?

24:13 It should be right at the top because the title is basically what I searched for.

24:16 Yeah, exactly.

24:17 So this is for us, but people can benefit as well.

24:19 Definitely.

24:20 All right.

24:21 You got some jokes?

24:22 These are definitely groaners, but they're submitted by friends of the show on Twitter, so I appreciate this.

24:29 So a couple of them.

24:30 Due to social distancing, I wonder how many projects are migrating to UDP and away from TLS to avoid all the handshakes.

24:38 >> Well, we have to, right?

24:40 >> Yeah.

24:41 >> You don't want to get a computer virus.

24:43 >> Next, a chef and a vagrant walk into a bar.

24:46 Within a few seconds, it was identical to the last bar they went into.

24:52 I got it. So Vagrant manages virtual machines and Chefs helps set up, configure those environments.

24:59 Got it.

25:00 Okay.

25:01 Nice.

25:02 Anyway, so.

25:03 Yeah. I like how you're leaving somewhat understanding these as an exercise of the reader and partially I'm ruining their joy of solving the problem.

25:11 Yeah. No, it's all good. Both of these took me a little bit of Googling to understand.

25:18 Beautiful. I like them. They're definitely groaners, but they do kind of make you feel you think for a second as well, which is great.

25:23 Yeah.

25:24 So thanks a lot.

25:25 Yeah.

25:26 You bet.

25:27 I think we're done.

25:28 Thanks.

25:29 See ya.

25:30 Bye.

25:31 Thank you for listening to Python Bytes.

25:32 Follow the show on Twitter @pythonbytes.

25:33 That's Python Bytes as in B-Y-T-E-S.

25:35 And get the full show notes at pythonbytes.fm.

25:38 If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

25:43 We're always on the lookout for sharing something cool.

25:45 This is Brian Okken, and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

Back to show page