AI Saviorism and more
Hello fellow datanistas!
It’s another edition of the Data Science Programming Newsletter. This time round, I’ve found five wonderful articles and blog posts that I think are worth your time to check out. Here they are:
Get rid of AI Saviorism
Shreya Shankar has a wonderful post arguing against the saviour-ism syndrome plaguing many of us data scientists nowadays. I find it to be a thoughtful post. In my opinion, the big takeaway here is that applied machine learning is a team sport and that it must be done with stakeholder buy-in. Otherwise, we have big fancy models with no path to impact. 💯 a good read for all of us practitioners!
A picture is worth 1,000 false-positive bug reports
This blog post by Stitch Fix details how their Algorithms team came up with a clever solution to help their internal users, the Warehouse team, debug path routing algorithms for warehouse workers. Here, as it turns out, it was an innovative data visualization application that did the trick. One lesson we definitely can learn here is that deep empathy for the people we're serving will help us devise smart, simple, and accessible solutions to their problems.
Nines of safety: a proposed unit of measurement of risk
This comes from the blog of Terry Tao, a famed mathematician at UCLA, who provides a very cool way of defining chances of safety (in a binary outcome) by using nines of safety. 99.99% safe would get 4 nines, 99.999% safe would get 5, and so on, but he also provides a way to calculate arbitrary percentages in the same units. I won't spoil the fun here; I think it's a smart way to communicate ideas such as Service Level Agreements (SLAs) or how safe we are against certain causes of death. (Fun fact, we have 5.9 nines of safety against dying from an airplane crash but only 4 nines of safety against dying from a car accident.)
Postmortem: A Year of Data Science Peer Review in Startups
Unlike software engineers and academics, data scientists have few formal outlets and incentives for peer review. In this excellent blog post by Shay Palachy, he details some of the lessons learned implementing data science peer review in organizations that he has worked at. The lessons are golden nuggets in how, where, and when to solicit feedback from your fellow data scientists. I favour reviews of all kinds: code, strategy, etc., at many stages of a project, as continual feedback helps with alignment, knowledge sharing, learning, and much more.
Where does bad code come from?
This is an excellent YouTube talk by YouTuber Molly Rocket (Casey Muratori). This was shared to me by a fellow pyjanitor dev Logan Thomas. He commented that it provided a set of incisive observations on how we develop as programmers. Particularly, Casey points out that experienced programmers use patterns heavily, a point corroborated in the book The Programmer's Brain by Felienne Hermans. I'm sharing this link in the spirit of both encouragement and reminder: as data scientists, our models are also ultimately code, just written a different way. If we get good at coding, we get good at making our code accessible to others. New data scientists, learn from the experienced programmers in your professional lives, and pay attention to the patterns of code that they use!