Are Data Scientists becoming obsolete in the LLM era? Not even close!
Alternatively titled: Why the 'scientist' in data science matters more than ever
Hello fellow datanistas!
Have you ever caught yourself wondering, 'If LLMs can code, what’s left for data scientists to do?' I’ve been hearing this question a lot lately, and I get the anxiety. Watching ChatGPT or Claude whip up Python scripts and debug code in seconds can make anyone pause and rethink their role.
In this post, I want to share my take after months of integrating LLMs into my own workflow. Rather than making us obsolete, these tools are fundamentally reshaping what it means to be a data scientist. I’ll break down how LLMs are enhancing our daily work and how they’re opening up new opportunities for us to build, experiment, and measure in ways that go back to our scientific roots.
Let’s start with the obvious: LLMs are incredible productivity boosters. Tools like GitHub Copilot and Cursor have made my coding faster and more fluid. Research agents like Elicit.org help me navigate the literature at a pace I never thought possible. I even use AI transcription to get my thoughts out almost as quickly as I can think them. But here’s the thing—being proficient with these tools is now table stakes. Just as spreadsheets changed accounting, AI assistance is now a baseline expectation for data scientists. The real skill is knowing how to use these tools critically: verifying information, catching errors, and making sure the outputs actually make sense.
But there’s a deeper shift happening. The most exciting part of my work lately has been building custom LLM workflows with business partners. This isn’t just about coding—it’s about co-creating new tools that remove tedious work and drive real business value. Here, the scientist’s mindset is crucial: hypothesizing, defining metrics, designing experiments, and measuring outcomes. It’s not about being an app developer or a business analyst. It’s about being the person who can connect technical possibilities with rigorous measurement and business impact.
Taking Hamel Husain and Shreya Shankar’s course on LLM evaluation really crystallized this for me. The core of our role is to measure, evaluate, and design metrics—just like in discovery science. Whether we’re working with molecules or business processes, we hypothesize, define what matters, test, and measure. That’s the heart of being a scientist, and it’s more relevant than ever in the LLM era.
LLMs aren’t making data scientists obsolete—they’re pushing us back to our scientific roots, where our value lies in hypothesizing, measuring, and building systems that connect business value with statistical rigor.
How are you using LLMs in your own workflow? Have you started building custom tools, or are you still exploring what’s possible? I’d love to hear your experiences and what’s working (or not) for you.
If you’re curious about how to thrive as a data scientist in the LLM era, check out the full post for more insights and practical tips: Read the full blog post. If you found this helpful, feel free to share or subscribe for more!
Cheers,
Eric

