How Can Organizations Think Differently to Get the Most Out of AI?

Cassie Kozyrkov
7 min readJul 17, 2024

--

What is to become of the Data Scientist role?

I get this question often from executives and more recently was asked about the evolving role of data teams at DataRobot.

My TL;DR answer? Tools will change, but the foundation stays the same. Data science is an expanding universe of possibilities, so the role of the data scientist is more critical than ever.

Read on…

Looking beyond the AI-everything

With the recent explosion of consumer AI-everything, it’s unsurprising that data scientists are wondering what it all means for their careers. If the proliferation of consumer AI products feels overwhelming, data scientists can take a collective breath: every consumer AI product is built on data and creates an enormous volume of exhaust data, which smells of opportunity.

But in many ways, consumer AI is a distraction. The game-changing opportunities for data science careers lie in enterprise-scale automation.

Enterprise-scale automation means building gargantuan systems effectively, safely, and reliably. When it comes to scale, even something as simple as a barcode lookup can present implementation challenges. Complex enterprise AI systems at scale — like applications involving fleet-wide predictive maintenance or fully automated customer service — require superheroes. Enter: the data scientist!

Data scientists take a methodical approach to inference, and are experts at exploring data. Data scientists are the professionals best positioned to identify opportunities for automation, design approaches for testing and monitoring systems at scale, and work with cross-functional teams on managing the end-to-end process of bringing AI solutions from ideation to production.

This is important because once you’re doing enterprise scale automation with no human to check the output before it leaves your system, your biggest concern should be: does it work? In other words, does this massive system that’s automating your process at scale do so safely and effectively? And that’s where the data scientist shines! Like superheroes waiting for their moment, you’ll see data scientists step up to help you figure out just how performant your shiny new generative tools are at enterprise scale.

Unlike traditional software systems, it’s not possible to “read the code” to figure out how well an automation solution is performing in your production environment, which is why you’ll need expert data scientists handling the process of understanding the efficacy and value of the AI solutions you’ll be relying on. And as businesses adopt AI tools for automating more and more of their software generation at scale — not tiny bug fixes but entire libraries — the absence of human eyes on individual coding tasks means that data science skills will be mandatory for effective debugging. Every convenience offered to data scientists by AI tools and products will be balanced by an explosion of responsibility that requires a data science mindset. There will be more demand for the data science profession than ever before. In fact, the scope of work will widen.

New skills for Data Scientists: The guardians of AI apps

To handle this automation at scale, data scientists will have to get good at new things. In addition to building, data scientists will be critical to securing what they’ve built.

Generative AI brings new risks.

Enterprises haven’t yet developed the muscle or best practices for securing AI applications from things like prompt injections or jailbreak attempts. There will also be more weight put on governance; ensuring new AI-driven solutions and workflows impact the business positively.

Enterprise GenAI systems serving users at scale without the need for each output to be checked individually is a step change in how we work, and leaders will have to think about their challenges in new ways. All of this requires great data science. The workflows may look different, but data scientists might find something very familiar in the challenges presented by this brave new world. In many ways, this is the moment they’ve been training for their whole lives.

Building Great Data Teams in the Age of AI

So, will data scientists be automated out of jobs? No.

But here’s an important caveat: they must identify with the function of their role, rather than specific tools. Because tools evolve! We are a tool building species. If you are defined by the tool you learn to use, you’ll feel stress as the ecosystem grows.

If you were to define yourself by punch cards, you wouldn’t feel like a modern-day data scientist.

But if you define yourself by the core foundations and of your discipline, you’ll realize tools are your friends that can help you do better work.

In fact, we’ll need much more than great data scientists. Roles will evolve from the leadership level all the way down to entry level, so teams should be built intentionally to include all the skills and personalities needed to make enterprise automation work at scale.

Here’s how I see this evolving:

Business Leaders: Soft skills & regulation prevail

As AI becomes more ubiquitous, business leaders will be forced to excel in the very human endeavor of knowing the problem they want to solve, then choosing the tools and teams to make it happen.

Decision-making skills will be critical: What does success look like? Which mistakes are tolerable? What’s unacceptable? Is it more important that the performance of the AI system is great or that the safety nets are trustworthy? (Spoiler alert: it’s the safety nets.) Which data is used? At which scale will you deploy this system… and who takes responsibility for the output?

Business leaders must also contend with ever-changing regulation. I have one strong opinion about regulation, which is that for applications of substantive impact on society, there must be an auditable system of record for the core decisions that determined the system parameters during design, especially these four:

  1. What was the provenance and nature of the data used?
  2. How was success/failure defined and scored?
  3. How was the launch test specified and performed?
  4. Which user groups were considered? For example, was the system built and tested for adults only or was it also designed and tested to be safe for children?

My hunch is that governments are unwilling to let the last wave of Big Tech repeat itself: a wild west of no regulation where problems are addressed after they’ve already brought unintended consequences. To audit AI decisions in the future, we’ll need new infrastructure and tools to maintain those records. Companies like DataRobot are well positioned to enable just that.

Data Scientists: The best of the precision thinkers

I strongly question any comments suggesting that AI negates the need for humans to develop coding skills.

After all, what is coding if not providing very precise instructions, maintaining control over what you’re requesting, and understanding what you’re observing? The ease of programming, or even the potential rejection of Python in favor of a personalized programming language created with a Large Language Model (LLM), does not diminish the requirement for precise thinking and precise instructions.

But there’s an even stronger reason to let go of the urge to panic about code copilots encroaching on data science territory: if you’re a data scientist who thinks writing code is what you do for a living, you’ve wildly missed the point of your profession. Data scientists must broaden their view of their skills and their role beyond mere execution.

All those red hot LLM benchmarks are a red herring, because as a data scientist, your best skill is the way you think, not how many commands you know in R.

Raw coding isn’t going to be the market for performance, since where data scientists shine is in their ability to optimize end-to-end technical communication. In other words, top data scientists begin their work long before their fingers touch a keyboard, working with domain experts and business stakeholders to ensure that they’re tackling the right problem. There’s no surer sign of data science incompetence than a headlong rush into using all the most sophisticated math and models to solve all the wrong problems. A truly great data scientist brings superlative ability to translate business needs into data-driven decision-making and AI applications.

Data science is a merciless game for the imprecise communicator and the profession brutally culls those whose thinking isn’t crisp. That’s the nature of the job; data science necessitates communicating with both machines and colleagues with utmost precision. So when it’s time to ask whether massive complex systems perform well at scale, who better to have in your corner than a professional whose thinking is as razor sharp as the data scientist? When it’s time to optimize agentic workflows or design AI safety nets (multi-layered defenses with real-time intervention and moderation), whose training is as well-suited to the task as the data scientist? Theirs is the kind of thinking that will cut through ambiguity and help leaders get to the bottom of whether their technical requirements are indeed satisfied by their shiny new solutions. But the role doesn’t stop at technical requirements, since data scientists will be more responsible than ever for protecting their organization’s brand and reputation in an AI-fueled world.

This role of the data-literate clear thinker is so crucial that there’s growing interest in making data processing tools accessible to any precise thinker, not just the folks who learned the typical tools. That’s why we see moves like DataRobot intentionally rolling out a suite of tools for both non-technical subject matter experts working with data for the first time, while continuing to build tools for the most skilled data scientists to be even more precise.

My crystal ball has nothing to say on the question of whether data science will go through a rebranding and what the title du jour will be, but if there’s one thing I’d be willing to bet on, it’s this: the precision thinking skills which come with classic data science training programs are going to be more in demand with the rise of enterprise AI systems, not less. Someone needs to test whether these solutions work and that someone is likely a data scientist or a data scientist in disguise. (Statisticians count in that category too!) The need for those who think deeply and delve deeply into data is going nowhere, no matter what they call themselves.

--

--

Cassie Kozyrkov

Chief Decision Scientist, Google. ❤️ Stats, ML/AI, data, puns, art, theatre, decision science. All views are my own. twitter.com/quaesita