How Data Engineers Should Prepare For An AI World


There’s been a lot of chatter lately about how the AI revolution will diminish the role of data engineers. I don’t believe that’s the case — in fact, data expertise will be more critical than ever. However, data professionals will need to acquire new skills to help their organizations get the most from AI and enhance their career prospects for the future.

AI unlocks the opportunity for organizations to extract more value from their data, and to do so more efficiently, but this can’t happen by itself. Data engineers will need to learn how and where to apply the technology, along with which models and tools to use in which situations. 

Here are four areas where AI will transform data analytics in the coming year, and the skills data engineers must acquire to meet these needs.

Building smarter data pipelines

Data pipelines combine sources of data that can be raw, unstructured and disorganized, and the task of engineers is to extract intelligence from those sources to deliver valuable insights. AI is about to transform that work.

Inserting AI into data pipelines can greatly accelerate a data engineer’s ability to extract value and insights. For example, imagine a company has a database of customer service transcripts or other text documents. With a few lines of SQL, an engineer can plug an AI model into a pipeline and instruct it to surface the rich insights from those text files. Doing so manually can take many hours, and some of the most valuable insights may only be discoverable by AI.

Data engineers who understand where and how to apply AI models to extract maximum value from data pipelines will be highly valuable to their organizations, but this requires new skills in terms of which models to choose and how to apply them.

Less data mapping, more data strategy

Different data sources often store information in different ways: One source system might refer to a state name as “Massachusetts,” for example, while another uses the abbreviation “MA.”

Mapping data to ensure it’s consistent and duplicate-free is a tailor-made job for AI. Engineers can construct a prompt that essentially says, “Take these 20 sources of customer data and build me a canonical customer database,” and the AI will complete the task in vastly less time.

That will require knowledge about how to write good prompts, but more importantly it frees up engineers’ time so they can spend less hours on data mapping and more on their organizations’ data strategy and data architecture.

Ultimately, the goal is to understand all the data sources available to an organization and how they can be best leveraged to meet the business goals. Handing tasks like data mapping off to an AI model will free up time for that higher-level work.

BI analysts must up-level their game

Business intelligence (BI) analysts spend a lot of their time today creating static reports for business leaders. When those leaders have follow-up questions about the data, the analysts must run a new query and generate a supplemental report. Generative AI will dramatically change those executives’ expectations.

As executives gain more experience with AI-driven chatbots, they will expect to interact with their business reports in a similar, conversational way. That will require BI analysts to up their game and learn how to provide those interactive capabilities. Instead of cranking out static charts, they’ll need to understand the pipelines, plug-ins and prompts required to build dynamic, interactive reports.

Cloud data platforms incorporate some of these capabilities in a low-code way, giving BI analysts a chance to extend their skills to address the new requirements. But there is a learning curve, and acquiring those skills will be their challenge in 2024.

Managing third-party AI services

When the cloud took off a decade ago, IT teams spent less time building infrastructure and software and more time managing third-party cloud services. Data scientists are about to go through a similar transition.

The growth of gen AI will require data scientists to work more with outside vendors that provide AI models, datasets and other services. Being familiar with the options, choosing the right model for the task at hand and managing those third-party relationships will be an important skill to acquire.

Looking forward to a lot more fun

Many data teams today say they are stuck in reactive mode, constantly responding to the latest job requests or fixing applications that broke. That’s no fun for anyone, but the influx of AI Into data engineering will change that.

AI will allow engineers to automate the most laborious parts of their work and free up time to think about the bigger picture. This will require new skills, but it will allow them to focus on more strategic, proactive work, making data engineers even more valuable to their teams — and their work a lot more enjoyable.

Jeff Hollan is director of product management at Snowflake.


Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers