In early 2024, a hiring manager at a Series B fintech described their hiring situation to me as “we have seven data science job reqs open and we can’t fill any of them, and we have one data engineering req that’s been open for three weeks and we already have twelve strong candidates.” I don’t have data on whether that anecdote generalizes. But I’ve heard variations of it enough times that I think it points at something real about where these two fields are right now.
This post is my attempt to reason through the data engineering vs. data science question as it actually stands in 2026, not how it looked in 2019.
What the job market actually shows
Data engineering roles have significantly outnumbered data science roles on most major job boards for several years now. The ratio varies by industry and company size, but the directional trend has been consistent since around 2021. Part of this is infrastructure: companies that built out data science teams in the 2017-2020 wave discovered they needed more data engineers to support those scientists before they could hire more scientists.
The BLS projects data scientist roles to grow 36% through 2033, which sounds enormous. But data engineering growth is harder to track because the BLS classification doesn’t cleanly separate data engineers from database administrators and software developers. The anecdotal evidence from LinkedIn job postings and survey data suggests data engineering demand has grown faster than formal projections capture.
The Stack Overflow 2024 Developer Survey found that data engineers earn a median salary notably higher than data analysts, and are increasingly indistinguishable from senior software engineers in compensation at larger companies. Whether or not their advantage is truly enduring is uncertain — the GenAI wave is changing both fields — but as of mid-2026 the market for strong data engineers remains tighter than the market for data scientists with generalist skills.
The actual day-to-day work, which is less glamorous for both roles than advertised
Data scientists in job descriptions: building machine learning models, running experiments, deriving insights that drive business strategy.
Data scientists in practice: 60-70% of time is spent on data cleaning, feature engineering, and trying to understand why the training data doesn’t match the production data. A significant fraction of DS roles at companies without mature data infrastructure end up being de facto data analysts or, occasionally, makeshift data engineers when no one else will fix the pipeline.
Data engineers in job descriptions: building and maintaining pipelines, warehousing, orchestration.
Data engineers in practice: debugging Airflow DAGs at 11pm, writing dbt models, arguing about schema design, handling the stakeholder who wants a new dashboard by Thursday, and occasionally doing the data science work nobody else got to. Also more on-call than most people expect going in.
This is, of course, not universal. At large tech companies with mature data organizations, both roles are well-scoped. At most companies, the lines blur.
Skills and what it takes to switch between them
Both roles require SQL. That’s the overlap. Beyond that, the divergence is real.
Data engineering leans on software engineering fundamentals: distributed systems concepts, pipeline orchestration (Airflow, Prefect, Dagster), cloud warehouse administration (Snowflake, BigQuery, Redshift), and increasingly dbt for transformation work. Strong Python is required. Spark knowledge is expected for roles at companies with significant data volume. System design for data infrastructure shows up heavily in senior interviews.
Data science leans on statistics and machine learning: regression, classification, hypothesis testing, experiment design, model evaluation. Python is required and the specific libraries matter (pandas, scikit-learn, PyTorch or TensorFlow for deep learning work). Communication of results to non-technical stakeholders is a bigger part of the job than most job descriptions acknowledge.
Switching from data science to data engineering is more common than the reverse, and generally more straightforward. Most data scientists already have the Python skills; they need to develop the systems thinking, the infrastructure knowledge, and tolerance for on-call. Switching from data engineering to data science requires building genuine statistical depth, which takes longer.
GenAI is changing both roles, but not evenly
If you’d asked me this in 2022, I’d have said data science was clearly the higher-status field with better long-term trajectory. I’d probably say something different now.
The rise of large language models and foundation models has done two things. First, it’s raised the floor of what non-specialists can do with data (many basic analysis tasks are now one ChatGPT prompt away). Second, it’s dramatically raised demand for data infrastructure that can support AI-powered products, which means data engineers.
Data science roles are bifurcating. On one end: ML engineering roles that require production system skills (model serving, monitoring, inference optimization) and pay like senior software engineers. On the other end: analyst-adjacent DS roles where the “model” is increasingly a fine-tuned foundation model, not a custom-built one. The middle is under pressure.
Data engineering is being changed by AI too, but differently. LLMs can help write boilerplate SQL and pipeline code faster. But the judgment work — what schema design makes sense for this access pattern, how should this pipeline handle failure, where is the quality assertion missing — isn’t going away. If anything, as data infrastructure becomes more critical to AI product development, the judgment layer in data engineering gets more valuable, not less.
How to actually choose between them
The career advice I’d give is not “pick whichever pays more.” The fields are close enough in compensation that you’ll optimize the wrong thing if you go by salary alone. The more useful question is which kind of problem you find more interesting to be stuck on for a week.
If you’re more interested in being stuck on “why is this model underperforming on this segment of users?” or “what does this customer behavior pattern actually mean?” — data science is probably the right fit.
If you’re more interested in being stuck on “why is this pipeline producing different results when it runs at 3am vs. noon?” or “how do I design this schema to handle three years of history without breaking the query performance?” — data engineering is probably the right fit.
Many people are interested in both, which is real and valid, but the market generally rewards specialists over generalists at the mid-to-senior level. Picking one to go deep on first, and treating the other as adjacent knowledge, is a more workable approach than trying to be equally strong in both from the start.