Data Science Interview Questions That Trip People Up

A recruiter at a mid-size fintech told me something that stuck: “We lose good candidates in the stats round every single time. Not because they can’t do the math. Because they can’t explain it to a PM.” That framing changed how I think about data science interview prep entirely.

The questions below aren’t a Q&A cheat sheet. They’re the ones I’ve seen consistently show up across roles at companies like Google, Stripe, and smaller Series B shops, along with notes on what interviewers are actually listening for when you answer.

The SQL questions that filter out more candidates than algorithms

Most data science job postings list Python and ML frameworks up front. But the screening round is almost always SQL. Get comfortable with window functions before anything else.

These come up in some form at almost every interview:

Find the second-highest salary in a table without using LIMIT or TOP.
Write a query that identifies users who were active in month N but not in month N+1 (churn identification).
Given a table of events with timestamps, calculate the running 7-day average of daily signups.
Deduplicate a table where the same user_id appears multiple times with slightly different email capitalizations.

On that last one, interviewers aren’t just checking syntax. They want to see you ask a clarifying question: “Should I keep the most recent row, or the one with the lowercase email?” Asking that question is worth more than getting the query right.

One pattern I’d flag: a lot of people prep JOIN types exhaustively but forget about NULL handling in aggregations. If a column has NULLs and you’re doing AVG(), the result might look correct but silently exclude 23% of your data. That kind of thing trips people up more than any advanced window function.

Statistics questions, and the ones where “I’d need to test that” is the right answer

Here’s my honest opinion on this section: most data science interviewers ask statistics questions they themselves would answer inconsistently. The goal isn’t a textbook-perfect answer. It’s showing you know when you’re in uncertain territory.

Questions that come up regularly:

What’s the difference between Type I and Type II error, and which is worse for a medical trial versus a spam filter?
You run an A/B test and p = 0.049. Your boss wants to ship. What do you do?
Explain the Central Limit Theorem to a product manager who studied marketing.
Your model shows 94% accuracy on the test set. Why might that number be completely misleading?

On the A/B test question, the answer is not “ship it.” The follow-up questions they’re listening for are: What’s the sample size? How long did the test run? Did you pre-register the hypothesis or look at the data and then decide to run a test? Are there multiple comparisons we haven’t accounted for? Interviewers at data-mature companies want to hear you push back on the framing, not just answer it.

The accuracy question is a classic because 94% sounds great until you realize the dataset is 94% one class and your model just predicts that class every time. That’s a particularly common failure mode in fraud detection and medical diagnosis work. Mentioning precision/recall tradeoffs here lands well.

Machine learning: what they’re really testing

According to the Stack Overflow Developer Survey 2024, machine learning frameworks like PyTorch and TensorFlow consistently rank in the top tools data professionals use. But interviews at most non-FAANG companies don’t go deep on framework internals. They go deep on intuition.

Expect questions like:

Your gradient boosting model is overfitting. Walk me through your diagnostic process.
Compare L1 and L2 regularization. When would you prefer one over the other?
You’re building a recommendation system for a new product with no user history. What do you do in the first 30 days?
Your feature importance scores put “user_id” at the top. What went wrong?

That last one is a data leakage question dressed up differently. User IDs shouldn’t predict anything meaningful unless your train/test split was done incorrectly. Catching that without being told is the kind of thing that moves you from “technically solid” to “would trust with production data.”

I genuinely don’t know how much model architecture depth matters for non-research roles. My read is: most product DS interviews care more about feature engineering intuition than transformer internals. But I’d be wrong about some companies.

The business case round most people under-prepare for

A lot of data science candidates nail the technical sections and then stumble on questions like: “How would you measure whether our new onboarding flow is working?” or “We’re seeing a 12% drop in Day 7 retention. Where do you start?”

These aren’t trick questions. They’re checking whether you can connect data work to decisions a product team can act on. A few things that help:

Practice structuring your answer before jumping into methods. Something like: “First I’d clarify what ‘working’ means and who the stakeholder is. Then I’d look at what data we already have. Then I’d figure out what’s measurable in the next sprint versus what needs instrumentation we don’t have yet.” That framework, stated out loud, does more for your candidacy than jumping straight to “I’d run a Cox proportional hazard model.”

The BLS Occupational Outlook for Data Scientists projects 36% employment growth through 2033, well above average for all occupations. More people are competing for these roles every year. The candidates who get offers tend to be the ones who can talk about data in a way that product managers find useful, not just technically impressive.

What Craqly is actually useful for here

If you’re running through practice questions alone, the hardest thing to replicate is the feedback loop. Did your explanation of gradient descent make sense? Was your business case answer too vague? Craqly’s AI interview mode lets you run through data science questions out loud and get feedback on whether your explanations are actually landing, not just whether the answer is technically correct. For the communication-heavy rounds, that gap matters.

A few things worth knowing before the day

Bring a list of 3 questions that show you’ve read their recent engineering or data blog posts. Companies that publish data team work notice when you reference it. Companies that don’t publish much will still appreciate that you looked.

Ask about the ratio of time data scientists spend on data cleaning versus modeling versus stakeholder communication. The answer tells you more about the role than the job description does.

And if they ask “what’s your biggest weakness,” the worst answers are the ones that are actually strengths in disguise (“I work too hard”). The second worst is “I don’t really have one.” Something like “I’m slower than I’d like at building first drafts of dashboards, so I’ve been using templates more intentionally” is real and recoverable. That’s what they’re looking for.

What question type consistently trips you up most? That’s probably where to spend the next two hours.

Data Science Interviews: Business Impact Through Analytics 2026