A hiring manager at a mid-size SaaS company told me something I didn’t expect: in about 47 interviews she’d run over three years for analyst roles, fewer than a third of candidates could correctly explain the difference between a LEFT JOIN and an INNER JOIN on a whiteboard. Not solve a complex window function. Just explain the basics.
That gap between what candidates think they’ll be asked and what actually shows up in analyst interviews is real. This post covers the questions that keep appearing across company sizes, the ones that trip people up, and what a strong answer actually sounds like.
The SQL questions you will almost certainly face
SQL is still the baseline. I’ve seen companies say they want “Python-first” analysts and then open the interview with a SQL problem. Don’t skip this.
The most common questions fall into a few buckets. Basic aggregation gets asked at every level. “Write a query that finds the top 5 customers by revenue in the last 90 days” is not an advanced question, but a surprising number of people stumble on the date filtering or forget to use LIMIT correctly.
Window functions come up constantly for mid-level and senior roles. Expect something like: “Given a table of daily sales, write a query that calculates a 7-day rolling average.” The key is knowing OVER(), PARTITION BY, and ORDER BY together. If you’ve only memorized the syntax but never used it on a real dataset, it shows.
Self-joins are rarer but memorable when they appear. “Find all customers who placed an order in January 2025 but not in February 2025” is a classic. You can solve it with a self-join, a subquery, or a NOT EXISTS clause. Knowing two approaches and explaining the trade-offs will distinguish you.
The Stack Overflow Developer Survey 2024 found SQL is still the most commonly used data tool among data practitioners, ahead of Python in actual day-to-day usage. Yet most interview prep materials treat SQL as an afterthought.
Statistics questions that people underprepare
Stats is where analysts get separated from spreadsheet jockeys, and also where the questions get surprisingly philosophical.
You should be ready to explain p-values in plain English to a non-technical stakeholder. Not the textbook definition. Something like: “If the null hypothesis were true, a p-value of 0.03 means we’d see results this extreme only about 3% of the time by chance.” Most hiring managers aren’t looking for precision here; they’re testing whether you’d confuse a non-technical colleague.
Confidence intervals come up constantly in A/B testing contexts. “What does a 95% confidence interval mean?” is a common question. The wrong answer (which many candidates give) is “there’s a 95% chance the true value is in this range.” The right answer involves saying the interval is a property of the procedure, not the parameter.
Correlation vs. causation comes up constantly in business case questions. Interviewers want to see you think about confounding variables without being prompted. “Our premium users spend more time on the platform. Should we make everyone a premium user?” is a trap question about selection bias, and the correct response starts with “that correlation could be explained by…”
BI tools and the visualization round
Not every company asks this. Startups often skip it entirely. But at companies running Tableau, Power BI, or Looker, there’s frequently a practical component where you’re given a dataset and asked to build something in 30-45 minutes.
The skill being tested isn’t how fast you click around. It’s whether you ask the right questions before building. “What decision does this dashboard need to support?” and “Who’s the primary audience?” are the two questions that show seniority before you write a single formula.
If you’re preparing for a Tableau-specific role, get comfortable with calculated fields and LOD expressions. Fixed LOD expressions in particular trip people up because they behave differently from regular aggregations when filters are applied.
Business case questions are where most candidates leave points on the table
These questions sound softer than SQL but they’re often the deciding factor. “Our signup conversion dropped 12% last week. Walk me through how you’d investigate.” is a typical one. Interviewers are watching for structure, not necessarily the right answer.
A good structure: start by segmenting. Is the drop happening in one channel, one device type, one geography? Then check for external factors. Did we ship something? Did a competitor run a promo? Then dig into funnel steps. Where exactly in the signup flow did users drop off?
The mistakes I see most often: jumping straight to a hypothesis without acknowledging you need to check for data quality issues first, and not mentioning you’d look at time-of-day or day-of-week patterns as a sanity check. Those two habits, checking data quality and segmenting before theorizing, are the marks of an analyst who’s actually worked in production.
What hiring managers are really evaluating
I think the Glassdoor-rated “data analyst interview process” at most companies is actually testing one thing more than SQL or stats: whether you’ll ask good questions when the brief is vague.
Real analyst work is mostly ambiguous. The business question is underspecified, the data is messy, the stakeholder doesn’t know what they want until they see the wrong answer. Technical skills get you in the room. The clarifying question habit gets you the offer.
If you want to practice that part specifically, tools like Craqly let you run mock interviews with AI feedback on how clearly you’re communicating your reasoning process, not just whether your SQL is correct. That gap, between getting the right answer quietly and explaining your logic as you work, is what most solo prep misses.
The BLS occupational outlook for data and information analysts projects 23% job growth through 2032, which is faster than almost any other white-collar field. There will be more openings. The interview standards will probably get stricter, not easier, as the pool grows.
One last thing about preparation
Most candidates overprepare SQL syntax and underprepare the communication piece. You can look up DENSE_RANK() syntax. You can’t look up how to explain your logic clearly under pressure without practicing it.
Do at least five mock business case walkthroughs where someone can interrupt you and ask “why did you assume that?” before your real interview. If you can answer that question well mid-stream, you’ll be fine.