Backend Engineering Interview Mastery: Scalability & Architecture Knowledge 2026

The list below is built from publicly documented interview patterns at companies like Stripe, Shopify, and various mid-size SaaS shops. These aren’t hypothetical questions. They’re the ones that come up with enough frequency that you’d be surprised if they didn’t appear in a loop covering backend engineering.

According to the Stack Overflow Developer Survey 2024, roughly 48% of professional developers work primarily on backend systems. The interview process for those roles has converged around a recognizable set of topics: API design, database theory, caching, security, and system architecture. What follows covers all of them.

REST APIs and web services (questions 1-10)

  1. What’s the difference between REST and GraphQL? When would you choose one over the other?
  2. Explain HTTP status codes 200, 201, 400, 401, 403, 404, 409, and 500. When is 409 appropriate vs. 400?
  3. What is idempotency? Which HTTP methods are idempotent and which aren’t?
  4. How would you design a rate limiter for a public API? What data structures and storage would you use?
  5. What’s the difference between synchronous and asynchronous API design? Give an example of when you’d use a webhook instead of polling.
  6. How do you handle versioning for a REST API that already has thousands of active clients?
  7. What is HATEOAS? Have you ever worked with an API that implemented it? Would you implement it yourself?
  8. How does cursor-based pagination differ from offset-based pagination? What are the failure modes of offset pagination on live data?
  9. What headers does a well-designed API response include beyond Content-Type and Authorization?
  10. How would you implement request tracing across multiple microservices? What gets logged and where?

Database design and optimization (questions 11-20)

  1. What is database normalization? When is it appropriate to denormalize?
  2. Explain the difference between a clustered index and a non-clustered index.
  3. What is the N+1 query problem? How do you detect and fix it in a codebase you’ve just joined?
  4. What does EXPLAIN or EXPLAIN ANALYZE output tell you? What do you look for in the output?
  5. What’s the difference between optimistic and pessimistic locking? Give a concrete use case for each.
  6. How would you handle a database migration on a table with 50 million rows without downtime?
  7. What is a database transaction? What’s the difference between READ COMMITTED and REPEATABLE READ isolation levels?
  8. When would you choose DynamoDB over PostgreSQL? What are you giving up?
  9. How do database connection pools work? What happens when all connections are exhausted?
  10. What is sharding? What problems does it introduce that weren’t present with a single-node database?

Caching and performance (questions 21-27)

  1. What’s the difference between cache-aside, write-through, and write-behind caching? When would you use each?
  2. How do you handle cache invalidation when underlying data changes? What strategies exist beyond TTL?
  3. What is a cache stampede? How would you prevent one in a high-traffic application?
  4. What are the trade-offs between in-memory caching (local to the process) and a shared cache like Redis?
  5. How would you cache an API response that’s personalized per user? What are the cache key design considerations?
  6. How does CDN caching interact with your API? What HTTP headers control it?
  7. What is a bloom filter and when is it useful in a caching or database context?

Security and authentication (questions 28-33)

  1. What’s the difference between symmetric and asymmetric encryption? Which does JWT use by default?
  2. How do JWTs work? Where are they stored on the client, and what are the security implications of each storage option?
  3. What is OAuth 2.0? How does it differ from OpenID Connect?
  4. How would you prevent SQL injection in an application that can’t use parameterized queries for a specific reason?
  5. What is CSRF and how is it prevented? How does the SameSite cookie attribute help?
  6. How do you handle secrets in a backend application? What’s wrong with environment variables as a secrets strategy?

On question 33: environment variables are better than hardcoded secrets in source code, but they’re not a strong secrets management strategy. They’re often visible in process listings, get included in error reports, and can leak through logging. Tools like AWS Secrets Manager, HashiCorp Vault, or even GitHub Actions secrets with rotation are the real answer. Many teams know this and still use env vars out of convenience. I’d say so in an interview rather than pretend there’s a clean answer.

Microservices and system design (questions 34-40)

  1. What are the main arguments for and against a microservices architecture? What organizational conditions make microservices a bad fit?
  2. How would you handle distributed transactions across two microservices that each own their own database?
  3. What is the saga pattern? What’s the difference between choreography and orchestration in saga implementations?
  4. How does a message queue differ from a message broker? Give a concrete example of each.
  5. What is a circuit breaker? How does it prevent cascading failures in a distributed system?
  6. How would you design a notification system that needs to deliver to email, SMS, and push, with at-least-once delivery guarantees?
  7. How do you decide when a new feature should be a new service vs. added to an existing service?

The backend engineering mindset interviewers look for

These 40 questions aren’t all equally likely to come up. A startup doing three-round interviews will probably hit 12 to 15 of them. A large tech company running a 5-round loop might hit more. The point isn’t to memorize answers.

The best backend interview performances I’ve seen share a pattern: the candidate treats the question as a starting point, not an endpoint. They give an answer, then immediately explain what they’d change given different constraints. “That’s the general approach. If we’re talking about a write-heavy workload, I’d probably…”, “If we’re doing this at Stripe’s scale vs. a 10-person startup, the answer looks different because…”

That kind of contextual reasoning is hard to fake and easy to recognize.

The System Design Primer on GitHub remains one of the more complete free references for the architecture questions in this list. It’s not a substitute for experience, but it’s a solid starting point for questions 34 to 40 if system design is where you’re weakest.

What’s the question in this list that you’d least like to be asked right now? That’s probably where to start your prep.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top