What Are Sticky Sessions in Load Balancing? (Session Affinity)

Your app scales from 1 server to 10 servers… and suddenly your users start getting randomly logged out.

Nothing in your code “changed.” But your traffic did.

What changed is where requests land. With multiple app instances behind a load balancer, one request might hit server A and the next request might hit server B. If your session state lives in server A’s memory, server B has no idea who the user is.

Sticky sessions are a common quick fix.

If you’re following the learning path, start with What Is Load Balancing and How It Works first.

For the full roadmap, use the System Design Foundations series as the pillar page and the System Design tag as a category page.

Related foundation posts that pair well with this topic:

Open Table of Contents

What Are Sticky Sessions? (Definition)
Why Sticky Sessions Exist
How Sticky Sessions Work (Step-by-Step)
Common Sticky Session Techniques (Cookie, IP Hash, Header)
Sticky Sessions vs Stateless Apps
When Sticky Sessions Are Actually Useful
Why Sticky Sessions Can Be Risky
Better Alternatives (Shared Sessions, Tokens)
- Option 1: Put session state in a shared store
- Option 2: Use stateless auth tokens (JWT)
A Practical Example: Nginx Session Affinity
- Option 1: IP hash (simple)
- Option 2: Hash a stable value (more explicit)
A Practical Example: Kubernetes Service Session Affinity
Migration Plan: From Sticky to Stateless
Interview Questions
Conclusion
References
YouTube Videos

What Are Sticky Sessions? (Definition)

Sticky sessions (also called session affinity or session persistence) mean that a load balancer tries to send all requests from the same user to the same backend instance.

A simple way to remember it:

Without stickiness: one user → many servers (over time)
With stickiness: one user → the same server (until the session expires)

Sticky sessions are often used when the backend is stateful (it keeps user-specific data in memory), and the team isn’t ready to move that state into a shared place yet.

Why Sticky Sessions Exist

In a perfect world, your app servers are stateless: any instance can handle any request.

In the real world, many apps start out storing things like these in server memory:

session data (logged-in user, roles)
shopping cart state
in-progress checkout data
rate-limit counters

That works on a single server.

But after you put multiple servers behind a load balancer, you’ve created a new problem: the “next request” might go to a different server.

Sticky sessions exist because they let teams scale out without immediately rewriting the app.

How Sticky Sessions Work (Step-by-Step)

Here’s the typical flow when the load balancer uses a cookie-based sticky session:

flowchart TD
    U[User Browser] -->|1. Request| LB[Load Balancer]

    LB -->|2. Choose backend| A[App Server A]
    LB -->|2. Choose backend| B[App Server B]

    A -->|3. Response + Set-Cookie| U

    U -->|4. Next request with cookie| LB
    LB -->|5. Route to same backend| A

    A --> R[(Shared Session Store\nOptional: Redis/DB)]
    B --> R

What’s important is what the cookie represents:

It might directly encode “this user belongs to server A” (not common in modern managed LBs)
More commonly, it’s an opaque “affinity” cookie that the load balancer understands

Either way, the effect is the same: the load balancer keeps routing the user back to the same backend.

There are a few common ways to implement session affinity. Each one has very different failure modes.

The load balancer sets an affinity cookie and uses it on the next request.

Works well for browser traffic.
Doesn’t break when many users share an IP.
The cookie has a TTL, and the affinity can expire.

2. IP hash affinity (simple, but risky)

The load balancer hashes the client IP and uses that to pick a backend.

This is a popular “simple” technique, but it can behave badly:

NATs and mobile carriers can make thousands of users appear as the same IP.
One “big” IP can become a hotspot and overload a backend.

3. Header-based affinity (advanced)

Some systems route based on a stable request header (like a user ID or a tenant ID).

This is powerful, but be careful: if clients can spoof the header, you can create security and traffic-routing issues. Usually you’d only trust this header if it’s injected by a gateway you control.

Sticky Sessions vs Stateless Apps

Sticky sessions don’t make your app stateless. They only hide statefulness behind routing.

A good mental model:

Sticky sessions are a routing decision.
Statelessness is an architectural property.

If server A dies, routing can’t save you. The session data in server A’s memory is gone.

That’s why most teams treat sticky sessions as:

a compatibility mode
a migration stepping stone
not the end state

When Sticky Sessions Are Actually Useful

Sticky sessions have a bad reputation because teams overuse them, but they’re not always wrong.

They can be a reasonable choice when:

you need a quick stability fix (users are being logged out)
your session store isn’t ready yet (Redis/DB sizing, operational readiness)
you’re supporting a legacy app that can’t be made stateless quickly
you’re using long-lived connections where reconnecting is expensive (some WebSocket setups)

In these cases, stickiness buys time.

The key is to treat it like a temporary bridge while you move state out of memory.

Why Sticky Sessions Can Be Risky

Sticky sessions introduce coupling between user experience and a specific machine.

1. Failures become user-visible

If a backend instance crashes, every user “stuck” to it experiences a disruption. Even if the load balancer routes them somewhere else on retry, the state that lived in memory may be lost.

2. Hotspots and uneven load

Some users are “heavier” than others. If a few heavy users end up sticky to the same backend, you get uneven CPU and memory utilization.

This is especially common with IP hash affinity.

3. Deployments and scaling can be weird

When you add or remove instances, your stickiness mapping changes.

New instances might receive almost no traffic for a while.
Old instances might stay hot because existing users are pinned to them.

This makes autoscaling slower to help and can make rolling deployments drag out longer than you expect.

4. Debugging becomes harder

In production, stickiness changes how you debug:

“It only happens for some users” might really mean “it only happens on one backend.”
Your metrics might look fine on average, while one instance is overloaded.

Better Alternatives (Shared Sessions, Tokens)

The cleanest fix is to make your app servers interchangeable.

Option 1: Put session state in a shared store

Instead of keeping session state in server memory, keep it in:

Redis (common)
a database (can work, but latency is higher)
a managed session store or identity provider

Now any backend can handle any request, and stickiness becomes unnecessary.

Option 2: Use stateless auth tokens (JWT)

If your “session” is mostly authentication, you can store user identity and claims in a signed token.

This reduces server-side session storage, but it’s not a universal replacement:

tokens have to expire
revocation is harder (you often need a denylist)
you might still need server-side state for carts, workflows, rate limits, etc.

In other words: JWT can remove some session state, not all state.

A Practical Example: Nginx Session Affinity

Open-source Nginx doesn’t provide cookie-based “sticky” sessions out of the box in the same way some managed load balancers do, but you can still get a basic form of session affinity.

Option 1: IP hash (simple)

upstream app_upstream {
  ip_hash;
  server 10.0.0.11:3000;
  server 10.0.0.12:3000;
  server 10.0.0.13:3000;
}

server {
  listen 80;

  location / {
    proxy_pass http://app_upstream;
  }
}

This tends to keep the same client IP routed to the same backend, but remember the risks: NAT and shared IPs can create hotspots.

Option 2: Hash a stable value (more explicit)

If you have a stable cookie you control (like a user ID or a session ID), you can hash that value instead.

upstream app_upstream {
  hash $cookie_session_id consistent;
  server 10.0.0.11:3000;
  server 10.0.0.12:3000;
  server 10.0.0.13:3000;
}

This works best when the cookie is present and stable. If the cookie is missing, Nginx falls back to a different routing behavior (so treat this as a technique, not a guarantee).

A Practical Example: Kubernetes Service Session Affinity

Kubernetes Services can do basic session affinity at the Service layer.

apiVersion: v1
kind: Service
metadata:
  name: checkout
spec:
  selector:
    app: checkout
  ports:
    - port: 80
      targetPort: 3000
  sessionAffinity: ClientIP

This is essentially IP-based affinity. It can be fine for small internal services, but the same “shared IP” caveat applies if your traffic comes through a NAT or a gateway.

Migration Plan: From Sticky to Stateless

If your system currently relies on sticky sessions, here’s a safe way to migrate without breaking users.

Identify what state is in memory
- sessions, carts, workflow state, rate limits, caches
Move the state to a shared store
- start with the smallest, most painful piece (usually auth session)
Make the app idempotent where possible
- retries and failovers happen more often once you remove stickiness
Reduce reliance on stickiness gradually
- shorten the affinity TTL
- canary a portion of traffic without stickiness
Turn off sticky sessions completely
- only after you have strong confidence and monitoring

If you want a deeper refresher on why statelessness matters for scaling out, revisit Horizontal vs Vertical Scaling Explained (Scale Out vs Up).

Interview Questions

1. What are sticky sessions?

Sticky sessions (session affinity) are a load-balancing technique where requests from the same client are consistently routed to the same backend instance for some period of time. This is commonly used when the backend is stateful and stores user session data in memory. It can prevent issues like users being logged out when requests land on different servers. The trade-off is that you couple user experience to a specific machine.

2. What’s the difference between sticky sessions and a stateless service?

Sticky sessions are a routing rule, not an architectural property. A stateless service can handle any request on any instance because user-specific state is stored externally (Redis/DB/tokens), so a load balancer can freely distribute traffic. Sticky sessions keep sending the same user to the same instance so that in-memory state “works,” but the service is still stateful. If that instance dies, the session state can be lost.

3. How do load balancers implement session affinity?

The most common approach for HTTP systems is cookie-based affinity, where the load balancer sets an affinity cookie and uses it to route subsequent requests. Another approach is IP hash, where the client IP is hashed to pick a backend, which can be simple but risky due to NAT and shared IPs. Some systems also support header-based affinity when a trusted gateway injects a stable identifier. Each method works, but each one comes with different security and hotspot risks.

4. Why can sticky sessions be risky in production?

They create uneven load and fragile coupling. A subset of users pinned to one instance can overload it, and if that instance fails, many users experience disruption simultaneously. Sticky sessions can also make autoscaling less effective because new instances may receive very little traffic until old affinity mappings expire. Over time, they can hide the need to move state to a shared store.

5. When would you choose sticky sessions anyway?

I’d consider sticky sessions as a short-term fix when users are experiencing session-related bugs after scaling out and we need stability quickly. They can also be useful for some long-lived connection patterns or legacy systems that can’t be made stateless quickly. The key is to define an exit plan: move session state out of memory into a shared store and gradually reduce reliance on affinity. In interviews, I’d frame sticky sessions as a bridge, not an end state.

6. What’s a better long-term design than sticky sessions?

The long-term approach is to make app servers stateless by moving session state and workflow state into shared infrastructure like Redis, a database, or a managed identity provider. Then any instance can handle any request, which improves failover behavior, rolling deployments, and autoscaling effectiveness. If authentication is the main concern, stateless tokens can reduce server-side session storage, but they don’t replace all forms of user state. In practice, most real systems use a combination: tokens for auth and shared stores for server-side state.

Conclusion

Sticky sessions (session affinity) route a user to the same backend instance across requests.
They exist because many apps store session state in memory and break when traffic spreads across multiple servers.
Cookie-based affinity is common for HTTP; IP hash is simpler but can create hotspots.
Sticky sessions can make failures more user-visible and can make autoscaling and deployments behave oddly.
The long-term fix is statelessness: move state to shared stores (Redis/DB) or use tokens where appropriate.

The next topic in this series covers Caching - why it improves performance, and where it fits in a real architecture.

If you want the “routing layer” context again, revisit What Is Load Balancing and How It Works.

References

Load balancer stickiness - AWS Prescriptive Guidance
https://docs.aws.amazon.com/prescriptive-guidance/latest/load-balancer-stickiness/welcome.html
Session affinity - Cloudflare Developers
https://developers.cloudflare.com/load-balancing/understand-basics/session-affinity/
ngx_http_upstream_module (hash) - Nginx Docs
https://nginx.org/en/docs/http/ngx_http_upstream_module.html#hash

YouTube Videos

“AWS ALB Stickiness Explained: Target Group Session Affinity Secrets 🔍☁️” - Network Ninja
https://www.youtube.com/watch?v=XwMVbJ_y2iA

What Are Sticky Sessions in Load Balancing? (Session Affinity)

Key Takeaways

Table of Contents

What Are Sticky Sessions? (Definition)

Why Sticky Sessions Exist

How Sticky Sessions Work (Step-by-Step)

2. IP hash affinity (simple, but risky)

3. Header-based affinity (advanced)

Sticky Sessions vs Stateless Apps

When Sticky Sessions Are Actually Useful

Why Sticky Sessions Can Be Risky

1. Failures become user-visible

2. Hotspots and uneven load

3. Deployments and scaling can be weird

4. Debugging becomes harder

Better Alternatives (Shared Sessions, Tokens)

Option 1: Put session state in a shared store

Option 2: Use stateless auth tokens (JWT)

A Practical Example: Nginx Session Affinity

Option 1: IP hash (simple)

Option 2: Hash a stable value (more explicit)

A Practical Example: Kubernetes Service Session Affinity

Migration Plan: From Sticky to Stateless

Interview Questions

1. What are sticky sessions?

2. What’s the difference between sticky sessions and a stateless service?

3. How do load balancers implement session affinity?

4. Why can sticky sessions be risky in production?

5. When would you choose sticky sessions anyway?

6. What’s a better long-term design than sticky sessions?

Conclusion

References

YouTube Videos

Next in Series

Related Posts

What Is Load Balancing and How It Works

What Is a Single Point of Failure (SPOF)?

What Is High Availability? A Beginner's Guide

Keep Learning with New Posts

Was this guide helpful?

What Are Sticky Sessions in Load Balancing? (Session Affinity)

Key Takeaways

Table of Contents

What Are Sticky Sessions? (Definition)

Why Sticky Sessions Exist

How Sticky Sessions Work (Step-by-Step)

Common Sticky Session Techniques (Cookie, IP Hash, Header)

1. Cookie-based stickiness (most common for HTTP)

2. IP hash affinity (simple, but risky)

3. Header-based affinity (advanced)

Sticky Sessions vs Stateless Apps

When Sticky Sessions Are Actually Useful

Why Sticky Sessions Can Be Risky

1. Failures become user-visible

2. Hotspots and uneven load

3. Deployments and scaling can be weird

4. Debugging becomes harder

Better Alternatives (Shared Sessions, Tokens)

Option 1: Put session state in a shared store

Option 2: Use stateless auth tokens (JWT)

A Practical Example: Nginx Session Affinity

Option 1: IP hash (simple)

Option 2: Hash a stable value (more explicit)

A Practical Example: Kubernetes Service Session Affinity

Migration Plan: From Sticky to Stateless

Interview Questions

1. What are sticky sessions?

2. What’s the difference between sticky sessions and a stateless service?

3. How do load balancers implement session affinity?

4. Why can sticky sessions be risky in production?

5. When would you choose sticky sessions anyway?

6. What’s a better long-term design than sticky sessions?

Conclusion

References

YouTube Videos

Next in Series

Related Posts

What Is Load Balancing and How It Works

What Is a Single Point of Failure (SPOF)?

What Is High Availability? A Beginner's Guide

Keep Learning with New Posts

Was this guide helpful?