System Design Interview: Design a URL Shortener

Designing a URL shortener like TinyURL or Bitly is one of the most common system design interview questions. It seems simple at first - mapping a long string to a short one - but quickly becomes complex when you consider scale, concurrency, and hash collisions.

Need a quick revision before interviews? Read the companion cheat sheet: System Design Interview: Url Shortener System Design CheatSheet.

Open Table of Contents

Interview Framework: How to Approach This Problem
Step 1: Clarifying Requirements
Step 2: Core Assumptions and Constraints
Step 3: High-Level Architecture
- System Flow Diagram
- API Design (REST)
Step 4: The Hardest Problem - Generating Unique IDs
Step 5: Database Design and Storage
Step 6: Key Technical Decision - 301 vs 302 Redirects
Step 7: Scaling the System
- 1. Caching (Critical)
- 2. Database Sharding
Step 8: Security and Permissions
Step 9: Handling Edge Cases
Step 10: Performance Optimizations
Real-World Implementations
- TinyURL
- Bitly
Common Interview Follow-Up Questions
Conclusion
References
YouTube Videos

Interview Framework: How to Approach This Problem

In a system design interview, when asked to design a URL Shortener, follow this structure:

Clarify requirements (5 minutes) - Functionality and volumes.
Back-of-envelope Math (3 minutes) - This is crucial here. Storage runs out fast.
High-level design (10 minutes) - API and Data Flow.
Deep dive (20 minutes) - Focus on the shortening algorithm (Base62 vs Hashing).
Scale (10 minutes) - Caching and cleanup.

Key mindset: Prove you understand Trade-offs between collision handling and throughput.

Step 1: Clarifying Requirements

Questions to Ask the Interviewer

Q: Can users choose their own alias (custom URL)?

Answer: Yes, but optional. Default is random.

Q: Do the links expire?

Answer: Standard links last 5 years.

Q: What is the scale?

Answer: 100M new URLs per month. 10 Billion reads per month (100:1 read/write ratio).

Functional Requirements

Shorten: Given a long URL, return a unique short URL.
Redirect: Given a short URL, redirect to the original long URL.
Custom Alias: Allow specific custom aliases (tinyurl.com/MyResume).
Analytics: Track click counts (optional but good to mention).

Non-Functional Requirements

Low Latency: Redirection must be extremely fast (< 20ms).
High Availability: If the service is down, all redirects fail. Aim for 99.99%.
Unpredictable: Ideally, the next short URL shouldn’t be guessable (security).

Step 2: Core Assumptions and Constraints

Traffic Estimates

Writes: 100 Million / month ≈ 40 writes/sec.
Reads: 10 Billion / month ≈ 4,000 reads/sec.
Ratio: Read-heavy (100:1).

Storage Estimates

Duration: 5 Years.
Total Objects: 100M/month × 12 × 5 = 6 Billion URLs.
Object Size: 500 bytes (ID, LongURL, UserID, Timestamp).
Total Capacity: 6B × 500 bytes ≈ 3TB.
- Conclusion: We can easily fit the metadata in a NoSQL cluster or sharded SQL.

Bandwidth Estimates

Reads: 4,000 req/sec × 500 bytes ≈ 2 MB/sec. (Trivial)

Step 3: High-Level Architecture

System Flow Diagram

flowchart TD
    Client["Client Browser"] --> LB["Load Balancer"]
    LB --> WebSvc["Shortener Service<br/>Web Server"]

    WebSvc --> Cache[("Cache<br/>Redis")]
    WebSvc --> DB[("Database<br/>NoSQL/SQL")]

    subgraph "Write Path"
    WebSvc -.-> KGS["Key Generation Service<br/>(Optional)"]
    KGS -.-> KGS_DB[("Exhausted Keys DB")]
    end

    classDef service fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000000
    classDef storage fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000000
    classDef client fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000000
    classDef infrastructure fill:#f5f5f5,stroke:#616161,stroke-width:2px,color:#000000

    class Client client
    class WebSvc,KGS service
    class Cache,DB,KGS_DB storage
    class LB infrastructure

API Design (REST)

1. Create Short URL POST /api/v1/data/shorten

Input: { longUrl: "https://...", customAlias: "optional" }
Output: { shortUrl: "https://tiny.url/xyz123" }

2. Redirect (Get) GET /api/v1/{shortUrl}

Output: HTTP 301/302 Redirect to Long URL.

Step 4: The Hardest Problem - Generating Unique IDs

We need a short string like http://tt.com/u7yX9. How do we generate the u7yX9 part?

Constraints: Length

Using Base62 encoding (A-Z, a-z, 0-9):

Length 6: 62⁶ ≈ 56.8 Billion combinations.
Length 7: 62⁷ ≈ 3.5 Trillion combinations.
Since we need to store 6 Billion URLs (from Step 2), Length 7 is sufficient.

Approach 1: Hash Collision (MD5/SHA)

Hash the long URL MD5(long_url).

Problem: MD5 produces 128-bit strings (too long).
Truncation: Take first 7 characters.
Critical Flaw: Collisions. Different long URLs might hash to the same first 7 chars.
Resolution: Check DB -> if exists, append salt -> re-hash. (Slows down writes significantly).

Approach 2: Key Generation Service (KGS) - Recommended

We pre-generate unique 7-character strings offline and store them in a “Token DB”.

Key DB: Has two tables: UsedKeys and UnusedKeys.
Process:
1. KGS loads 1000 unused keys into memory.
2. Web Server requests a key.
3. KGS hands one out and marks it used.
Pros: Zero collisions. Extremely fast (no hashing on the fly).
Cons: Need to handle concurrency so KGS doesn’t give the same key twice (use SELECT ... FOR UPDATE or strict locking).

Step 5: Database Design and Storage

Schema:

{
  "hash": "u7yX9", // Primary Key
  "originalUrl": "https://google.com/search?q=...",
  "creationDate": "2026-02-12T...",
  "expirationDate": "2031-02-12T...",
  "userId": "UserID_OPTIONAL"
}

Which DB Type?

NoSQL (DynamoDB / Cassandra):
- ✅ Highly scalable for billions of rows.
- ✅ Fast Key-Value lookups.
- ✅ Easier to shard.
SQL (PostgreSQL):
- ✅ Transactional support (good if tracking complex user analytics).
- ❌ Harder to scale horizontally to 100TB+.

Verdict: NoSQL (DynamoDB) is the industry standard for this read-heavy, simple structure.

Step 6: Key Technical Decision - 301 vs 302 Redirects

301 (Permanent Redirect)

Browser caches the mapping. Next time user types tiny.url/xyz, browser goes directly to Long URL without hitting your server.
Pro: Reduced server load.
Con: You lose analytics. You don’t know if the user clicked it again.

302 (Temporary Redirect)

Browser always hits your server first for the new location.
Pro: Accurate analytics (click tracking).
Con: Higher server load.

Decision: Use 302 if Analytics is a business requirement (usually yes). Use 301 if reducing server cost is the priority.

Step 7: Scaling the System

1. Caching (Critical)

Since 80/20 rule applies (20% of URLs generate 80% of traffic), we should cache popular mappings.

Technology: Redis / Memcached.
Eviction: LRU (Least Recently Used).
Flow: GET request -> Check Redis -> If Miss, Check DB -> Write to Redis -> Return.

2. Database Sharding

How to store 3TB of data?

Hash-Based Sharding: hash(short_url) % n.
Determine which shard holds the data based on the short URL ID.

Step 8: Security and Permissions

Prediction Attack: If using auto-increment IDs (tiny.url/1, tiny.url/2), competitors can scrape all your data by iterating numbers.
- Fix: Use KGS with random distribution, or simple Base62 + large random offset.
Malicious Links: Users shortening phising links.
- Fix: Integrate with Google Safe Browsing API to scan long URLs before shortening.

Step 9: Handling Edge Cases

KGS Failure: If the Key Gen Service dies, writes stop.
- Solution: Run redundant KGS instances with standby keys in memory.
Redirect Loops: User shortens tiny.url/abc which points to tiny.url/abc.
- Fix: Detect domains that match your own service and reject them.

Step 10: Performance Optimizations

Geo-Distribution: Run read replicas of the DB (or Edge Caching via Cloudflare) in multiple regions to reduce latency for global users.
Cleanup: A “Lazy Cleanup” process. Don’t constantly scan for expired links. Only check expiration when a user clicks a link. If expired, delete it and return error. Run a background sweeper monthly for the rest.

Real-World Implementations

TinyURL

Originally used simple base62 conversion of database IDs.
Suffered from predictable URLs (enumerability).

Bitly

Started as a utility, pivoted to Enterprise Analytics.
Uses heavy caching and 302 redirects to track every interaction for marketing insights.

Common Interview Follow-Up Questions

Q: How would you support custom aliases (for example, `tiny.url/sale-2026`)?

Answer: “I’d treat custom alias creation as a transactional write:

Validate alias format and reserved keywords.
Check uniqueness with conditional write (put-if-absent) in the alias table.
Enforce namespace rules for enterprise accounts.
Add abuse checks to block impersonation and trademark misuse.

Trade-off: Alias flexibility improves product value, but requires stricter moderation and conflict handling.”

Q: How do you add expiring links without hurting redirect performance?

Answer: “Expiration should be evaluated at read time with lightweight metadata:

Store expiresAt with each mapping.
Keep it in cache payload so redirect path can reject expired links quickly.
Return a dedicated expired-link page and analytics event.
Run async cleanup jobs to remove old rows from storage.

This keeps the hot redirect path fast while still honoring expiration contracts.”

Q: How would you detect and block malicious destination URLs?

Answer: “I would combine pre-check and post-check controls:

Scan long URLs with threat-intel APIs before creation.
Re-scan high-traffic links periodically since reputation can change.
Maintain internal deny lists and domain risk scores.
Add click-time interstitial warnings for suspicious but not fully blocked links.

Trade-off: Strict blocking reduces abuse, but false positives require fast appeal workflows.”

Q: How do you provide detailed analytics without increasing redirect latency?

Answer: “Separate redirect serving from analytics ingestion:

Redirect service returns immediately after lookup.
Emit click events asynchronously to Kafka/PubSub.
Aggregate metrics in batch/stream jobs for dashboards.
Use sampled logs plus exact counters for high-value enterprise links.

This protects p95 latency while still delivering rich reporting.”

Q: What if your key generation strategy must change later?

Answer: “Design for migration from day one:

Version short codes (for example v1, v2) in metadata.
Keep resolver backward-compatible across versions.
Roll out new generator for new links only.
Migrate old records lazily when accessed, if needed.

This avoids large risky rewrites and allows incremental evolution.”

Conclusion

Designing a URL Shortener tests your ability to handle:

Atomicity: Generating unique keys without duplicates.
Scale: 100:1 read ratios requiring heavy caching.
Simplicity: Choosing NoSQL over SQL for simple KV data.

References

YouTube Videos

“Design a URL Shortener - System Design Interview” - Gaurav Sen [https://www.youtube.com/watch?v=JQDHz72OA3c]
“System Design Interview: A Framework for Beginners and Seniors” - ByteByteGo [https://www.youtube.com/watch?v=bUHFg8CZFws]