Skip to content
ADevGuide Logo ADevGuide
Go back

System Design Interview: Design a URL Shortener

By Pratik Bhuite | 13 min read

Hub: System Design / Interview Questions

Series: System Design Interview Series

Last verified: Feb 12, 2026

Part 4 of 8 in the System Design Interview Series

Key Takeaways

On this page
Reading Comfort:

System Design Interview: Design a URL Shortener

Designing a URL shortener like TinyURL or Bitly is one of the most common system design interview questions. It seems simple at first—mapping a long string to a short one—but quickly becomes complex when you consider scale, concurrency, and hash collisions.

Need a quick revision before interviews? Read the companion cheat sheet: System Design Interview: Url Shortener System Design CheatSheet.

Table of Contents

Open Table of Contents

Interview Framework: How to Approach This Problem

In a system design interview, when asked to design a URL Shortener, follow this structure:

  1. Clarify requirements (5 minutes) - Functionality and volumes.
  2. Back-of-envelope Math (3 minutes) - This is crucial here. Storage runs out fast.
  3. High-level design (10 minutes) - API and Data Flow.
  4. Deep dive (20 minutes) - Focus on the shortening algorithm (Base62 vs Hashing).
  5. Scale (10 minutes) - Caching and cleanup.

Key mindset: Prove you understand Trade-offs between collision handling and throughput.

Step 1: Clarifying Requirements

Questions to Ask the Interviewer

Q: Can users choose their own alias (custom URL)?

  • Answer: Yes, but optional. Default is random.

Q: Do the links expire?

  • Answer: Standard links last 5 years.

Q: What is the scale?

  • Answer: 100M new URLs per month. 10 Billion reads per month (100:1 read/write ratio).

Functional Requirements

  1. Shorten: Given a long URL, return a unique short URL.
  2. Redirect: Given a short URL, redirect to the original long URL.
  3. Custom Alias: Allow specific custom aliases (tinyurl.com/MyResume).
  4. Analytics: Track click counts (optional but good to mention).

Non-Functional Requirements

  1. Low Latency: Redirection must be extremely fast (< 20ms).
  2. High Availability: If the service is down, all redirects fail. Aim for 99.99%.
  3. Unpredictable: Ideally, the next short URL shouldn’t be guessable (security).

Step 2: Core Assumptions and Constraints

Traffic Estimates

  • Writes: 100 Million / month ≈ 40 writes/sec.
  • Reads: 10 Billion / month ≈ 4,000 reads/sec.
  • Ratio: Read-heavy (100:1).

Storage Estimates

  • Duration: 5 Years.
  • Total Objects: 100M/month × 12 × 5 = 6 Billion URLs.
  • Object Size: 500 bytes (ID, LongURL, UserID, Timestamp).
  • Total Capacity: 6B × 500 bytes ≈ 3TB.
    • Conclusion: We can easily fit the metadata in a NoSQL cluster or sharded SQL.

Bandwidth Estimates

  • Reads: 4,000 req/sec × 500 bytes ≈ 2 MB/sec. (Trivial)

Step 3: High-Level Architecture

System Flow Diagram

flowchart TD
    Client["Client Browser"] --> LB["Load Balancer"]
    LB --> WebSvc["Shortener Service<br/>Web Server"]

    WebSvc --> Cache[("Cache<br/>Redis")]
    WebSvc --> DB[("Database<br/>NoSQL/SQL")]

    subgraph "Write Path"
    WebSvc -.-> KGS["Key Generation Service<br/>(Optional)"]
    KGS -.-> KGS_DB[("Exhausted Keys DB")]
    end

    classDef service fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000000
    classDef storage fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000000
    classDef client fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000000
    classDef infrastructure fill:#f5f5f5,stroke:#616161,stroke-width:2px,color:#000000

    class Client client
    class WebSvc,KGS service
    class Cache,DB,KGS_DB storage
    class LB infrastructure

API Design (REST)

1. Create Short URL POST /api/v1/data/shorten

  • Input: { longUrl: "https://...", customAlias: "optional" }
  • Output: { shortUrl: "https://tiny.url/xyz123" }

2. Redirect (Get) GET /api/v1/{shortUrl}

  • Output: HTTP 301/302 Redirect to Long URL.

Step 4: The Hardest Problem - Generating Unique IDs

We need a short string like http://tt.com/u7yX9. How do we generate the u7yX9 part?

Constraints: Length

Using Base62 encoding (A-Z, a-z, 0-9):

  • Length 6: 62⁶ ≈ 56.8 Billion combinations.
  • Length 7: 62⁷ ≈ 3.5 Trillion combinations.
  • Since we need to store 6 Billion URLs (from Step 2), Length 7 is sufficient.

Approach 1: Hash Collision (MD5/SHA)

Hash the long URL MD5(long_url).

  • Problem: MD5 produces 128-bit strings (too long).
  • Truncation: Take first 7 characters.
  • Critical Flaw: Collisions. Different long URLs might hash to the same first 7 chars.
  • Resolution: Check DB -> if exists, append salt -> re-hash. (Slows down writes significantly).

We pre-generate unique 7-character strings offline and store them in a “Token DB”.

  • Key DB: Has two tables: UsedKeys and UnusedKeys.
  • Process:
    1. KGS loads 1000 unused keys into memory.
    2. Web Server requests a key.
    3. KGS hands one out and marks it used.
  • Pros: Zero collisions. Extremely fast (no hashing on the fly).
  • Cons: Need to handle concurrency so KGS doesn’t give the same key twice (use SELECT ... FOR UPDATE or strict locking).

Step 5: Database Design and Storage

Schema:

{
  "hash": "u7yX9", // Primary Key
  "originalUrl": "https://google.com/search?q=...",
  "creationDate": "2026-02-12T...",
  "expirationDate": "2031-02-12T...",
  "userId": "UserID_OPTIONAL"
}

Which DB Type?

  • NoSQL (DynamoDB / Cassandra):
    • ✅ Highly scalable for billions of rows.
    • ✅ Fast Key-Value lookups.
    • ✅ Easier to shard.
  • SQL (PostgreSQL):
    • ✅ Transactional support (good if tracking complex user analytics).
    • ❌ Harder to scale horizontally to 100TB+.

Verdict: NoSQL (DynamoDB) is the industry standard for this read-heavy, simple structure.

Step 6: Key Technical Decision - 301 vs 302 Redirects

301 (Permanent Redirect)

  • Browser caches the mapping. Next time user types tiny.url/xyz, browser goes directly to Long URL without hitting your server.
  • Pro: Reduced server load.
  • Con: You lose analytics. You don’t know if the user clicked it again.

302 (Temporary Redirect)

  • Browser always hits your server first for the new location.
  • Pro: Accurate analytics (click tracking).
  • Con: Higher server load.

Decision: Use 302 if Analytics is a business requirement (usually yes). Use 301 if reducing server cost is the priority.

Step 7: Scaling the System

1. Caching (Critical)

Since 80/20 rule applies (20% of URLs generate 80% of traffic), we should cache popular mappings.

  • Technology: Redis / Memcached.
  • Eviction: LRU (Least Recently Used).
  • Flow: GET request -> Check Redis -> If Miss, Check DB -> Write to Redis -> Return.

2. Database Sharding

How to store 3TB of data?

  • Hash-Based Sharding: hash(short_url) % n.
  • Determine which shard holds the data based on the short URL ID.

Step 8: Security and Permissions

  1. Prediction Attack: If using auto-increment IDs (tiny.url/1, tiny.url/2), competitors can scrape all your data by iterating numbers.
    • Fix: Use KGS with random distribution, or simple Base62 + large random offset.
  2. Malicious Links: Users shortening phising links.
    • Fix: Integrate with Google Safe Browsing API to scan long URLs before shortening.

Step 9: Handling Edge Cases

  1. KGS Failure: If the Key Gen Service dies, writes stop.
    • Solution: Run redundant KGS instances with standby keys in memory.
  2. Redirect Loops: User shortens tiny.url/abc which points to tiny.url/abc.
    • Fix: Detect domains that match your own service and reject them.

Step 10: Performance Optimizations

  1. Geo-Distribution: Run read replicas of the DB (or Edge Caching via Cloudflare) in multiple regions to reduce latency for global users.
  2. Cleanup: A “Lazy Cleanup” process. Don’t constantly scan for expired links. Only check expiration when a user clicks a link. If expired, delete it and return error. Run a background sweeper monthly for the rest.

Real-World Implementations

TinyURL

  • Originally used simple base62 conversion of database IDs.
  • Suffered from predictable URLs (enumerability).

Bitly

  • Started as a utility, pivoted to Enterprise Analytics.
  • Uses heavy caching and 302 redirects to track every interaction for marketing insights.

Common Interview Follow-Up Questions

Q: How would you support custom aliases (for example, tiny.url/sale-2026)?

Answer: “I’d treat custom alias creation as a transactional write:

  1. Validate alias format and reserved keywords.
  2. Check uniqueness with conditional write (put-if-absent) in the alias table.
  3. Enforce namespace rules for enterprise accounts.
  4. Add abuse checks to block impersonation and trademark misuse.

Trade-off: Alias flexibility improves product value, but requires stricter moderation and conflict handling.”

Answer: “Expiration should be evaluated at read time with lightweight metadata:

  1. Store expiresAt with each mapping.
  2. Keep it in cache payload so redirect path can reject expired links quickly.
  3. Return a dedicated expired-link page and analytics event.
  4. Run async cleanup jobs to remove old rows from storage.

This keeps the hot redirect path fast while still honoring expiration contracts.”

Q: How would you detect and block malicious destination URLs?

Answer: “I would combine pre-check and post-check controls:

  1. Scan long URLs with threat-intel APIs before creation.
  2. Re-scan high-traffic links periodically since reputation can change.
  3. Maintain internal deny lists and domain risk scores.
  4. Add click-time interstitial warnings for suspicious but not fully blocked links.

Trade-off: Strict blocking reduces abuse, but false positives require fast appeal workflows.”

Q: How do you provide detailed analytics without increasing redirect latency?

Answer: “Separate redirect serving from analytics ingestion:

  1. Redirect service returns immediately after lookup.
  2. Emit click events asynchronously to Kafka/PubSub.
  3. Aggregate metrics in batch/stream jobs for dashboards.
  4. Use sampled logs plus exact counters for high-value enterprise links.

This protects p95 latency while still delivering rich reporting.”

Q: What if your key generation strategy must change later?

Answer: “Design for migration from day one:

  1. Version short codes (for example v1, v2) in metadata.
  2. Keep resolver backward-compatible across versions.
  3. Roll out new generator for new links only.
  4. Migrate old records lazily when accessed, if needed.

This avoids large risky rewrites and allows incremental evolution.”

Conclusion

Designing a URL Shortener tests your ability to handle:

  1. Atomicity: Generating unique keys without duplicates.
  2. Scale: 100:1 read ratios requiring heavy caching.
  3. Simplicity: Choosing NoSQL over SQL for simple KV data.

References

  1. System Design Primer - URL Shortener
  2. Base62 Encoding Explained

YouTube Videos

  1. “Design a URL Shortener - System Design Interview” - Gaurav Sen [https://www.youtube.com/watch?v=JQDHz72OA3c]

  2. “System Design Interview: A Framework for Beginners and Seniors” - ByteByteGo [https://www.youtube.com/watch?v=bUHFg8CZFws]


Share this post on:

Next in Series

Continue through the System Design Interview Series with the next recommended article.

Related Posts

Keep Learning with New Posts

Subscribe through RSS and follow the project to get new series updates.

Was this guide helpful?

Share detailed feedback

Previous Post
System Design Interview: Design Twitter News Feed
Next Post
System Design Interview: Design Instagram Feed