Purpose: to solve the RIGHT problem!
Interview Question: Design URL shortener. (https://tinyurl.com/app)
Q: Can you tell me more about the specific features we need to support?
A: Given a URL, our service should generate a shorter and unique alias.
Q: How short the shortened link should be?
A: It is up to you.
Q: Does the shortened link have an expiration time?
A: Users can specify it.
Q: Can user pick their shortened URL?
A: Nope.
Q: Does URL need to be random? A: Yes, it should not be guessable Q: What are the non-functional requirements regarding availability, latency, and throughput? A: It should be real-time (low latency) and highly available. You can assume 500M new URLs per month. Q: What is the read: write ratio? A: read:write = 100:1
Purpose: estimate the scale of our system
We don’t need to get the correct answer when estimating the system capacity, just a rough estimation.
Example: URL shortener
500M new URL / month => 200 url/s
Read/Write = 100:1
POST api/v1/shortenrequest param:{ longURL: string}response: {shortURL: string}GET api/v1/{shortURL}status code: 301 (redirect)response: longURL
if cache hit, return longURL
else query from DB
if the URL is expired, return error.
else write back to cache and return.
How to choose Hash Function?
> shortURL contains [a-z, A-Z, 0-9] ~ 62 characters.> We need to store 60 billions URL for 10 years.> => shortURL length should be 7 characters. Because 62^6 < 60B < 62^7> We have few options here: (again, explain the tradeoff to interviewer)> - Pick the first 7 characters of the hash => maybe leads to collision.> - Pick the first 7 characters and query the database to check collision. => high latency.> - Randomly pick 7 character from the hash
Other approaches:
> Use encoding instead of hashing> Generate an unique id integer for each longURL> Convert to base62> - Pros: no collision because the id is unique> - Cons: need to build a distributed ID generator, which is hard.
Make it scalable and reliable.
Quick Links
Legal Stuff
Social Media