Daily, we come across a number of short URLs while surfing web. A short URL redirects to another URL (usually a long URL). For example, the URL http://manjeetdahiya.com/2010/10/21/introduction-to-app-inventor-for-android/ can be shortened to http://bit.ly/fmUP2Q using bit.ly. The short URL will just redirect to the long one.
There are many URL Shortening services available, e.g., bit.ly, tinyurl and the latest one goo.gl by Google. Have you ever wondered how the URL shortening works? This article will take you through the concepts behind URL shortening. I will touch upon only the high level details and will not discuss any of the implementation stuff.
Understanding Short URL
If you look at a short URL, it consists of a domain and a key:
Domain = bit.ly
Key = fmUP2Q
The key is different for each long URL. Based on the key the server redirects to the respective long URL.
How does it really works?
Let us take example.com as the domain of the service provider for explaining purpose. Here example.com is equivalent of tinyurl.com, bit.ly, goo.gl etc.
The service provider, example.com, maintains a map of keys and URLs in the database. Where key is a positive integer and URL is just a string. The map looks like:
- [0, url0]
- [1, url1]
- [2, url2]
Where url0, url1, url2… are the long URLs. The integer is unique and always incremented when a new short URL is created. It can be made sure before creating a new short URL, if a long URL already exists in the database then don’t add it again and use the same short key.
On opening the short URL=http://example.com/key in the browser, the request goes to the server example.com with the key. The server checks the key in the map and redirects to respective long URL. For example:
- example.com/0 -> url0
- example.com/1 -> url1
- example.com/2 -> url2
This is one of the techniques which can be used to implement a simple URL Shortening service.
I don’t see keys as alpha-numeric
Usually, we don’t see integers as the keys. All the services use alpha-numeric string as the key. The reason behind this is that on can accommodate more keys without increasing the length of key.
For a key length of 6: base10 can have 10^6 keys and base 62 can have 62^6 keys.
Using base62 in the above mentioned technique is not difficult to implement. Keep the same database. Store integers itself in the database. The server will get requests as key in base62 and you will have to convert base62 key to base10 before finding the corresponding long URL in the database and then redirect to the long URL.