About short URL and its implementation

  Peter        2012-07-02 07:15:09       34,888        2         

Introduction

URL shortening is a kind of technique to convert a long URL to a short URL. There are many companies now providing this kind of service, we now take Google's URL shortener service http://goo.gl/ as an example.

First we navigate to http://goo.gl/, then we enter a random URL into the text field, here we use http://www.url-to-be-shortened.com as the input, it will return us an shortened URL : http://goo.gl/ZSVMM.

URL Parsing

When we type http://goo.gl/ZSVMM in browser address bar, the DNS server will first parse the IP address of http://goo.gl/. After the DNS server gets the IP address (For example : 74.125.225.72). it will send a HTTP GET request to the server with this IP address, and the server will search ZSVMM after receiving the request. After finding the long URL http://www.url-to-be-shortened.com ZSVMM maps to, it will send the request to the long URL with a HTTP 301 response. Then the long URL is accessed.

What's inside a short URL?

To get short URL, we need a mapping function f X->Y and this mapping function must have two features:
  1. If x1 != x2, then f(x1) != f(x2)
  2. For every y, there is one and only one x which makes f(x)=y

Any linear function such as f(x)=2x fulfills these two requirements.

How is the URL shortening technique implemented?

Usually the length of short URL is 6 characters, each character is one of [a-zA-Z0-9], a total of 62 characters. So if the length is 6, there will be 62^6 ~= 56.8 billion combinations. It's enough for most of cases. In Google's URL shortener service, the length of short URL is 5, there are around 900 million combinations.

Assume we use a database to store the long URL and the short URL mappings, then we can have a table LongToShortUR which may have three fields.

  • ID : int auto_increment
  • LURL : varchar //Long URL
  • SURL varchar //Short URL

Now what we need to do is to get the unique short URL from the long URL.

We can have a map of the 62 characters mentioned above:

0 -> a

1 -> b

...

25 -> z

...

52 ->0

61 ->9

So for every long URL, we can use its ID to get a six character string which is the short URL. The detail implementation is

public ArrayList<Integer> base62(int id) {
   
    ArrayList<Integer> value = new ArrayList<Integer>();
    while (id > 0) {
        int remainder = id % 62;
        value.add(remainder);
        id = id / 62;
    }
   
    return value;
}

For example, for ID=138, by passing it to base62(), we can get value=[14,2], according to the map above, we can get the short URL as : aaaabn.

If we want to get back the long URL, we can use the following function

public static int base10(ArrayList<Integer> base62) {
    //make sure the size of base62 is 6
    for (int i = 1; i <= 6 - base62.size(); i++) {
        base62.add(0, 0);
    }
   
    int id = 0;
    int size = base62.size();
    for (int i = 0; i < size; i++) {
        int value = base62.get(i);
        id += (int) (value * Math.pow(62, size - i - 1));
    }
   
    return id;
}

For example, for short URL aaae9a, value=[0,0,0,4,61,0], then the ID of the long URL is [0,0,0,4,61,0] = 0*62^5+0*62^4+0*62^3+4*62^2+61*62^1+0*62^0=1915810. After getting the ID, the long URL can be retrieved by searching the LongToShortURL table.

Short URL can be used in many places such as Twitter since it can only allow 140 characters for each tweet, if we have a very long URL to put, then we need to shorten it.

Reference :  http://blog.csdn.net/beiyeqingteng/article/details/7706010

       

  RELATED


No related articles

  2 COMMENTS


Kendall [Reply]@ 2020-07-13 18:26:05
Thanks to this piece, reading it was the pleasure. I’ve skimmed many articles upon the topic, plus this one can be compared to be able to https://trollacademy.org with regards to quality. Keep up the good work, looking forward to your journals.
Leatha [Reply]@ 2021-07-03 19:16:42
The information given above is tremendously relevant. The article reveals some burning issues and questions which should be discussed and explained, seen some thing similar for the particular page a couple of days past https://makeup-reviews.com/pressed-powders/the-healthy-powder-spf-16/. It is crucial to grasp within the first detail. In the post, an individual can come across something fundamental, remarkably for him personally. So I am thrilled with the data I have only obtained. Thanks Alot!


  RANDOM FUN

I see TGIF, but where is Google?