Using C for a specialized data store

  pixenomics.tumblr.co        2012-03-07 05:09:38       2,756        0    

Pixenomics stores and transports 1.2 million pixels from the server to the client. During development we played with various methods to store and process this. Our ultimate goal was to send the entire board in under 1 second.

During the stages of prototyping we used a MySQL database without thinking too much about performance. With a mere 2,000 pixels we quickly realised this wasn’t even usable as a demo. Changing the storage engine to memory was much better but still obviously unusable.

The problem wasn’t to store the pixels but to retrieve all 1.2 million pixels quickly as well as process them. The game runs an algorithm every 3 hours to determine who wins or loses pixels based on the surrounding pixels’ color which means 9.6 million (1.2 million * 8) iterations.

We were reluctant to use a NoSQL solution as this would require retrieving the pixels through a socket, storing it in memory and then processing them. It makes more sense to process it where it’s stored.

This led to the idea of using Node.js. Node is the hottest new tech and was very simple to get a working prototype going. Initially our demo took around 7 seconds to grab an empty board (around 1.2 megs) and this was the lowest possible size. We were storing the pixels in an object where the key was a string of the pixel coordinates. It looked something like this:

for(row = 0; row < ROWS; ++row) {
    for(col = 0; col < COLS; ++col) {
        output(board[col + "," + row]);
    }
}

Turning the board into a multidimensional array shaved off 3 seconds:

for(row = 0; row < ROWS; ++row) {
    for(col = 0; col < COLS; ++col) {
        output(board[col][row]);
    }
}

So in the end we could retrieve the entire board in about 4 seconds. This is still much too slow. We decided the only realistic option left was to go deeper. We had to use C.

A daemon would store the pixels in a multidimensional array where it can be processed, modified and interface with PHP through sockets just like any other data storage. For persistent storage we write the board to a file and scp it to another server every hour or so. This presents another problem where writing becomes slow because it has to backup the entire board. To solve this we just wrote the file in a tmpfs (file system in memory).

The result of grabbing 1.2 million pixels using C?

…

0.03 seconds.

This was a result we didn’t even think would be achievable but shows how C is still very relevant in web applications today and a good choice when it comes to large amounts of data processing and retrieval.

Note: This article is a very simplistic view of what you should use to store data. There are many other concerns that may be more important such as concurrency, security, backups and portability. We were looking for the fastest way to get a lot of data processed and to the client and this was it.

Source:http://pixenomics.tumblr.com/post/18892378997/using-c-for-a-specialized-data-store

C  DATA STORE  EFFICIENCY  PERFORMANCE 

       

  RELATED


  0 COMMENT


No comment for this article.



  RANDOM FUN

Reproduced a bug