Posted tagged ‘encryption’

Building cheap cloud storage – the Backblaze way!

29/04/2010

image

I recently read this article written by cloud backup service provider Backblaze on how to build  a cheap cloud storage device – 67 TB of storage for under 8000 USD to be exact.

Backblaze provides unlimited backups to individuals for a mere 5 USD/month, and it is really interesting to read about how they are coping with the demand.

I am going to make a brief summary of the most interesting parts of their  article.

image
Use of cheap RAID controllers
It’s interesting that they don’t use expensive hardware RAID controllers – their “Syba SD-SA2PEX-2IR” controllers cost only 35 dollars a piece, and leave the CPU to handle the job of maintaining RAID functions. This is ingenious in cloud storage, because the cost of a good CPU is far less than four or more full-featured hardware RAID controllers. (They use a Intel Core 2 in their rig, but since the post is a few months old a Core i5/i7 would probably be a better choice today.)

image
RAID structure
Using 45 drives in each server, Backblaze chooses to divide these drives into three RAID6 volumes of 15 drives each. This gives every volume  resistance against two disk failures, or a total of six drive failures in a best-case  scenario. (Two failing drives per volume.) The interesting part here is the “threshold” for maximum number of disks in a RAID6 volume, as estimated by Backblaze. With every new disk added, the likelihood of two drive failures in quick succession increases.

image
Tomcat backbone for communication
I mostly put this in to further dispel the idea that Java is “slow”. This is a great choice of platform, which must have lowered development costs considerably over an alternative implementation. (Such as modifying Apache, or writing a custom daemon to handle the communication.)

Encrypted communication
I am still a bit puzzled at this. I guess the encryption is supposed to protect against snooping on their internal network, but the data is encrypted on end users personal computers before upload, so this measure seems a bit unnecessary, especially with the added CPU usage. I’d be glad to hear some other ideas on this, so if you know, leave a comment!

JFS File system
JFS is a stable file system with low CPU utilization and great performance when looking for files. Read this in-depth file system benchmark for more information.

And that’s all. If you have any questions or other ideas, don’t hesitate to leave a comment!

“I’m a twat”, or “The case of the MD5 crypt”

12/10/2009

twitter_fail_whale_01Proud member of the “fail whalefan club!

No wait, I mean I’m on Twitter! That’s right – yours truly is now posting 140 characters long snarky remarks every now and then on my very own Twitter channel. (But not exclusively, there’s some ironic/stupid ones too!) With the excellent Twitter gadget I even have access to both reading and writing posts from inside iGoogle. Eau de humanity, such brilliant technology.

In other news…
I was reading up on encryption algorithms for a project I’m currently doing and cringed when I noticed how many tutorial sites called MD5 an encryption algorithm. MD5 is used for hashing, which is mainly a way of verifying the integrity of a file or other input data. All hashes are fixed-size, meaning they don’t grow no matter how much data you give MD5, be it the text “Hello” or an entire music track.

Encryption on the other hand is used to conceal data from someone (a third party) and in its most simple form uses a key to encrypt (transform to so called ciphertext) and decrypt (return to the original form, also known as plaintext) the data. (a.k.a symmetric-key encryption). The output of an encryption algorithm grows depending on how much data you feed it, and is approximately 1:1 in size to the input. (Can be bigger due to overhead or smaller if encryption algorithm compresses the data.)

There is no simple way of turning a hash back into the original message or file for which it was calculated, but to turn encrypted data (the ciphertext) back to the original (plaintext), all you need is the password.

Some hash algorithms are MD5 and SHA-1.
Some encryption algorithms are ROT13, DES and RSA. (Although not all of these are symmetric-key.)

I sense this “In other news” is turning into the actual post, so I’m going to stop here and promise myself and any interested readers a full-sized follow-up articl, although the web sources below are pretty

Until then, read more on hashing and encryption!