GNU/Linux‎ > ‎Blog‎ > ‎

New era of storage

objavljeno: 9. jan. 2020 17:06 avtor: Damjan Ziberna
With the amount of data beeing produced each minute these days, many companies are becomming  aware of  the fact that the classic way of storing data is just not cutting it any more to what the needs are.
Today you cannot afford to take performance hits just because a disk failed. Not you can afford to have downtime if your RAID array failes completely.
So what do you do? One way ahead is to be stubborn and get your secondary enterprise storage rack for your secondary data-center with the license and the dark fibre connection to have it mirrored with your primary one. That is if you have the money for all that.
Another way is to put your data into the cloud and have that data mirrored to at least two geographically split sites. This costs a bit extra, but you want to make sure your data is available at all times, do you not? By doing so, you are placing the burden of making storage space scalable to your cloud service provider. 
But wait. What if you do not want to place your data into the cloud just yet or you do not have the need to have that data globally accessible. You just want to scale up your own data center and perhaps make it in sync with your secondary site for backup and/or disaster recovery purposes. How would you go about that without spending a fortune?
The answer is in the title of this article. Nothing new for most of the cloud providers as they must be using something affordable and without the vendor lock-in threat. A solution that would work on off-the-shelf servers, something easily attainable. Something like a software defined storage.
There are some different solutions you can choose from. They are in their essence not the same and should be considered based on your requirements.
Let's name some - CEPH, GLUSTERFS, DRBD, LUSTRE, MOOSEFS. These are some of the open source project that are available to you that go beyond the classical storage. These solutions feature distributed data storage with single or multiple copies of your data that gets written to multiple different disks across different storage units (servers). As you probably have figured out by now, you require multiple servers to run such data storage systems. But if you are at a stage where you are considering abandoning the old ways, you are probably already running on bunch of servers anyway.
There is no "one-fits-all" solution amongs these. You will just have to dig into the features and requirements of each and every one of these or more and carefully examine your data requirements of the future. You might even consider going hybrid by having one copy of your data in your data center and another one (or more) at one of the cloud providers. 
So forget about your classic SAN, NAS or DAS boxes and join the evolution.