Small and Medium size businesses are recognizing the importance of the integrity of their data. These businesses need to effectively manage the growing amount of data.
Because of the wide-ranging benefits that small and medium size businesses can gain from Big Data in today’s competitive world, many are implementing a local Big Data strategy. A Big Data implementation presents a need for even more focus on the ability to recover from a catastrophic event quickly. With the implementation of Big Data comes the need for sophisticated Data Backup and Recovery procedures.
SMBs must ensure that critical data is available when needed, even as environments become more complex. Data must also be resilient in the face of regulations, hackers and natural disasters. If disaster does strike, recovery must be fast so there is a minimal impact to the bottom line.
“There is a growing interest among SMBs to harness technology, but they are worried about costs and having the right skills. They want to take incremental steps to build on existing capacities that also hold potential for consequent bold moves. Storage and infrastructure will play a critical role as enabling technology.” IDC as stated on IBM.com
Although the SMB is nimble, the SMB also has the challenge of a limited IT staff. The old method of passing the responsibility of Backup and Recovery to the newest and most junior IT staff person is no longer an option when mission-critical data is now in a combination of structured and unstructured complex formats. The need for a comprehensive data backup strategy becomes more critical to the business, while the implementation of that strategy becomes more complex.
The SMB will look to the cloud for an answer to their backup needs. Salvus Data Consultants is a Data Backup/Recovery Managed Service Provider that does not require the data to leave the customer’s network unless the SMB wants the data to be stored offsite. The data can remain within the customer’s network or deployed off site at the customer’s chosen location, while the processes are managed in the cloud.
IBM says that 90% of the data in the world today has been created in the last two years alone. IBM also says that 80% of data captured today is unstructured. Sources of unstructured data are, among others, posts to social media sites, digital pictures and videos, point-of-sale systems. All of this unstructured data can be termed as Big Data.
Because of the wide-ranging benefits that small and medium size businesses can gain from Big Data in today’s competitive world, many are implementing a local Big Data strategy. To help businesses of all sizes manage Big Data, there is Hadoop. The Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
The Hadoop project has various elements. Below are a few of the more pertinent :
- Hadoop Common – libraries and utilities
- Hadoop Distributed File System (HDFS) – a distributed file-system that stores data on commodity servers while providing high aggregate bandwidth across the cluster.
- Hadoop MapReduce – The “Map” step takes the input, divides it into smaller sub-problems, and distributes them to worker nodes. The worker node processes the smaller problem, and passes it to its master node. The “Reduce” step then collects the answers to all the sub-problems and combines them to form the output.
As stated by Nathan Coutinho in his CDW Blog article 5 Ways to Future-Proof Your Data Center for Big Data “The whole point of Hadoop is to keep the data local on commodity servers and economical local storage…”
Small and Medium size businesses find Hadoop attractive because of it ability to provide high availability to data on local commodity servers.
A data strategy is never complete without a Data Backup and Recovery strategy. A Big Data implementation using Hadoop presents a need for even more focus on the ability to recover from a catastrophic event quickly. However, the SMB is not often staffed or tooled to design and execute a backup strategy of this level of complexity. The other consideration is that since the attractiveness of Hadoop is to use local servers, there is a further need to implement a data backup and recover strategy that can be managed remotely but not have a requirement that the live data be transferred to or running in a cloud environment.
There are Data Backup/Recovery Managed Service Providers (DB/R MSP) that provide remote management of the Backup process, along with professional Disaster Backup and Recovery consultation. Contracting an DB/R MSP with the model of remote DB/R management allows the SMB to maintain their data locally without the need to hire new staff or train existing staff in sophisticated data backup and recovery processes. Additionally, the SMB can have a comprehensive Data Backup and Recovery strategy while housing their Big Data locally.