Our last data center dictionary entry covered Disaster Recovery. We discussed what disaster recovery is, how to reduce various threats, and how to begin your own disaster recovery plan. Next, we move to Colocation.
What is colocation?
Simply stated, colocation is the practice through which a business locates its servers and IT equipment in an offsite data center. These facilities are often designed provide rich connectivity options, which would be otherwise unavailable to a business or organization. In application, colocation allows a business to locate its servers and other IT equipment securely in a data center. As opposed to dedicated server hosting, colocation allows businesses to own and manage their servers in an environment designed to support and enhance server activity.
Why do businesses practice colocation?
Colocation provides businesses with several advantages, including:
- Improved facility and network security
- High uptime and availability
- Increased connectivity options
- Cooling, electrical and networking redundancy
- Scalability for future growth
- Cost-effective bandwidth
- Outage protection
Who should consider colocation?
While colocation can be a great resource for all businesses, medium and large-sized organizations should consider colocation. Industries that regularly handle highly sensitive information, such as financial services and healthcare, benefit from colocation because data centers have exceptional security measures in place.
Why should a financial service company consider colocation?
Today’s financial environment has given the advantage to the quick, connected, and agile. Colocation allows companies the speed, availability, and compliance adherence necessary for success. The boom in electronic trading allows companies to make transactions almost instantly but has also created an environment in which speed directly affects success. The most successful companies in this industry obtain and analyze market information to make quick and accurate decisions, and each second matters.
Colocation also prevents companies from suffering at a loss of latency. With 100% availability and uptime, a financial organization can be certain they will not miss an opportunity which might lead to a costly loss. Finally, because these companies handle sensitive data regularly, they must adhere to stringent compliance regulations. For more information about compliance, financial services, and colocation, we recommend reading our white paper, A Guide to Financial Services Regulations.
Healthcare and Colocation
In today’s healthcare environment, the IT infrastructure may be as important as the care itself. A new study published in the January/February Annals of Family Medicine estimates that 70% of family physicians are using Electronic Health Records (EHR) and by the conclusion of the year over 80% will use EHRs. Healthcare providers at all levels—from hospitals to family care practices—are relying heavily upon EHR and other technology. Today technology in medicine is no longer just for operational efficiency but also for effective patient care. Because the role of technology in healthcare has evolved as a critical component in any healthcare organization, these organizations should consider colocation. It ensures effective operation and excellent patient care as well as HIPAA and HITECH compliance.
Sunday marked one of the most important days of the year (for us, anyways). March 31, 2013 was World Backup Day 2013. This campaign was recently founded to remind computer users around the globe about the importance of backing up data. What would you do if you lost everything on your computer tomorrow? What would your business do if it were to suffer a natural disaster or power failure?
Did you know…
- More than 60 million computers will fail worldwide in 2013.
- Companies that aren’t able to resume their operations within 10 days after a disaster are not likely to survive.
- 90% of small companies spend less than 8 hours planning/managing their continuity plans.
- Between 60-70% of problems that hurt business are due to internal malfunctions of hardware or software.
- 80% of businesses that suffer a major disaster go out of business within one year.
- Over 50% of businesses experienced an unforeseen interruption. The majority of the interruptions caused the business to be closed one or more days.
- Only 1 in 4 people backup their information regularly.
- 113 cell phones are lost or stolen every minute in the U.S. alone.
Companies can choose from several options, when evaluating backup options. One option is to use comprehensive offsite backup services. These services are designed to run continuously in the background of your computer or server and provide your company with real-time data replication to a secure server within the data center. Another option to consider is colocation, which houses your IT infrastructure at a data center to maximize reliability and uptime. Colocation is maintained at Data Cave, our fully redundant Indiana data center, with on-site technicians who can manage any of your unforeseen crises.
In the spirit of World Backup Day 2013, we have put together some questions for you to consider while examining your own backup routine.
- Are you backing up every database that is important to you?
- Do you double check that your backups are working? Check your backed up data periodically to ensure the backup is complete and successful.
- Do you have multiple copies of your data? If you backup your data (photos, files, etc) and then remove them from your primary computer, you may want to consider redundant backups.
We challenge you to take the pledge to back up your files in celebration of World Backup Day.
As with any industry, it is easy to fall into the myopic trap of jargon. In fact, one of my favorite HBR podcasts discusses the burden and challenges that jargon presents (Listen to or read Dan Pallota’s interview with Sarah Green “Business Jargon is Not a Value-Add” here). While we try our best not to over-jargon our customers and friends, it is easy to be ensnared by its occasional usefulness. But sometimes it can valuable to step back, think about what we want to express, and explain it in plain terms. For that purpose, we are going to break down data center jargon down. We want to explain what it is that we do, so all can understand what is actually being communicated.
So without further ado, here is the first installment of our data center dictionary series. Please let us know in the comments section what words you’d like to see explained, and we would be happy to oblige.
What is Disaster Recovery?
Disaster recovery is a plan put in place to make sure your business is adequately prepared to function after any type of disaster. It typically implies what technology plans that will take place, should a disaster occur.
What kinds of disasters should DR include?
There are two different types of disasters that occur: natural disasters and man-made disasters. Natural disasters include tornadoes, wildfires, floods, hurricanes, earthquakes, etc. Man-made disasters include infrastructure failure, human error, hazardous material spills like Three Mile Island and Chernobyl.
By the Numbers:
In a study from Gartner, Inc., they found that 90% of companies that experience data loss go out of business within two years. Research by IBM (Varcoe, 1993) showed that 80% of organizations without relevant contingency plans who suffered a computer disaster went bankrupt.
How to Reduce Various Threats:
Take preventative measures to avoid disasters. Start by creating a disaster recovery plan and be sure to enforce those policies. Make frequent backups of your critical data or records. Be sure to store this information in a secure and remote location. The typical rule of thumb is to have your disaster recovery site at least 50 miles away from your business or primary colocation (see our upcoming data center dictionary installment on Colocation) site.
Where Should You Start?
Some good questions to ask yourself when preparing for a disaster include:
- How is your business run?
- What is required to keep your business going?
- What are the most critical aspects of day-to-day business?
- What is a reasonable length of time for your business to be up and running from your disaster recovery site if your primary servers and hardware went down? Minutes? Hours? Days?
When creating any kind of business plan, a business should begin with an assessment of current status. In this case, an organization should conduct a basic risk assessment. While different industries have different risks, business continuity plans should reflect the respective risks for a given organization and industry. This often requires brainstorming sessions around worst case scenarios.
Next, perform a business impact analysis for the scenarios your team created. By itemizing the potential risks, your organization will then be able to prioritize the scenarios that will impact their business the greatest. Plot these scenarios on a matrix, with likelihood and impact on each axis.
After performing a business impact analysis, we recommend formally documenting your contingency plans for these situations. Business continuity plans should cover the entirety of an organization and may include:
- Incident response plan
- Incident management plan
- Business recovery plan
- Emergency and evacuation plan
- Contingency plan.
Those plans should be accompanied by their respective procedures and policies.
After documenting a business continuity plan, an organization should put these items into action. Adequate preparation applies beyond the paper documentation. For example, if the engineers on the Mars Rover had only documented the backup plans, they might not be able to have protected their asset, should one of the circumstances occur.
At this point in the process, your organization has created its business continuity plan and taken the appropriate preparations. The plans should be measured and reviewed on a regular basis. There is no magic number in terms of the frequency, but the policies, procedures, and plans should be monitored to identify the successful aspects and the potential weakness within these processes.
Finally, once the weaknesses have been identified, adjust accordingly. Prioritize the actions, as done when performing the business impact analysis, and then tweak the plans.
Unfortunately preparing for disasters must be an ongoing process. This cycle is crucial in protecting an organization, its information, and its operations. In the Midwest, where tornadoes, floods, and lightning are commonplace, it is especially vital for a business not only have a business continuity plan but also frequently measure and monitor that plan. Check out our whitepaper on disaster recovery planning with tips to help you along the way.
Data Cave is a fully redundant, robust data center located in Indiana. We welcome you to share in its advantages, especially when it comes to protecting your valuable data assets. Contact us or call 866-514-2283 to schedule a tour and see for yourself.
Did you know that McDonald’s feeds more than 46 million people every day? That’s more than the population of Spain! Additionally, McDonald’s represents 43% of the United States fast food market. One would think that a company like McDonald’s would practice appropriate server maintenance. We were horrified when a friend of Data Cave sent us this picture they snapped through the window of a local McDonald’s drive through.
So let’s play a game. What’s wrong with this picture?
1. Kitchens and Technology are a Recipe for Disaster
This McDonald’s chose to locate their servers near the kitchen. It doesn’t take a data center expert to note that this is not an effective strategy. Consider your personal cell phone, for example. SquareTrade conducted research that stated that 21% of all iPhone accidents occur in the kitchen. An iPhone is a critical device for many, but most of the vital information is backed up using iCloud. And it isn’t cheap to replace an iPhone, but the price is not nearly as prohibitive as purchasing and implementing a new server. Being near to food and drink can only result in terrible technology tragedies.
2. Exposure to the Elements
Not only did this McDonald’s choose to place their servers near the kitchen, they exposed them to the elements because they were in the drive thru room. It is estimated that an average McDonald’s serves 1,584 customers daily. If half those customers came through the drive thru and the window is open for an average of 10 seconds per customer, those servers were exposed to outside conditions two hours and twelve minutes each day. This takes the idea of an uncontrolled environment to the extreme.
3. Crossed Wires
While the appearance of messy wires isn’t aesthetically pleasing, it is also dangerous. Tangled wires pose fire threats (and we are willing to bet that McDonald’s didn’t employ a fire suppression system exclusively for its servers). Due to this cabling, it doesn’t even appear as if they can shut the door (see #4). In fact, this picture below details the challenges of having messy wires.
4. An Open Door Policy
Open door policies are great for dealing with employees, but they are less than optimal when it comes to technology. Having an open door to their servers poses many security risks. Damage could be done, both intentionally and unintentionally. McDonald’s has employed one in every eight American workers. That is indicative of a high employee turnover. A disgruntled employee could easily wreck havoc on McDonald’s because the technology is so readily available. Additionally, accidents happen. By having an open door, the chance of accidents increases.
5. The Data Closet
Finally, it goes without saying that we encourage all organizations to protect their valuable technology (especially offsite). McDonald’s has their main data center in Dallas but their restaurants obviously still needs local equipment. There are so many risks that come with housing an internal data center, especially one in a closet with no ventilation or cooling. If you want cost savings and increased protection, it only makes sense to outsource your data center.
McDonald’s, we urge you to clean up your technology act! It is inevitable that something will happen, and you will suffer!
The past few blogs have told stories about bad backups, Amazon’s cloud outage, the impending death of RIM, and extended power outages. For the sake of argument and this article, we will continue on this path to emphasize an important concept, redundancy (pun intended).
So, what is redundancy? As a data center, we live and breathe this term, but for those who don’t live in the Data Cave, this term might be foreign. Some define redundancy as “Superfluous repetition or overlapping, or superfluity.” We beg to differ. Data has morphed into the currency of today, and redundancy ensures its viability. This is anything but a superfluous task. Rather, we prefer another definition, “the additional, predictable information so included, and the degree of predictability thereby created.” What organization doesn’t prefer a predictable work environment?
Any organization craves predictability because in lives, there is so much we cannot control. Developing redundancy (and sometimes secondary redundancy) gives the peace of mind and control that we so desperately need in a world in which we have so little control.
In late June, a massive windstorm blew through Northern Virginia and took down Fairfax County’s 911 system, which operated through Verizon Wireless. 2.3 million residents were left without 911 service for several days. Not only did the storm interrupt the 911 service power supply, one of its generators failed to activate, despite the fact that it had been routinely tested 3 days prior. Fortunately, no one died due to this failure, but the effects are still grave. Harry Mitchell, Verizon’s Director of Public Relations, acknowledge the gravity of the situation and their desire to remedy the situation.
“Once we complete our restoral efforts, we will investigate fully the causes of the problems and provide a root-cause analysis to the appropriate officials. The powerful storm appears to have caused problems on multiple layers of facilities, from the commercial power failure to damage to our backup power supply, to downed and damaged lines. The combination of those factors led to issues with various aspects of the 911 system.”
This case highlights in the importance of multiple levels of redundancy. Verizon had the appropriate generators. They had a backup plan. What they didn’t have was a backup-backup plan. Redundancy is a valuable risk management tool, and organizations should use it to furthest degree possible. Colocating at a data center is a great first step to ensuring your data is secure. The second step would be to have an additional site for disaster recovery to ensure you’re equipment is redundant. When choosing a data center, for colocation or disaster recovery, look at their redundancies. Do they have a backup-backup (or backup-backup-backup) plan? At Data Cave, we obviously know the importance of being redundant. We have redundant equipment for all four quadrants of our building plus redundancy at the 1,300 square foot data suite level. Our design allows for more redundancy than your typical wide open floor data center. We know that being prepared for the disaster is half the battle. So, backup your backup plans and be prepared. Those redundancies will save you time, effort, and money in case of disaster.
ThePlanet has a blog post from a couple of years ago from their DC Manager highlighting some of the daily things they go through in keeping their data centers up and running. I’ll highlight some similar bits of information about Data Cave in the next couple of posts.
Each data suite in our center has multiple air conditioning units (CRACs). We monitor these units remotely from our Network Operations Center (NOC) and keep an eye on the temperature and humidity levels in the rooms. We keep historical data for trending analysis. We also will spot check the units daily, verify they are working properly, and ensure the screen readouts agree with our remote monitoring.
The main part of our chilled water system are our multiple centrifugal chillers that create the chilled water. These units run mostly autonomously, but we still spot check them daily for things like oil level, and level of the refrigerant in the system. We also remotely monitor them in our NOC to ensure that no faults have occurred, and that the water temperatures and flows stay within bounds.
The chillers make cold water, but they do so by rejecting heat into a separate water system, known as our process water. This separate water loop is also computer controlled via a system of pumps and cooling tower fans where it is taken outside and water is evaporated to reduce the temperature again. Makeup water is also brought in through wells located around the building, and is purified by reverse osmosis and softening systems. Again, the whole system is computer controlled and remotely monitored, and we spot check it daily.
Because the process water is warm, it is a breeding ground for bacteria. Thus, we have to periodically add chemicals to it to keep bacteria from forming and to prevent it from rusting or deteriorating steel and copper tubing throughout the cooling system. This treatment process is done in house, and is monitored weekly.
Stay tuned for part #2, where I’ll talk about our electrical systems.
A few weeks ago I posted an opinion that tape backups are dead – and that generated some feedback telling me I was plain wrong. For better or worse, I’m sticking with another “is dead” mantra: raid (particularlly, raid 5).
Now, in all reality RAID5 isn’t dead. But you shouldn’t be using it. The meat of why is here, in an old piece at ZDNet. Now, let’s dive into why.
The main concept behind RAID5 is that in a disk set, an extra disk is used for storing parity information. The parity information is actually stored across all of the disks, not just on a single disk. The main idea behind this is that any disk can fail in the set, and the set can continue on. Once the failure is noticed, an extra disk can be brought into the set (usually automatically by modern SAN devices) and rebuild the extra parity information.
The problem is that this rebuild takes time. A lot of time, for today’s modern disks. And disk failure rates are fairly high to begin with. So, statistically, there’s a somewhat good likelihood of a secondary disk failure during the parity rebuild of the first disk. And if that happens, you are in for a really bad day. Wikipedia says it best:
As the number of disks in a RAID 5 group increases, the mean time between failures (MTBF, the reciprocal of the failure rate) can become lower than that of a single disk. This happens when the likelihood of a second disk’s failing out of N − 1 dependent disks, within the time it takes to detect, replace and recreate a first failed disk, becomes larger than the likelihood of a single disk’s failing.
Basically, the more disks there are in a RAID5 set, the better chances there are of two disks failing than just a single disk failing.
Of course, RAID6 is an alternative to RAID5, with yet another an additional disk used for parity storage, so that a two disk failure can be handled. But the same limitations exist as with RAID5: at some point it just won’t be reliable anymore. ZDNet even follows up. The problem still lies in that with increased disk sets and parity striping information, any failure takes a really long time to rebuild from, and that’s when things are most vulnerable.
What are the solutions? Well, for one, you could store the data on multiple RAIDsets – perhaps in completely different SAN units. This adds significantly more storage, but makes reliability much higher. You could just back everything up to tape (kidding!). Or start using a more reliable data store on top of the drives, like ZFS.
There are a lot of options. What are you doing to mitigate data loss?
DATA CAVE ENTERS INTO AGREEMENT WITH KAR Auction Services
Columbus, IN – (February 2011)
Data Cave, Inc., a state of the art data center facility based in Columbus, IN, announced today that it recently entered into a long term agreement to house strategic IT infrastructure for KAR Auction Services, Inc.
Caleb Tennis, President of Data Cave commented “We are very excited about the addition of KAR Auction Services to our growing list of clients. We are confident our highly dependable infrastructure will exceed KAR Auctions Service’s expectations.”
Data Cave, Inc.
Data Cave provides customers a private and secure environment for data center services, including colocation and disaster recovery solutions, from its newly built hardened data center facility in the Midwest. The 80,000+ sq. ft. facility was designed and constructed to withstand the most extreme natural disasters and is conveniently located to Indianapolis, Louisville, and Cincinnati. Data Cave is a privately held woman and minority owned organization. For more information, visit www.thedatacave.com