Good SLAs make good neighbors.
When selecting the right data center colocation, important considerations are visibility and the management tools available to the customer. Monitoring energy consumption, setting up a DRready environment and a true partnership with the supplier are essential points to take into account in making the right choice of data center. This is the fourth article in a series on roommate selection and best practices. Below is an abridged version of Section 4 of the Data Center Knowledge Guide to Colocation Selection.
During the planning phases, things like contracts, expectations and management tools need to be defined to ensure that everyone is on the same page. When working with a colocation provider, there will be important planning points and ongoing considerations surrounding a successful datacenter deployment.
Work with a service level agreement
When selecting the right colocation provider, it is crucial to create or have a good SLA and establish clear dividing lines. Many times an SLA can be developed based on the needs of the organization and what is hosted in the data center infrastructure. It means identifying key workloads, applications, servers and more. From there, an organization can develop basic service agreements for availability, troubleshooting, response time, etc. Creating a good SLA document can take some time, but it is important to do so with care as it can govern the performance of your environment. Some very high availability environments will integrate credits into their LA. In these situations, for example, a colocation provider could issue credits in case of unavailability of electricity. Creating an SLA is a partnership between the data center provider and the customer. Expectations should be clearly defined to ensure that all performance, recovery and other expectations are met. Surprises or encountering strangers in a busy production environment can result in lost productivity, time and money.
Maintenance and testing
Remember, when you buy a colocation data center, you are buying a slice of critical infrastructure and ongoing maintenance. Without a robust maintenance program, the technology will fail. Look for documented MOPs (method of procedure) and SOPs (standard operating procedures) that are used consistently and improved over time. Make sure your SLA does not exclude maintenance windows or emergency maintenance. Your colo supplier should be able to show you their monthly, quarterly, and annual maintenance schedules for all critical parts of mechanical and electrical systems, including chillers, air handling units, generators, batteries and inverters. You should be able to observe and even participate in maintenance exercises. How are you informed of windows and maintenance procedures? Finally, ask the ultimate question: “Are you planning and testing a complete utility outage?” “
Systems should be designed with sufficient redundancy to allow proper maintenance. Colocation providers are reluctant to maintain systems if it could potentially cause an outage. The best practice in the industry is to be able to “fix one and break one, along with a utility outage.”
Have a disaster recovery contract ready
For some organizations, the move to a colocation data center is the result of a disaster recovery plan. In these situations, it may well be possible to integrate a DR contract into an SLA or as a stand-alone agreement. In it, the organization and colocation provider establish which internal systems should remain active and create a policy for those systems to continue to operate. When designing a contract around a disaster recovery initiative, consider some of the following:
- Use your BIA. As mentioned earlier, a business impact analysis will describe the key components within a business environment that need to be kept active or recovered quickly.
- Communicate clearly. Good communication between the colocation partner and the organization is vital in any disaster recovery plan. A situation where an unknown system or component (which was deemed critical but not disclosed) fails will become a serious problem.
- On-site and off-site supplies. In the event of a disaster, you need key sources of supply on-site and off-site. Are there on-site supplies of diesel fuel for the generators and water for the cooling systems? Are there established services for the delivery of water and diesel fuel in the event of on-site supply depletion? Does the colo provider lead disaster recovery scenarios with key vendors?
Use management tools
One of the most important management aspects in any environment is the ability to have clear visibility. This means using not only native tools, but also those provided by the data center partner. Working with workload management and monitoring tools is very important. It is also important to have a good view of the physical infrastructure of the data center environment. Data and reports from these monitoring tools should be
made available via a secure portal.
- Power monitoring. Always monitor the energy consumption rates of your environment. The idea here is not just to know how much energy is being used, but to make the environment more efficient.
- Cooling monitoring. Just like diet, it’s also important to keep an eye on cooling. This can be described as part of an SLA or an organization can also manually monitor cooling.
- Rack conditions and environments. Keeping track of environment variables will help create a more efficient rack design. Some servers will generate more heat while others will need more power. By seeing which system is using which resources, administrators can better position their environment for optimal use.
- Availability and status reports. Regularly check the availability reports of individual systems and keep an eye on the status of different systems.
- Newspapers. A log monitoring platform is always very important. One recommendation is to have a log aggregation tool that collects various server, system, and security logs for analysis.
Problem solving and communication
A big part of having an effective environment will be problem-solving practices and partner-client communication. Although a lot of this can be described in the SLA; specific problem solving issues should be discussed.
When designing a problem-solving conversation, it is important to identify the major components of the data center and then communicate this to the colocation provider. For example, a hard drive within a particular system may take precedence over another problem if there is a concurrent event. In this scenario, the SLA and BIA are used to create a clear plan to resolve issues quickly and in the correct order. There will be instances where one specific issue will take precedence over others due to the nature of the event. Without good communication, the colocation provider may not know which issue to resolve first and assign a resource to a less important ticket. Share your BIA findings and make it clear which data center components need to be addressed first.
The process of selecting the right colocation provider should include planning for the creation of the contract and setting up the right management tools. A colocation data center is an important extension of any organization and therefore needs to be properly managed. Good data center vendors often offer tools for line-of-sight to an infrastructure. This means that engineers will have a clear understanding of how their racks are cooled, powered and monitored. These types of decisions make an infrastructure more efficient and much easier to manage in the long run.
For more details on how to structure your colocation SLA and other colocation selection best practices, download the Complete Colocation Selection Data Center Knowledge Guide.