Data Centre Demystification (Part 1)

Understanding The Data Centre

 

Imagine walking into a large warehouse type room, full of racks emitting an eerie glow. The sound emanating from this room is a uniform roar. If you ever find yourself in this sort of environment, you are most likely in a data centre. The racks emitting the eerie glow are filled with state of the art computing, storage and network equipment and the sound you will hear is the sound of data being received, processed, accessed and stored at breakneck speeds. All of this is in place, so that you can access something as simple as your email or Facebook account.

If I had to give a technical definition of a data centre the best would be the following: “A data center or computer centre (also datacenter) is a facility used to house computer systems and associated components, such as telecommunications and storage systems. It generally includes redundant or backup power supplies, redundant data communications connections, and environmental controls.”

When you consider the amount of data that gets sent between systems these days it is not difficult to imagine the amount of planning and work that goes into a data centre, however, what are some of the challenges IT Professionals face when working in such an environment and how do they plan for the worst?


Some of the challenges seen in such large computing fabrics are the following:
  • Design and style 
  • Power RequirementsEnvironmental Requirements and Constraints 
  • Redundancy and failover 
  • Scalability and maintenance 
  • Cost 
Based purely on design, data centres can be divided into different categories or tiers. Depending on which tier the data centre operates, will determine how big an influence the other challenges become. The higher the tier, the more complex the design, which in turns exposes the operation to more of the challenges, listed above.


The different tiers are illustrated in the table below:

Tier Level
Requirements
1
  • Single non-redundant distribution path serving the IT equipment
  • Non-redundant capacity components
  • Basic site infrastructure with expected availability of 99.671%
2
  • Meets or exceeds all Tier 1 requirements
  • Redundant site infrastructure capacity components with expected availability of 99.741%
3
  • Meets or exceeds all Tier 1 and Tier 2 requirements
  • Multiple independent distribution paths serving the IT equipment
  • All IT equipment must be dual-powered and fully compatible with the topology of a site's architecture
  • Concurrently maintainable site infrastructure with expected availability of 99.982%
4
  • Meets or exceeds all Tier 1, Tier 2 and Tier 3 requirements
  • All cooling equipment is independently dual-powered, including chillers and heating, ventilating and air-conditioning (HVAC) systems
  • Fault-tolerant site infrastructure with electrical power storage and distribution facilities with expected availability of 99.995%
 

Just by referencing the table, we can see that at the first tier, you will have the infrastructure needed to provide a single path to all the resources hosted in the centre without much failover being seen in terms of redundant hardware. At this level an uptime of around 99.671% would be considered the standard. This would be typical of a site based server room or small data centre servicing a single site.

At the second tier we see all requirements of a tier 1 centre being met but with redundancy introduced at the hardware level. This provides some failover in the case of hardware failure. This increases the cost of the data centre significantly especially in terms of the telecommunications equipment. In this environment we can expect an uptime of around 99.741% This would service a regional operation but would still eventually patch into a larger solution.

At the third tier, it starts to get interesting. Here all tier 1 and 2 requirements are met, which seems simple enough, however, in this environment, we start introducing redundant paths in and out of the centre. This means that if routes and sections of the operations go down, data is still accessible even though that specified path is down. This becomes a real challenge in the form of IP Address management, route management as well as equipment selection. This also means that redundant breakouts to the internet will be needed. This increases the cost of the operation exponentially. To break this down for you, take a look at your average internet bill, and add 2 zeros to it (not after the decimal point!!) 

Another interesting point to mention is that at the third tier, we start seeing concurrent maintenance being a requirement. This means that we are able to maintain devices running critical resources without those resources becoming unavailable. This allows us to approach the problem of scalability and maintenance. We also see that a tier 3 data centre should give us around99.982% uptime. This would be typical of a large scale regional centre or a NOC (Network Operations Control) for a large organization or ISP.

Moving onto the 4th tier, we see the exact same requirements as tiers 1 through 3 but we get the environmental control aspect that needs to be addressed. In this scenario, we are not only going to have independently powered HVAC (Heating, Ventilating and Air-Conditioning) systems, but they will also have to be more dynamic in nature which means they will adjust to the current workload seen within the data centre.

We also start introducing a redundant site design. This means that we have redundant power systems and supplies as well as backups. In some cases this might be in the form of multiple sources of incoming power or a backup system such as battery banks or generators which can then carry the brunt should the primary power fail. This requires some serious planning and it also means that network and server engineers will not be the only type of engineer on the payroll, which will then increase the cost of the operation.

At the 4th tier we should get an uptime of 99.995% but this doesn’t mean much when compared to the third tier. I am going to break it down so that we see the vast differences between the uptimes throughout the tiers.

If you take the difference between tier 3 and 4 the uptimes differ by a mere 0.013%. If we consider the amount of minutes seen in a year we will get a total of 525 600 minutes. This means that the total downtime that can be expected in a tier 3 operation would be around 95 minutes (rounded up) The total amount of downtime seen in a tier 4 data centre would be around 26 minutes. This means that a tier 4 data centre will have 68 minutes more uptime that a tier 3 data centre. Similarly, a tier 3 data centre would have almost 23 hours more uptime than a tier 2 data centre per year! 

This amount of time seem insignificant, but imagine if a hospital running life support machinery was reliant on that data centre. If we were to lose 68 minutes, the price would be paid in lives. The chaos that goes on if the stock market goes down for an hour is insane. Imagine it going down for a day? I’m sure the powers that be will not be impressed. 

With this in mind, we can now start to understand the enormity of a data centre operation, as well as the reliance we have on these installations.
Interested in a career in an operation like this? Stick around for the next article and I will go into what is required by an IT Professional wanting to work in a data centre, and we will then explore the options when it comes to equipping yourself with the skills and knowledge needed!

Written by Brendon de Meyer

CTU Training Solutions

Phasellus facilisis convallis metus, ut imperdiet augue auctor nec. Duis at velit id augue lobortis porta. Sed varius, enim accumsan aliquam tincidunt, tortor urna vulputate quam, eget finibus urna est in augue.

No comments:

Post a Comment