The world today is truly interconnected. And when it comes to the Internet, this number of connections only multiplies. This means that mishaps by one party can have real business consequences for several others. Let’s not forget about what happened when Amazon Web Services’ S3 storage data centers experienced a major outage within their system. The BBC addressed its effects, confirming that this error impacted high-profile sites as popular as Netflix, AirBnB, and Spotify.
Unfortunately, the public cloud is not more or less dependable than the private sector. A major difference is, of course, that you are no longer in charge of keeping it in operation. While this all might sound like the indisputable, right way to go, there is an issue that arises from these very same benefits. Like if and when AWS S3 fails to work, there is not a thing you can do to keep your app running; getting an S3 or EC2 engine back up and running would prove impossible.
In comparing the expenses needed to generate your own private cloud or choose to go with vendor distribution per managed cloud service, the cost of downtime was included in the calculation for DIY and vendor distribution-based private clouds. With impactful errors like this one with Amazon, it is easy to acquire that not even the most highly reputable, experienced clouds operate at 100% all the time—occasionally there will be cutouts.
What can be done about it?
The best defense for avoiding cloud errors is to be capable of tracking the efficiency of your cloud at all times. You need to know the error rate for storage to be informed of what is going on with your cloud to be able to take action in preventing future errors. Your error rate should frequently be trending upward to indicate that there are problems that need to be identified and fixed before your data center fails.
In order to be a true cloud operation, it must be able to withstand the potential storm of storage failure or a dying hypervisor. These are things that were designed to be fault-tolerant. It’s important that you’re able to build your cloud and applications to avert future data center problems. All of this is possible through Pure Play Open Cloud.
Exactly what is Pure Play Open Cloud?
Pure Play Open Cloud is a company you have probably heard mentioned frequently if you hang out in the tech circle. Its cloud-based architecture that is agnostic to hardware and the fundamental data center allows it to run anywhere. It has its roots in open source software like OpenStack, Kubernetes, Ceph, OpenContrail (networking software that does not have a vendor lock-in). Additionally, it can be relocated from a hosted environment to your own, and utilizing CI/CD pipelines ensures both reliability and scale.
In the ideal situation you would:
- Have a timely reaction to technical issues
- Become free of relying on only one vendor or cloud
- Having access to the underlying cloud
- Have the required support and capability of solving problems before they progress
These practices are not as simple as they seem on the surface, let’s put them under the microscope, so to speak.
Remaining independent from single vendors or clouds
OpenStack was initially attractive to clients due to the fact that Amazon Web Services enabled an entirely unique way of operation. While it was innovative, its utter dependency on AWS made it weak. The problems that arose were technologically and financially based. AWS tries to cut prices, but the larger you grow, the more incremental cost increases occur. And at that point, if you decide you want to switch, you are going to find yourself stuck, as your complete infrastructure is constructed around AWS products and APIs.
You would be considerably better off building your application and infrastructure in an arrangement that is agnostic to both the hardware and the infrastructure. In the instance that your app is not concerned with running on AWS or OpenStack, then creating an OpenStack infrastructure that operates as the base for your app, using external resources like AWS or GCE for emergency scaling or damage control during emergencies could be the right way to go.
Reacting in a timely manner to technical issues
The incident with Amazon would not have occurred in an ideal world; the outage occurred in AWS S3’s us-east-1 region. This could have been prevented had their applications been designed with a strong presence in multiple different regions. Regions exist for this purpose, yet are rarely utilized in this manner.
Your applications should be constructed with a reputable presence in several different regions for extra support. They are ideally spread out, so when there is a problem in a specific place, the application is able to continue running. The downside is that this can get pricey. Your alternative is to switch to a fail-safe in case of any incidents. In the very least, you should be able to manually switch to another option as soon as a problem has been detected. Again, this would ideally be identified prior to the issue reaching critical levels.
Having visibility into the underlying cloud
Being able to see into the underlying cloud is an area where a managed or private cloud has an advantage over the public cloud. In the end, one of the major tenets of a cloud is that you do not need to be concerned with hardware specifics for the hardware running your application, which can be alright—unless you are in charge of keeping things on the up and up. If that is the case, you can utilize tools like StackLight (for OpenStack), or even Prometheus paired up with Kubernetes can give you valuable information about the inner workings. This allows you to see if any trouble is forming, troubleshoot the issues, and determine where the problem specifically lies. Once the problem is identified, you can immediately repair it.
Support with fixing issues before they become problems
Having the tools to prevent and fix problems is, for many people, where the know-how really begins. Today, there is a shortage of cloud experts, and there are multiple companies who are nervous about putting faith in their internal employees to handle their cloud. Luckily, this does not have to be the norm. While approaching the “do it yourself” method of getting into the cloud seems like the least expensive route, this is not always the truth in the long-term. The more common solution is to use vendor distribution and purchase support.
Another alternative that is growing in popularity is the concept of the “managed cloud”. Here, your cloud might/might not be on your premises at any given moment. However, the experts who oversee these can ensure that your cloud holds up to a certain SLA standard—all the while letting you remain in charge.
Here’s an example: Mirantis Managed OpenStack is a reputable service that monitors clouds 24 hours a day, 7 days a week. Additionally, it is capable of repairing issues before they worsen. Remote monitoring, KPI reporting, a CI/CD infrastructure, and operational support when needed.
Mirantis Managed OpenStack is designed around the notion of Build-Operate-Transfer, where everything is constructed within open standards. Therefore, you are not stuck. When you feel ready you can take the lead in transferring to a lower level of support; you can even assume 100% responsibility if you wish. In the end, you need assistance that saves you time and keeps you running without being pinned down.
Taking complete charge of your cloud destiny
If you take anything away from this, remember that although it might seem ideal to entrust your cloud with a vendor, it’s not generally your best bet. Take charge, and get what you want from your cloud—having access to options and stellar applications.