High-Availability and Disaster-Recovery in and with the Cloud
High availability and disaster recovery in and with the cloud support different levels of integration. It is important for enterprises to ensure that their data is secure and available when needed. Cloud solutions such as IaaS, PaaS and SaaS, backup-to-the-cloud and cloud-to-cloud backup support these enterprises and offer a suitable alternative to on-premises solutions. But which solution is right for the enterprise in question? In this article, we examine the pros and cons of each solution.
The availability and thus usability of IT applications and thus also of the underlying IT services depends on various factors. These factors are based, for example, on the underlying hardware components, the installed software modules or the configuration of the network connections involved. When it comes to the availability of IT applications or IT services, a distinction is made between two main states, the normal state and the fault state, in which certain components involved fail.
An IT service is described as available if it is capable of performing the tasks for which it is intended. However, availability also refers to the probability that a system will be functional (available) within a specified period of time. Availability is measured as the ratio of downtime to uptime of a system:
- A system is considered to be highly available if an application continues to be available even in the event of a failure and can continue to be used without direct human intervention.
- High availability (also abbreviated HA, derived from the English "High-Availability") refers to the ability of a system to ensure unrestricted operation in the event of failure of one of its components. For this purpose, systems or parts of them are designed redundantly.
The redundancy used can range from individual components to systems to complete application landscapes, depending on the importance (e.g., relevance of the application for the core processes of the company) of the services and the risks to be covered.
Generally speaking, the greater the scope of possible events, the greater the effort required to make a system fail-safe.
Traditionally, business-critical systems are operated across at least two geographical locations so that services can continue to operate even if one location fails. For protection against large-scale events, a third site can also be operated as a so-called "disaster recovery" (DR) site, which is also activated in the event of cyberattacks, ransomware attacks or security breaches. See also the Eraneos Group study "Cyber resilience: ensuring the continued existence of companies".
However, the associated costs for installation and operation, as well as the increased complexity, represent a significant hurdle.
Take-Away: What is High-Availability?
Take-Away: What Does Disaster Recovery Mean?
Various cloud providers offer options for running systems operated within the cloud with high availability, for example Infrastructure as a Service (IaaS), Platform as a Service (PaaS) or Software as a Service (SaaS). IaaS, PaaS and SaaS solutions are mostly based on a dynamic billing or subscription model. There are no high acquisition costs, maintenance and updates are carried out by the cloud provider.
Many of these cloud services are already built on redundant hardware as standard and are therefore protected against local hardware failures. However, the fault tolerance of these systems can be significantly increased by replication to other locations or regions. For example, virtualized systems or storage can be configured as site- or geo-redundant instances.
Highly available PaaS services, such as database instances or instances of the widespread container platform Kubernetes, are also operated on the basis of a highly available cloud infrastructure.
Likewise, data stored in the cloud can be automatically distributed across multiple locations, which prevents data loss in the event of a cloud data center failure. Applications obtained from the cloud (SaaS) are already provided by the corresponding provider with high availability.
In this context, however, it is important to emphasize that the high availability provided by a cloud provider for virtualized systems (IaaS) only works up to the defined system limit. This limit is in the case of a virtualized server, i.e. only up to and including the upper edge of the operating system. The installed business application must therefore have suitable replication mechanisms itself and ensure availability in the event of a fault.
Take-Away: What is the Cloud?
The cloud or cloud computing means that data and programs are stored and accessed via the Internet and not on the hard disk of a computer.
Another option for using cloud services for data and failover security is "backup-to-the-cloud" or "cloud-to-cloud backup". In the case of the former, a corresponding service at a cloud provider is used instead of or as a supplement to a local backup infrastructure. This solution can be particularly useful for environments that do not have a second IT location, as the backup data is physically available at a different or additional location. Cloud-to-cloud backup allows data hosted in one cloud to be backed up to another cloud provider. This protects the company that owns the data from a business continuity management (BCM) perspective against the insolvency of the primary cloud provider, among other things.
However, this solution is primarily of interest to companies and organizations that no longer have their own IT infrastructure. In the case of a hybrid cloud solution, replication of cloud data to the local environment is also an option, for example for Microsoft Office 365 and other SaaS solutions.
In addition to potential compliance and cyber security & privacy issues, the future costs for a possible restore must also be considered for a backup or archiving solution in the cloud, which can be disproportionately high, especially for archive solutions.
Comparison of backup and archive solutions
The table opposite compares backup and archive solutions operated on-premises and in a cloud with the various operating models (on-premises, IaaS, PaaS and SaaS) and documents the motivation for the corresponding solution.
Take-away: Cloud as backup and/or archiving location
The use of a cloud as a DR or backup site is particularly interesting for companies that require a second or third site for reasons of redundancy and availability.
For companies that have decided on a "hybrid cloud" strategy, a "cloud-first" approach is often chosen for life cycles and new services, where the primary focus is on the cloud. This involves moving on-premises services to the cloud in the long term, while reducing in-house infrastructure, which may then serve as a secondary DR site and backup location.
Various cloud providers also offer calculations of the expected costs, which can be used to draw up a statement about the expected operating costs in advance. While computer power is available relatively cheaply, managed storage services and access in the HA /DR area are often the primary cost drivers. Another cost driver is the desired redundancy of data, which can be replicated across different sites depending on the service level.
In any case, it is recommended that the integration of future-proof HA, DR, BCM and backup and archiving approaches be anchored sustainably from the outset both at the strategic level (IT, cloud strategy) and in the corresponding architectures and solution concepts. The selection of one or more cloud providers as strategic partners depends not only on technological aspects but also on legal, regulatory and operational factors, which are taken into account and the corresponding guard rails are defined. With its experience and expertise, the Eraneos Group represents a competent partner on its "Journey-to-the-Cloud". Eraneos Group's experts will be happy to support your company in asking the right questions and finding solutions.
Register today for our Eraneos Fokus webinar "BIT as a digitalization driver for Switzerland" here. On November 23, 2021, our Fokus-Webinar will provide a first-hand account of FOITT's significant role as a digitization engine for public Switzerland. The provision of the Covid certificate in record time is a prime example of what the FOITT is capable of.
We will be happy to report in more detail on the need as well as possible scenarios for data backup from cloud services to on-premises backup solutions in another article.
Data Analytics & AI Study - Achieving Real Added Value with Data Analytics & AI
In this first article, we address the core aspect of DevOps: a new culture in IT.
The IT organization of the future is business-oriented and will thus become an important driver of digital transformation.
This Eraneos blog article is dedicated to the methods around DevOps.
We would like to welcome you at our events and discuss with you selected questions on current topics.