The Importance of High Availability in an Information System

Rémy LAPLEIGE - System administrator at uh!ive

One of the critical points in any information system architecture—and something everyone dreams of—is infallible, fully resilient high availability.

Let’s be honest: it doesn’t truly exist.
But several methods still allow us to sleep soundly by reducing the risk of failure.

To reasonably guarantee high availability, let’s review some of the approaches we use at Uh!ive.

Physical redundancy

“Don’t put all your eggs in one basket.”

Managing physical redundancy is one of the most essential steps. Consolidating all services onto a single server or a single geographical location may seem cost-effective, but it is completely incompatible with redundancy.

Our information system is distributed across at least two distant and distinct datacenters, more than 100 km apart, while ensuring the required low latency for service continuity.
We work with OVH Cloud, a trusted partner for several years, to maintain our availability level.

This setup allows us to operate in active-active mode, and even continue running from a single datacenter in case of an inter-DC link outage or network incident.
OVH Cloud’s VRACK technology enables secure and private interconnection between these networks.

For services such as Elasticsearch, Redis, and RabbitMQ, deploying a third datacenter became necessary.
We are progressively migrating our services to better distribute load and meet client needs.

Service redundancy

Service redundancy is inherently tied to physical redundancy, as all our services are duplicated across each datacenter.

Choosing virtualization

Except for specific cases requiring high CPU, RAM, or storage capacity, our infrastructure is hosted on virtualization platforms such as VMWare and Proxmox.
Automated rules enable seamless migration of virtual machines if a host fails or during software updates.

Doubling proxy servers

Each upstream proxy is duplicated and shares a virtual IP address (VIP).
This enables us to update servers and services while limiting production downtime.

Each proxy pair is hosted on different datastores and virtualization hosts, providing both physical and software redundancy.

Tools such as keepalived are used to manage virtual IP addressing and failover.

Data replication

We use multiple database management systems—MariaDB and PostgreSQL. All databases are natively replicated across each datacenter.

Deployment simplification

Rapid deployment of servers and applications is a defining characteristic of high availability. It enables fast reaction to sudden spikes in load and facilitates scalability testing.

Deployment of servers using predefined templates
Deployment of tools using Puppet and Ansible
Deployment of services using Ansible and Docker

System backups

Backups are not strictly part of redundancy, but they ensure fast and reliable recovery.

All elements required to rebuild the information system are backed up daily, with higher frequency for some components:

Databases
Router configurations
Configuration files

Testing backups is essential. We chose to test them every six months.

And the human factor?

Electronic components and network cables may be impressive, but humans remain behind every architecture.
Human-side high availability is equally important in any organization.

To ensure the right level of knowledge, we apply the following practices:

Pair-based organization
Regular presentations of technical progress
Writing clear, precise documentation
Sharing insights from technology watch activities
Organizing internal hackathons

Monitoring

Monitoring is not directly related to redundancy, but every system must inform us of any change in state.

Redundancy is not always automatic; some situations require human intervention, especially network traffic redirection.

We also rely on:

Log timestamping for better traceability
Log centralization for easier searching
Multichannel alerts: SMS, internal chat, emails, etc.

Conclusion

There is no perfect solution to ensure flawless high availability, and each architecture comes with its own rules.

A wide range of tools and approaches exist to build an effective methodology. Hardware matters—but the human aspect is essential.

Cross-team collaboration, documentation, procedures, and regular testing are often what truly save an infrastructure.

Tags:

Discover More

Discover More

Discover More

The Importance of High Availability in IT Systems

Physical redundancy

Service redundancy

Choosing virtualization

Doubling proxy servers

Data replication

Deployment simplification

System backups

And the human factor?

Monitoring

Conclusion

Tags:

Previous PostSetting up a Hot / Warm S3 Cluster with MinIO

Next PostNot Everything Is GenAIous: Rethinking Artificial Intelligence

Status

Démo

Contact

Discover More

Discover More

Discover More

The Importance of High Availability in IT Systems

Physical redundancy

Service redundancy

Choosing virtualization

Doubling proxy servers

Data replication

Deployment simplification

System backups

And the human factor?

Monitoring

Conclusion

Tags:

Previous PostSetting up a Hot / Warm S3 Cluster with MinIO

Next PostNot Everything Is GenAIous: Rethinking Artificial Intelligence

Related Posts

Brace yourself data selection, industrialization is coming!

Stateful Components in ELM

Implementing PCI DSS (SAQ-D): Securing our Transactions

Status

Démo

Contact