Building resilient architecture comes with challenges. As engineers, we like to have a system that when impacted has no change in user experience. This is crucial, a slow loading form has a higher bounce rate.
What’s the difference between a High availability and a Fault-tolerant architecture?
If a server goes down the user might notice a poor experience however the user can complete the journey. Applications can run in the background but might take a longer time to complete the job.
Example: If you have four servers running, and you are trying to use Code Deploy one at a time. The first deployment breaks server 1, Code Deployment freezes but now there are 3 active servers instead of 4. Users will see this impact.
If a server becomes faulty, the user experience is not impacted. This architecture is ideally used in banks, health care or business where a slight drop in performance causes a financial impact.
Example: Instead of using one at a time deployment, we spin another server deploy the code and bring down the previous server. If things go, we always have 4 servers running.
Should you always aim for a fault-tolerant architecture?
It depends on many factors. In my opinion aim for High availability in the first phase and then Fault-Tolerant.
Let’s say you’re migrating to the cloud. In the first phase focus on availability, later on, look into Fault-tolerant.