It’s really hard. And really expensive. I used to work in five nine environments, life or death type use cases, and my rule of thumb was that you double your cost for every extra nine you add.
When we got to five nines it was multiple hot standbys with a custom control and orchestration plane - literally custom hardware we had to build. This was for local installations, so not modern cloud environments (it was over a decade ago), but many of the challenges are similar, like session handling, transmission replay and caching, locking, clashing, routing, jitter, latency etc.
Looks like it is very stressful to work when you need this amount of availability, even more with the pressure that a little error can cause giant consequences.