Mail Server Outage
On 12th August 2022 I faced a Service disruption in my mail server due to some human error. here is the timeline.
Time | Log |
---|---|
09:57 | Started usual Maintenance work |
10:19 | Backup Before starting Updates |
10:27 | Started Updates. |
10:54 | Initial Service Disruption |
11:11 | Rollback to Previous Version |
11:29 | Service Is Live |
35 minutes of service disruption recovered with zero data loss.
now I want to learn about Zero Downtime deployments. If you know any learning resources about Zero Downtime deployments. Please share me those on my Twitter or at email [email protected]
Cause and fix
After the upgrade finished I noticed some weird behavior. I am unable to send or receive emails. Then I dug into logs and found that I messed up Postfix configuration while upgrading packages on the server. After I found that I restored the previous version’s configuration and that fixed the problem.
what I learned from this?
From this incident, I learned how important it is to have a good backup and monitoring plan. but Before anything bad happened I am maintaining a good backup and maintenance schedule. that one decision that I made at the start of the mail server project saved me now.
* This post is licensed under CC BY-SA 4.0