this post was submitted on 23 Jun 2023
33 points (100.0% liked)
Experienced Devs
4008 readers
13 users here now
A community for discussion amongst professional software developers.
Posts should be relevant to those well into their careers.
For those looking to break into the industry, are hustling for their first job, or have just started their career and are looking for advice, check out:
- Logo base by Delapouite under CC BY 3.0 with modifications to add a gradient
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Answering my own question: My systems do zero-downtime deployment. Some of my services are managed using ECS and some using custom deployment scripts.
It's interesting that people mostly focus on the mechanics of launching the new code. To me, the interesting thing about zero-downtime deployment is what happens while the release is in progress, when there will be a mix of the old and new code versions accessing the same resources (databases, microservices, etc.) at the same time.
For example, you don't want to just drop a previously-mandatory column from a SQL database: even if your new release no longer references the column, the new code will break if you deploy code before updating the database, and the old code will break if you update the database before deploying code. Obviously there are ways to do this kind of thing (roll out the change in small backward-compatible steps) but they're extra work and can be easy to get wrong even if you're using ECS to launch the code. Whereas, if you're allowed to take downtime, you can do it all in one step without worrying about mixed-version environments.
You don't need to wiry about mixed version environments but you need to worry about whether you can roll back your changes without loss of data. It's not as hard but it seems to get overlooked if there haven't been any bad deployments lately.
On the flip side, if something goes wrong and your service is backwards compatible you can roll back without any more issues. If you allow downtime and backwards incompatible changes rollback can cause even more problems and result in far longer outages and lots of very stressed programmers.
You should always be able to roll back code changes. And zero downtime deployment are not that hard to do if you are already enforcing that.