In this section, you will accomplish the following steps:
Build the checkers v2 software.
Build the checkers v1.1 software.
Build the checkers v1 software.
Build Cosmovisor and set it up for two consecutive known upgrades on all nodes.
Launch everything.
Create the first upgrade governance proposal, v1tov1_1.
Have the first proposal pass.
Repeat with the second upgrade governance proposal, v1_1tov2.
Observe the first migration take place.
Have the second proposal pass.
Observe the second migration take place.
Stop everything and start it again safely in v2.
In a real production situation, node operators would wait for a named upgrade governance proposal to be on the ballot (if not wait for it to be approved) before they went to the trouble of setting up Cosmovisor; therefore point 6 would happen some time before points 2 and 4. However, because we know that the proposal will go through, we will use the above order in the interest of time.
Also, both proposals will be created before the first upgrade, and the second proposal with be passed after the first upgrade. That is in the interest of the exercise, so as to see the second proposal cross the first upgrade in its pending state.
Your genesis elements should still be in v1 as they were created in the run-prod branch. You can confirm this by verifying that there are no player infos and no leaderboards in the checkers genesis store(opens new window).
As you did in the migration section, you need to reduce the voting period from 2 days to 10 minutes to make the exercise bearable:
The names of the upgrade proposals will be v1tov2_1 and v1_1tov2. The names are important, as they are defined in the code, and Cosmovisor uses them to determine which executable to run.
Because the project in its current state uses Cosmos SDK v0.45.4, to avoid any surprise you will prepare Cosmovisor at the v0.45.4(opens new window) version too.
You can describe the steps in a new Dockerfile prod-sim/Dockerfile-cosmovisor-alpine, described here logically (before a recap lower down):
You need to build Cosmovisor from its code:
Copy
FROM --platform=linux golang:1.18.7-alpine AS builder
ENV COSMOS_VERSION=v0.45.4
RUN apk update
RUN apk add make git
WORKDIR /root
RUN git clone --depth 1 --branch ${COSMOS_VERSION} https://github.com/cosmos/cosmos-sdk.git
WORKDIR /root/cosmos-sdk/cosmovisor
RUN make cosmovisor
FROM --platform=linux alpine
ENV LOCAL=/usr/local
COPY --from=builder /root/cosmos-sdk/cosmovisor/cosmovisor ${LOCAL}/bin/cosmovisor
prod-sim Dockerfile-cosmovisor-alpine View source
Cosmovisor is instructed via environment variables(opens new window). In the eventual containers, the /root/.checkers folder comes from a volume mount, so to avoid any conflict it is better not to put the cosmovisor folder directly inside it. Instead pick /root/.checkers-upgrade:
With the executables and the blockchain elements ready, you can now define the production setup. You already defined one in the previous run checkers in prod section. In this new setup, the only things that change are the Docker images you call: cosmovisor_i instead of checkersd_i. Even the start command does not need to change.
To avoid rewriting everything, you can declare a Docker Compose extension(opens new window) in a new file prod-sim/docker-compose-cosmovisor.yml. Each checkersd type of service is extended, and in the end is replaced, with a new image:
docker-compose-cosmovisor.yml's val-alice extends docker-compose.yml's val-alice, while keeping the same name. In effect this overwrites val-alice, instead of starting another validator working on the same shared prod-sim/val-alice folder.
The first upgrade proposal to be run in 15 minutes (i.e. 180 blocks).
The second upgrade proposal to be run in 25 minutes (i.e. 300 blocks).
Remember that both proposals will have a voting period of 10 minutes, with the second one straddling the first upgrade:
At t=0, the first proposal is in its voting period, for an upgrade 15 minutes later (t=+15 min).
At t=+10 min, the first proposal should pass, and at about the same time, you create the second proposal (it does not matter if it is a bit before or a bit after) for an upgrade 15 minutes later (t=+25 min).
At t=+15 min, the first upgrade happens automatically thanks to Cosmovisor. You will now be running v1.1.
At t=+20 min, the second proposal should pass.
At t=+25 min, the second upgrade happens. You will now be running v2.
Find the current block height with:
Copy
$ docker run --rm -it \
--network checkers-prod_net-public \
checkersd_i:v1-alpine status \
--node "tcp://node-carol:26657" \
| jq -r ".SyncInfo.latest_block_height"
Observe that it changes to PASSED after the first upgrade. After that, if you are scanning the logs of one of the containers (for instance from Docker's GUI) you should see something like:
Copy
ERR UPGRADE "v1_1tov2" NEEDED at height: 1300:
INF starting node with ABCI Tendermint in-process
That was v1.1's last message, followed by v2's first message. You can confirm that the leaderboard has been populated:
It will return this until the containers are stopped and deleted, that is.
Remember that the containers are loaded from a Docker image configured with Cosmovisor. In the current configuration, Cosmovisor starts with what it finds at genesis/bin/checkersd, i.e. v1.
All this is to say that you should not expect it to work if you stop and start your Cosmovisor Compose setup as is.
If you were using real production servers, Cosmovisor's symbolic link would not reset itself on restart, so you would be safe in this regard. You would have time to revisit your server's configuration so as to launch checkersd v2 natively.
In this example you can prepare yet another Compose file, this time specifically for v2:
If you want to test other migration configurations, for instance where Carol forgot to put Cosmovisor on her node, you can revert all your blockchain files to v1 with:
Copy
$ ./prod-sim/unsafe-reset-state.sh
Another exercise you can attempt is create a v1tov2 upgrade that does both upgrades in one go. You would have to add this v1tov2 name into the Go code, and make sure it is handled correctly.
synopsis
To summarize, this section has explored:
How to prepare multi-stage Docker images for different executable versions.
How to prepare Cosmovisor for a simulated production migration.
How to upgrade a blockchain in production, by live migrating from v1 of the blockchain to v1.1 and then v2.
How to launch all that with the help of Docker Compose.
A complete procedure for how to conduct the update via the CLI.