We have test the upgrade process against v0.8 and later. It takes hours or more, depends on the number of nodes in the cluster and the internet network speed. During the upgrade, running jobs will fail. And jobs will automatically retry after the upgrade have done.
Table of Contents
- Prepare
- Stop Services and Backup Data
- Destroy Kubernetes Cluster
- Install Kubernetes Cluster
- Run Migration Scripts And Start Services
- It's Done
All the commands in the document excuted in dev-box. You will need to prepare a dev-box of v0.10 first. Run the fellow command to create one and work in it:
# create dev-box
sudo docker run -itd \
-e COLUMNS=$COLUMNS -e LINES=$LINES -e TERM=$TERM \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /pathConfiguration:/cluster-configuration \
--pid=host \
--privileged=true \
--net=host \
--name=dev-box \
docker.io/openpai/dev-box:v0.10.1
# Working in your dev-box
sudo docker exec -it dev-box /bin/bash
cd /pai
You could check version from the cluster configuration file service-configuration.yaml
.
It looks like:
cluster:
...
docker-registry:
namespace: openpai
domain: docker.io
tag: v0.8.3 # It's your cluster version
...
If you cluster is v0.9 and later, you could fetch the config from the cluster via paictl:
./paictl.py config pull -o path_to_backup_your_config
There should be four files under the path_to_backup_your_config
:
layout.yaml
(orcluster-configuration.yaml
in version 0.8/0.9)k8s-role-definition.yaml
kubernetes-configuration.yaml
services-configuration.yaml
PAI provide a script tools to convert configuration from old style to the v0.10 release.
Usage:
./deployment/tools/configMigration.py path_to_backup_your_config path_to_output_new_style_config
Then you could customzie the generate config under the directory path_to_output_new_style_config
per need.
PAI provide an check
command for validatng configuration, usage as below:
./paictl.py check -p path_to_output_new_style_config
Notices: the configuration pushed to cluster won't take effect until we restart the PAI Services. Use the command like below:
./paictl.py config push -p path_to_output_new_style_config
We stop all PAI Services:
./paictl.py service stop
Now the PAI is down, won't be able to access the PAI dashboard.
Data won't lost during the upgrade, the backup is optional but recommended.
Now please login onto the master node, and backup the data for ETCD, Zookeeper and etc. Below is a list of directories should take care (please backup them):
- PAI common data path, check the
service-configuration.yaml
, there is a configcluster.common.data-path
. Please don't change it unless you know excatly what you are doing. - Etcd data path, check the
kubernetes-configuration.yaml.yaml
, there is a configkubernetes.ectd-data-path
.
We will reinstall it with new configuration, destroy it first:
./paictl.py cluster k8s-clean -p path_to_output_new_style_config
Now the Kubernetes cluster is down.
Install the Kubernetes cluster:
./paictl.py cluster k8s-bootup -p path_to_output_new_style_config
Now the Kubernetes cluster is up, you can check the Kubernetes dashboard.
During the Service starting up, migrate script will be automatically called:
./paictl.py service start
Now the PAI is up, you can visit the PAI dashboard.
Now you have the release v0.10 install, please check the release-notes for new features.