In this post I'll show you how to implement backup solution for your database running as container in OpenShift. I'll use AWS S3 as storage for my backups and PostgreSQL database as an example.
First I've create following cron job in my OpenShift project which is executed daily at 2 AM:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: landmarks-db-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
activeDeadlineSeconds: 1800
restartPolicy: OnFailure
containers:
- name: backup-maker
image: quay.io/jstakun/pg-backup:0.2
command:
- /bin/bash
- /opt/backup/do_backup.sh
envFrom:
- configMapRef:
name: pg-backup-conf
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: landmarksdb
key: database-password
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: landmarksdb
key: database-user
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-creds
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-creds
key: AWS_SECRET_ACCESS_KEY
This cron job is referencing aws-creds secret where I store my AWS S3 access keys, landmarksdb secret where I store my PostgreSQL database credentials and pg-backup-conf config map where I store other required env variables:
apiVersion: v1
kind: Secret
metadata:
name: aws-creds
type: Opaque
data:
AWS_ACCESS_KEY_ID: ...
AWS_SECRET_ACCESS_KEY: ...
apiVersion: v1
kind: Secret
metadata:
name: landmarksdb
type: Opaque
data:
database-password: ...
database-user: ...
apiVersion: v1
kind: ConfigMap
metadata:
name: pg-backup-conf
data:
AWS_DEFAULT_REGION: us-east-1
DB_BACKUP_PASSWORD: ...
POSTGRES_DB: landmarksdb
POSTGRES_HOST: landmarksdb
S3_BACKUP_PATH: s3://my_backups/landmarksdb
My cron job is using container image quay.io/jstakun/pg-backup:0.2 which contains all the tools I need to execute backup and store it at AWS S3 bucket: pg_dump, bzip2, mcrypt and aws cli. Here is pg-backup image definition:
FROM registry.access.redhat.com/rhscl/postgresql-96-rhel7 ENV BACKUP_HOME=/opt/backup WORKDIR $BACKUP_HOME USER root RUN yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm && \ yum install -y mcrypt bzip2 && \ curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \ unzip awscliv2.zip && \ ./aws/install && \ rm awscliv2.zip && \ rm -rf ./aws && \ yum clean all && \ rm -rf /var/cache/yum && \ mkdir -p $BACKUP_HOME && \ adduser -g 0 -c "Backup user" -p backup backup COPY ./do_backup.sh $BACKUP_HOME RUN chown -R backup:root $BACKUP_HOME && \ chmod +x $BACKUP_HOME/do_backup.sh && \ chmod g+rw $BACKUP_HOME USER backup
Here you can find the latest pg-backup container image definition. In this container image definition I'm referencing following do_backup.sh bash script:
#!/bin/bash
cd /opt/backup
export DUMP_FILE=pg-backup_`date +%Y%m%d_%H%M%S`.sql
export PGPASSWORD=$POSTGRES_PASSWORD
echo '--- running pg_dump ---'
pg_dump -v -d $POSTGRES_DB -U $POSTGRES_USER -h $POSTGRES_HOST --encoding=UTF-8 > $DUMP_FILE
echo '--- running bzip2 ---'
bzip2 $DUMP_FILE
echo '--- running mcrypt ---'
mcrypt ${DUMP_FILE}.bz2 -k $DB_BACKUP_PASSWORD
echo '--- running aws s3 ---'
aws s3 cp ${DUMP_FILE}.bz2.nc $S3_BACKUP_PATH/$DUMP_FILE.bz2.nc
Here you can find the latest backup script definition including AWS SES integration for sending email notificatons.
In order to copy backup file to AWS S3 bucket you'll need to create AWS account and assign appropriate policy to it:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-backups",
"arn:aws:s3:::my-backups/*"
]
}
]
}
In addition for optional AWS SES integration for sending email notifications following policy will be needed:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ses:SendEmail",
"ses:SendRawEmail"
],
"Resource": "arn:aws:ses:us-east-1:096851829459:identity/admin@mydomain.com"
}
]
}
That's it! Now you can check backups are stored in your AWS S3 bucket daily.
In order to restore the database you can follow following procedure from your workstation:
2. Download backup file
$ mkdir /data/install/landmarks && cd /data/install/landmarks
$ aws configure --profile=s3
$ S3_BACKUP_PATH=s3://my-backups/landmarksdb
$ DUMP_FILE=pg-backup_20201203_141724.sql.bz2.nc
$ aws s3 cp $S3_BACKUP_PATH/$DUMP_FILE $DUMP_FILE
3. Install mcrypt and bzip2. You can check my container definition file above for commands how to install these tools from EPEL repository.
4. Decrypt and uncompress backup file
$ mcrypt -d $DUMP_FILE
#passphrase: ...
$ bzip2 -vd pg-backup_20201203_141724.sql.bz2
5. Copy backup file to destination database container
$ oc project landmarks
$ DB_POD=$(oc get pods | grep landmarksdb | awk '{print $1}') && echo $DB_POD
$ oc rsync /data/install/landmarks $DB_POD:/var/lib/pgsql/data
6. Import backup to the database
$ oc rsh $DB_POD
sh-4.2$ psql -d landmarksdb < /var/lib/pgsql/data/landmarks/pg-backup_20201203_141724.sql
sh-4.2$ rm -rf /var/lib/pgsql/data/landmarks/
In order to backup and restore data to other Database Management System you can follow pretty much the same procedure and simply replace pg_dump tool with other backup & restore tool suitable for your Database Management System.