You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Difference between revisions of "GitLab/Backup and Restore"

From Wikitech-static
Jump to navigation Jump to search
imported>Jelto
imported>Jelto
 
(3 intermediate revisions by 2 users not shown)
Line 43: Line 43:


===Restore===
===Restore===
The restore procedure depends on the host and the age of the backup that should be restored. Backups for the last three days are present on production GitLab instance in <code>/srv/gitlab-backup/</code> and <code>/etc/gitlab/config_backup/</code>. If older backups have to be restored, the backups have to be fetched from Bacula first. Restoring a backup on the same machine can be done using the Bacular CLI and the [[wikitech:Bacula#Restore_(aka_Panic_mode)|guide to restore a backup]] of the same client. '''Note''': only production GitLab is configured to use Bacula.
The restore procedure depends on the host and the age of the backup that should be restored. Backups for the last three days are present on production GitLab instance in <code>/srv/gitlab-backup/</code> and <code>/etc/gitlab/config_backup/</code>. If older backups have to be restored, the backups have to be fetched from Bacula first.


==== Fetch backups from bacula ====
Restoring a backup from bacular can be done using the Bacular CLI and the [[Bacula#Restore (aka Panic mode)|guide to restore a backup]] of the same client. '''Note''': only production GitLab is configured to use Bacula.
This steps follow the [[Bacula#Restore (aka Panic mode)|guide to restore a backup]].
* SSH to the backup host (currently <code>backup1001.eqiad.wmnet</code>)
* Run bacula command line tool: <code>sudo bconsole</code>
<syntaxhighlight lang="bash">
backup1001:~$ sudo bconsole
Connecting to Director backup1001.eqiad.wmnet:9101
1000 OK: 103 backup1001.eqiad.wmnet Version: 9.4.2 (04 February 2019)
Enter a period to cancel a command.
</syntaxhighlight>
* Choose <code>restore</code> option
* Choose option <code>5</code> (5: Select the most recent backup for a client)
* Select the server (currently <code>96: gitlab1001.wikimedia.org-fd</code>)
* Choose the FileSet to be restored
* Use the new prompt to browse the bvfs (bacula virtual filesystem) if file metadata has not been expired from the database. Standard '''ls''', '''cd''' commands apply. mark the files/dirs you want restored. If you specified a date old enough you will not be able to browse and you will have to restore the entire fileset
* use the <code>mark</code> command to mark files you want to be restored. wildcards work, there is also <code>unmark</code>
<syntaxhighlight lang="bash">
You are now entering file selection mode where you add (mark) and
remove (unmark) files to be restored. No files are initially added, unless
you used the "all" keyword on the command line.
Enter "done" to leave this mode.
cwd is: /
$ ls
etc/
srv/
$ mark srv/
2 files marked.
</syntaxhighlight>
* Enter <code>done</code>
* modify the job if needed (for example change the destination directory)
* wait :-) (you can use the '''messages''' command to see the status of the restore job)
* Check the backup on the GitLab host:
<syntaxhighlight lang="bash">
gitlab1001:~$ ls -l /var/tmp/bacula-restores/srv/gitlab-backup/latest/
total 17512
-rw------- 1 root root 17930240 Aug 11 00:04 latest.tar
</syntaxhighlight>Proceed with restore of the backup to GitLab.
#
==== Restore backup to GitLab ====
To proceed with the restore procedure, the config backup and data backup both of the '''same day''' should be present on the GitLab machine to perform the restore (either by using local backups or by restoring the Bacula backup to a temporary folder). Make sure to move the backup to the default backup path <code>/srv/gitlab-backup/</code> and <code>/etc/gitlab/config_backup/</code>.
To proceed with the restore procedure, the config backup and data backup both of the '''same day''' should be present on the GitLab machine to perform the restore (either by using local backups or by restoring the Bacula backup to a temporary folder). Make sure to move the backup to the default backup path <code>/srv/gitlab-backup/</code> and <code>/etc/gitlab/config_backup/</code>.


Line 63: Line 111:
</syntaxhighlight>
</syntaxhighlight>
*Restore GitLab configuration file into <code>/etc/gitlab/gitlab.rb</code> from the configuration backup archive and restore GitLab secrets file into <code>/etc/gitlab/gitlab-secrets.json</code> from the configuration backup archive
*Restore GitLab configuration file into <code>/etc/gitlab/gitlab.rb</code> from the configuration backup archive and restore GitLab secrets file into <code>/etc/gitlab/gitlab-secrets.json</code> from the configuration backup archive
<syntaxhighlight>
<syntaxhighlight lang="bash">
sudo tar -xvf /etc/gitlab/config_backup/latest/latest.tar --strip-components=2 -C /etc/gitlab/
sudo tar -xvf /etc/gitlab/config_backup/latest/latest.tar --strip-components=2 -C /etc/gitlab/
</syntaxhighlight>
</syntaxhighlight>
* Run <code>sudo gitlab-ctl reconfigure</code> to make sure GitLab installation is set up and PostgreSQL database is initialized; make sure GitLab configuration was done successfully
* When restoring a replica, overwrite the <code>/etc/gitlab/gitlab.rb</code> file with the local one of the replica
*Run <code>sudo gitlab-ctl reconfigure</code> to make sure GitLab installation is set up and PostgreSQL database is initialized; make sure GitLab configuration was done successfully
* Make sure GitLab is running with <code>sudo gitlab-ctl status</code>; if not, start it with <code>sudo gitlab-ctl start</code>
* Make sure GitLab is running with <code>sudo gitlab-ctl status</code>; if not, start it with <code>sudo gitlab-ctl start</code>
* Make sure the GitLab backup path, configured in <code>gitlab_rails['backup_path']</code> setting in <code>/etc/gitlab/gitlab.rb</code> exists, owned by <code>git:root</code> and has <code>rwx------</code> (<code>0700</code>) permissions; the default backup path is <code>/var/opt/gitlab/backups</code>
* Make sure the GitLab backup path, configured in <code>gitlab_rails['backup_path']</code> setting in <code>/etc/gitlab/gitlab.rb</code> exists, owned by <code>git:root</code> and has <code>rwx------</code> (<code>0700</code>) permissions
* Before restoring, disallow users' access to the GitLab:
* Before restoring, disallow users' access to the GitLab. Can be skipped when restoring a replica.
* If you have GitLab Runners connected to your running GitLab Server, pause all runners and wait until all jobs are finished before starting the restore
* If you have GitLab Runners connected to your running GitLab Server, pause all runners and wait until all jobs are finished before starting the restore. Can be skipped when restoring a replica.
* Stop GitLab's dedicated ssh server: <code>sudo systemctl stop ssh-gitlab</code>; check with <code>sudo systemctl status ssh-gitlab</code>
* Stop GitLab's dedicated ssh server: <code>sudo systemctl stop ssh-gitlab</code>; check with <code>sudo systemctl status ssh-gitlab</code>
* Stop database-connected GitLab processes: <code>sudo gitlab-ctl stop puma</code> and <code>sudo gitlab-ctl stop sidekiq</code>; check with <code>sudo gitlab-ctl status</code>. DO NOT stop other GitLab processes, they are required for restoring
* Stop database-connected GitLab processes: <code>sudo gitlab-ctl stop puma</code> and <code>sudo gitlab-ctl stop sidekiq</code>; check with <code>sudo gitlab-ctl status</code>. DO NOT stop other GitLab processes, they are required for restoring

Latest revision as of 09:14, 19 August 2021

This section describes backup configuration and restore procedure for GitLab instance.

Backups

To backup application data GitLabs build in backup functionality is used. Application data backups are created by calling the /usr/bin/gitlab-backup create command. Configuration backups are created by calling /usr/bin/gitlab-ctl backup-etc. The commands are executed once a day in cronjobs created with Ansible and will create full backups. To configure the backups please refer to all backup related variables in Ansible.

So GitLab will create two new .tar archives every day:

  • full data backup in {{gitlab_backup_path}}
  • full config backup in /etc/gitlab/config_backup

Partial backups are disabled currently. For the initialization phase daily full backups are used. In the future we may start implementing partial and incremental backups.

Backup retention

Data backups and config backups will be deleted after three days on the production instance (see T274463#7147179). Release Engineering wanted to have three days of local retention for fast troubleshooting and restores. Deletion of the data backups is handled by GitLab (using the gitlab_backup_keep_time variable). Deletion of the config backup is implemented in the backup cronjob (using the gitlab_backup_config_keep_num variable).

Storing backups in bacula

For enhanced reliability backups are also stored in Bacula. Bacula is the standard for secure, encrypted backup storage in the WMF.

For the initialization phase we decided to only backup the most recent .tar file with the data backup and the most recent .tar file with the configuration backup. Furthermore these .tar files will be shipped to Bacula once a day as a full backup (see backup strategy daily). This backup strategy is not the default used by most services. The following concerns and advantages were discovered during our discussion when comparing daily full backups instead of weekly full backups and daily incremental backups (see T274463 and comments in /puppet/+/697850):

  • Incremental backups of GitLab's self-contained full backups would introduce an artificial technical dependency between revisions without having an actual dependency. To restore a backup Bacula would have to merge and diff all recent incremental backups and combine them with the last full backup. However, the latest backup should be enough to restore GitLab to the previous state.
  • The default backup policy would conflict with the requirement of Release Engineering to have three days of local backup retention on the GitLab host. This conflict would cause up to three times of additional disk usage in Bacula in comparison to a non-default backup policy.
  • Incremental-only backups would solve the problem of additional disk usage but can't be used long term due to technical limitations of Bacula according to Data Persistence. The restore process with a lot of incremental revisions would need a long time and computing resources. Furthermore we would introduce a dependency between revisions (see above).

Because of the reasons above we decided against the default strategy and instead use Daily Full Backups. For this decisions it was necessary to implement two changes:

"Latest" backup

To implement the strategy of daily full backups, a dedicated folder structure is needed for Bacula. We have to make sure that Bacula will not save the last three backups available on the GitLab host. Bacula must only backup the directory with the most recent files. For this purpose we created a additional ./latest directory inside each of the backup directories (using Ansible). Since our goal is to replace the Ansible code with puppet eventually, we also ensured the "latest" backup dirs exist using Puppet. We did this in 2 places, the profile class currently used in production (gerrit:700622) and the backup class from the gitlab module currently used only in cloud (gerrit:700595). Ideally we want to get to a situation where both production and cloud machines are setup automatically by the same puppet role, both using the module. The backup scripts on the GitLab machine will update the latest.tar file.

/srv/gitlab-backup/
├── 1624752267_2021_06_27_13.11.5_gitlab_backup.tar
├── 1624838667_2021_06_28_13.11.5_gitlab_backup.tar
├── 1624925067_2021_06_29_13.11.5_gitlab_backup.tar
└── latest
    └── latest.tar

Bacula is then configured to just use the /latest folder and save the most recent backup. Here is the fileset used in bacula:

    bacula::director::fileset { 'gitlab':
        includes => [ '/srv/gitlab-backup/latest', '/etc/gitlab/config_backup/latest' ]
    }

Restore

The restore procedure depends on the host and the age of the backup that should be restored. Backups for the last three days are present on production GitLab instance in /srv/gitlab-backup/ and /etc/gitlab/config_backup/. If older backups have to be restored, the backups have to be fetched from Bacula first.

Fetch backups from bacula

Restoring a backup from bacular can be done using the Bacular CLI and the guide to restore a backup of the same client. Note: only production GitLab is configured to use Bacula.

This steps follow the guide to restore a backup.

  • SSH to the backup host (currently backup1001.eqiad.wmnet)
  • Run bacula command line tool: sudo bconsole
backup1001:~$ sudo bconsole
Connecting to Director backup1001.eqiad.wmnet:9101
1000 OK: 103 backup1001.eqiad.wmnet Version: 9.4.2 (04 February 2019)
Enter a period to cancel a command.
  • Choose restore option
  • Choose option 5 (5: Select the most recent backup for a client)
  • Select the server (currently 96: gitlab1001.wikimedia.org-fd)
  • Choose the FileSet to be restored
  • Use the new prompt to browse the bvfs (bacula virtual filesystem) if file metadata has not been expired from the database. Standard ls, cd commands apply. mark the files/dirs you want restored. If you specified a date old enough you will not be able to browse and you will have to restore the entire fileset
  • use the mark command to mark files you want to be restored. wildcards work, there is also unmark
You are now entering file selection mode where you add (mark) and
remove (unmark) files to be restored. No files are initially added, unless
you used the "all" keyword on the command line.
Enter "done" to leave this mode.

cwd is: /
$ ls
etc/
srv/
$ mark srv/ 
2 files marked.
  • Enter done
  • modify the job if needed (for example change the destination directory)
  • wait :-) (you can use the messages command to see the status of the restore job)
  • Check the backup on the GitLab host:
gitlab1001:~$ ls -l /var/tmp/bacula-restores/srv/gitlab-backup/latest/
total 17512
-rw------- 1 root root 17930240 Aug 11 00:04 latest.tar

Proceed with restore of the backup to GitLab.

Restore backup to GitLab

To proceed with the restore procedure, the config backup and data backup both of the same day should be present on the GitLab machine to perform the restore (either by using local backups or by restoring the Bacula backup to a temporary folder). Make sure to move the backup to the default backup path /srv/gitlab-backup/ and /etc/gitlab/config_backup/.

  • Make yourself familiar with Restore Prerequisites and the official omnibus restore guide.
  • Select the backup archives (data and configuration) to restore and copy them to the target host to /srv/gitlab-backup/ and /etc/gitlab/config_backup/
  • Make sure the backup archives are owned by git:git and have rw------- (0600) permissions.
sudo chown git.git /srv/gitlab-backup/1628121868_2021_08_05_13.12.9_gitlab_backup.tar
  • Confirm that there is enough free space on GitLab installation mountpoint on the target host
jelto@gitlab2001:~$ df -h
  • Make sure GitLab package (gitlab-ce) is properly installed on the target host and installed version is same as was used to create data and configuration archives; use version code from the name of the data archive to verify this
jelto@gitlab2001:~$ dpkg -l | grep gitlab
ii  gitlab-ce                            13.12.9-ce.0                 amd64        GitLab Community Edition (including NGINX, Postgres, Redis)
jelto@gitlab2001:~$ sudo ls /srv/gitlab-backup/ | grep gitlab_backup | cut -d "_" -f 5
13.12.9
  • Restore GitLab configuration file into /etc/gitlab/gitlab.rb from the configuration backup archive and restore GitLab secrets file into /etc/gitlab/gitlab-secrets.json from the configuration backup archive
sudo tar -xvf /etc/gitlab/config_backup/latest/latest.tar --strip-components=2 -C /etc/gitlab/
  • When restoring a replica, overwrite the /etc/gitlab/gitlab.rb file with the local one of the replica
  • Run sudo gitlab-ctl reconfigure to make sure GitLab installation is set up and PostgreSQL database is initialized; make sure GitLab configuration was done successfully
  • Make sure GitLab is running with sudo gitlab-ctl status; if not, start it with sudo gitlab-ctl start
  • Make sure the GitLab backup path, configured in gitlab_rails['backup_path'] setting in /etc/gitlab/gitlab.rb exists, owned by git:root and has rwx------ (0700) permissions
  • Before restoring, disallow users' access to the GitLab. Can be skipped when restoring a replica.
  • If you have GitLab Runners connected to your running GitLab Server, pause all runners and wait until all jobs are finished before starting the restore. Can be skipped when restoring a replica.
  • Stop GitLab's dedicated ssh server: sudo systemctl stop ssh-gitlab; check with sudo systemctl status ssh-gitlab
  • Stop database-connected GitLab processes: sudo gitlab-ctl stop puma and sudo gitlab-ctl stop sidekiq; check with sudo gitlab-ctl status. DO NOT stop other GitLab processes, they are required for restoring
  • Restore GitLab data by running sudo gitlab-backup restore BACKUP=timestamp_of_backup; timestamp_of_backup is the datecode from data backup name, example: 1628121868_2021_08_05_13.12.9
  • Reconfigure GitLab: sudo gitlab-ctl reconfigure
  • Restart GitLab services: sudo gitlab-ctl restart; check with sudo gitlab-ctl status
  • Run GitLab check rake task: sudo gitlab-rake gitlab:check SANITIZE=true; make sure all checks are fine
  • Check GitLab can decrypt secrets: sudo gitlab-rake gitlab:doctor:secrets; make sure check is fine, if not, check that /etc/gitlab/gitlab-secrets.json was restored correctly
  • Restore GitLab's dedicated ssh server: sudo systemctl restart ssh-gitlab
  • run basic smoke tests (make sure that web UI works, authentication works, ssh cloning works)
  • re-enable paused runners (if required)