You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

GitLab: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Jelto
No edit summary
imported>Jelto
No edit summary
Line 1: Line 1:
This page contains SRE related topics for GitLab. For GitLab application specific information, please see https://www.mediawiki.org/wiki/GitLab (under Implementation).
This page contains SRE related topics for GitLab. For GitLab application specific information, please see https://www.mediawiki.org/wiki/GitLab (under Implementation).
GitLab is reachable at https://gitlab.wikimedia.org/. We run multiple instances of GitLab:
* gitlab1001 runs production GitLab serving https://gitlab.wikimedia.org/
* gitlab2001 runs a passive GitLab instance serving https://gitlab-replica.wikimedia.org/ (WIP)
* gitlab-ansible-test in WMCS gitlab-test project
* gitlab in WMCS gitlab-test project
gitlab1001 and gitlab2001 are setup using puppet and Ansible. The base configuration currently lives in [[gerrit:plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/manifests/gitlab.pp|profile::gitlab]]. Additional GitLab specific configuration comes from [[gerrit:plugins/gitiles/operations/gitlab-ansible|gitlab-ansible]]. Its planned to migrate all of the logic to puppet and drop Ansible (see [[phab:T283076|T283076]]).


==Backup and restore==
==Backup and restore==

Revision as of 13:43, 30 July 2021

This page contains SRE related topics for GitLab. For GitLab application specific information, please see https://www.mediawiki.org/wiki/GitLab (under Implementation).


GitLab is reachable at https://gitlab.wikimedia.org/. We run multiple instances of GitLab:

gitlab1001 and gitlab2001 are setup using puppet and Ansible. The base configuration currently lives in profile::gitlab. Additional GitLab specific configuration comes from gitlab-ansible. Its planned to migrate all of the logic to puppet and drop Ansible (see T283076).

Backup and restore

This section describes backup configuration and restore procedure for GitLab instance.

Backups

To backup application data GitLabs build in backup functionality is used. Application data backups are created by calling the /usr/bin/gitlab-backup create command. Configuration backups are created by calling /usr/bin/gitlab-ctl backup-etc. The commands are executed once a day in cronjobs created with Ansible and will create full backups. To configure the backups please refer to all backup related variables in Ansible.

So GitLab will create two new .tar archives every day:

  • full data backup in {{gitlab_backup_path}}
  • full config backup in /etc/gitlab/config_backup

Partial backups are disabled currently. For the initialization phase daily full backups are used. In the future we may start implementing partial and incremental backups.

Backup retention

Data backups and config backups will be deleted after three days on the production instance (see T274463#7147179). Release Engineering wanted to have three days of local retention for fast troubleshooting and restores. Deletion of the data backups is handled by GitLab (using the gitlab_backup_keep_time variable). Deletion of the config backup is implemented in the backup cronjob (using the gitlab_backup_config_keep_num variable).

Storing backups in bacula

For enhanced reliability backups are also stored in Bacula. Bacula is the standard for secure, encrypted backup storage in the WMF.

For the initialization phase we decided to only backup the most recent .tar file with the data backup and the most recent .tar file with the configuration backup. Furthermore these .tar files will be shipped to Bacula once a day as a full backup (see backup strategy daily). This backup strategy is not the default used by most services. The following concerns and advantages were discovered during our discussion when comparing daily full backups instead of weekly full backups and daily incremental backups (see T274463 and comments in /puppet/+/697850):

  • Incremental backups of GitLab's self-contained full backups would introduce an artificial technical dependency between revisions without having an actual dependency. To restore a backup Bacula would have to merge and diff all recent incremental backups and combine them with the last full backup. However, the latest backup should be enough to restore GitLab to the previous state.
  • The default backup policy would conflict with the requirement of Release Engineering to have three days of local backup retention on the GitLab host. This conflict would cause up to three times of additional disk usage in Bacula in comparison to a non-default backup policy.
  • Incremental-only backups would solve the problem of additional disk usage but can't be used long term due to technical limitations of Bacula according to Data Persistence. The restore process with a lot of incremental revisions would need a long time and computing resources. Furthermore we would introduce a dependency between revisions (see above).

Because of the reasons above we decided against the default strategy and instead use Daily Full Backups. For this decisions it was necessary to implement two changes:

"Latest" backup

To implement the strategy of daily full backups, a dedicated folder structure is needed for Bacula. We have to make sure that Bacula will not save the last three backups available on the GitLab host. Bacula must only backup the directory with the most recent files. For this purpose we created a additional ./latest directory inside each of the backup directories (using Ansible). Since our goal is to replace the Ansible code with puppet eventually, we also ensured the "latest" backup dirs exist using Puppet. We did this in 2 places, the profile class currently used in production (gerrit:700622) and the backup class from the gitlab module currently used only in cloud (gerrit:700595). Ideally we want to get to a situation where both production and cloud machines are setup automatically by the same puppet role, both using the module. The backup scripts on the GitLab machine will update the latest.tar file.

/srv/gitlab-backup/
├── 1624752267_2021_06_27_13.11.5_gitlab_backup.tar
├── 1624838667_2021_06_28_13.11.5_gitlab_backup.tar
├── 1624925067_2021_06_29_13.11.5_gitlab_backup.tar
└── latest
    └── latest.tar

Bacula is then configured to just use the /latest folder and save the most recent backup. Here is the fileset used in bacula:

    bacula::director::fileset { 'gitlab':
        includes => [ '/srv/gitlab-backup/latest', '/etc/gitlab/config_backup/latest' ]
    }

Restore

WIP

Failover

WIP

Monitoring

WIP