Preliminary version. The functionality may change, but the basic features will be preserved. Compatibility with future versions is ensured, but may require additional migration actions.
Toolbox
The Toolbox Pod is used to execute periodic housekeeping tasks within application.
It contains useful GitLab tools such as Rails console, Rake tasks, etc. These commands allow one to check the status of the database migrations, execute Rake tasks for administrative tasks, interact with the Rails console:
# locate the Toolbox pod
kubectl -n d8-code get pods -lapp.kubernetes.io/component=toolbox
# Launch a shell inside the pod
kubectl exec -it <Toolbox pod name> -- bash
# open Rails console
gitlab-rails console -e production
# execute a Rake task
gitlab-rake gitlab:env:info
Usecases:
- Profile rail application
- Check status of background migrations
- Run rake task
- Make backup
Rails-console tips
Rails-console is one of the Toolbox components
A lot of useful and sometimes emergency tasks can be done through rails console. Having such access grants you with administrative permissions by default, but its highly recommended to use console with caution and only if really necessary
How to disable pipelines for every project
Project.all.find_each { |project| project.update!(builds_enabled: false) }
How to enable regular password authentication
Gitlab::CurrentSettings.update!(password_authentication_enabled_for_web: true)
Registry
Garbage collection
To run garbage collection we need to put registry in manual mode first. Add following to CodeInstance
:
...
spec:
feature:
registry:
maintenance:
readOnly: true
...
Registry is in r/o mode now, so let’s get the name of one of the registry Pods.
kubectl get pods -n d8-code -l app.kubernetes.io/component=registry,app.kubernetes.io/name=code -o jsonpath='{.items[0].metadata.name}'
Run the actual garbage collection. Check the registry’s manual if you really want the -m
parameter.
kubectl exec -n d8-code <registry-pod> -- /bin/registry garbage-collect -m /etc/docker/registry/config.yml
Backups
Automatic Backup Creation Before Module Updates
When a new version of the Code
module is received, only the controller is updated, while all other components remain untouched and continue to function until the backup is created. The updated operator waits for the successful completion of the backup creation job.
- If an error occurs during the launch of the backup creation job or within the backup creation process itself, the operator will periodically check the status of existing jobs. Upon finding a successfully completed job, the operator will begin deploying the remaining components.
If a module-level update is initiated during a period when the previous update with the backup.backupBeforeUpdate
option failed, and the operator’s version differs from its components’ versions, the module-level update will fail with an error.
Please note this function to work properly requires at least 1.69 version of Deckhouse Kubernetes Platform
Enabling Automatic Backup Creation
To enable automatic backup creation, set the parameters backup.enabled
and backup.backupBeforeUpdate
to true
in the CodeInstance
specification. Additionally, ensure that backup.restoreFromBackupMode
is set to false
. Example configuration:
backup:
backupBeforeUpdate: true
backupStorageGb: 3
cronSchedule: 0 0 1 * *
enabled: true
restoreFromBackupMode: false
Logic of Automatic Backup Creation
- The operator stores its version in the pod/deployment environment variable —
GITLAB_VERSION
. - The version of GitLab components is checked via API call.
- The values of
backup.enabled
andbackup.backupBeforeUpdate
are checked. If either of them is nottrue
, the operator skips backup creation and begins updating all its components. - The current version of the operator and GitLab components is retrieved. If the versions do not differ, backup creation is skipped.
- A search is performed for Kubernetes Jobs related to backup creation within the last hour:
- If no job is found, the operator creates a new job and monitors its status, waiting for successful completion.
- If at least one job with a
Failure
status is found within the last hour, the operator will not update the components. An alert will be sent to Prometheus, signaling the update error. The alert will remain active until thebackup.backupBeforeUpdate
option is disabled or a job with aSuccess
status is found. - If a job with a
Success
status is found, the operator proceeds to update all other components.
Troubleshooting Backup Creation Errors
-
It is necessary to read the logs of the job that ended with a
Failure
status to identify the error. (Possible issues include insufficient space in the S3 bucket for backups or network problems.). Example command to read logs:kubectl -n d8-code logs backup-before-update-1745879676-zdzz7
This command allows you to view the logs of the pod created to execute the job.
-
Since the operator creates a backup job only once, after resolving the issues with backup creation, it is necessary to manually recreate the job to restore the operator’s correct functionality. The operator will automatically detect the successfully completed job and resume its operation. Example command:
kubectl -n d8-code create job --from=cronjob/full-backup backup-before-update-manual-created
Where:
backup-before-update-manual-created
— the name of the new job that will be launched.
Backups and restore
A backup script creates a backup archive file to store your data.
To create the archive file, the backup script:
- Extracts the previous backup archive file, when you’re doing an incremental backup.
- Updates or generates the backup archive file.
- Runs all backup sub-tasks to:
- Back up the database.
- Back up Git repositories.
- Back up files (including S3 buckets).
- Archives the backup staging area into a tar file.
- Uploads the new backup archive to the object storage.
- Cleans up the archived backup staging directory files.
Regular backups
Make sure backups enabled and configured in CodeInstance
Backups are implemented within stanalone cronJob (cron schedule can also be configured). It uses default gitlab backup-utility
and the process is clearly described in official docs. The only one meaningful specific - it runs with --repositories-server-side
flag. Read more about it here in official gitlab documentation.
Configuring underlying storage
Size can be calculated as
Gitaly node size
+sum of all buckets size
+database size
- Ensure that Persistent Volume size will be enough to store whole archive
- Another option is to set
backup.persistentVolumeClaim.enabled: false
and make sure you have enough free space on the k8s Node your workloads hosted at
To enable regular backups add following to CodeInstance
:
backup:
cronSchedule: 0 0 */7 * *
enabled: true
s3:
bucketName: d8-code-test-backups
external:
provider: YCloud
accessKey: __ACCESS_KEY__
secretKey: __SECRET_KEY__
mode: External
persistentVolumeClaim:
enabled: true # whether to use persistencVolumeClaims for restore and backup processes or not
storageClass: localpath
Examples here
So, after setting up CodeInstance
properly you actually don’t need anything else rather than wait for backups to be done on schedule.
Manual backups
- Make sure section
backup.s3
existsCodeInstance
and has proper values - Make sure needed component is running and read
kubectl -n d8-code get pods -lapp.kubernetes.io/component=toolbox
- Run backup utillity
kubectl exec -n d8-code deploy/toolbox -it -- backup-utility
- Backup will be stored in bucket defined in
backup.s3.bucketName
. It will be named in_gitlab_backup.tar format.
Restore from backups
To restore from backups you need to perform following steps:
- Use toolbox pod to start restore process:
kubectl -n d8-code exec <Toolbox pod name> -it -- backup-utility --restore -t <timestamp|URL>
timestamp
is from the name of the tarball stored in dedicated for backups bucket.URL
- public url to your backup meetingfile:///<path>
format
- Agree on everything being proposed during restore process
Additionally you can verify uploaded files integrity withing the same Toolbox pod by following guides from official docs