Preliminary version. The functionality may change, but the basic features will be preserved. Compatibility with future versions is ensured, but may require additional migration actions.

Toolbox

The Toolbox Pod is used to execute periodic housekeeping tasks within application.

It contains useful GitLab tools such as Rails console, Rake tasks, etc. These commands allow one to check the status of the database migrations, execute Rake tasks for administrative tasks, interact with the Rails console:

# locate the Toolbox pod
kubectl -n d8-code get pods -lapp.kubernetes.io/component=toolbox

# Launch a shell inside the pod
kubectl exec -it <Toolbox pod name> -- bash

# open Rails console
gitlab-rails console -e production

# execute a Rake task
gitlab-rake gitlab:env:info

Usecases:

  • Profile rail application
  • Check status of background migrations
  • Run rake task
  • Make backup

Rails-console tips

Rails-console is one of the Toolbox components

A lot of useful and sometimes emergency tasks can be done through rails console. Having such access grants you with administrative permissions by default, but its highly recommended to use console with caution and only if really necessary


How to disable pipelines for every project

Project.all.find_each { |project| project.update!(builds_enabled: false) }

How to enable regular password authentication

Gitlab::CurrentSettings.update!(password_authentication_enabled_for_web: true)

Registry

Garbage collection

To run garbage collection we need to put registry in manual mode first. Add following to CodeInstance:

...
spec:
  feature:
    registry:
      maintenance:
        readOnly: true
...

Registry is in r/o mode now, so let’s get the name of one of the registry Pods.

kubectl get pods -n d8-code -l app.kubernetes.io/component=registry,app.kubernetes.io/name=code -o jsonpath='{.items[0].metadata.name}'

Run the actual garbage collection. Check the registry’s manual if you really want the -m parameter.

kubectl exec -n d8-code <registry-pod> -- /bin/registry garbage-collect -m /etc/docker/registry/config.yml

Backups


Automatic Backup Creation Before Module Updates

When a new version of the Code module is received, only the controller is updated, while all other components remain untouched and continue to function until the backup is created. The updated operator waits for the successful completion of the backup creation job.

  • If an error occurs during the launch of the backup creation job or within the backup creation process itself, the operator will periodically check the status of existing jobs. Upon finding a successfully completed job, the operator will begin deploying the remaining components.

If a module-level update is initiated during a period when the previous update with the backup.backupBeforeUpdate option failed, and the operator’s version differs from its components’ versions, the module-level update will fail with an error.

Please note this function to work properly requires at least 1.69 version of Deckhouse Kubernetes Platform

Enabling Automatic Backup Creation

To enable automatic backup creation, set the parameters backup.enabled and backup.backupBeforeUpdate to true in the CodeInstance specification. Additionally, ensure that backup.restoreFromBackupMode is set to false. Example configuration:

backup:
  backupBeforeUpdate: true
  backupStorageGb: 3
  cronSchedule: 0 0 1 * *
  enabled: true
  restoreFromBackupMode: false

Logic of Automatic Backup Creation

  • The operator stores its version in the pod/deployment environment variable — GITLAB_VERSION.
  • The version of GitLab components is checked via API call.
  1. The values of backup.enabled and backup.backupBeforeUpdate are checked. If either of them is not true, the operator skips backup creation and begins updating all its components.
  2. The current version of the operator and GitLab components is retrieved. If the versions do not differ, backup creation is skipped.
  3. A search is performed for Kubernetes Jobs related to backup creation within the last hour:
    • If no job is found, the operator creates a new job and monitors its status, waiting for successful completion.
    • If at least one job with a Failure status is found within the last hour, the operator will not update the components. An alert will be sent to Prometheus, signaling the update error. The alert will remain active until the backup.backupBeforeUpdate option is disabled or a job with a Success status is found.
    • If a job with a Success status is found, the operator proceeds to update all other components.

Troubleshooting Backup Creation Errors

  1. It is necessary to read the logs of the job that ended with a Failure status to identify the error. (Possible issues include insufficient space in the S3 bucket for backups or network problems.). Example command to read logs:

    kubectl -n d8-code logs backup-before-update-1745879676-zdzz7
    

    This command allows you to view the logs of the pod created to execute the job.

  2. Since the operator creates a backup job only once, after resolving the issues with backup creation, it is necessary to manually recreate the job to restore the operator’s correct functionality. The operator will automatically detect the successfully completed job and resume its operation. Example command:

    kubectl -n d8-code create job --from=cronjob/full-backup backup-before-update-manual-created
    

    Where:

    • backup-before-update-manual-created — the name of the new job that will be launched.

Backups and restore

A backup script creates a backup archive file to store your data.

To create the archive file, the backup script:

  • Extracts the previous backup archive file, when you’re doing an incremental backup.
  • Updates or generates the backup archive file.
  • Runs all backup sub-tasks to:
    • Back up the database.
    • Back up Git repositories.
    • Back up files (including S3 buckets).
  • Archives the backup staging area into a tar file.
  • Uploads the new backup archive to the object storage.
  • Cleans up the archived backup staging directory files.

Regular backups

Make sure backups enabled and configured in CodeInstance

Backups are implemented within stanalone cronJob (cron schedule can also be configured). It uses default gitlab backup-utility and the process is clearly described in official docs. The only one meaningful specific - it runs with --repositories-server-side flag. Read more about it here in official gitlab documentation.

Configuring underlying storage

Size can be calculated as Gitaly node size + sum of all buckets size + database size

  • Ensure that Persistent Volume size will be enough to store whole archive
  • Another option is to set backup.persistentVolumeClaim.enabled: false and make sure you have enough free space on the k8s Node your workloads hosted at

To enable regular backups add following to CodeInstance:

backup:
  cronSchedule: 0 0 */7 * *
  enabled: true
  s3:
    bucketName: d8-code-test-backups
    external:
      provider: YCloud
      accessKey: __ACCESS_KEY__
      secretKey: __SECRET_KEY__
    mode: External
  persistentVolumeClaim:
    enabled: true # whether to use persistencVolumeClaims for restore and backup processes or not
    storageClass: localpath

Examples here

So, after setting up CodeInstance properly you actually don’t need anything else rather than wait for backups to be done on schedule.

Manual backups

  1. Make sure section backup.s3 exists CodeInstance and has proper values
  2. Make sure needed component is running and read
kubectl -n d8-code get pods -lapp.kubernetes.io/component=toolbox
  1. Run backup utillity
kubectl exec -n d8-code deploy/toolbox -it -- backup-utility
  1. Backup will be stored in bucket defined in backup.s3.bucketName. It will be named in _gitlab_backup.tar format.

Restore from backups

To restore from backups you need to perform following steps:

  1. Use toolbox pod to start restore process: kubectl -n d8-code exec <Toolbox pod name> -it -- backup-utility --restore -t <timestamp|URL>
    • timestamp is from the name of the tarball stored in dedicated for backups bucket.
    • URL - public url to your backup meeting file:///<path> format
  2. Agree on everything being proposed during restore process

Additionally you can verify uploaded files integrity withing the same Toolbox pod by following guides from official docs