GitLab Geo configuration

Note: This is the documentation for the Omnibus GitLab packages. For installations from source, follow the GitLab Geo nodes configuration for installations from source guide.

Configuring a new secondary node

Note: This is the final step in setting up a secondary Geo node. Stages of the setup process must be completed in the documented order. Before attempting the steps in this stage, complete all prior stages.

The basic steps of configuring a secondary node are:

  1. replicate required configurations between the primary and the secondaries;
  2. configure a second, tracking database on each secondary;
  3. start GitLab on the secondary node's machine.

You are encouraged to first read through all the steps before executing them in your testing/production environment.

Notes:

  • Do not setup any custom authentication in the secondary nodes, this will be handled by the primary node.
  • Do not add anything in the secondaries Geo nodes admin area (Admin Area ➔ Geo Nodes). This is handled solely by the primary node.

Step 1. Copying the database encryption key

GitLab stores a unique encryption key on disk that is used to encrypt sensitive data stored in the database. All secondary nodes must have the exact same value for db_key_base as defined on the primary node.

  1. SSH into the primary node, and execute the command below to display the current encryption key:

    sudo gitlab-rake geo:db:show_encryption_key

Copy the encryption key to bring it to the secondary node in the following steps.

  1. SSH into the secondary node and login as root:

    sudo -i
  2. Add the following to /etc/gitlab/gitlab.rb, replacing encryption-key with the output of the previous command:

    gitlab_rails['db_key_base'] = 'encryption-key'
  3. Reconfigure the secondary node for the change to take effect:

    gitlab-ctl reconfigure

Once reconfigured, the secondary will automatically start replicating missing data from the primary in a process known as backfill. Meanwhile, the primary node will start to notify the secondary of any changes, so that the secondary can act on those notifications immediately.

Make sure the secondary instance is running and accessible. You can login to the secondary node with the same credentials as used in the primary.

Step 2. (Optional) Enabling hashed storage (from GitLab 10.0)

Warning Hashed storage is in Alpha. It is considered experimental and not production-ready. See Hashed Storage for more detail, and for the latest updates, check infrastructure issue #2821.

Using hashed storage significantly improves Geo replication - project and group renames no longer require synchronization between nodes.

  1. Visit the primary node's Admin Area ➔ Settings (/admin/application_settings) in your browser
  2. In the Repository Storages section, check Create new projects using hashed storage paths:

Step 3. (Optional) Configuring the secondary to trust the primary

You can safely skip this step if your primary uses a CA-issued HTTPS certificate.

If your primary is using a self-signed certificate for HTTPS support, you will need to add that certificate to the secondary's trust store. Retrieve the certificate from the primary and follow these instructions on the secondary.

Step 4. Enable Git access over HTTP/HTTPS

GitLab Geo synchronizes repositories over HTTP/HTTPS, and so requires this clone method to be enabled. Navigate to Admin Area ➔ Settings (/admin/application_settings) on the primary node, and set Enabled Git access protocols to Both SSH and HTTP(S) or Only HTTP(S).

Verify proper functioning of the secondary node

Your nodes should now be ready to use. You can login to the secondary node with the same credentials as used in the primary. Visit the secondary node's Admin Area ➔ Geo Nodes (/admin/geo_nodes) in your browser to check if it's correctly identified as a secondary Geo node and if Geo is enabled.

If your installation isn't working properly, check the troubleshooting document.

Point your users to the "Using a Geo Server" guide.

You can monitor the status of the syncing process on a secondary node by visiting the primary node's Admin Area ➔ Geo Nodes (/admin/geo_nodes) in your browser.

Please note that if git_data_dirs is customized on the primary for multiple repository shards you must duplicate the same configuration on the secondary.

GitLab Geo dashboard

Disabling a secondary node stops the syncing process.

The two most obvious issues that replication can have here are:

  1. Database replication not working well
  2. Instance to instance notification not working. In that case, it can be something of the following:
    • You are using a custom certificate or custom CA (see the troubleshooting document)
    • The instance is firewalled (check your firewall rules)

Currently, this is what is synced:

  • Git repositories
  • Wikis
  • LFS objects
  • Issues, merge requests, snippets, and comment attachments
  • Users, groups, and project avatars

Selective replication

GitLab Geo supports selective replication, which allows admins to choose which groups should be replicated by secondary nodes.

It is important to note that selective replication:

  1. Does not restrict permissions from secondary nodes.
  2. Does not hide projects metadata from secondary nodes. Since Geo currently relies on PostgreSQL replication, all project metadata gets replicated to secondary nodes, but repositories that have not been selected will be empty.
  3. Secondary nodes won't pull repositories that do not belong to the selected groups to be replicated.

Upgrading Geo

See the updating the Geo nodes document.

Troubleshooting

See the troubleshooting document.