Troubleshooting¶
Reference for diagnosing a GitlabInstance that is stuck, failing, or not behaving as expected. Start with State inspection, then match your symptom in the sections below.
State inspection — start here¶
NS=my-gitlab
NAME=my-gitlab
# 1. Phase and print columns
kubectl get gli $NAME -n $NS
# 2. Full status including conditions
kubectl describe gli $NAME -n $NS
# 3. Conditions as JSON (often the most direct answer)
kubectl get gli $NAME -n $NS \
-o jsonpath='{.status.conditions}' | jq
# 4. Flux HelmRelease status
kubectl get helmrelease $NAME -n $NS
kubectl describe helmrelease $NAME -n $NS
# 5. GitLab pods
kubectl get pods -n $NS
kubectl describe pod -n $NS <stuck-pod-name>
# 6. Managed Postgres
kubectl get perconapgcluster -n $NS
kubectl describe perconapgcluster -n $NS
# 7. Managed Redis
kubectl get redis,redisreplication,redissentinel -n $NS
# 8. Managed Elasticsearch (EE)
kubectl get elasticsearch -n $NS
Where are the logs?¶
| Component | Command |
|---|---|
| Operator | kubectl logs -n bnerd-gitlab-operator deploy/bnerd-gitlab-operator --tail=200 |
| Flux helm-controller | kubectl logs -n flux-system deploy/helm-controller --tail=200 |
| Flux source-controller | kubectl logs -n flux-system deploy/source-controller --tail=200 |
| Percona PG operator | kubectl logs -n pgo deploy/percona-postgresql-operator --tail=200 |
| OT redis-operator | kubectl logs -n ot-operators deploy/redis-operator --tail=200 |
| ECK operator | kubectl logs -n elastic-system statefulset/elastic-operator --tail=200 |
Phase: Pending¶
Symptom: kubectl get gli shows Pending for more than a few minutes.
Causes:
-
A referenced Secret does not exist yet — the operator is waiting for a
credentialsSecretreferenced in the spec. Check which Secret is missing:Create the missing Secret and the operator will automatically progress on the next reconcile (30-second requeue).
-
GitlabVersionMap/default not found — the cluster-scoped version map has not been applied yet:
Phase: Provisioning¶
Symptom: Stuck in Provisioning for more than 10–15 minutes.
Causes:
Postgres not becoming ready¶
Common causes:
- No default StorageClass — PVCs will be Pending. Set a default StorageClass.
- Percona PG Operator not installed or not running. Check: kubectl get pods -n pgo.
- Insufficient cluster resources (CPU/memory). Check pod events.
Redis not becoming ready¶
Common causes:
- OT-Container-Kit redis-operator not installed. Check: kubectl get pods -n ot-operators.
- StatefulSet PVCs pending due to missing StorageClass.
Note: the OT-Container-Kit v0.25.0 readiness workaround (StatefulSet fallback) is built into the operator — no manual intervention needed.
Elasticsearch not becoming ready (EE)¶
Common causes:
- ECK not installed. Check: kubectl get pods -n elastic-system.
- ECK cluster needs 3 nodes and significant memory (at least 4Gi per node). Check node capacity.
- Takes 3–5 minutes normally — wait before escalating.
Phase: Failed¶
Symptom: phase: Failed with a condition reason.
ValidationFailed¶
The spec failed basic validation. Check the condition message:
Common reasons: missing spec.domains.gitlab, invalid spec.edition, or edition: ee without spec.licenseSecret.
PostgresCRDMissing / RedisCRDMissing / ElasticsearchCRDMissing¶
managed: true was set but the required backend CRD is not present in the cluster:
# Postgres
kubectl get crd perconapgclusters.pgv2.percona.com
# Redis
kubectl get crd redis.redis.opstreelabs.in
# Elasticsearch
kubectl get crd elasticsearches.elasticsearch.k8s.elastic.co
Install the missing operator (see Installation) and then update the GitlabInstance to trigger a reconcile:
kubectl annotate gli $NAME -n $NS \
kubectl.kubernetes.io/last-applied-configuration- --overwrite
# or change any spec field to trigger reconcile
VersionResolutionFailed¶
The requested spec.version is not in the GitlabVersionMap/default. Either the version does not exist in the map or the alias is not defined:
Update the version map to add the missing version, or correct spec.version to a known value.
S3BucketsIncomplete¶
One or more required bucket-class keys are missing from the S3 credentials Secret:
Add the missing bucket.<class> keys. See Object Storage for the full list.
S3SecretInvalid¶
The S3 credentials Secret is missing a required top-level key (accessKey, secretKey, endpoint, or region).
HelmRelease not reconciling¶
Symptom: The GitlabInstance shows Deploying but pods are not starting or the HelmRelease shows an error.
kubectl describe helmrelease $NAME -n $NS
kubectl logs -n flux-system deploy/helm-controller --tail=200 | grep $NAME
Common causes:
-
Flux controllers not running:
-
Wrong HelmRelease API version — the operator targets
helm.toolkit.fluxcd.io/v2beta1. If your Flux version no longer serves this API: -
Chart version not found — the resolved chart version doesn't exist in the chart repo. Check the Flux source-controller logs:
-
Values schema error — a
spec.helm.valuesoverride is incompatible with the chart version. Check the helm-controller logs forvalidationerrors.
Getting more detail¶
Operator logs for a specific instance¶
Full condition dump¶
kubectl get gli $NAME -n $NS -o json \
| jq '.status | {phase, host, observedVersion, chartVersion, conditions}'