Skip to main content

Documentation Index

Fetch the complete documentation index at: https://cantonfoundation-issue-365-details-history.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

When something goes wrong with your validator, a structured approach saves time. This page outlines a repeatable process for diagnosing issues and knowing when to escalate.

Step 1: Check Logs

Logs are your primary diagnostic tool. Start here before anything else. Capture logs from a Docker Compose deployment:
make capture-logs
This creates a timestamped archive of all container logs in the logs/ directory. For Kubernetes deployments, pull logs from each pod:
kubectl logs -n validator deployment/validator-app --tail=1000 > validator-app.log
kubectl logs -n validator deployment/participant --tail=1000 > participant.log
Analyze logs with lnav (a terminal-based log viewer with filtering and search):
lnav logs/*.log
Inside lnav, press / to search, then filter by severity:
  • :filter-in error — show only error-level entries
  • :filter-in WARN — show warnings too
  • Press TAB to switch between log files
Look for error codes like PARTICIPANT_*, SEQUENCER_*, or MEDIATOR_*. These codes directly point to the category of failure.

Step 2: Check Health Endpoints

Every validator component exposes a health endpoint. Confirm that each is responding:
# Validator app
curl -s http://localhost/api/validator/readyz

# Participant
curl -s http://localhost:5003/health

# Scan (if running an SV node)
curl -s http://localhost/api/scan/readyz
A healthy response returns HTTP 200. If a component returns 503 or fails to respond, focus your investigation there.

Step 3: Check Connectivity

Verify that your validator can reach the synchronizer sequencer and that no network-level issue is blocking traffic.
# DNS resolution
dig sequencer.sync.global

# TCP connectivity to the sequencer
nc -zv sequencer.sync.global 443

# If on DevNet with VPN, confirm VPN is connected
ping -c 3 sequencer.dev.sync.global
TLS issues often surface as connection hangs or handshake failures. See Connectivity Issues for details.

Step 4: Check the Database

If health endpoints respond but transactions fail or performance degrades, the database is the next suspect.
# Check database connectivity
psql -h $DB_HOST -U $DB_USER -d participant -c "SELECT 1;"

# Check table sizes (large tables may need pruning)
psql -h $DB_HOST -U $DB_USER -d participant -c "
  SELECT relname, pg_size_pretty(pg_total_relation_size(relid))
  FROM pg_catalog.pg_statio_user_tables
  ORDER BY pg_total_relation_size(relid) DESC
  LIMIT 10;
"
If tables are large (tens of GB), enable pruning. See Performance Issues.

Step 5: Check Canton Console

If the validator is running, the Canton console provides direct insight into internal state:
// Overall health
health.status

// Connected synchronizers
participant.synchronizers.list_connected()

// Party registrations
participant.parties.list()

Debugging Web UI Issues

This section was copied from existing reviewed documentation. Source: docs/src/deployment/troubleshooting.rst Reviewers: Skip this section. Remove markers after final approval.

Debugging issues in Web UIs

When facing an issue related to connectivity problems, if you are using chrome or firefox based browser:
  1. Go to the right of the address bar in your browser, go on the settings and then More tools -> Developer tools (or by hitting Ctrl + Shift + i). Your browser developer tools window opens.
  2. Click the Network tab.
  3. Enable the Preserve or Persist log check box.
  4. While the console remains open, reproduce the issue you want to report.
  5. Once you are done, right-click and select Save all as HAR
  6. Name the file with the description of your issue and save it.
Once you are done reiterate 2-6 by clicking the Console tab instead of the Network tab.

Collecting Configuration for Diagnosis

This section was copied from existing reviewed documentation. Source: docs/src/deployment/troubleshooting.rst Reviewers: Skip this section. Remove markers after final approval.

Configurations

Another thing which is often quite helpful to diagnose issue is to collect all configurations.
  • The application configuration files when running locally,
  • The helm values when using the helm charts (helm get values -n <namespace> <chartname>),
  • The environment variables when using the docker container but not the helm charts.
In addition to that also check the version you are using and the network (dev/test/mainnet) you are running against.

Common Error Messages

This section was copied from existing reviewed documentation. Source: docs/src/deployment/troubleshooting.rst Reviewers: Skip this section. Remove markers after final approval.

Common Error Messages

Traffic balance below reserved amount

A log of the form shown below indicates that your validator app has not been able to purchase any traffic. The validator blocks transactions not required to purchase more traffic once the purchased traffic balance falls below a given number to avoid issues where the validator locks itself out by not having enough traffic to complete a traffic purchase. Check the logs for TopupMemberTrafficTrigger to find possible causes. If you only want to rely on free traffic and do not want to purchase any extra traffic, remove the validator top-up config.
ABORTED: Traffic balance below reserved traffic amount (0 < 200000)

Insufficient funds to buy configured traffic amount

A log of the form shown below indicates that your validator app attempted to purchase traffic but does not have enough in the wallet of the validator operator party. This is common on TestNet and MainNet for new nodes as they start out with a balance of 0 and only slowly accrue CC through validator liveness rewards. So often this just requires waiting until enough CC has accrued. Alternatively, an existing node with a CC balance can transfer CC to you to increase your balance. If you only want to rely on free traffic and do not want to purchase any extra traffic, remove the validator top-up config.
Insufficient funds to buy configured traffic amount. Please ensure that the validator’s wallet has enough amulets to purchase 1.9998 MB of traffic to continue healthy operation.

Gave up getting app version

A log of the form below can often indicate that you used a scan URL in a place where an SV URL was expected or the other way around. Note the mismatch between the prefix https://scan. and the path /api/sv at the end. In a docker-compose setup, verify the URL passed to -s which should be a SV URL or svSponsorAddress for the helm deployment.
2025-02-11T10:16:13.098Z [⋮] ERROR - o.l.s.v.ValidatorSvConnection:validator=validator_backend (7427be2620676fce8a464eee769eb1d8-app_version-2d71c55f5ecd731b-793d382fa2d6ce14) - Gave up getting 'app version of https://scan.sv-2.dev.global.canton.network.digitalasset.com/api/sv' org.apache.pekko.http.scaladsl.unmarshalling.Unmarshaller$UnsupportedContentTypeException: Unsupported Content-Type [Some(text/html)], supported: application/json

UNAUTHENTICATED errors in validator, sv and scan app

A log of the form below in the SV, scan or validator app logs indicates an authenitcation error on the connection to the participant. Check the participant logs which will contain more details on why the request got rejected.
2025-02-14T11:32:00.304Z [⋮] INFO - o.l.s.v.ValidatorApp:validator=validator_backend (50836441bf579035d64a56f776566cbf) - The operation 'Get user 7D95xiEUxju4IUXFQgyUrwHMMuZO0g2F@clients' failed with a retryable error (full stack trace omitted): UNAUTHENTICATED: An error occurred. Please contact the operator and inquire about the request efd009557dec03da74dd29b723949cd6 with tid efd009557dec03da74dd29b723949cd6

Node has identity X, but identifier Y was expected

A log of the form below in the validator or SV app indicates that you tried to change the identifier used for your participant (or for SVs sequencer, mediator) after it was already initialized. Note that for validators the node identifier defaults to your validatorPartyHint so changing that also produces this error. For SVs it defaults to the SV name. If this is a new node, the easiest option is to reset your node by dropping the respective databases of the participant and validator or for SVs sequencer, participant, mediator, validator, sv and scan app. After you dropped the databases bring up your node again.
This deletes all data on your node and you cannot recover it. Only run this on fresh nodes that never successfully initialized.
If this is not a new node, change the values back to what you had before.
│Caused by: io.grpc.StatusRuntimeException: INTERNAL: Node has identity a-b-c-1::122098ffcd99..., but identifier a-b-1 was expected.                │

MemberDisabled error when connecting to sequencer

A log of the form below in your participant indicates that it has been down longer than the 30 day sequencer pruning window so the SVs have disabled it. Any attempts to connect will fail with the same error. You can recover your CC balance by spinning up a new node via validator_reonboard.
2025-04-16T08:18:06.451Z [⋮] DEBUG - c.d.c.s.c.t.GrpcSequencerSubscription:participant=participant/domainId=global-domain::12206d339948/sequencerAlias=Some-Alias (---) - Completed subscription with Success(GrpcSubscriptionError(Request failed for sequencer.
  GrpcRequestRefusedByServer: FAILED_PRECONDITION/MemberDisabled(PAR::validator1::12203d9ed85f...)
  Request: subscription

Error Escalation

When you encounter an error, use this decision path:
  1. Check the category. Categories 1 and 2 are transient — retry with backoff. If retries fail for more than a few minutes, move to step 2.
  2. Check connectivity. Verify network connectivity, VPN status (for DevNet), DNS resolution, and that the target sequencer endpoint is reachable.
  3. Check resources. Look for DB_STORAGE_DEGRADATION, SERVER_OVERLOADED, or SEQUENCER_BACKPRESSURE in logs. These point to resource constraints.
  4. Category 4 or 5 errors require investigation. Gather full logs, the error code, correlation ID, and node version. Do not repeatedly restart a node showing category 4 errors without understanding the cause.
  5. Contact support with the error code, full correlation ID, timestamps, node version, and relevant log excerpts. Support channels:
When sharing logs with support, redact private keys, passwords, and JWT tokens, but preserve error codes, correlation IDs, and timestamps.

Escalation Path

If self-diagnosis does not resolve the issue, escalate in this order:
  • Self-service documentation — search this troubleshooting guide and the cheat sheet
  • Community Slack channels — post in #validator-operations or #gsf-global-synchronizer-appdev with your error message, logs (redacted), and environment details
  • Email support — contact da-support@digitalasset.com for best-effort discretionary support
  • Paid support with SLAs — contact support@digitalasset.com, which opens a tracked Jira ticket
When escalating, always include:
  • Your validator ID and network (DevNet, TestNet, or MainNet)
  • The Splice / SDK version you are running
  • Relevant log excerpts (redact private keys, passwords, and JWT tokens)
  • A timeline of when the issue started and any recent changes