Validator Operations
Overview
Section titled “Overview”This guide covers day-to-day operations for Ashen validator nodes: key management, monitoring, backup/restore, and common operational procedures.
For initial setup, see Running a Node. For PoA allowlist management, see PoA Policy.
Key Management
Section titled “Key Management”Ed25519 Consensus Keys
Section titled “Ed25519 Consensus Keys”Each validator holds an ed25519 keypair used for signing blocks and consensus messages. Generate one with:
just ed25519-keygen > validator.key.json
# Deterministic (for reproducible testnet setups)just ed25519-keygen --seed 12345The key file contains secret_key, public_key, and address. Store the
secret key securely — it controls the validator’s identity.
See Keystore & Signatures for key storage details.
BLS Threshold Keys (DKG)
Section titled “BLS Threshold Keys (DKG)”Validators participate in Distributed Key Generation (DKG) each epoch to produce a shared BLS threshold key used for:
- Finality signatures (threshold BLS aggregation)
- Threshold encryption keys for the private mempool
DKG runs automatically. Monitor its status via:
curl -s http://localhost:3030/v2/rpc \ -d '{"jsonrpc":"2.0","id":1,"method":"validator_status","params":{}}' | jq .result.dkgResponse:
{ "phase": "complete", "epoch": 42, "rounds_started": 42, "rounds_completed": 41, "rounds_failed": 1, "fallbacks": 0}DKG phases: idle → commit → share → complaint → finalize → complete (or failed → fallback to deterministic keygen).
If rounds_failed increases, check peer connectivity and logs for DKG-related
errors.
Monitoring
Section titled “Monitoring”HTTP Endpoints
Section titled “HTTP Endpoints”The node exposes three HTTP endpoints for monitoring:
| Endpoint | Purpose | Response |
|---|---|---|
GET /metrics | Prometheus metrics | text/plain (Prometheus format) |
GET /health | Liveness probe | JSON with health status |
GET /ready | Readiness probe | JSON with sync/readiness status |
These are Kubernetes-compatible. Configure probes:
livenessProbe: httpGet: path: /health port: 3030readinessProbe: httpGet: path: /ready port: 3030Key Metrics
Section titled “Key Metrics”| Metric | Type | Description |
|---|---|---|
ashen_block_height | Gauge | Current tip height |
ashen_finalized_height | Gauge | Latest finalized height |
ashen_epoch | Gauge | Current consensus epoch |
ashen_peer_count | Gauge | Connected P2P peers |
ashen_txpool_size | Gauge | Pending transactions |
ashen_dkg_current_phase | Gauge | DKG phase (0=idle..6=failed) |
ashen_dkg_rounds_started | Counter | Total DKG rounds started |
ashen_dkg_rounds_completed | Counter | Successful DKG rounds |
ashen_poa_validators_allowed | Gauge | Validators in PoA allowlist |
ashen_poa_missed_proposals_total | Counter | Missed block proposals |
ashen_consensus_verify_total | Counter | Verification outcomes (success/failed) |
Validator Status RPC
Section titled “Validator Status RPC”The validator_status method returns a comprehensive operational snapshot:
curl -s http://localhost:3030/v2/rpc \ -d '{"jsonrpc":"2.0","id":1,"method":"validator_status","params":{}}' | jq .resultFields returned:
| Field | Description |
|---|---|
validator_pubkey | This node’s public key (null if not a validator) |
role | leader, validator, follower, or observer |
dkg | DKG phase, epoch, rounds started/completed/failed, fallbacks |
sync | synced, tip/finalized heights, finalization lag, time since last block |
peers | Connected/inbound/outbound peer counts, banned peers |
consensus | Epoch, view, leader changes, blocks finalized, verifications |
epoch_key_available | Whether the threshold encryption key is ready |
uptime_ms | Node uptime |
Health Check Response
Section titled “Health Check Response”{ "healthy": true, "block_height": 12345, "finalized_height": 12340, "epoch": 42, "last_block_time_ms": 1706900000000, "uptime_seconds": 86400, "peer_count": 5, "memory_rss_bytes": 524288000, "dkg_current_epoch": 42, "dkg_current_phase": 5, "status": "healthy", "issues": []}Backup & Restore
Section titled “Backup & Restore”Data Directory Structure
Section titled “Data Directory Structure”node-data/├── genesis.json # Genesis configuration├── chain/ # Block and transaction data├── state/ # State database (accounts, storage)├── consensus/ # Consensus state, DKG keys└── checkpoints/ # Periodic state snapshotsWhat to Back Up
Section titled “What to Back Up”genesis.json: immutable after init, but essential for recovery- Validator key file: the ed25519 secret key — most critical
consensus/: DKG shares and epoch keys — loss requires re-DKGcheckpoints/: enables fast recovery without full replay
Checkpoints
Section titled “Checkpoints”Configure automatic checkpoint creation:
node run --checkpoint-interval 200 # snapshot every 200 blocksList available checkpoints:
curl -s http://localhost:3030/v2/rpc \ -d '{"jsonrpc":"2.0","id":1,"method":"list_checkpoints","params":{}}' | jq .resultRestore from Checkpoint
Section titled “Restore from Checkpoint”- Stop the node.
- Copy
genesis.jsonand the validator key to a fresh data directory. - Start the node with the checkpoint:
Terminal window node run --data-dir ./restored-data --sync-from-checkpoint <height> - The node will sync forward from the checkpoint.
Storage Modes
Section titled “Storage Modes”| Mode | Flag | Description |
|---|---|---|
| Pruned (default) | --prune-keep-epochs 10 | Keeps last N epochs |
| Archive | --archive-mode | Retains all historical state |
Archive mode is required for:
- Historical state queries
- Analytics and indexing
- Serving
state_prooffor old blocks
For disk growth projections, see Disk Growth.
Operational Procedures
Section titled “Operational Procedures”Adding a Validator (PoA)
Section titled “Adding a Validator (PoA)”- Generate the validator’s ed25519 keypair.
- Add the public key to the allowlist file.
- Restart nodes or wait for hot-reload.
- Verify: check
validator_statusshows the new validator.
Removing a Validator (PoA)
Section titled “Removing a Validator (PoA)”- Remove the public key from the allowlist file.
- Restart nodes.
- Confirm consensus continues (need 2/3+1 remaining).
Emergency Validator Exclusion
Section titled “Emergency Validator Exclusion”node --poa-allowlist /path/to/allowlist.txt --poa-exclude <pubkey_hex>Consensus Stall Recovery
Section titled “Consensus Stall Recovery”If consensus stops progressing:
- Check
validator_status— is finalization lag growing? - Check peer connectivity — are enough peers connected?
- Check DKG status — did a round fail?
- Check logs for
leader_timeoutornotarization_timeout. - If needed, restart with adjusted timeouts:
Terminal window node run --leader-timeout-ms 4000 --notarization-timeout-ms 8000
Log Configuration
Section titled “Log Configuration”# Trace output for debuggingASHEN_TRACE_OUTPUT=1 node run --data-dir ./node-data
# Trace to directoryASHEN_TRACE_OUTPUT_DIR=target/feedback node run --data-dir ./node-data
# Rust log levelRUST_LOG=info,ashen=debug node run --data-dir ./node-dataRelated
Section titled “Related”- Running a Node — initial node setup
- PoA Policy — allowlist management
- Disk Growth — storage projections
- Local Devnet — local testing environment
- RPC API — full method reference
- Configuration — all configuration options