Join our community of builders on Discord!

Drain & Graceful Exit

Taking a Lightchain AI worker offline is not the same as stopping its container. The worker has obligations on-chain — jobs it already accepted carry a dispute window during which the disputer can still raise a challenge — and the on-chain registry refuses to release your staked LCAI while any of those obligations remain. Calling deregister before that settles reverts with ActiveJobsExist and leaves the stake locked. This page is the operator runbook for shutting a worker down cleanly: stop new sessions from being routed to it (drain), wait out the dispute window while the worker's internal release scheduler decrements activeJobsCount, then deregister and stop the container. It applies to both testnet and mainnet — substitute the env values for your network.

Why drain before exit

WorkerRegistry.deregister() reverts with ActiveJobsExist whenever getActiveJobCount(worker) > 0. After every completed job, the worker stays "active" for the on-chain dispute window (a governance parameter on AIConfig). During that window the disputer can still raise a challenge, so the contract holds the slot open. Once the dispute window passes, the worker's release scheduler automatically calls JobRegistry.releaseJobs(...) on-chain, which is what actually decrements activeJobsCount. The scheduler runs inside a running worker, probes every 5 minutes (RELEASE_PROBE_INTERVAL), and fires a full cycle within RELEASE_INTERVAL (default 8 h) of each job's expiry. In the normal case, you don't run release manually — just keep the worker running and the count will tick down on its own. If you stop the container before the scheduler has done its work, the count freezes and your stake stays locked until you bring the worker back.

Prerequisites

The examples below assume you already have the same environment variables set as in the standard Run a Worker — Testnet or Mainnet guide:
CodeBASH
cast (from Foundry) is used for read-only chain queries throughout.

Default exit sequence

The recommended path: drain, wait, optionally withdraw, deregister, stop. Steps 1-2 are mandatory; step 3 is optional housekeeping; steps 4-5 retire the worker.

1. Drain

Tell the dispatcher to stop routing new sessions to your worker. The drain marker is held in Redis on the worker-gateway side for the full dispute window plus slack — it survives restarts.
CodeBASH
Expected output:
CodeHTML

2. Wait, keeping the container running

In-flight jobs continue to settle. New ones no longer arrive. The release scheduler ticks automatically and decrements activeJobsCount over time. Poll until it reaches 0:
CodeBASH
Worst-case wait is one dispute window plus one release interval after the last completed job — so the dispute window (AIConfig.getDisputeWindow()) plus up to RELEASE_INTERVAL (default 8 h). Most operators see the count drop closer to one dispute window. Verify the dispute window for your network:
CodeBASH
Keep the container running through this wait. The release scheduler only runs while the worker process is alive — stopping the container freezes activeJobsCount. docker stop is fine if you intend to restart, but docker rm deletes release_state.json from the container's writable layer and orphans your claimable earnings on-chain. Drained = no new jobs, so the running cost during the wait is just an idle process.

3. (Optional) Withdraw earnings

Earnings accumulate as a per-worker workerBalance on the registry. Check what's claimable, then transfer it to the worker EOA:
CodeBASH
balance reports the on-chain workerBalance accumulated from settled jobs. withdraw drains it to the worker address. From there you can sweep it to your designated payout wallet as described in Rewards and fund handling in the install guides (Testnet, Mainnet).

4. Deregister

Once activeJobsCount is 0, deregister unlocks your staked LCAI and removes the worker from the registry:
CodeBASH
If this reverts with ActiveJobsExist, the scheduler hasn't fully settled yet — wait longer (or use the fast-exit below) and retry. A failed deregister doesn't change on-chain state; you can fix the condition and re-run.

5. Stop the container

Only safe once deregister has succeeded:
CodeBASH

Fast-exit

If you don't want to wait up to a full RELEASE_INTERVAL for the next scheduled release cycle, you can manually fire one cycle yourself after the dispute window has passed. This is purely an optimization — there is no functional difference vs. waiting for the scheduler.
CodeBASH
Choose between the default flow and fast-exit based on whether you'd rather be done sooner or be hands-off.

Quick reference

ConcernCommand
Check activeJobsCountcast call $WORKER_REGISTRY_ADDRESS "getActiveJobCount(address)(uint256)" $WORKER_ADDR --rpc-url $RPC_URL
Check dispute window lengthcast call $AI_CONFIG_ADDRESS "getDisputeWindow()(uint256)" --rpc-url $RPC_URL (seconds)
Check claimable earningsdocker exec lightchain-worker /bin/lightchain-worker balance
Manually fire one release cycledocker exec lightchain-worker /bin/lightchain-worker release
Roll back drain (keep serving)docker exec lightchain-worker /bin/lightchain-worker undrain --yes

Recovery scenarios

"I drained by mistake and want to keep serving."
CodeBASH
The marker is removed from the gateway's Redis store. The dispatcher's next selector call sees your worker as eligible again. No process restart needed. "I ran deregister and got ActiveJobsExist." Your worker is still registered — the failed tx didn't burn anything. Either wait for the scheduler to finish (poll activeJobsCount until 0), or run lightchain-worker release manually, then retry deregister. "I stopped the container during the wait." The scheduler can't run while the container is stopped, so activeJobsCount froze wherever it was. Restart the container (docker start lightchain-worker) — the scheduler resumes ticking and the count continues decrementing. "I removed the container (docker rm) before deregistering." Your claimable funds and registration still exist on-chain, but the local release_state.json is gone. To recover:
  1. Restart a container with the same keystore (docker run with the same env vars you used to bring the worker up in the first place).
  2. The worker reconstructs state via the chain reconciler on startup — watch logs for reconciler: pass complete.
  3. Continue with steps 2-5 of the default flow above.

Important behaviors

  • The release scheduler needs the worker running. Stopping the container pauses settlement; activeJobsCount won't drop while the worker is down. Always leave the container running between drain and deregister.
  • drain is selection-only. The CLI sets the gateway-side marker but does not stop the worker sidecar from finishing jobs that have already been queued for it. To halt the process as well, send SIGTERM — the sidecar writes the drain marker as part of shutdown.
  • deregister is retryable. A failed deregister (e.g. ActiveJobsExist) doesn't change on-chain state. Fix the underlying condition and re-run; nothing is consumed by the revert.
  • Drain marker outlives the dispute window. The gateway holds it for the full dispute window plus slack, so once set it persists across worker restarts until you explicitly undrain or it expires.