LocalEndpoint
Metadata-onlyNo public command dispatch Browser-local validationNo upload intake Private runtime stays localNo localhost probing Invited distributionChecksum-backed artifacts

Phase 3.397 / v1.5.187

Local Model Lifecycle

Local model lifecycle documentation for reviewed downloaded models, immutable snapshots, sidecars, hardware-fit evidence, safetensors conversion boundaries, and worker allocation gates.

Local model lifecycle

Downloaded model files become usable only after Desktop review.

LocalEndpoint is designed for local downloaded models, but a file on disk is not enough. The desktop path records review evidence before a model can become a selected runtime candidate.

License sidecar Inventory sidecar SHA-256 Byte count Hardware fit Selected model identity
Review

Intake is evidence-backed.

Desktop verifies model format, license sidecar, inventory sidecar, hash, byte count, revision, and hardware fit before activation.

  • Model candidates must pass format, license sidecar, inventory sidecar, SHA-256, byte-count, revision, and hardware-fit checks before intake.
  • Accepted snapshots become immutable content-addressed local artifacts; unchanged snapshots record AlreadyStored evidence.
  • Safetensors conversion is evidence-only until converted GGUF or ONNX output re-enters local intake as a new reviewed snapshot.
  • Catalog download, intake activation, selected-model use, worker allocation, prompt tokenization, inference, and token streaming all stay behind local desktop gates.
  • LocalChatViabilityEvidence blocks non-viable runtime, model, offload, local API, memory authority, UAIX authority, or public-site boundary state before a worker envelope exists.
  • Associated GGUF package-folder enumeration failures after scan block local model intake with VerificationBlocked evidence before content-addressed storage write, intake cleanup, or runtime execution can occur.
  • Actual GPU execution claims require runtime adapter backend and device IDs to match the selected offload plan backend and device IDs before LocalEndpoint accepts `GpuObserved` evidence.
Runtime

Execution waits for identity gates.

Worker allocation, prompt tokenization, inference, and token streaming remain blocked until LocalChatViabilityEvidence, the active registry row, requested model entry, session identity, turn identity, provider request fingerprint, and runtime context agree.

  • RequestedModelEntryId prevents stale selected-model use.
  • RequestHandoffSha256 blocks start-to-stream request drift before worker handoff.
  • LocalChatViabilityEvidence records a blocked no-op before worker envelope creation when runtime, model, offload, local API, memory authority, UAIX authority, or public boundary state does not fit.
  • Registry and audit records keep prompt text and generated text out of persisted evidence.
  • Converted outputs must re-enter local intake before runtime use.
Model lifecycle boundary No model upload No hosted inference No provider API relay No public-site prompt intake No public-site model execution

Operating boundary

Public clarity, local authority.

This public site is static metadata and does not dispatch desktop commands, probe localhost, upload files, collect telemetry, request credentials, or claim runtime safety certification.