tare/upgrade
tare upgrade
Upgrade an existing TARS dataplane release. By default uses the chart
embedded in this tare binary. Pass --chart-version <v> to fetch a specific
chart version from the OCI registry instead — lets operators upgrade
without downloading a new tare binary.
Reads the live release's Helm values so operator-supplied config
(registry, namespaces, customer, OTel endpoint) is carried forward
without re-prompting — pass flags to override individually.
Refuses to fresh-install (use 'tare install' for that). Refuses to
upgrade single-replica installs without --allow-downtime, because
rolling a single data-plane Envoy pod RSTs any in-flight LLM streams.
Defaults to helm upgrade --atomic so a failed rollout auto-rolls back
to the previous revision. Pass --no-atomic to leave a stuck release
in place for debugging.
The CRD apply step is run explicitly because Helm's crds/ directory is
install-only — chart upgrades that introduce new CRD fields would
otherwise be silently ignored.
Image sync is NOT performed by this command. Sync new images to your
registry first with 'tare install --image-sync <REG> --sync-only',
then run 'tare upgrade'.
Examples:
# Standard upgrade after replacing the tare binary
tare upgrade identity.json
# Force-upgrade a single-replica install (drops in-flight streams)
tare upgrade identity.json --allow-downtime
# Override the carried-forward image registry
tare upgrade identity.json --image-registry acme.registry.com
Usage:
tare upgrade <identity-file> [flags]
Flags:
Main:
--allow-downtime Proceed even if the data-plane is single-replica (drops in-flight requests). Use only on lab installs you intend to migrate later.
--drain-timeout-seconds int EnvoyProxy.spec.shutdown.drainTimeout (seconds). Maximum time Envoy waits for in-flight requests (long LLM streams) to finish before SIGKILL. (default 300)
--enable-semantic-router Override semantic-router enable state (default: carry forward from existing release)
--ha HA-safe defaults for the data-plane Envoy proxy (HPA min 2, PDB min 1). Pass --ha=false to keep single-replica. (default true)
--no-atomic Disable helm --atomic (no auto-rollback on failure). Leaves stuck releases in place for debugging.
--timeout string Helm upgrade --wait timeout. Should exceed drainTimeout × replicas to allow serial drain. (default "10m")
Telemetry:
--enable-otel-collector Override OTel collector enable state (default: carry forward from existing release)
--otel-collector-endpoint string Override OTel OTLP endpoint (default: carry forward from existing release)
--otel-exporter-auth-headers string Override OTel Authorization header value
Other:
--argocd-namespace string Namespace where ArgoCD Applications live; pre-check uses this to detect mixed-deployment (default: argocd). Set to empty string to disable the check.
--enable-metrics-server Override metrics-server enable state (default: carry forward from existing release)
--forward-proxy-address string Override ai-gateway.controller.forwardProxyAddress (envoy's LLM forward-proxy host:port). Default: carry forward from existing release; absent in the release + --http-proxy on is treated as a previous explicit disable and preserved. Pass --forward-proxy-address="" to disable the LLM tunnel while keeping --http-proxy on for the controller.
--forward-proxy-no-proxy strings Override the per-host opt-out list for envoy's LLM forward-proxy egress (default: carry forward). Comma-separated host-name suffixes (e.g. .openai.azure.com); leading dot tolerated.
--http-proxy string Override HTTP_PROXY env on controller/worker/envoy pods, and (by default, host:port-derived) the ai-gateway-controller's EGRESS_FORWARD_PROXY_ADDRESS that drives envoy's HTTP CONNECT egress for LLM upstreams (default: carry forward from existing release; pass --http-proxy="" to clear). Use --forward-proxy-address to override or disable the envoy-side derive without touching the in-pod HTTP_PROXY.
--https-proxy string Override HTTPS_PROXY env on controller/worker/envoy pods (default: carry forward; pass --https-proxy="" to clear)
--ignore-argocd Proceed even when ArgoCD manages the system namespace (mixed-deployment override; see ADR 046 §12.13).
--no-proxy string Override NO_PROXY env on controller/worker/envoy pods (default: carry forward; pass --no-proxy="" to clear)
--toleration stringArray Pod toleration applied to every data-plane component that schedules on tainted nodes: egress envoy, redis, ratelimit, the label-namespace Job, tareDoctor CronJob, and the configMonitor CronJob (when enabled). Repeatable. Format: key[=value]:effect[:tolerationSeconds]. Examples: --toleration nodepool:NoSchedule (Exists), --toleration nodepool=workload:NoSchedule (Equal). effect ∈ NoSchedule|PreferNoSchedule|NoExecute. Per-component overrides are dashboard-only — use the "Build install values" form to taint a single component differently from the rest.