005 · Sizing calculator

Size your deployment.

Configure your workload and see the recommended infrastructure in real time. Choose your target environment, add your models, and adjust options to get a production-ready sizing estimate.

Target environment

Workload

Concurrent conversations

Active conversations running simultaneously. Each holds state in Redis. This is the primary driver of memory scaling.

Conversations per day

Total daily volume across all agents. Drives storage and Kafka throughput sizing.

Data retention

How long to retain conversation history and audit logs.

Self-hosted models

Add models to size GPU nodes. Leave empty to use models as a service (MaaS).

Options

Self-hosted observability

Deploy Prometheus, Loki, Tempo, and Grafana on your infrastructure.

High availability (multi-AZ)

Ensure minimum 3 nodes per group distributed across Availability Zones.

vCPU total

80 GiB

memory

GPU

~52 GB

storage per node

nodes

Control Plane

m7i.xlarge

Cluster management and API server

NODES

vCPU

4 per node · 12 total

MEMORY

16 GiB

Managed by Red Hat in ROSA HCP configurations

Core · App · Data

m7i.2xlarge

Platform runtime, Postgres, Redis, Kafka, MinIO, Vault, ORAS, Studio

NODES

vCPU

8 per node · 8 total

MEMORY

32 GiB per node

REDIS

~1.4 GiB (6 MiB active + 1.4 GiB retained)

STORAGE

~52 GB per node

Redis holds both active conversation state and retained history. Retention period is the primary memory driver.

01No self-hosted models selected. GPU nodes are not required when using models as a service (MaaS).
02This is an estimated minimum. Contact us for production validation.