Size your deployment.
Configure your workload and see the recommended infrastructure in real time. Choose your target environment, add your models, and adjust options to get a production-ready sizing estimate.
Target environment
Workload
Active conversations running simultaneously. Each holds state in Redis. This is the primary driver of memory scaling.
Total daily volume across all agents. Drives storage and Kafka throughput sizing.
How long to retain conversation history and audit logs.
Self-hosted models
Add models to size GPU nodes. Leave empty to use models as a service (MaaS).
Options
Deploy Prometheus, Loki, Tempo, and Grafana on your infrastructure.
Ensure minimum 3 nodes per group distributed across Availability Zones.
Cluster management and API server
Managed by Red Hat in ROSA HCP configurations
Platform runtime, Postgres, Redis, Kafka, MinIO, Vault, ORAS, Studio
Redis holds both active conversation state and retained history. Retention period is the primary memory driver.
- 01No self-hosted models selected. GPU nodes are not required when using models as a service (MaaS).
- 02This is an estimated minimum. Contact us for production validation.