On-premises
Your hardware, your datacenter.
Kubernetes on your own hardware in your own datacenter. Maximum sovereignty — data, models, and compute stay within physical infrastructure your organization owns.
An agent platform is the operational layer underneath AI agents in production. This page is the architectural reference — what the platform does, how it composes, where it runs, and what it makes inspectable.
The first AI agent is not the hard one. The fourth is. The shape of the problem changes the moment an organization runs more than one — and that is what Agentic Platform is built for. The architecture is composable, the deployment is sovereign, and every decision an agent makes is traceable from prompt to outcome.
The agent layer is what Alquimia owns. The runtime layer is what the platform integrates with. The foundation is your infrastructure. The six components inside the agent layer share state through the runtime layer and persist artifacts through the foundation. They are the platform.
Production-grade inference runtime — GPU scheduling, model serving, explainability.
No-code agent design. Prompt configuration, tool selection, lifecycle management. The place where an agent is defined once and the definition is the source of truth.
Production execution. Agent-to-agent orchestration. Event-driven inference. Execution lives at the platform layer, available to every agent uniformly.
OCI-backed publish and pull. Every agent gets a name, a version, and a namespace. Promotion and revocation happen at the platform layer, not by editing code in a repository.
OpenTelemetry traces. Behavioral metrics. Token analytics. Every inference is inspectable in a single surface across the fleet.
Role-based access control. Single sign-on. Secrets management. Multi-tenant agent spaces. Enterprise primitives at the platform level, applied uniformly to every agent.
For engineering teams that extend the platform, integrate custom tools, or wire agents into existing systems. Extensibility without giving up the platform's properties.
Agentic Platform plugs into infrastructure your organization already operates. The pattern across every surface is the same: a standard interface, not a vendor-specific binding. The day your organization decides to change one layer, the rest of the stack keeps working.
Standard SSO protocols (OIDC, SAML) — your identity provider.
Standard secrets-management APIs — your secret store.
Kubernetes API — any conformant Kubernetes distribution.
Pub/sub messaging — any compatible broker.
S3-compatible object store — any compliant storage product.
OpenTelemetry — any OTel-compatible observability stack.
Standard inference APIs + custom adapters — any LLM, open weights you self-host or hosted endpoints.
Connector framework — any messaging, chat, or email channel.
Standard model serving primitives — any production-grade inference runtime.
Standard explainability surfaces — any open-source explainability framework.
Agentic Platform deploys to your account, your perimeter, your governance. Four deployment topologies are in production use today.
Your hardware, your datacenter.
Kubernetes on your own hardware in your own datacenter. Maximum sovereignty — data, models, and compute stay within physical infrastructure your organization owns.
Your account, your perimeter.
Kubernetes on private cloud infrastructure. Sovereignty preserved at the data level — your account, your perimeter, your governance.
Your cloud account.
Kubernetes on any of the major public cloud providers. The platform deploys to your cloud account. There is no data path through Alquimia infrastructure.
Any combination.
Any combination — control plane on-prem, inference in a regional cloud, archive in another region. Composition is part of the design.
Every layer in the deployment stack uses standard interfaces, so the choice of model, runtime distribution, or observability stack remains the organization's — not Alquimia's.
Any LLM — open weights you self-host, hosted endpoints, managed AI services. The platform does not require any specific model or vendor.
Any production-grade inference runtime with GPU scheduling and model serving primitives. Open-source explainability frameworks integrate at this layer.
Any conformant Kubernetes — on-prem, private cloud, or major public cloud. Lightweight distributions also supported.
Data, models, and decisions stay inside your perimeter. No data path through Alquimia or any third party unless your organization chooses one.
Red Hat OpenShift with OpenShift AI
Self-hosted open-weight models, plus any LLM provider you integrate
Amazon EKS (or another Kubernetes distribution)
Amazon Bedrock, plus any LLM provider
Google Kubernetes Engine (or another Kubernetes distribution)
Vertex AI, plus any LLM provider
Azure Kubernetes Service (or another Kubernetes distribution)
Azure OpenAI, plus any LLM provider
Red Hat OpenShift with OpenShift AI
Self-hosted open-weight models, plus any LLM provider you integrate
Amazon EKS (or another Kubernetes distribution)
Amazon Bedrock, plus any LLM provider
Google Kubernetes Engine (or another Kubernetes distribution)
Vertex AI, plus any LLM provider
Azure Kubernetes Service (or another Kubernetes distribution)
Azure OpenAI, plus any LLM provider
Hybrid deployments combine any of the above — for example, the control plane on OpenShift on-premises with inference in a regional cloud through that cloud's managed AI service. The architectural commitment is the same in all cases: the platform deploys to your account, your perimeter, your governance.
The combinations above are the most common deployment patterns. The specific architectural decisions our customers have made in production — including the reasoning behind their runtime-layer and explainability choices — are shared in our Insights track and in the use cases.
The platform captures six properties of every call. Capturing them is part of the runtime, available to every agent uniformly, with no per-agent instrumentation effort.
Who or what triggered the agent.
The exact prompt the agent received, including any contextual data retrieved at runtime.
The tools the agent was allowed to call and the responses they returned.
The exact model and version that produced the output, with inference parameters.
The output the agent produced, classified by type.
The downstream action the decision triggered.
The six properties flow through the observability layer (OpenTelemetry traces, behavioral metrics, token analytics) and through the governance layer(RBAC, audit trails, retention policies aligned with the organization's regulatory posture).
The audit committee's question — who handled this interaction, with what authority, and what did the model produce? — has a single answer from one observability surface.
For reproducible behavioral metrics across agents, models, and versions, the platform integrates with Gaussia — the open evaluation suite crafted by Alquimia.
// Hypothetical illustration
Take a typical production deployment: two AI agents collaborating on a customer support pipeline.
Triggered by a customer message. Retrieves the customer's account context through a platform-registered tool. Responds in natural language. Logs the full conversation as a structured event.
Triggered by every new incident, regardless of channel. Classifies the ticket against the organization's category taxonomy. Prioritizes it based on context — severity, customer tier, recurrence. Routes it to the right technical team. Logs the classification, the model that produced it, and the timestamp.
Both agents are designed in Studio. Each one is defined by three pieces — the prompt that captures its behavior in plain language, the tools it can call, and the guardrails that constrain it. The configuration is a property of the platform layer, available to both agents through Studio, the same way every other agent in the fleet is configured.
When a customer interaction happens, the runtime executes Agent A on the incoming message. Agent A produces the response and emits a structured event. Agent B subscribes to the same event stream and runs its classification. Both agents' traces — identity, prompt, tools, model, decision, outcome — flow into the observability layer. The audit committee, asked retrospectively who handled this interaction and how?, gets the answer from one place.
This walk-through is the pattern. Concrete deployments — including ones in production today — are documented in our use cases and in the Insights track.
We work with organizations building AI agents on their own infrastructure — corporate enterprises, state-owned operators, and regulated mid-market teams. A short call is enough to see if Agentic Platform is the right fit for your case.
Get in touch