001Vision

Turn your cameras into AI agents your team can govern — frame by frame.

Alquimia Vision is the sovereign platform for real-time computer vision at enterprise scale — configurable in plain language, composable across cameras, and governable from prompt to event. No retraining. No data science team.

002Value props
01Pillar

Real-time at frame speed

Detect, track, and identify across cameras as it happens. Multi-camera identity stays persistent end to end, so the same person or vehicle is the same entity in your event stream.

02Pillar

Configurable by prompt

Define new use cases in plain language. No retraining, no labeling pipeline, no data science team required. The day your protocol changes, the prompt changes — the pipeline does not.

03Pillar

Sovereign, composable, governable

Run on your infrastructure — on-prem, private cloud, or hybrid. Replace any model. Audit every decision from prompt to event.

003 · How Vision works

How Vision works.

Vision is composable by design. Specialized models do what they are best at — fast, precise detection and tracking. A vision-language model (VLM) does what only it can do — zero-shot semantic reasoning, configured by prompt. The two layers work together so the platform stays both fast and flexible.

01Layer

Workflows

No-code configuration: when a condition is met, where in your camera grid, analyze with this prompt. Triggers and prompts in plain language, no code required.

02Layer

Real-time pipeline

Detector, tracker, and cross-camera embeddings running at frame speed. Persistent identity within and across cameras, so an entity is the same entity from the moment it appears to the moment it leaves your scene.

03Layer

VLM reasoning, three levels

Visual reasoning invoked only when the rest of the system needs it: Level A on a single object crop (attribute classification), Level B on a full frame (spatial relations), Level C on a temporal sequence (what happened over time).

04Layer

Plugins

Opt-in vertical capabilities for domain-specific tasks: license plate OCR, PPE detection, human pose, face identity, person re-identification. Activate only the ones your case needs.

05Layer

Event stream + governance

Structured events flow through a NATS broker to your existing systems — SOC, observability stack, custom integrations — with OpenTelemetry traces and full audit history. Every decision is inspectable end to end.

If your reference for vision is closed CCTV analytics or pure VLM-on-every-frame, this is a different architecture.

004 · Demo

How a Workflow looks in production.

Take a security checkpoint use case. Guards in orange vests review visitors who arrive without vests. The goal is to detect, for every visitor, whether they were actually reviewed by a guard. Two Workflows configured in plain language are enough.

Workflow 01
Classify role on entry
WHEN
A new entity is detected (class: person)
WHERE
Camera 04 — Main Entry, anywhere in frame
ANALYZE
Use VLM Level A on a crop of the person. Prompt: "Is this person a guard wearing an orange vest, or a visitor without a vest?"
RESULT
Each person is tagged with their role, once, the moment they appear.
Workflow 02
Verify review on exit
WHEN
A tracked entity leaves the scene (class: person, role: visitor, duration > 60 seconds)
WHERE
Anywhere in the camera grid
ANALYZE
Use VLM Level C on five crops sampled across the entity's time on scene. Prompt: "Was this person reviewed by a guard? At what moment, and by whom?"
RESULT
A structured event is published with the visitor's identity, the answer, the timestamp, and a reference to the guard who reviewed them.

No retraining. No new code. Two prompts and two trigger configurations. If tomorrow the convention changes — guards now wear green vests, or the use case shifts entirely — only the prompts change. The pipeline stays the same.

006 · Industries

Industries we serve.

Chips filter the Solutions hub by industry. No deep page in v1 — promoted when a customer case is authorized.

01Security operations02Government03Manufacturing QC04Public safety05Compliance
007 · Open source

Open source by design.

Open code, replaceable components, no vendor lock-in. We craft Gaussia, our open evaluation suite, for the community — so every behavioral metric we publish is reproducible in your environment.

Crafted by Alquimia
008Insights

From the team.

#vision#architecture
Forthcoming

Composable vision: where the model ends and the prompt begins.

Why the right architecture for real-time vision is not a single big model, but a pipeline that calls a VLM only when it needs to.

Apr 2026
#vision#tracking
Forthcoming

Persistent identity across a camera grid, explained.

What it takes to make sure the same person is the same entity from the moment they appear to the moment they leave the scene.

Mar 2026
#vision#workflows
Forthcoming

Workflows: the no-code config that survives a protocol change.

How When / Where / Analyze / Result keeps the pipeline stable while the prompt absorbs the change.

Mar 2026
009 · Get in touch

Bring your camera feed. We'll walk you through it.

We work with enterprise teams running real-time vision on their own infrastructure. A short call is enough to see if Alquimia Vision is the right fit for your case.

Get in touch