Same ARK runtime as Cloud, deployed on your hardware and licensed per GPU. Full platform with ARK Tailored, core runtime only with ARK Core. Data never leaves your infrastructure. Every price below is a framework — the number on your contract is scoped to your environment.
Both ARK Tailored and ARK Core use the same pricing mechanics. A per-GPU license covers the software and its update cycle. You add modality modules and an extended model catalogue only if you need them. Support tier scales automatically with the GPU count — no separate purchase, same software, same patches across every tier.
We don't publish a fixed per-GPU number because enterprise deployments genuinely differ. Modality mix, model catalogue size, deployment depth, and regional context all matter. Every quote comes out of a 30-minute discovery call — and every line item on the quote maps to one of the components in the formula on the right.
Same per-GPU pricing mechanics on both. The only difference is what's in the box — and whether your team already runs the surrounding services.
Everything in ARK plus the managed platform services — identity, logging, telemetry, monitoring, Portal — deployed and configured by ARK on your infrastructure, then handed over to your team.
The ARK inference runtime and nothing else. Designed to drop into a mature platform stack that already runs IAM, logging, monitoring, and an API gateway — so your team takes the engine and runs with it.
A 30-minute discovery call turns the framework above into a real number — tailored to your GPU fleet, modality mix, and operating model.
The base license gives you text generation and 10 models. Modalities and extended catalogue are modular. Same rates across Tailored and Core.
Text generation is included by default. Image generation, vision, embeddings, and speech modules are priced per additional modality per GPU per month. Enable only the ones your workload actually uses.
Every deployment includes up to 10 models from the ARK-curated set. Unlock an extended 10–20 model catalogue for broader coverage — fine-tunes, specialised models, regional language models.
A one-time package covering architecture workshops, reference configurations, and light hand-holding during your team's deployment. Included at no charge above a certain GPU volume — ask during discovery.
ARK is a technology supplier, not a consultancy. These are scoped engagements, not hourly bill-backs — so you know what you're getting and what it costs before the work starts.
For platform teams who prefer to install, configure, and operate ARK Core themselves — end-to-end.
For teams who want a sanity check and a structured rollout without ARK engineers doing the work.
ARK engineers install, configure, and hand over. You operate after go-live.
ARK runs the platform on your infrastructure ongoing. You focus on the workloads.
No feature gating between tiers. You get the same runtime, the same update cycle, and the same patches regardless of which support tier your GPU count lands in.
| Tier | GPU Licenses | Channel | Availability | Uptime | Extras |
|---|---|---|---|---|---|
| Standard AUTO | Under 25 | Business hours | 99.0% | Documentation & knowledge base | |
| Premium AUTO | 25 – 75 | Email + live chat | Extended hours | 99.5% | Named support contact · priority patch delivery |
| Enterprise AUTO | 75 or more | Dedicated TAM | 24/7 for critical | 99.9% | Priority hotfix · proactive health checks · architecture reviews |
Llama-3.1-70B · 4k context · stateful sessions · identical request mix
On Tailored and Core you run ARK on infrastructure you control. Here's exactly where our responsibility ends and yours begins.
ARK runtime components, API Gateway, Supervisor, Compute Nodes, Model Storage. Software defect response under the SLA. Update cycle and patch delivery. Documentation. Architecture reviews at the Enterprise support tier.
Hardware uptime. Networking. Operating system. Storage. Third-party integrations. Monitoring the underlying infrastructure. Patching the OS. Access control policies inside your organisation. Ongoing operations after handover — unless you take Full Managed Deployment on Tailored.
Tell us your GPU count, modality mix, model catalogue, and integration constraints. You'll leave the call with a scoped quote and a deployment path — not a brochure.