Per-GPU license.
Your infrastructure. Your rules.

Same ARK runtime as Cloud, deployed on your hardware and licensed per GPU. Full platform with ARK Tailored, core runtime only with ARK Core. Data never leaves your infrastructure. Every price below is a framework — the number on your contract is scoped to your environment.

One license, one set of levers.

Both ARK Tailored and ARK Core use the same pricing mechanics. A per-GPU license covers the software and its update cycle. You add modality modules and an extended model catalogue only if you need them. Support tier scales automatically with the GPU count — no separate purchase, same software, same patches across every tier.

We don't publish a fixed per-GPU number because enterprise deployments genuinely differ. Modality mix, model catalogue size, deployment depth, and regional context all matter. Every quote comes out of a 30-minute discovery call — and every line item on the quote maps to one of the components in the formula on the right.

YOUR QUOTE · BILLED MONTHLY OR ANNUAL GPU license × GPUs under management ALWAYS + Modality add-ons · +€10 / GPU / month OPTIONAL + Extended model catalogue · +€25 / GPU / month OPTIONAL + Software Support SLA · auto by GPU count INCLUDED Professional-services packages are separate — only pay for the hand-holding you actually need.
Pick the scope that fits your stack.

Same per-GPU pricing mechanics on both. The only difference is what's in the box — and whether your team already runs the surrounding services.

FULL PLATFORM

ARK Tailored

Everything in ARK plus the managed platform services — identity, logging, telemetry, monitoring, Portal — deployed and configured by ARK on your infrastructure, then handed over to your team.

WHAT'S IN THE BOX
ARK Runtime · Compute Nodes · Supervisor
API Gateway (edge proxy + service)
Model Storage
Identity Service (Keycloak) — or BYO
Logging + Telemetry + Monitoring
Portal Service (optional)
Included Optional
Pricing mechanics
  • Per-GPU license — billed monthly or annual
  • Modality add-ons (text included; others +€10/GPU/mo)
  • Extended model catalogue (10–20 models) +€25/GPU/mo
  • Support tier auto-scales with GPU count
  • Optional managed-services & professional packages
Fits when: you want the full inference platform configured to your environment, with ARK doing the deployment work and your team (or an infrastructure partner) operating after handover.
Book a discovery call
RUNTIME ONLY · SELF-DEPLOY

ARK Core

The ARK inference runtime and nothing else. Designed to drop into a mature platform stack that already runs IAM, logging, monitoring, and an API gateway — so your team takes the engine and runs with it.

🚢
Self-service installer — on the roadmap. A signed executable that provisions ARK Core, registers a license key, and activates against a per-GPU count — no ARK engineers in the loop. Until it ships, deploy with documentation and an optional guidance package.
WHAT'S IN THE BOX
ARK Runtime · Compute Nodes · Supervisor
API Gateway
Model Storage (swappable for your own)
Identity · BYO (e.g. your existing IdP)
Logging · BYO (ELK, Loki, etc.)
Telemetry · BYO (Prometheus, Datadog)
Included Bring your own
Pricing mechanics
  • Per-GPU license — billed monthly or annual, same rate as Tailored
  • Modality add-ons (text included; others +€10/GPU/mo)
  • Extended model catalogue (10–20 models) +€25/GPU/mo
  • Support tier auto-scales with GPU count
  • Installation guidance strictly optional
Fits when: you already standardised on Keycloak / Prometheus / ELK (or your own alternatives) and don't want a second stack. Maximum control, minimum vendor footprint, self-deploy from documentation.
Book a discovery call
Same pricing on both tiers. The per-GPU license rate is identical for ARK Tailored and ARK Core. The choice between them is a scope & operating-model decision — not a price decision.

Ready to scope this to your environment?

A 30-minute discovery call turns the framework above into a real number — tailored to your GPU fleet, modality mix, and operating model.

Only pay for what you turn on.

The base license gives you text generation and 10 models. Modalities and extended catalogue are modular. Same rates across Tailored and Core.

Additional modalities

+€10 / GPU / month

Text generation is included by default. Image generation, vision, embeddings, and speech modules are priced per additional modality per GPU per month. Enable only the ones your workload actually uses.

Extended model catalogue

+€25 / GPU / month

Every deployment includes up to 10 models from the ARK-curated set. Unlock an extended 10–20 model catalogue for broader coverage — fine-tunes, specialised models, regional language models.

Installation guidance

ONE-TIME FEE

A one-time package covering architecture workshops, reference configurations, and light hand-holding during your team's deployment. Included at no charge above a certain GPU volume — ask during discovery.

Self-deploy, or we deploy. Your call.

ARK is a technology supplier, not a consultancy. These are scoped engagements, not hourly bill-backs — so you know what you're getting and what it costs before the work starts.

ARK CORE Self-deploy paths
DOCS-ONLY COMING SOON

Self-deploy with documentation

For platform teams who prefer to install, configure, and operate ARK Core themselves — end-to-end.

  • Full ARK Core documentation & reference architectures
  • Installation manuals for supported OS / GPU combinations
  • Configuration examples for common IAM / logging / telemetry stacks
  • Access to the Software Support SLA from day one
  • Version update notes and upgrade playbooks
Target time to first inference call: hours to days
Will be bundled with every license — currently in preparation
Book a discovery call
LIGHT TOUCH · OPTIONAL

Install Guidance package

For teams who want a sanity check and a structured rollout without ARK engineers doing the work.

  • Architecture workshop with your infrastructure team
  • Reference deployment topology for your environment
  • Infrastructure sizing & GPU capacity plan
  • Installation runbook tailored to your stack
  • Two structured Q&A sessions during rollout
  • Dedicated support channel for the duration
Typical duration: 2–4 weeks
Quote on request
Book a discovery call
ARK TAILORED ARK-delivered paths
DELIVERED

Standard Deployment

ARK engineers install, configure, and hand over. You operate after go-live.

  • Everything in Install Guidance
  • ARK engineers install the platform on your infrastructure
  • Configuration scoped to your models, modalities, identity provider
  • Integration with your existing gateway / IAM / logging / telemetry
  • Load-testing against your workload profile
  • Operational handover & runbook walkthrough
  • 30 days of co-pilot support post go-live
Typical duration: 4–8 weeks
Quote on request
Book a discovery call
FULLY OPERATED

Full Managed Deployment

ARK runs the platform on your infrastructure ongoing. You focus on the workloads.

  • Everything in Standard Deployment
  • ARK operates the platform on your infrastructure ongoing
  • Patch management, upgrades, monitoring, on-call
  • Quarterly architecture reviews & capacity planning
  • Enterprise Support SLA included
  • Named Technical Account Manager
  • Available in combination with ARK Tailored only
Ongoing engagement: annual contract
Quote on request
Book a discovery call
Same software, same patches.
Uptime scales with GPU count.

No feature gating between tiers. You get the same runtime, the same update cycle, and the same patches regardless of which support tier your GPU count lands in.

Tier GPU Licenses Channel Availability Uptime Extras
Standard AUTO Under 25 Email Business hours 99.0% Documentation & knowledge base
Premium AUTO 25 – 75 Email + live chat Extended hours 99.5% Named support contact · priority patch delivery
Enterprise AUTO 75 or more Dedicated TAM 24/7 for critical 99.9% Priority hotfix · proactive health checks · architecture reviews
Scope: The Software Support SLA covers ARK components — software uptime and software defect resolution. Infrastructure availability, networking, OS, storage, and third-party integrations remain the customer's (or their infrastructure partner's) responsibility. ARK warrants the software performs as documented when deployed per the ARK specifications delivered with your engagement.
Every GPU you own works 5–8× harder.
The per-GPU license looks expensive in isolation. But on identical hardware, the ARK runtime packs 5–8× the concurrent sessions compared with vLLM or TensorRT-LLM. Fewer GPUs. Fewer racks. Fewer kilowatts. Same workload.

Concurrent sessions per 8× A100 node

Llama-3.1-70B · 4k context · stateful sessions · identical request mix

Identical hardware 8× NVIDIA A100 80GB · standard Ethernet
ARKARK Runtime
640 sessions
5.8×
vs vLLM
vLLMStateless batching
110 sessions
Baseline
TensorRT-LLMStatic engine
95 sessions
-14%
vs vLLM
A clear responsibility boundary.

On Tailored and Core you run ARK on infrastructure you control. Here's exactly where our responsibility ends and yours begins.

ARK

Software & support

ARK runtime components, API Gateway, Supervisor, Compute Nodes, Model Storage. Software defect response under the SLA. Update cycle and patch delivery. Documentation. Architecture reviews at the Enterprise support tier.

Customer (or infrastructure partner)

Infrastructure & operations

Hardware uptime. Networking. Operating system. Storage. Third-party integrations. Monitoring the underlying infrastructure. Patching the OS. Access control policies inside your organisation. Ongoing operations after handover — unless you take Full Managed Deployment on Tailored.

Ready for a real number?
A 30-minute call is all it takes.

Tell us your GPU count, modality mix, model catalogue, and integration constraints. You'll leave the call with a scoped quote and a deployment path — not a brochure.