Two ways to run ARK.
Pick how you buy, not how it runs.

ARK Cloud is self-serve, token-based, and public. Sign up with an email, get free credits, ship today. ARK Tailored and ARK Core are per-GPU licenses deployed on your own infrastructure — consultative, scoped to your stack, quote-based. Same runtime underneath. Different commercial surface.

ARK RUNTIME ARK CLOUD 01 REST API · EU-hosted POST /v1/chat Bearer sk-eu-… ↳ 200 · 42ms REGION FRA · WAW MODELS Llama · Mistral · Qwen PAY PER TOKEN $1 · 1M credits ARK TAILORED · CORE 02 Private · your infra H100 A100 MI300 TOPO Multi-node · Ethernet FIT on-prem · private cloud PER-GPU LICENSE Quote on request
☁ ARK CLOUD · SERVERLESS

Building or experimenting.

Fastest way to run frontier open-source models behind an OpenAI-compatible API. Hosted in the EU. Sign up with an email, get free credits, ship today.

Pricing model
Pay per token
Commit
None — credits never expire
Starting at
Free credits, no card
Sales contact
Not required
  • OpenAI v1 / Anthropic-compatible API
  • 10+ frontier open models (text, vision, code, reasoning)
  • EU-only data residency, zero retention by default
  • Credits never expire, no auto-renew, no hidden fees
  • Best-effort 99% availability on shared endpoints
⚙ ARK TAILORED & ARK CORE · YOUR INFRASTRUCTURE

Deploying on your own GPUs.

The same ARK runtime, licensed per GPU and deployed into your environment — cloud, on-prem, or hybrid. Full platform (Tailored) or core runtime only (Core), your call. Data never leaves your infrastructure.

Pricing model
Per-GPU license + add-ons
Billing
Monthly or annual · multi-year available
Starting at
Quote on request
Sales contact
Required today
  • Per-GPU license — Tailored (full platform) or Core (runtime only)
  • Any GPU mix, any generation — heterogeneous by design
  • Standard Ethernet — no NVLink or InfiniBand required
  • Add-ons for modalities, extended model catalogue, installation
  • Software Support SLA, tier scales with GPU license count
A 60-second decision helper.

Three quick questions. The honest answer points to the right tier.

Pricing — the honest answers.
Can I start on Cloud and move to our own GPUs later?
Yes. It's the same runtime. Same OpenAI v1-compatible API. The model catalogue overlaps heavily. Most of our enterprise customers started with a Cloud trial, validated performance, then moved the production workload to their own infrastructure under ARK Tailored or ARK Core.
Why don't you publish a per-GPU price for Tailored and Core?
Because it's not a transaction. Deployments differ by modality mix, model catalogue size, support tier, integration work, and regional context. Publishing a single number would either underquote one customer or overquote another. We'd rather have a 30-minute discovery call and give you a real number.
What counts as "a GPU" in the per-GPU license?
Any physical accelerator attached to a Compute Node and used to serve inference under ARK. We don't meter per vGPU, per MIG partition, or per tensor core — just the physical units you put under ARK's control. Mixed generations (3060 to H100) count as one GPU each.
Do the per-GPU numbers include modalities beyond text?
Text generation is always included. Additional modalities (image, vision, embeddings, speech) are modular add-ons at a fixed per-GPU per-month rate. See the Enterprise pricing page for the framework.
Is installation and configuration included in the license?
The license covers the software. ARK Core can be self-deployed from documentation at no additional fee, with an optional Install Guidance package if you want a light touch. ARK Tailored is delivered by ARK engineers under a Standard or Full Managed Deployment package. You only pay for the level of hand-holding you actually need.
Will ARK Core ever be fully self-service?
Yes — that's the roadmap. ARK Core is moving toward a signed installer that provisions the runtime, registers a license key, and activates against a per-GPU count with no ARK engineers in the loop. Until it ships, Core is available today with documentation-only self-deploy or an optional guidance package.
What does the SLA cover?
On Cloud, ARK operates the entire stack and is accountable for availability and latency on shared endpoints. On Tailored and Core, ARK's SLA covers software defect response times and resolution on ARK components. Infrastructure uptime is the customer's (or their infrastructure partner's) responsibility. Support tier is determined by GPU license count and is the same software, same update cycle — no feature gating.