The ARK runtime, licensed for your stack.

Just the core — Compute Nodes, Supervisor, API Gateway, and Model Storage — deployed on your own infrastructure and wired into the identity, logging, and monitoring you already run. No platform services to rip out, no operational handover, no ongoing vendor dependency. Annual runtime license with a Software Support SLA on ARK components.

Mature platform teams, sovereign deployments, infrastructure integrators.

ARK Core is the streamlined tier for organisations with an existing platform stack that want to adopt the ARK inference engine without pulling in the full managed platform. You operate it. You integrate it. We license the software and support the ARK components.

MATURE PLATFORM TEAMS

You already run the platform

Keycloak, ELK, and Prometheus/Grafana are production-hardened. You need the inference engine — not another platform to manage.

SOVEREIGN & GOVERNMENT

Maximum control, minimum dependency

Public-sector, defence, and regulated deployments where vendor footprint must be minimal. Air-gap supported. Source-of-truth for identity, logging, and telemetry stays with you.

INFRASTRUCTURE INTEGRATORS

Embed ARK in your platform

Sovereign cloud providers, MSPs, and SaaS platforms integrating ARK as the inference layer inside an existing platform product — not bolting on a second control plane.

Just the core runtime. Attach the rest of your platform.

ARK Core ships the proprietary inference engine and the components that make it run — nothing more. You bring your own identity provider, log aggregator, and metrics stack; the ARK runtime plugs into them through standard interfaces. If you ever need the managed platform services, the upgrade path to ARK Tailored is a configuration change, not a migration.

Core runtime (included)

ARK
✓ Included

ARK Runtime — inference engine with heterogeneous GPU support, elastic hot-scaling, and session-level KV isolation

✓ Included

Compute Nodes — execution runtimes bound to specific devices; any mix of GPU generations and vendors

✓ Included

Supervisor Node — orchestration, shard routing, master-master replication

✓ Included

API Gateway — edge proxy + OpenAI v1 / Anthropic-compatible API service

Swappable

Model Storage — Hugging Face-format shared repository (use ARK's or point at your own backend)

Platform services (you attach)

Client
Client

Identity — bring your own Keycloak, Okta, Entra, or OIDC-compatible IdP for tenancy and API-key management

Client

Logging — stream structured logs into your existing ELK, Splunk, Datadog, or SIEM

Client

Telemetry — Prometheus-compatible metrics (including nvidia-smi) consumed by your Grafana, Datadog, or observability stack

Client

Monitoring — wire alerts into your own on-call tooling (PagerDuty, Opsgenie, Splunk On-Call, Datadog)

Tailored

Portal — not included. Add via ARK Tailored if you need a chat & embeddings UI for non-technical teams

Same software, minimal footprint. ARK Core ships the same runtime as ARK Tailored and ARK Cloud — identical update cycle, identical patch delivery, no feature gating between tiers. The difference is scope, not capability.

The same workload. Two very different data paths.

Standard integration is the same as you'd build with any LLM provider — ARK just changes where the inference happens, and what crosses your perimeter to get there.

Default pattern Without ARK
Public LLM provider. Prompts cross your perimeter to a third-party cloud.
Your perimeter
APP Your application prompt + context
LLM Public LLM API cloud provider
Data crosses 2 boundaries. Your prompts, business context, and the model output transit a third-party network. Logs and audit trails sit with the provider.
ARK pattern With ARK Core
Same OpenAI v1 API. Inference happens inside your perimeter on your hardware.
Your perimeter
APP Your application prompt + context
ARK ARK Runtime your GPUs
OUT Response back to app
Zero perimeter crossings. Prompts, context, model weights, and the response all stay on your hardware. Logs sit in your SIEM. Audit trails are yours.

Your team installs. Your team operates. We support the software.

ARK Core is designed for teams that want minimal vendor involvement in day-to-day operations. Install using ARK-supplied artefacts and manuals — or bring us in for an installation guidance package. Either way, once it is running, your team owns the operational loop; ARK supports the software.

1

License & scope

Annual runtime license sized by GPUs under license. Pick the modalities and model catalogue that match your workload; upgrade at any time.

2

Install

Your team installs using ARK's reference architecture, container images, and manuals. Optional installation guidance package available if you want ARK engineers alongside.

3

Integrate

Wire ARK into your existing IdP, logging pipeline, metrics stack, and API gateway. Standard interfaces; no bespoke glue required.

4

Operate & get support

Your team runs the platform. ARK provides a Software Support SLA covering the ARK runtime components — response times and escalation paths scale with GPU license volume.

SLA scope: ARK Core includes a Software Support SLA covering response times and software defect resolution for ARK runtime components. Infrastructure uptime, networking, hardware, and the platform services you attach remain your team's responsibility. Support tier scales with GPU license volume.

Annual runtime license + support.

ARK Core is sold as an annual runtime license, sized by GPUs under license. No managed-services fee, no platform-services surcharge — you only pay for the runtime. Optional add-ons cover extended model catalogues, additional modalities, and one-time installation guidance.

RUNTIME LICENSE
Annual

Priced per GPU under license. Base allocation includes 10 models and text modality.

SUPPORT TIER
Scales with GPU volume

Software Support SLA on ARK runtime components. Standard / Premium / Enterprise, auto-selected by license size.

EXTENDED MODELS
+€25 / GPU / month

Upgrade from 10-model base to a 10–20 model catalogue.

ADD-ON MODALITIES
+€10 / GPU / month

Per additional modality (image, vision, embeddings). Text is always included.

One-time installation guidance package available on request. Professional services for integration or custom configuration priced separately.

The same architectural moat, stripped to the engine.

Everything that makes ARK defensible against vLLM, TensorRT-LLM, and Ollama is in the runtime itself — not the surrounding platform. ARK Core gives you that engine, on your infrastructure, without the rest.

HETEROGENEOUS GPUs

Any mix. Any generation.

Run H100, H200, B200, A100, MI300X, MI325X, and Gaudi 3 in the same cluster. No identical-hardware requirement. No re-provisioning when a generation rolls over.

STANDARD NETWORKING

Ethernet, not InfiniBand.

Multi-host inference runs at around 5 Mbit/s per session. No NVLink or InfiniBand fabric required — ARK runs on the network you already own.

HOT-SCALING

Scale without a restart.

Add or remove compute at runtime. No reload, no session drops, no maintenance window. Continues operating after 90–99% GPU failure.

SESSION ISOLATION

Multi-tenant by design.

Session-level KV isolation lets multiple tenants or workloads share the same GPU fleet without leaking context — the audit story regulators expect.

ARK Core vs ARK Tailored vs ARK Cloud.

Choose ARK Core when you have a mature platform team and only need the runtime. Choose ARK Tailored when you want the full managed platform on your infrastructure. Choose ARK Cloud for zero-ops serverless inference.

License the ARK runtime for your stack.

Tell us about your GPU footprint, existing platform stack, and integration requirements. We will come back with a license quote, reference architecture, and installation plan.