● Live312 units online · TKO · SFO · FRA

Run Llama 405B locally.
$1,499 a month.
Not $18,000.

Rent a bare-metal Mac Studio M3 Ultra with 512 GB unified memory — the only consumer-class machine that holds Llama 3.1 405B, DeepSeek-V3, and Kimi-K2 1T MoE entirely in memory. SSH-ready in 90 seconds. Yours alone.

Get started — from $499/wk How it works

312 machines online4 regions · 24/7SOC 2 · ISO 27001

satoshi@kizai-tko-04 — ssh — 120×34

connected · 11 ms

My Machines

kizai-tko-04512G

kizai-fra-12512G

Usage · kizai-tko-04

MEM287 / 512 GB

GPU72%

SSD2.1 / 8 TB

# Connect over Tailscale — no public IPsatoshi@local ~ ssh kizai-tko-04✓ Welcome to macOS 15.4 · M3 Ultra · 512 GB satoshi@kizai-tko-04 ~ mlx_lm.generate \ --model meta-llama/Llama-3.1-405B-Instruct \ --prompt "Explain MoE routing in one paragraph" \ --max-tokens 512 Loading weights … 230.4 GB into UMA … done (8.2 s)Mixture-of-Experts routing assigns each token to a small subset ofspecialized sub-networks via a learned gating function, so totalparameter count grows without proportionally raising compute

In production at AI teams from research labs to YC startups

YAMABIKO LABS

stanza/ai

DRIFT.RESEARCH

Forerunner

naoroku

ATLAS·M

How we compare

Same frontier models.
About 1/12 the monthly cost.

For memory-bound inference (MoE, long context, 70B+ dense), Apple Silicon wins on $/token and $/GB — and you get a dedicated machine, not a slot in someone's pool.

	AI KIZAI · M3 Ultra 512 GB	1× H100 80 GB (cloud)	8× H100 cluster
Usable memory for weights	512 GB UMA	80 GB HBM3	640 GB (sharded)
Fits Llama 3.1 405B (4-bit)	✓ comfortably	✗	✓ (with vLLM TP-8)
Fits Kimi-K2 1T MoE	✓ (Q3)	✗	tight
List price	$1,499 / month	~$2.49 / hr ≈ $1,800/mo	~$20–32 / hr ≈ $18k/mo
Dedicated hardware	always — bare metal	shared instance	multi-tenant pool
Provision time	90 seconds	~3 min (when available)	quota wait, days
Local dev parity	identical to your MacBook	Linux + CUDA	Linux + CUDA
Idle cost	$0 (flat rate)	full hourly rate	full hourly rate

What you can run

Frontier open-weights
that actually fit.

Measured throughput on a 512 GB M3 Ultra running MLX with KV cache and a 4k context window. All weights stay resident in unified memory — no sharding, no offload.

M3 Ultra · 512 GB UMA · 800 GB/s$1,499 / month · 8 TB SSD · macOS 15

Llama 3.1 405BMLX 4-bit · 230 GB

12 t/sfits

DeepSeek-V3 671BMoE 4-bit · 380 GB

8 t/sfits

Kimi-K2 1T MoEQ3 · 480 GB

6 t/stight

Llama 4 BehemothMLX 4-bit · 460 GB

5 t/stight

Qwen 2.5 72B × 4parallel agents

parallelfits

Pricing

One machine. One price.

Every plan includes the same hardware. Pick the billing cycle that works for your team.

Mac Studio M3 Ultra · 512 GB

M3 Ultra32-core CPU80-core GPU512 GB UMA8 TB SSDmacOS 15

Billing cycle

Included in every plan

Dedicated Mac Studio M3 Ultra

512 GB unified memory

8 TB NVMe SSD

Tailscale private mesh

Full root / sudo access

macOS 15 Sequoia

24/7 monitoring & alerts

Snapshot backups on exit

Total

$1,499/month

Cancel before next cycle — no penalty

Continue to checkout

SOC 2 Type IIStripe-secured billing4 regions · 99.9% SLA

The machine

Bare-metal Apple Silicon.
No virtualization tax.

512GB UMA

Holds 405B models in one piece.

No sharding, no offload — every weight stays resident in unified memory.

800GB/s

2× faster tokens than an H100.

More effective bandwidth than HBM3 for memory-bound inference.

80cores

Fine-tune locally with MLX + LoRA.

Neural Engine + Metal MPS, optimized for Apple's MLX framework.

90seconds

SSH-ready from card swipe.

Fully automated provisioning. No tickets, no waitlist, no quota.

How it works

SSH into your machine
in under two minutes.

Four steps from card swipe to running a 405B model. No tickets, no infra team, no cold start.

$ kizai rent --m3-ultra-512

Pick a billing cycle

Weekly, monthly, or 3 months. Same hardware. Same SLA. Longer commitments unlock a small discount.

✓ paid · stripe

Pay with Stripe

USD, EUR, JPY, GBP. Cards, Apple Pay, Google Pay, ACH, bank transfer. Receipts via email.

provisioning… 47 s

Auto-provision

We boot a fresh Mac Studio, install Tailscale, mount your home disk, and send you keys.

$ ssh kizai-tko-04

Connect and build

SSH, VNC, or macOS Screen Sharing over a private mesh. Full root. Run whatever you want.

Network

Four regions live.
Two more this quarter.

Pick a region at checkout. Swap free, mid-cycle. All units sit on your private Tailnet — no public IP, no port forwarding, no inbound exposure.

TKOonline

Tokyo

142 units · AS · 11 ms RTT

OSAonline

Osaka

48 units · AS · 18 ms RTT

SFOonline

San Francisco

94 units · NA · 88 ms RTT

FRAonline

Frankfurt

28 units · EU · 232 ms RTT

SINQ3 2026

Singapore

Q3 2026

IADQ3 2026

Virginia

Q3 2026

Use cases

One machine, every workload
that needs a lot of memory.

Run frontier open-weights locally

Llama 3.1 405B, DeepSeek-V3, Kimi-K2 1T MoE — all fit in UMA without sharding.

INFERENCE

Private RAG over sensitive data

Index TBs of internal docs on a machine in your Tailnet. Weights never leave your subscription.

ENTERPRISE

MLX fine-tuning + LoRA

Fine-tune 7B–70B models with native Metal. Apple's MLX framework + your own LoRA recipes.

RESEARCH

Agent harnesses & long contexts

Keep 200k-token KV caches in memory and serve agents 24/7. No cold starts, no eviction.

AGENTS

ComfyUI & image/video pipelines

SDXL, Flux, AnimateDiff, HunyuanVideo on Metal. Render farms over Tailscale.

GENERATIVE

Xcode CI for App Store builds

Dedicated build runner with Apple notarization access. Faster than any cloud Mac.

iOS / macOS

Questions

Frequently asked.

A dedicated, bare-metal Mac Studio M3 Ultra with 512 GB unified memory, 8 TB SSD, and macOS 15. It is physically yours for the duration of your subscription — no other tenants, no virtualization, no noisy neighbors.

Ready when you are

Your private supercomputer.
Ninety seconds away.

Spin up a Mac Studio M3 Ultra 512 GB in Tokyo, San Francisco, or Frankfurt. Cancel any time — no GPU quotas, no waitlists.

Get a machine View customer dashboard →

Run Llama 405B locally.$1,499 a month.Not $18,000.

Same frontier models.About 1/12 the monthly cost.

Frontier open-weightsthat actually fit.