Query live and historical AI inference prices across 15+ providers. One endpoint. Structured JSON. Works in any language in under 60 seconds.
No signup required for the free tier. Just call the endpoint.
# Get live prices for Llama 3.1 70B across all providers curl "https://coaxiom.io/api/v1/prices?model=llama-3.1-70b"
{
"model": "llama-3.1-70b",
"providers": [
{
"provider": "Groq",
"provider_slug": "groq",
"input_per_1m": 0.59,
"output_per_1m": 0.79,
"blended_per_1m": 0.65,
"buy_url": "https://console.groq.com/?utm_source=coaxiom"
},
{
"provider": "DeepInfra",
"provider_slug": "deepinfra",
"input_per_1m": 0.30,
"output_per_1m": 0.40,
"blended_per_1m": 0.33,
"buy_url": "https://deepinfra.com/?utm_source=coaxiom"
}
// ... 13 more providers
],
"cheapest": "DeepInfra",
"last_updated": "2026-05-12T14:22:10Z"
}
import requests # Free tier — no auth required r = requests.get( "https://coaxiom.io/api/v1/prices", params={"model": "llama-3.1-70b"} ) data = r.json() # Find cheapest provider cheapest = min(data["providers"], key=lambda p: p["blended_per_1m"]) print(f"Cheapest: {cheapest['provider']} at ${cheapest['blended_per_1m']:.4f}/1M") # With API key (Developer+ tier) — historical data r2 = requests.get( "https://coaxiom.io/api/v1/history", params={"model": "llama-3.1-70b", "provider": "groq", "days": 30}, headers={"Authorization": "Bearer cxm_your_key_here"} ) history = r2.json()["snapshots"] # list of OHLC candles
// Free tier — no auth const r = await fetch("https://coaxiom.io/api/v1/prices?model=llama-3.1-70b"); const { providers } = await r.json(); // Sort by blended cost const sorted = [...providers].sort((a, b) => a.blended_per_1m - b.blended_per_1m); console.log(`Cheapest: ${sorted[0].provider} @ $${sorted[0].blended_per_1m}/1M`); // Developer+ — historical prices const hist = await fetch("https://coaxiom.io/api/v1/history?model=llama-3.1-70b&provider=groq&days=30", { headers: { Authorization: "Bearer cxm_your_key_here" } }); const { snapshots } = await hist.json(); // OHLC candles
Base URL: https://coaxiom.io/api/v1/ Auth: Authorization: Bearer cxm_...
| Method | Endpoint | Description | Auth |
|---|---|---|---|
| GET | /prices | Live prices across providers. ?model=llama-3.1-70b |
Free |
| GET | /providers | All tracked providers + metadata | Free |
| GET | /models | All tracked models | Free |
| GET | /history | OHLC price history. ?model=&provider=&days=30 |
API Key |
| GET | /compare | Cross-model comparison. ?models=llama-3.1-70b,gpt-4o |
API Key |
| GET | /usage | Your API key quota + usage stats | API Key |
| GET | /news | AI infrastructure news + market signals | Free |
Start free. Upgrade when you need history, webhooks, or higher volume.
All limits are rolling monthly windows. Headers included in every response.
| Feature | Free | Developer | Team | Enterprise |
|---|---|---|---|---|
| Request limit | 100/day | 10K/mo | 100K/mo | Unlimited |
Live prices /prices |
✓ | ✓ | ✓ | ✓ |
Historical data /history |
— | 30 days | 90 days | Unlimited |
| Cross-model compare | — | ✓ | ✓ | ✓ |
| Webhooks (price alerts) | — | 1 | 5 | Unlimited |
| Kafka streaming feed | — | — | — | ✓ |
| Snowflake / BigQuery | — | — | — | ✓ |
| Response headers | X-RateLimit-Limit · X-RateLimit-Remaining · X-RateLimit-Reset | |||
coming_soon status.(input_per_1m × 0.4) + (output_per_1m × 0.6). Most real-world workloads generate more output tokens than input tokens, so this weighting reflects typical usage. You can always use raw input/output rates for exact cost modeling.pip install coaxiom and npm install @coaxiom/sdk. The API is simple enough that a direct fetch/requests call works today — the SDKs add retry logic, rate limit handling, and typed responses.Free tier requires no API key. Sign up to unlock history, compare, and alerts.