REST API · JSON · No SDK Required

Inference pricing.
In your code.

Query live and historical AI inference prices across 15+ providers. One endpoint. Structured JSON. Works in any language in under 60 seconds.

Get free API key → See endpoints
Quick Start

60 seconds to your first response

No signup required for the free tier. Just call the endpoint.

curl
# Get live prices for Llama 3.1 70B across all providers
curl "https://coaxiom.io/api/v1/prices?model=llama-3.1-70b"
JSON response
{
  "model": "llama-3.1-70b",
  "providers": [
    {
      "provider":         "Groq",
      "provider_slug":   "groq",
      "input_per_1m":    0.59,
      "output_per_1m":   0.79,
      "blended_per_1m":  0.65,
      "buy_url":         "https://console.groq.com/?utm_source=coaxiom"
    },
    {
      "provider":         "DeepInfra",
      "provider_slug":   "deepinfra",
      "input_per_1m":    0.30,
      "output_per_1m":   0.40,
      "blended_per_1m":  0.33,
      "buy_url":         "https://deepinfra.com/?utm_source=coaxiom"
    }
    // ... 13 more providers
  ],
  "cheapest":  "DeepInfra",
  "last_updated": "2026-05-12T14:22:10Z"
}
python
import requests

# Free tier — no auth required
r = requests.get(
    "https://coaxiom.io/api/v1/prices",
    params={"model": "llama-3.1-70b"}
)
data = r.json()

# Find cheapest provider
cheapest = min(data["providers"], key=lambda p: p["blended_per_1m"])
print(f"Cheapest: {cheapest['provider']} at ${cheapest['blended_per_1m']:.4f}/1M")

# With API key (Developer+ tier) — historical data
r2 = requests.get(
    "https://coaxiom.io/api/v1/history",
    params={"model": "llama-3.1-70b", "provider": "groq", "days": 30},
    headers={"Authorization": "Bearer cxm_your_key_here"}
)
history = r2.json()["snapshots"]  # list of OHLC candles
javascript / node
// Free tier — no auth
const r = await fetch("https://coaxiom.io/api/v1/prices?model=llama-3.1-70b");
const { providers } = await r.json();

// Sort by blended cost
const sorted = [...providers].sort((a, b) => a.blended_per_1m - b.blended_per_1m);
console.log(`Cheapest: ${sorted[0].provider} @ $${sorted[0].blended_per_1m}/1M`);

// Developer+ — historical prices
const hist = await fetch("https://coaxiom.io/api/v1/history?model=llama-3.1-70b&provider=groq&days=30", {
  headers: { Authorization: "Bearer cxm_your_key_here" }
});
const { snapshots } = await hist.json(); // OHLC candles
API Reference

Endpoints

Base URL: https://coaxiom.io/api/v1/   Auth: Authorization: Bearer cxm_...

Method Endpoint Description Auth
GET /prices Live prices across providers. ?model=llama-3.1-70b Free
GET /providers All tracked providers + metadata Free
GET /models All tracked models Free
GET /history OHLC price history. ?model=&provider=&days=30 API Key
GET /compare Cross-model comparison. ?models=llama-3.1-70b,gpt-4o API Key
GET /usage Your API key quota + usage stats API Key
GET /news AI infrastructure news + market signals Free
Pricing

Simple, usage-based tiers

Start free. Upgrade when you need history, webhooks, or higher volume.

Free
$0
forever
100 requests / day
  • Live prices endpoint
  • All providers + models
  • JSON responses
  • No API key required
  • No history endpoint
  • No compare endpoint
Start for free →
Team
$79
/ month
100,000 requests / month
  • Everything in Developer
  • Full price history (90 days)
  • 5 webhook endpoints
  • Team API key management
  • Provider click tracking
  • Priority support
Get API key →
Enterprise
Custom
contact us
Unlimited requests
  • Everything in Team
  • Full history (unlimited)
  • Kafka streaming feed
  • Snowflake / BigQuery connector
  • Unlimited webhooks
  • SLA + dedicated support
Contact us →
Rate Limits

Limits by tier

All limits are rolling monthly windows. Headers included in every response.

Feature Free Developer Team Enterprise
Request limit 100/day 10K/mo 100K/mo Unlimited
Live prices /prices
Historical data /history 30 days 90 days Unlimited
Cross-model compare
Webhooks (price alerts) 1 5 Unlimited
Kafka streaming feed
Snowflake / BigQuery
Response headers X-RateLimit-Limit · X-RateLimit-Remaining · X-RateLimit-Reset
FAQ

Common questions

Where does the pricing data come from?
Price data is coming soon. Endpoints currently return a coming_soon status.
How fresh is the data?
Live prices endpoint is refreshed every 5 minutes from upstream. Historical OHLC snapshots are taken every hour. If a provider changes prices, you'll see it within 5 minutes on the live endpoint and within 1 hour in the time series.
What's the "blended" rate?
Blended rate is a weighted average of input and output pricing: (input_per_1m × 0.4) + (output_per_1m × 0.6). Most real-world workloads generate more output tokens than input tokens, so this weighting reflects typical usage. You can always use raw input/output rates for exact cost modeling.
Do you support webhooks for price alerts?
Webhooks are on Developer and Team plans (Sprint 8, coming shortly). You'll be able to register an endpoint and receive a POST when a specific model/provider price changes by more than a threshold you set. HMAC-SHA256 signed on every delivery.
Can I use this in production?
Yes. The free tier is intentionally useful — 100 requests/day is enough for a cost monitor that checks hourly. For production systems doing continuous monitoring or powering dashboards, Developer or Team is the right fit. We don't throttle burst requests — limits are rolling monthly totals.
Is there a Python or Node SDK?
Coming in Sprint 8: pip install coaxiom and npm install @coaxiom/sdk. The API is simple enough that a direct fetch/requests call works today — the SDKs add retry logic, rate limit handling, and typed responses.
What is Llama 3.1 70B and why is it the benchmark?
Llama 3.1 70B is Coaxiom's benchmark model — every major provider hosts it, making it the only model with truly comparable pricing across all providers. It's the "benchmark barrel" of the inference market, like WTI crude for oil pricing. When we say a provider is "cheapest," we mean cheapest for this model specifically.

Start in 60 seconds

Free tier requires no API key. Sign up to unlock history, compare, and alerts.

Get free API key → Enterprise inquiry