Skip to main content
Lago turns usage into invoices. You send events describing what your customers consumed (API calls, compute hours, transactions, tokens), and Lago matches each event to the right subscription and pricing plan, deduplicates it, and aggregates everything into accurate, auditable invoices. This page is your complete integration guide: how to design your events for billing, send them through any delivery method, and handle edge cases like retries and late arrivals.

Designing your events

A well-designed event captures what happened, who it happened for, and the dimensions you need for pricing. Getting this right upfront saves you from reworking your integration later. The guiding question: what do you charge your customers for? That’s your event.

Event schema

{
  "transaction_id": "txn_20240314_cust8832_inference_00142",
  "external_subscription_id": "sub_8832",
  "code": "ai_inference",
  "timestamp": 1710421740,
  "properties": {
    "model": "gpt-4",
    "tokens_in": 820,
    "tokens_out": 1500,
    "region": "us-east-1"
  }
}
A unique ID you generate. Lago uses it for deduplication: if the same event arrives twice, only the first is billed.
Good practice: Don’t use random UUIDs. Embed the customer, event type, and timestamp so you can trace a billed amount back to its source event without guessing:
{type}_{date}_{customer}_{category}_{sequence}

# Examples
inf_20240314_cust42_gpt4_00831
compute_20240314_org7_a100_useast_001
pay_20240314_acct3391_card_00092
If your Lago organization uses the ClickHouse-based event pipeline (designed for high-volume processing), event uniqueness is maintained with both transaction_id and timestamp fields:
  • If both are new to Lago, the event is ingested
  • If both have already been received, the new event replaces the previous one
Ties the event to a customer subscription. This is how Lago knows which pricing plan to apply. The external_subscription_id must match an active subscription in Lago.
Maps to a billable metric you’ve defined in Lago (e.g., ai_inference, api_calls, storage_gb). Treat the code as a stable API contract between your application and your billing configuration.
When the event happened, as a Unix timestamp. Lago uses this to assign the event to the correct billing period.We typically log events using timestamps in seconds, such as 1741219251. When higher precision is required, you can use millisecond accuracy in the format 1741219251.590, where the dot acts as the decimal separator.
If you do not specify a timestamp, Lago automatically uses the reception date of the event.
Key-value pairs carrying the data your pricing needs: token counts, regions, instance types, anything you price on. Properties can be strings, integers, floats, uuids, or timestamps.Lago ignores properties that don’t match a billable metric, so include dimensions you might price on in the future — extra fields cost nothing but give you pricing flexibility later.
For UNIQUE COUNT aggregation on a recurring metric, the operation_type property is required. Send add to add a value or remove to remove one.
Skip Lago’s aggregation and set the dollar amount directly. This value must be a string to avoid floating-point rounding issues.If not specified, Lago sets it to 0 and the event is not included in charge aggregation for dynamic charge models.

Use cases in detail

Billing model: Per-token pricing, rates vary by model and sometimes by input vs. output tokens.Event design: One event per inference request. Properties carry model, tokens_in, tokens_out, and any other dimensions you price on.
{
  "transaction_id": "inf_20240314150900_cust42_gpt4_00831",
  "external_subscription_id": "sub_42",
  "code": "llm_tokens",
  "timestamp": 1710421740,
  "properties": {
    "model": "gpt-4",
    "tokens_in": 820,
    "tokens_out": 1500,
    "latency_ms": 1230,
    "stream": true
  }
}
Billable metric setup: Create a metric llm_tokens with aggregation type SUM and a custom expression like properties.tokens_in + properties.tokens_out. Use model as a charge filter to apply different per-token rates for GPT-4, GPT-3.5, Claude, etc.
Include latency_ms and stream in properties even if you don’t price on them today. Lago ignores unused properties, but having the data means you can add latency-based tiers or streaming surcharges later without re-instrumenting.
Volume consideration: A busy AI platform might generate thousands of inference events per second. At low volume, send every event via REST API. At high volume, stream through Kafka or pre-aggregate to one event per customer per model per hour.

Event design best practices

Send more data than you need today

Properties are flexible: include dimensions you might price on in the future. Lago ignores properties that don’t match a billable metric, so extra fields cost nothing but give you pricing flexibility later.

Make transaction_id meaningful

Don’t use random UUIDs. Embed the customer, event type, and timestamp so you can trace a billed amount back to its source event without guessing.

One event per billable action

If a customer makes 3 API calls, send 3 events, not 1 event with a count of 3. This gives you maximum flexibility. The exception is pre-aggregation at very high volumes.

Use timestamp accurately

Lago assigns events to billing periods based on timestamp, not arrival time. Late-arriving events are placed in the correct historical period. This matters for reconciliation and billing accuracy.

Treat code as a contract

The code ties your event to a billable metric. If you rename a metric, you need to update the code in your events. Treat it as a stable API contract.

Test with a single event first

Send one event via the REST API and verify it appears in GET /events/{transaction_id} before scaling to batch or streaming delivery.

Delivery methods

Lago accepts events through multiple channels. All methods use the same event schema and produce identical billing results.
MethodWhen to use
REST APIMost integrations. Simplest setup. Start here.
REST API (batch)Same as REST but batches up to 100 events per request for higher throughput.
Kafka / RedpandaReal-time, high-volume production pipelines.
Amazon KinesisAWS-native streaming architectures.
Amazon S3Historical backfills, migrations, periodic batch loads.
Start with the REST API. You can migrate to streaming later without changing your event schema, billable metrics, or pricing configuration. Many customers start on REST and switch to Kafka only when they outgrow it.
For the full list of available ingestion sources, see Metering ingestion sources.

REST API

Replace __LAGO_API_URL__ with api.getlago.com for Lago Cloud, or your own instance URL for self-hosted deployments.
LAGO_URL="https://api.getlago.com"
API_KEY="__YOUR_API_KEY__"

curl -X POST "$LAGO_URL/api/v1/events" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "event": {
      "transaction_id": "txn_20240314_cust42_api_00001",
      "external_subscription_id": "sub_42",
      "code": "api_calls",
      "timestamp": 1710421740,
      "properties": {"tokens": 1500}
    }
  }'

Client libraries

from lago_python_client import Client

client = Client(api_key='__YOUR_API_KEY__')

client.events.create({
    "transaction_id": "txn_20240314_cust42_api_00001",
    "external_subscription_id": "sub_42",
    "code": "api_calls",
    "timestamp": 1710421740,
    "properties": {"tokens": 1500}
})

Kafka / Redpanda configuration

Kafka ingestion requires the ClickHouse event store (see Architecture section above).
Lago Cloud includes a managed Kafka endpoint. Contact the team for your connection details (broker address, credentials, topic name). No infrastructure to deploy.
Lago can directly consume events from your Kafka topic (which will require opening access to your Kafka broker / topic). Alternatively, you can send your events to a dedicated topic on our infrastructure (AWS VPC Peering or PrivateLink).In this second case, these settings have the biggest impact on throughput:
SettingRecommendedWhy
linger.ms20-100msBatches messages for fewer, larger network calls
batch.num.messages500-5000Larger batches = higher throughput
compression.typelz460-80% bandwidth reduction at 1M events/sec
enable.idempotencetrueExactly-once delivery to broker
Message keyexternal_subscription_idPreserves per-customer ordering
Partition count: Aim for partition count >= number of event processor instances. More partitions = more consumer parallelism.
Consumer lag monitoring: This is your #1 operational metric. Alert when lag exceeds your latency budget (e.g., > 100K messages).
Minimal production deployment:
apiVersion: cluster.redpanda.com/v1alpha1
kind: Redpanda
metadata:
  name: redpanda
spec:
  chartRef: {}
  clusterSpec:
    statefulset:
      replicas: 3
    auth:
      sasl:
        enabled: true
        users:
          - name: lago
            password: <your-password>
    resources:
      memory: { container: { max: 4Gi } }
      cpu: { cores: 2 }
    storage:
      persistentVolume: { size: 100Gi }
See Redpanda Operator docs for TLS, rack awareness, and tiered storage.
ClickHouse on Kubernetes documentation is coming soon.

Amazon Kinesis configuration

We can read events directly from your Kinesis stream. We’ll need:
  • Kinesis Stream ARN
  • Credentials:
    • Role ARN to assume, or
    • IAM Access Keys

Amazon S3 configuration

Newline-delimited JSON (.jsonl or .jsonl.gz). Target 100MB-1GB per compressed file.
Idempotent: Re-running an import after failure is always safe (dedup by transaction_id).
We generally provide an S3 bucket on our infrastructure where you can deliver your files, along with credentials to upload new files.If you prefer to provide your own S3 bucket, an SQS queue is required to capture new file uploads (configurable on the S3 bucket properties/notifications). We would need:
  • S3 bucket
  • S3 Region
  • SQS URL
  • Credentials:
    • Role ARN to assume, or
    • IAM Access Keys

Idempotency and deduplication

Lago guarantees exactly-once processing through the transaction_id:
  • Same transaction_id + external_subscription_id arrives twice → only the first is billed.
  • Works across delivery methods: a REST event won’t be billed again if the same ID arrives via Kafka.
  • Retries are always safe. REST timeout? Resend. Kafka consumer restart? Reprocess. S3 import interrupted? Re-trigger.
Design your IDs to be deterministic (derived from source data), not random. This makes retries automatic and debugging straightforward.

Handling edge cases

Events that arrive after a billing period has closed are assigned to the correct historical period based on timestamp. If the invoice has already been finalized, the event will be included in the next billing cycle as a retroactive adjustment.
To amend a previously sent event, send a new event with the same transaction_id and updated properties. The new event replaces the original.
Events with a timestamp before the subscription’s started_at date are ignored. Validate your timestamps if you’re backfilling historical data.
Events for subscriptions that have been terminated are ignored. This is a safety mechanism — you don’t need to stop your event pipeline the instant a subscription ends.

Verifying your integration

Test with a single event first. Send one event via the REST API, then retrieve it with GET /events/{transaction_id} to confirm Lago received and parsed it correctly. Reconciliation pattern: Compare your source-of-truth event count against Lago. Use GET /events filtered by external_subscription_id, timestamp_from, and timestamp_to, then compare against your internal logs. Any delta means dropped events or events that failed validation. What to watch on your side:
  • REST API response codes: 200 = accepted, 422 = validation error (bad schema, unknown subscription), 429 = rate limited (retry safely), 5xx = server-side error (investigate before retrying)
  • At invoice time: compare line item quantities against your own event counts per subscription per billing period. If they diverge, check for late arrivals outside the billing window.

Rate limits

Lago enforces rate limits on the REST API per organization to protect platform stability. Default limits are shown below — these can be adjusted to match your needs. Contact the Lago team if you need higher throughput.
Endpoint categoryDefault limit
Event ingestion (POST /events, POST /events/batch)500 requests/sec
Current usage200 requests/sec
All other endpoints50 requests/sec
When rate limited: You receive a 429 Too Many Requests response. Every response includes headers to help you manage your rate:
  • x-ratelimit-limit — max requests per window
  • x-ratelimit-remaining — remaining requests
  • x-ratelimit-reset — seconds until the window resets
Retries on 429 are always safe thanks to transaction_id deduplication. Back off until x-ratelimit-reset reaches zero.
Kafka, Kinesis, and S3 connectors are not subject to REST API rate limits. Their throughput is bounded by your broker or infrastructure capacity.