Observability
Telemetry spans, error tracking, event bus, and fraud detection powered by ClickHouse.
Overview
WhaleTools Observability is built on ClickHouse for high-throughput analytics. The system ingests telemetry spans, error events, and operational metrics across all services. Currently tracking 88K+ spans and 14K+ errors with sub-second query performance.
All observability data flows through the Gateway API. Ingest spans via POST /v1/native/telemetry and query traces via POST /v1/native/clickhouse/query. Both endpoints require JWT authentication.
Ingest Spans
Send telemetry spans to track operations across your system. Each span records an operation name, duration, and optional metadata like model name and token counts for LLM calls.
| Field | Type | Description |
|---|---|---|
| operation_name | string | Name of the operation (e.g., "llm.chat", "tool.execute", "api.request"). |
| duration_ms | number | How long the operation took in milliseconds. |
| service_name | string | Service that produced the span (e.g., "whale-gateway", "whale-agent"). |
| model_name | string | LLM model used, if applicable (e.g., "claude-opus-4-20250514"). |
| input_tokens | number | Number of input tokens consumed by the LLM call. |
| output_tokens | number | Number of output tokens generated by the LLM call. |
| status | string | Span status: "ok", "error", or "timeout". |
| trace_id | string | Groups related spans into a single trace. |
| parent_span_id | string | Links child spans to their parent for tree visualization. |
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Ingest a span
await whale.telemetry.ingest({
spans: [
{
trace_id: 'trace_abc123',
span_id: 'span_001',
operation_name: 'llm.chat',
service_name: 'product-assistant',
duration_ms: 1842,
model_name: 'claude-sonnet-4-20250514',
input_tokens: 1250,
output_tokens: 340,
status: 'ok',
attributes: {
agent_id: 'agent_abc123',
conversation_id: 'conv_xyz789'
}
}
]
})
// Response
{ "accepted": 1, "rejected": 0 }Query Traces
Query stored traces using ClickHouse SQL. Filter by service, operation, duration, time range, and custom attributes. The query endpoint only allows SELECT statements — write operations are rejected.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Find slow LLM calls in the last hour
const traces = await whale.telemetry.query({
sql: `
SELECT
operation_name,
duration_ms,
model_name,
input_tokens + output_tokens AS total_tokens
FROM spans
WHERE service_name = 'product-assistant'
AND duration_ms > 3000
AND timestamp > now() - INTERVAL 1 HOUR
ORDER BY duration_ms DESC
LIMIT 20
`
})
// Token usage by model over the last 7 days
const usage = await whale.telemetry.query({
sql: `
SELECT
model_name,
sum(input_tokens) AS total_input,
sum(output_tokens) AS total_output,
count() AS call_count,
avg(duration_ms) AS avg_latency_ms
FROM spans
WHERE model_name != ''
AND timestamp > now() - INTERVAL 7 DAY
GROUP BY model_name
ORDER BY total_input + total_output DESC
`
})Error Events
Errors are deduplicated using SHA-256 fingerprinting — identical errors are grouped automatically. Each error includes a severity level, stack trace, and contextual metadata.
critical
System-breaking errors that require immediate attention. Pages operators.
error
Request failures, unhandled exceptions, and integration errors.
warning
Degraded performance, approaching limits, recoverable issues.
info
Notable events that are not errors — config changes, deploys, scaling events.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Report an error
await whale.telemetry.error({
message: 'Payment gateway timeout after 30s',
severity: 'error',
service_name: 'checkout',
stack_trace: 'TimeoutError: Request timed out\n at PaymentClient.charge (payment.ts:42)',
attributes: {
order_id: 'ord_abc123',
gateway: 'stripe',
amount: 4999
}
})
// Query error trends
const errors = await whale.telemetry.query({
sql: `
SELECT
fingerprint,
message,
severity,
count() AS occurrences,
max(timestamp) AS last_seen
FROM errors
WHERE timestamp > now() - INTERVAL 24 HOUR
GROUP BY fingerprint, message, severity
ORDER BY occurrences DESC
LIMIT 10
`
})Event Bus
Publish events with guaranteed delivery. Events support idempotency keys to prevent duplicate processing, configurable retry with exponential backoff, and a dead letter queue for events that exceed max_attempts.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Publish an event
await whale.events.publish({
event_type: 'order.completed',
idempotency_key: 'ord_abc123_completed',
payload: {
order_id: 'ord_abc123',
customer_id: 'cust_xyz789',
total: 4999,
items: 3
},
max_attempts: 5 // retry up to 5 times on handler failure
})
// Subscribe to events (webhook)
await whale.events.subscribe({
event_type: 'order.completed',
webhook_url: 'https://your-app.com/webhooks/order-completed',
secret: 'whsec_your_signing_secret'
})
// List dead letter events (exceeded max_attempts)
const deadLetters = await whale.events.deadLetter.list({
event_type: 'order.completed',
since: '2026-03-01T00:00:00Z'
})
// Retry a dead letter event
await whale.events.deadLetter.retry({ event_id: 'evt_failed_001' })Fraud Detection
Real-time fraud scoring on orders and transactions. Each order receives a risk score from 0 (no risk) to 100 (confirmed fraud). Scores above your configured threshold trigger holds for manual review. The system analyzes multiple signals:
velocity_check
Too many orders from the same IP, device, or payment method in a short window.
address_mismatch
Billing and shipping addresses are in different countries or distant regions.
statistical_outlier
Order amount deviates significantly from the customer's historical average.
new_account_high_value
High-value order from an account created within the last 24 hours.
proxy_vpn_detection
Order placed through a known proxy, VPN, or Tor exit node.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Score an order
const score = await whale.fraud.score({
order_id: 'ord_abc123',
customer_id: 'cust_xyz789',
amount: 29900,
ip_address: '203.0.113.42',
billing_country: 'US',
shipping_country: 'US',
payment_method: 'card_ending_4242',
email: 'customer@example.com'
})
// Response
{
"order_id": "ord_abc123",
"risk_score": 23,
"decision": "approve", // "approve" | "review" | "reject"
"signals": [
{ "type": "velocity_check", "score": 0, "detail": "1 order in 24h — normal" },
{ "type": "address_mismatch", "score": 0, "detail": "Same country" },
{ "type": "new_account_high_value", "score": 23, "detail": "Account 3 days old, order is 2x avg" }
]
}
// Configure thresholds
await whale.fraud.configure({
review_threshold: 40, // score >= 40 triggers manual review
reject_threshold: 75 // score >= 75 auto-rejects
})