AI Platform
Embeddings, prompt templates, agent memory, evaluation framework, and safety rules.
Overview
The AI Platform is the infrastructure layer that powers every AI agent in your store. It provides vector embeddings for semantic search, versioned prompt templates, persistent agent memory, an evaluation framework for testing agent quality, and safety rules to keep responses on-brand and compliant.
All AI Platform endpoints are scoped to your store and authenticated with your API key. Agents automatically use these systems — you can also access them directly for custom integrations.
Embeddings & Semantic Search
Create vector embeddings from text and search by semantic similarity. Use this to power product discovery, FAQ matching, and knowledge base retrieval. Embeddings are stored per-store and can be scoped to a namespace.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Create an embedding
const embedding = await whale.ai.embeddings.create({
input: 'Organic cotton crew neck t-shirt in navy blue',
namespace: 'products',
metadata: {
product_id: 'prod_abc123',
category: 'apparel'
}
})
// Search by similarity
const results = await whale.ai.embeddings.search({
query: 'comfortable blue shirt',
namespace: 'products',
limit: 10,
threshold: 0.75
})
// Response
{
"results": [
{
"id": "emb_550e8400",
"score": 0.92,
"metadata": { "product_id": "prod_abc123", "category": "apparel" },
"text": "Organic cotton crew neck t-shirt in navy blue"
}
]
}
// Index all products in bulk
await whale.ai.embeddings.index({
source: 'products',
fields: ['name', 'description', 'tags'],
namespace: 'products'
})Prompt Templates
Versioned prompt templates with change tracking. Templates support variable interpolation and can be pinned to a specific version or set to always use the latest. Every edit creates a new version — roll back instantly if a change degrades quality.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Create a prompt template
const template = await whale.ai.templates.create({
name: 'product_recommendation',
content: `You are a shopping assistant for {{store_name}}.
The customer is looking for: {{query}}
Available products:
{{#products}}
- {{name}} ({{price}}): {{description}}
{{/products}}
Recommend the best match and explain why.`,
variables: ['store_name', 'query', 'products'],
description: 'Product recommendation prompt with inventory context'
})
// Use a specific version
const v2 = await whale.ai.templates.get('product_recommendation', {
version: 2
})
// List all versions with change history
const history = await whale.ai.templates.versions('product_recommendation')
// Returns: [{ version: 3, created_at, diff_summary }, ...]Agent Memory
Agents store and retrieve memory across three scopes. Memory is automatically managed during conversations, but you can also read and write memory directly for custom workflows.
short_term
Conversation-scoped memory that expires after the session ends. Default TTL: 1 hour.
long_term
Persistent memory stored across sessions. Used for customer preferences and history.
entity
Structured memory about specific entities — customers, products, orders. Auto-linked by ID.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Store a long-term memory
await whale.ai.memory.store({
agent_id: 'agent_abc123',
type: 'long_term',
key: 'customer_preference',
value: {
preferred_size: 'M',
favorite_colors: ['navy', 'black'],
allergies: ['latex']
},
customer_id: 'cust_xyz789',
ttl: null // null = no expiration
})
// Store a short-term memory with TTL
await whale.ai.memory.store({
agent_id: 'agent_abc123',
type: 'short_term',
key: 'current_cart_context',
value: { items: 3, total: 89.50 },
ttl: 3600 // expires in 1 hour
})
// Retrieve memory for a customer
const memories = await whale.ai.memory.recall({
agent_id: 'agent_abc123',
customer_id: 'cust_xyz789',
types: ['long_term', 'entity']
})Evaluation Framework
Test agent quality systematically with datasets, test cases, and automated scoring. Run evaluations after prompt changes to catch regressions before they reach customers. Supports exact match, semantic similarity, and LLM judge scoring.
| Field | Type | Description |
|---|---|---|
| dataset_id | string | Reference to the evaluation dataset containing test cases. |
| test_cases | array | Array of input/expected_output pairs for scoring. |
| scoring_method | string | Scoring strategy: exact_match, semantic_similarity, or llm_judge. |
| judge_model | string | Model used for LLM judge scoring (e.g., claude-sonnet-4-20250514). |
| pass_threshold | number | Minimum score (0-1) for a test case to pass. |
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Create a dataset
const dataset = await whale.ai.eval.datasets.create({
name: 'product_qa_v2',
test_cases: [
{
input: 'Do you have this in red?',
expected_output: 'Let me check our available colors for you.',
tags: ['color_query']
},
{
input: 'What is your return policy?',
expected_output: 'You can return any item within 30 days.',
tags: ['policy']
}
]
})
// Run an evaluation
const run = await whale.ai.eval.run({
agent_id: 'agent_abc123',
dataset_id: dataset.id,
scoring_method: 'llm_judge',
judge_model: 'claude-sonnet-4-20250514',
pass_threshold: 0.8
})
// Response
{
"id": "eval_run_001",
"status": "completed",
"summary": {
"total": 2,
"passed": 2,
"failed": 0,
"avg_score": 0.91
},
"results": [
{ "test_case_id": 0, "score": 0.88, "passed": true, "judge_feedback": "..." },
{ "test_case_id": 1, "score": 0.94, "passed": true, "judge_feedback": "..." }
]
}Safety Rules
Safety rules run on every agent response before it reaches the customer. Blocking rules prevent the response from being sent and return a fallback message. Warning rules flag the response for review but still deliver it.
| Rule | Mode | Description |
|---|---|---|
| content_filter | blocking | Blocks responses containing prohibited content categories. |
| pii_detection | warning | Flags responses that may contain personally identifiable information. |
| topic_restriction | blocking | Prevents the agent from discussing off-topic subjects. |
| tone_check | warning | Warns when response tone deviates from brand guidelines. |
| hallucination_guard | blocking | Blocks responses that reference products or policies not in the knowledge base. |
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Add a safety rule
await whale.ai.safety.create({
agent_id: 'agent_abc123',
rule_type: 'topic_restriction',
mode: 'blocking',
config: {
blocked_topics: ['competitors', 'politics', 'religion'],
fallback_message: "I can only help with questions about our products and services."
}
})
// List active rules for an agent
const rules = await whale.ai.safety.list({ agent_id: 'agent_abc123' })Cost Budgets
Set spending limits on AI usage to prevent runaway costs. Budgets can be configured at the store level or per-agent. When a budget is exceeded, the agent returns a graceful fallback message instead of making additional LLM calls.
const whale = new WhaleClient({ apiKey: 'wk_live_...' })
// Set a store-wide budget
await whale.ai.budgets.set({
scope: 'store',
limits: {
daily: 25.00, // $25/day
weekly: 150.00, // $150/week
monthly: 500.00 // $500/month
},
alert_threshold: 0.8, // alert at 80% usage
fallback_message: "Our AI assistant is temporarily unavailable. Please contact support."
})
// Set a per-agent budget
await whale.ai.budgets.set({
scope: 'agent',
agent_id: 'agent_abc123',
limits: {
daily: 10.00,
monthly: 200.00
}
})
// Check current usage
const usage = await whale.ai.budgets.usage()
// { daily_spent: 8.42, daily_limit: 25.00, daily_remaining: 16.58, ... }