Advanced Usage
This document covers advanced usage patterns for AI-Suite.
Configuration Options
Temperature Control
Control the randomness of responses by adjusting the temperature:
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[{ role: 'user', content: 'Write a creative story' }],
{
responseFormat: 'text',
temperature: 0.2 // Lower temperature for more deterministic responses
}
);
Max Output Tokens
Limit the maximum number of tokens in the response:
const response = await aiSuite.createChatCompletion(
'anthropic/claude-3-5-sonnet-20241022',
[{ role: 'user', content: 'Write a short story.' }],
{
responseFormat: 'text',
maxOutputTokens: 500 // Limit response to 500 tokens
}
);
Structured Output
JSON Object Mode
Get any valid JSON response:
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[{ role: 'user', content: 'Generate a user profile with name, age, and email' }],
{
responseFormat: 'json_object'
}
);
if (response.success) {
console.log(response.content_object); // Parsed JSON object
}
JSON Schema Mode (Strongly Typed)
Use Zod schemas for type-safe, validated JSON responses:
import { z } from 'zod';
const UserSchema = z.object({
name: z.string(),
age: z.number().int().positive(),
email: z.string().email(),
interests: z.array(z.string())
});
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[{ role: 'user', content: 'Generate a sample user' }],
{
responseFormat: 'json_schema',
zodSchema: UserSchema
}
);
if (response.success) {
// response.content_object is typed according to your schema
const user = response.content_object;
console.log(`${user.name} is ${user.age} years old`);
}
This works with all providers that support JSON mode (OpenAI, Gemini, etc.).
Tool/Function Calling
Enable models to call functions:
const tools = [{
type: 'function' as const,
function: {
name: 'get_weather',
description: 'Get the current weather for a location',
parameters: {
type: 'object' as const,
properties: {
location: {
type: 'string' as const,
description: 'The city name'
},
unit: {
type: 'string' as const,
description: 'Temperature unit (celsius or fahrenheit)'
}
},
required: ['location'],
additionalProperties: false
},
additionalProperties: false,
strict: true
}
}];
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[{ role: 'user', content: 'What is the weather in Paris?' }],
{
responseFormat: 'text',
tools
}
);
if (response.success && response.tools) {
for (const tool of response.tools) {
console.log(`Calling ${tool.name} with:`, tool.content);
// Execute your function and send result back
}
}
Retry Logic
Built-in retry mechanism with exponential backoff:
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[{ role: 'user', content: 'Hello!' }],
{
responseFormat: 'text',
retry: {
attempts: 5,
delay: (attempt) => {
// Exponential backoff: 100ms, 200ms, 400ms, 800ms, 1600ms
return Math.pow(2, attempt) * 100;
}
}
}
);
Default retry configuration:
- Attempts: 1 (no retry)
- Delay: Exponential backoff starting at 100ms
Langfuse Integration
AI-Suite provides built-in integration with Langfuse for tracking and monitoring AI interactions.
Setup
import { Langfuse } from 'langfuse';
const langfuse = new Langfuse({
publicKey: process.env.LANGFUSE_PUBLIC_KEY,
secretKey: process.env.LANGFUSE_SECRET_KEY,
});
const aiSuite = new AISuite(
{
openaiKey: process.env.OPENAI_API_KEY,
anthropicKey: process.env.ANTHROPIC_API_KEY
},
{
langFuse: langfuse
}
);
Adding Metadata
Track additional context with your requests:
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[{ role: 'user', content: 'Hello!' }],
{
responseFormat: 'text',
metadata: {
langFuse: {
userId: 'user-123',
sessionId: 'session-456',
environment: 'production',
name: 'greeting-interaction',
tags: ['customer-support', 'greeting']
},
// Custom metadata
customField: 'value'
}
}
);
What Gets Tracked
When Langfuse is integrated, AI-Suite automatically tracks:
- Model used for each request
- Input messages
- Output responses
- Token usage (input, output, cached, reasoning, thinking)
- Execution time
- Success/failure status
- Custom metadata
Comparing Multiple Providers
AI-Suite makes it easy to compare responses from different providers:
const responses = await aiSuite.createChatCompletionMultiResult(
[
'openai/gpt-4o',
'anthropic/claude-3-5-sonnet-20241022',
'gemini/gemini-2.5-flash'
],
[{ role: 'user', content: 'Explain quantum computing in simple terms.' }],
{
responseFormat: 'text',
temperature: 0.7
}
);
// Responses is an array in the same order as providers
const [openaiResponse, claudeResponse, geminiResponse] = responses;
if (openaiResponse.success) {
console.log('OpenAI:', openaiResponse.content);
console.log('Time:', openaiResponse.execution_time + 'ms');
}
if (claudeResponse.success) {
console.log('Claude:', claudeResponse.content);
console.log('Time:', claudeResponse.execution_time + 'ms');
}
if (geminiResponse.success) {
console.log('Gemini:', geminiResponse.content);
console.log('Time:', geminiResponse.execution_time + 'ms');
}
Reasoning Models (OpenAI o1/o3, Grok)
OpenAI’s reasoning models (o1, o3) and Grok support extended reasoning:
const response = await aiSuite.createChatCompletion(
'openai/o1',
[{ role: 'user', content: 'Solve this complex problem: ...' }],
{
responseFormat: 'text',
reasoning: {
effort: 'high' // 'low' | 'medium' | 'high'
}
}
);
if (response.success) {
console.log('Response:', response.content);
console.log('Reasoning tokens used:', response.usage?.reasoning_tokens);
console.log('Total tokens:', response.usage?.total_tokens);
}
The reasoning.effort parameter controls how much computational effort the model uses for reasoning:
low: Faster, less thorough reasoningmedium: Balanced reasoninghigh: Slower, more thorough reasoning
Thinking Mode (Gemini 2.5)
Gemini 2.5 models support thinking budget for extended reasoning:
const response = await aiSuite.createChatCompletion(
'gemini/gemini-2.5-pro',
[{ role: 'user', content: 'Analyze this complex problem deeply...' }],
{
responseFormat: 'text',
thinking: {
budget: 1024, // Token budget for thinking (0-16384)
output: true // Include thinking process in output
}
}
);
if (response.success) {
console.log('Response:', response.content);
console.log('Thinking tokens used:', response.usage?.thoughts_tokens);
}
Notes:
budget: Number of tokens allocated for thinking (0-16384). Higher = more thorough analysisoutput: Whether to include the thinking process in the response- Only works with
gemini-2.5-procurrently
Hooks System
Intercept and process requests/responses:
const aiSuite = new AISuite(
{
openaiKey: process.env.OPENAI_API_KEY,
},
{
hooks: {
handleRequest: async (req) => {
// Log or modify request before sending
console.log('Sending request:', JSON.stringify(req, null, 2));
// You can throw an error to abort the request
// throw new Error('Request aborted');
},
handleResponse: async (req, res, metadata) => {
// Process response
console.log('Received response:', res);
console.log('Metadata:', metadata);
// Log to your own tracking system
await myTrackingSystem.log({
request: req,
response: res,
metadata
});
},
failOnError: true // If false, hook errors won't abort the request
}
}
);
Use cases for hooks:
- Custom logging
- Request/response transformation
- Additional validation
- Integration with custom tracking systems
- A/B testing
- Request filtering/blocking
Error Handling
AI-Suite provides consistent error handling across all providers:
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[{ role: 'user', content: 'Hello, world!' }],
{
responseFormat: 'text'
}
);
if (response.success) {
console.log('Success:', response.content);
console.log('Tokens used:', response.usage?.total_tokens);
} else {
// Error handling
console.error('Error tag:', response.tag);
console.error('Error message:', response.error);
console.error('Raw error:', response.raw);
// Handle specific error types
switch (response.tag) {
case 'InvalidAuth':
console.error('Invalid API key');
break;
case 'RateLimitExceeded':
console.error('Rate limit hit, retry later');
break;
case 'InvalidRequest':
console.error('Invalid request parameters');
break;
case 'ServerError':
console.error('Provider server error');
break;
default:
console.error('Unknown error');
}
}
Error tags:
InvalidAuth: Authentication/API key issuesInvalidRequest: Malformed requestInvalidModel: Model not found or not availableRateLimitExceeded: Rate limit hitServerError: Provider server error (5xx)ServerOverloaded: Server overloaded/capacity issuesUnknown: Other errors
Custom LLM Provider
Use any OpenAI-compatible API:
// Example: Using Ollama
const aiSuite = new AISuite({
customURL: 'http://localhost:11434/v1',
customLLMKey: 'not-needed' // Ollama doesn't require auth
});
const response = await aiSuite.createChatCompletion(
'custom-llm/llama3.2',
[{ role: 'user', content: 'Hello!' }],
{
responseFormat: 'text',
temperature: 0.7
}
);
// Example: Using vLLM
const vllmSuite = new AISuite({
customURL: 'http://your-vllm-server:8000/v1',
customLLMKey: 'optional-key'
});
// Example: Using LM Studio
const lmStudioSuite = new AISuite({
customURL: 'http://localhost:1234/v1',
});
This works with any server implementing the OpenAI Chat Completions API format.
Message Roles
AI-Suite supports different message roles:
const messages = [
{
role: 'developer', // System/developer instructions
content: 'You are a helpful assistant specialized in TypeScript'
},
{
role: 'user',
content: 'How do I define an interface?'
},
{
role: 'assistant',
content: 'You can define an interface like this: interface MyInterface { ... }'
},
{
role: 'user',
content: 'Can you show me an example?'
}
];
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
messages,
{ responseFormat: 'text' }
);
Role mapping per provider:
developer: Mapped to appropriate system instruction per provideruser: User messageassistant: Assistant responsetool: Tool/function call result
Image and File Support
AI-Suite supports sending images and files as part of your messages, enabling multimodal AI interactions. This feature is available for compatible providers (OpenAI, Anthropic, Google Gemini).
Supported Content Types
AI-Suite supports three content types in messages:
- Text: Plain text or structured text objects
- Images: Image data as Buffer or base64 string
- Files: Documents with specified media type (PDF, PNG, JPG, JPEG, GIF, WEBP)
Sending Images
Send images by using the InputContentImage format:
import { readFileSync } from 'fs';
const img = readFileSync('./path/to/image.jpg');
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[
{
role: 'user',
content: {
type: 'image',
image: img // readFileSync returns a Buffer, use it directly
}
},
{
role: 'user',
content: 'What do you see in this image?'
}
],
{ responseFormat: 'text' }
);
You can also send base64-encoded images:
const base64Image = 'iVBORw0KGgoAAAANSUhEUgAA...'; // Your base64 string
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[
{
role: 'user',
content: {
type: 'image',
image: base64Image
}
},
{
role: 'user',
content: 'Describe this image'
}
],
{ responseFormat: 'text' }
);
Sending Files
Send documents using the InputContentFile format:
import { readFileSync } from 'fs';
const pdf = readFileSync('./document.pdf');
const response = await aiSuite.createChatCompletion(
'anthropic/claude-3-5-sonnet-20241022',
[
{
role: 'user',
content: {
type: 'file',
mediaType: 'application/pdf',
file: pdf, // readFileSync returns a Buffer, use it directly
fileName: 'document.pdf'
}
},
{
role: 'user',
content: 'Summarize the contents of this PDF'
}
],
{ responseFormat: 'text' }
);
Supported media types:
application/pdf: PDF documentsimage/png: PNG imagesimage/jpg: JPG imagesimage/jpeg: JPEG imagesimage/gif: GIF imagesimage/webp: WebP images
Mixing Multiple Content Types
You can send multiple content items (text, images, files) in a single message:
import { readFileSync } from 'fs';
const img1 = readFileSync('./image1.jpg');
const img2 = readFileSync('./image2.jpg');
const response = await aiSuite.createChatCompletion(
'openai/gpt-4o',
[
{
role: 'user',
content: [
{
type: 'text',
text: 'Please analyze these images:'
},
{
type: 'image',
image: img1
},
{
type: 'image',
image: img2
},
{
type: 'text',
text: 'What are the differences between them?'
}
]
}
],
{ responseFormat: 'text' }
);
Important Notes
- Role Restrictions: Images and files can only be sent in
useranddeveloperrole messages. Assistant and tool messages only support text content. - Provider Support: Not all providers support all content types. Check your provider’s documentation for specific capabilities.
- File Size Limits: Different providers have different file size limits. Consult provider documentation for specifics.
- Structured Output: Image and file inputs work with all response formats including
json_schemaandjson_object.