VectorDB: Scalable Vector Search for AI Agents
December 17, 2025
Modern AI agents need more than conversation—they need memory. VectorDB is OpenKBS's high-performance vector storage service that gives every agent its own semantic search capability. Store millions of embeddings, query them in milliseconds, and build RAG applications that actually understand context.
Traditional databases search by exact matches. VectorDB searches by meaning. When a user asks "What are our shipping policies?", VectorDB doesn't look for exact keyword matches—it finds documents semantically related to shipping, delivery, returns, and logistics. This is the foundation of every serious RAG implementation.
Traditional databases search by exact matches. VectorDB searches by meaning. When a user asks "What are our shipping policies?", VectorDB doesn't look for exact keyword matches—it finds documents semantically related to shipping, delivery, returns, and logistics. This is the foundation of every serious RAG implementation.
How It Works: Architecture Overview
Every OpenKBS agent can have its own isolated vector index. When you enable VectorDB for an agent, the platform automatically:
• Creates a dedicated vector index for that agent
• Generates embeddings using OpenAI's text-embedding-3 models (large: 3072 dimensions, small: 1536 dimensions)
• Stores vectors with full metadata including encrypted item data
• Performs KNN similarity search at query time
This isolation means your customer service agent's knowledge base never mixes with your invoice processing agent's data. Each agent maintains its own semantic memory.
• Creates a dedicated vector index for that agent
• Generates embeddings using OpenAI's text-embedding-3 models (large: 3072 dimensions, small: 1536 dimensions)
• Stores vectors with full metadata including encrypted item data
• Performs KNN similarity search at query time
This isolation means your customer service agent's knowledge base never mixes with your invoice processing agent's data. Each agent maintains its own semantic memory.
Enabling VectorDB for Your Agent
Activating VectorDB takes three steps:
1. Configure the Search Engine
In your agent settings, set the search engine to "VectorDB" instead of the default IndexedDB.
2. Choose Your Embedding Model
Select either:
•
•
3. Set Embedding Dimension
Choose from: 256, 512, 768, 1024, 1536, or 3072 dimensions. Higher dimensions capture more semantic nuance but require more storage.
1. Configure the Search Engine
In your agent settings, set the search engine to "VectorDB" instead of the default IndexedDB.
2. Choose Your Embedding Model
Select either:
•
text-embedding-3-large (3072 dimensions) — Higher accuracy, best for complex domains•
text-embedding-3-small (1536 dimensions) — Faster, cost-effective for general use3. Set Embedding Dimension
Choose from: 256, 512, 768, 1024, 1536, or 3072 dimensions. Higher dimensions capture more semantic nuance but require more storage.
Search Parameters
Fine-tune your semantic search with these options:
• topK — Number of results to return (default: 30)
• maxTokens — Maximum total tokens in results (default: 10,000)
• minScore — Minimum similarity threshold (0-100%)
These parameters let you balance between comprehensive retrieval and focused, relevant results. A customer support agent might use high topK with strict minScore, while a research agent might prioritize maxTokens for deeper context.
• topK — Number of results to return (default: 30)
• maxTokens — Maximum total tokens in results (default: 10,000)
• minScore — Minimum similarity threshold (0-100%)
These parameters let you balance between comprehensive retrieval and focused, relevant results. A customer support agent might use high topK with strict minScore, while a research agent might prioritize maxTokens for deeper context.
For Developers: Integration Patterns
VectorDB integrates seamlessly with OpenKBS handlers. Here's how the system processes queries:
When users send messages, the system automatically:
1. Converts the conversation to an embedding
2. Queries VectorDB for semantically similar items
3. Injects relevant context into the system prompt
4. Sends the enriched prompt to the LLM
// Automatic RAG flow in chat-api
if (kbData?.searchEngine === 'VectorDB') {
const {knowledgeBase} = await getKnowledgeBase({
query: JSON.stringify(payload.messages),
kbId,
kbData,
topK: kbData?.options?.vectorDBTopK || 30,
maxTokens: kbData?.options?.vectorDBMaxTokens || 10000,
minScore: kbData?.options?.vectorDBMinScore || null
});
// knowledgeBase is injected into system prompt
}When users send messages, the system automatically:
1. Converts the conversation to an embedding
2. Queries VectorDB for semantically similar items
3. Injects relevant context into the system prompt
4. Sends the enriched prompt to the LLM
Adding Items to VectorDB
Items are automatically indexed when you use the NoSQL API with embeddings:
The system stores both the vector (for similarity search) and full item data (for retrieval), all encrypted with your agent's key.
// Create item with embeddings
await createItem({
kbId: 'your-agent-id',
itemId: 'doc-001',
totalTokens: 500,
embeddings: [/* 3072 float values */],
embeddingModel: 'text-embedding-3-large',
embeddingDimension: 3072,
attributes: [
{ attrType: 'body', attrName: 'content' },
{ attrType: 'keyword1', attrName: 'category' }
],
item: {
content: 'Your document content here...',
category: 'policies'
}
});The system stores both the vector (for similarity search) and full item data (for retrieval), all encrypted with your agent's key.
Scalability: Built for Enterprise
VectorDB is designed to scale without infrastructure management:
• Serverless architecture — No servers to provision, patch, or scale
• Per-agent isolation — Each agent's vectors are completely separated
• Automatic index management — Indexes are created and optimized automatically
• Global availability — Deployed across multiple regions for low latency
• High durability — Built on enterprise-grade cloud infrastructure
Whether you have 1,000 vectors or 100 million, VectorDB handles scaling transparently. You focus on building intelligent agents; we handle the infrastructure.
• Serverless architecture — No servers to provision, patch, or scale
• Per-agent isolation — Each agent's vectors are completely separated
• Automatic index management — Indexes are created and optimized automatically
• Global availability — Deployed across multiple regions for low latency
• High durability — Built on enterprise-grade cloud infrastructure
Whether you have 1,000 vectors or 100 million, VectorDB handles scaling transparently. You focus on building intelligent agents; we handle the infrastructure.
Pricing: Start at €2.50/month
VectorDB pricing is simple and predictable:
Usage-Based Pricing:
• Queries: 10 credits per 1,000 queries (€0.10 per 1K queries)
• Writes: 1 credit per 1,000 operations (€0.01 per 1K writes)
Example Costs:
• Small RAG app (1GB + 10K queries/month): €2.50 + €1.00 = €3.50/month
• Medium RAG app (2GB + 100K queries/month): €5.00 + €10.00 = €15.00/month
• Enterprise RAG (10GB + 1M queries/month): €25.00 + €100.00 = €125.00/month
Compare this to standalone vector database services that often charge $50-500/month for similar capabilities.
Upfront Storage
€2.50/month for 1GB of vector storage (prepaid)
€2.50/month for 1GB of vector storage (prepaid)
Usage-Based Pricing:
• Queries: 10 credits per 1,000 queries (€0.10 per 1K queries)
• Writes: 1 credit per 1,000 operations (€0.01 per 1K writes)
Example Costs:
• Small RAG app (1GB + 10K queries/month): €2.50 + €1.00 = €3.50/month
• Medium RAG app (2GB + 100K queries/month): €5.00 + €10.00 = €15.00/month
• Enterprise RAG (10GB + 1M queries/month): €25.00 + €100.00 = €125.00/month
Compare this to standalone vector database services that often charge $50-500/month for similar capabilities.
Use Cases
Customer Support Agents
Index your entire knowledge base—FAQs, documentation, policies—and let agents find relevant answers instantly. Customers get accurate, contextual responses without waiting for human agents.
Document Intelligence
Process thousands of contracts, invoices, or reports. Agents retrieve relevant documents based on semantic queries: "Show me all contracts with penalty clauses" finds results even if they use different terminology.
Personalized Recommendations
Store product descriptions as vectors. When users describe what they want, find semantically similar products—even if they don't use exact keywords.
Code Assistants
Index your codebase documentation, API references, and past solutions. Developers ask questions in natural language and get contextually relevant code examples.
Index your entire knowledge base—FAQs, documentation, policies—and let agents find relevant answers instantly. Customers get accurate, contextual responses without waiting for human agents.
Document Intelligence
Process thousands of contracts, invoices, or reports. Agents retrieve relevant documents based on semantic queries: "Show me all contracts with penalty clauses" finds results even if they use different terminology.
Personalized Recommendations
Store product descriptions as vectors. When users describe what they want, find semantically similar products—even if they don't use exact keywords.
Code Assistants
Index your codebase documentation, API references, and past solutions. Developers ask questions in natural language and get contextually relevant code examples.
Getting Started
1. Create an OpenKBS account if you don't have one
2. Create a new agent or select an existing one
3. Go to Agent Settings → Search Engine → Select "VectorDB"
4. Choose your embedding model and dimension
5. Start adding items with embeddings via the API or UI
Your agent now has semantic memory. Every query benefits from contextual understanding, and every response is grounded in your actual data.
For API documentation and code examples, visit our developer tutorials.
2. Create a new agent or select an existing one
3. Go to Agent Settings → Search Engine → Select "VectorDB"
4. Choose your embedding model and dimension
5. Start adding items with embeddings via the API or UI
Your agent now has semantic memory. Every query benefits from contextual understanding, and every response is grounded in your actual data.
For API documentation and code examples, visit our developer tutorials.
