Skip to main content

Overview

context-window provides extensive configuration options to customize AI models, vector storage, text chunking, and retrieval behavior.

Configuration Interface

interface CreateContextWindowOptions {
  namespace: string;
  data: string | string[];
  ai?: AIConfig;
  vectorStore?: VectorStoreConfig;
  chunk?: ChunkConfig;
  limits?: LimitsConfig;
}

Required Options

namespace

namespace
string
required
Unique identifier for the context window. Also used as the Pinecone namespace.Rules:
  • Must be unique across your application
  • Use descriptive names (e.g., "user-documentation", not "docs1")
  • Alphanumeric characters, hyphens, and underscores only
namespace: "product-catalog"

data

data
string | string[]
required
File path(s) or directory path(s) to ingest.Supported formats: .txt, .md, .pdfBehavior:
  • Directories are processed recursively
  • Hidden files (starting with .) are ignored
  • Non-supported files are skipped
// Single file
data: "./document.pdf"

// Multiple files
data: ["./doc1.pdf", "./doc2.md"]

// Directory
data: ["./documentation"]

// Mixed
data: ["./docs", "./extra/file.pdf"]

Optional Configurations

ai

ai
AIConfig
AI provider and model configurationDefault:
{
  provider: "openai",
  model: "gpt-4o-mini"
}
Examples:
// Use default (gpt-4o-mini)
ai: { provider: "openai" }

// Use GPT-4 for better accuracy
ai: {
  provider: "openai",
  model: "gpt-4o"
}

// Use GPT-3.5 for faster, cheaper responses
ai: {
  provider: "openai",
  model: "gpt-3.5-turbo"
}

vectorStore

vectorStore
VectorStoreConfig
Vector store provider and configurationDefault:
{
  provider: "pinecone",
  namespace: namespace  // Uses namespace by default
}
Examples:
// Use default namespace (same as namespace)
vectorStore: { provider: "pinecone" }

// Custom namespace
vectorStore: {
  provider: "pinecone",
  namespace: "v2-documentation"
}

// Environment-based namespace
vectorStore: {
  provider: "pinecone",
  namespace: `docs-${process.env.NODE_ENV}`
}

chunk

chunk
ChunkConfig
Text chunking configurationDefault:
{
  size: 1000,
  overlap: 150
}
Examples:
// Use defaults
chunk: { size: 1000, overlap: 150 }

// Precise answers (smaller chunks)
chunk: {
  size: 500,
  overlap: 75
}

// Comprehensive answers (larger chunks)
chunk: {
  size: 2000,
  overlap: 300
}

// Legal documents (large chunks, high overlap)
chunk: {
  size: 1500,
  overlap: 250
}
Choosing Chunk Size:
  • Small (500-800)
  • Medium (1000-1500)
  • Large (1500-2000)
Best for:
  • FAQ documents
  • Simple Q&A
  • Definition lookups
  • Quick facts
Trade-offs:
  • ✅ Precise matches
  • ✅ Less noise
  • ❌ May miss context
  • ❌ More chunks = more cost

limits

limits
LimitsConfig
Query and retrieval limitsDefault:
{
  topK: 8,
  maxContextChars: 8000,
  scoreThreshold: 0
}
Examples:
// Use defaults (balanced)
limits: {
  topK: 8,
  maxContextChars: 8000,
  scoreThreshold: 0
}

// High precision (strict matching)
limits: {
  topK: 5,
  maxContextChars: 6000,
  scoreThreshold: 0.75
}

// Comprehensive coverage
limits: {
  topK: 12,
  maxContextChars: 12000,
  scoreThreshold: 0
}

// Cost-optimized
limits: {
  topK: 5,
  maxContextChars: 5000,
  scoreThreshold: 0.6
}
Tuning Guidelines:

For Accuracy

limits: {
  topK: 10-15,
  maxContextChars: 10000-12000,
  scoreThreshold: 0-0.5
}
More context = better answers

For Speed

limits: {
  topK: 3-5,
  maxContextChars: 5000-6000,
  scoreThreshold: 0.7
}
Less context = faster responses

For Cost

limits: {
  topK: 5,
  maxContextChars: 5000,
  scoreThreshold: 0.6
}
Less tokens = lower costs

For Precision

limits: {
  topK: 5,
  maxContextChars: 6000,
  scoreThreshold: 0.75-0.85
}
High threshold = focused answers

Complete Examples

General Purpose Configuration

await createCtxWindow({
  namespace: "general-docs",
  data: ["./documentation"],
  ai: {
    provider: "openai",
    model: "gpt-4o-mini"
  },
  vectorStore: {
    provider: "pinecone"
  },
  chunk: {
    size: 1000,
    overlap: 150
  },
  limits: {
    topK: 8,
    maxContextChars: 8000,
    scoreThreshold: 0
  }
});

High-Accuracy Configuration

await createCtxWindow({
  namespace: "research-papers",
  data: ["./papers"],
  ai: {
    provider: "openai",
    model: "gpt-4o"  // Best model
  },
  chunk: {
    size: 1500,      // Larger chunks for context
    overlap: 250
  },
  limits: {
    topK: 12,        // More chunks
    maxContextChars: 12000,
    scoreThreshold: 0  // No filtering
  },
  vectorStore: { provider: "pinecone" }
});

Cost-Optimized Configuration

await createCtxWindow({
  namespace: "faq",
  data: ["./faq.md"],
  ai: {
    provider: "openai",
    model: "gpt-4o-mini"  // Cheapest model
  },
  chunk: {
    size: 2000,           // Fewer chunks
    overlap: 100
  },
  limits: {
    topK: 5,              // Fewer retrievals
    maxContextChars: 5000, // Less context
    scoreThreshold: 0.6   // Filter low-quality matches
  },
  vectorStore: { provider: "pinecone" }
});

Legal/Compliance Configuration

await createCtxWindow({
  namespace: "legal-docs",
  data: ["./contracts", "./policies"],
  ai: {
    provider: "openai",
    model: "gpt-4o"  // Highest accuracy
  },
  chunk: {
    size: 1500,      // Large chunks for full context
    overlap: 300     // High overlap for continuity
  },
  limits: {
    topK: 5,         // Fewer, more relevant chunks
    maxContextChars: 8000,
    scoreThreshold: 0.75  // High confidence only
  },
  vectorStore: { provider: "pinecone" }
});

Environment Variables

Required environment configuration:
# Required
OPENAI_API_KEY=sk-...
PINECONE_API_KEY=...

# Optional (with defaults)
PINECONE_INDEX=context-window
PINECONE_ENVIRONMENT=us-east-1

Type Definitions

Full TypeScript types for reference:
interface CreateContextWindowOptions {
  namespace: string;
  data: string | string[];
  ai?: AIConfig;
  vectorStore?: VectorStoreConfig;
  chunk?: ChunkConfig;
  limits?: LimitsConfig;
}

interface AIConfig {
  provider: "openai";
  model?: string;
}

interface VectorStoreConfig {
  provider: "pinecone";
  namespace?: string;
}

interface ChunkConfig {
  size?: number;
  overlap?: number;
}

interface LimitsConfig {
  topK?: number;
  maxContextChars?: number;
  scoreThreshold?: number;
}