Configuration

Overview

context-window provides extensive configuration options to customize AI models, vector storage, text chunking, and retrieval behavior.

Configuration Interface

interface CreateContextWindowOptions {
  namespace: string;
  data: string | string[];
  ai?: AIConfig;
  vectorStore?: VectorStoreConfig;
  chunk?: ChunkConfig;
  limits?: LimitsConfig;
}

Required Options

namespace

string

required

Unique identifier for the context window. Also used as the Pinecone namespace.Rules:

Must be unique across your application
Use descriptive names (e.g., "user-documentation", not "docs1")
Alphanumeric characters, hyphens, and underscores only

namespace: "product-catalog"

data

string | string[]

required

File path(s) or directory path(s) to ingest.Supported formats: .txt, .md, .pdfBehavior:

Directories are processed recursively
Hidden files (starting with .) are ignored
Non-supported files are skipped

// Single file
data: "./document.pdf"

// Multiple files
data: ["./doc1.pdf", "./doc2.md"]

// Directory
data: ["./documentation"]

// Mixed
data: ["./docs", "./extra/file.pdf"]

Optional Configurations

ai

AIConfig

AI provider and model configurationDefault:

{
  provider: "openai",
  model: "gpt-4o-mini"
}

Show AIConfig properties

provider

"openai"

AI provider to use. Currently only "openai" is supported.

provider: "openai"

model

string

OpenAI model to use for chat completions.Available models:

"gpt-4o-mini" (default) - Fast and cost-effective
"gpt-4o" - Most capable, higher accuracy
"gpt-4-turbo" - Balance of speed and capability
"gpt-3.5-turbo" - Fastest, least expensive

model: "gpt-4o"

Embeddings always use text-embedding-3-small (1536 dimensions)

Examples:

// Use default (gpt-4o-mini)
ai: { provider: "openai" }

// Use GPT-4 for better accuracy
ai: {
  provider: "openai",
  model: "gpt-4o"
}

// Use GPT-3.5 for faster, cheaper responses
ai: {
  provider: "openai",
  model: "gpt-3.5-turbo"
}

vectorStore

VectorStoreConfig

Vector store provider and configurationDefault:

{
  provider: "pinecone",
  namespace: namespace  // Uses namespace by default
}

Show VectorStoreConfig properties

provider

"pinecone"

Vector store provider. Currently only "pinecone" is supported.

provider: "pinecone"

namespace

string

Pinecone namespace for data isolation. Defaults to namespace.Use custom namespaces to:

Separate different datasets in the same index
Test with production data
Organize by environment (dev, staging, prod)

namespace: "production-docs"

Examples:

// Use default namespace (same as namespace)
vectorStore: { provider: "pinecone" }

// Custom namespace
vectorStore: {
  provider: "pinecone",
  namespace: "v2-documentation"
}

// Environment-based namespace
vectorStore: {
  provider: "pinecone",
  namespace: `docs-${process.env.NODE_ENV}`
}

chunk

ChunkConfig

Text chunking configurationDefault:

{
  size: 1000,
  overlap: 150
}

Show ChunkConfig properties

size

number

Maximum characters per chunk.Recommended ranges:

Small (500-800): Precise, specific answers
Medium (1000-1500): Balanced (default)
Large (1500-2000): Comprehensive explanations

size: 1500

overlap

number

Characters to overlap between consecutive chunks.Purpose: Preserves context at chunk boundariesRecommended: 10-20% of chunk size

overlap: 200

Examples:

// Use defaults
chunk: { size: 1000, overlap: 150 }

// Precise answers (smaller chunks)
chunk: {
  size: 500,
  overlap: 75
}

// Comprehensive answers (larger chunks)
chunk: {
  size: 2000,
  overlap: 300
}

// Legal documents (large chunks, high overlap)
chunk: {
  size: 1500,
  overlap: 250
}

Choosing Chunk Size:

Small (500-800)
Medium (1000-1500)
Large (1500-2000)

Best for:

FAQ documents
Simple Q&A
Definition lookups
Quick facts

Trade-offs:

✅ Precise matches
✅ Less noise
❌ May miss context
❌ More chunks = more cost

limits

LimitsConfig

Query and retrieval limitsDefault:

{
  topK: 8,
  maxContextChars: 8000,
  scoreThreshold: 0
}

Show LimitsConfig properties

topK

number

Number of most relevant chunks to retrieve from Pinecone.Range: 1-20 (typical: 3-15)Impact:

Higher = more context, better coverage
Lower = faster, more focused

topK: 10

maxContextChars

number

Maximum characters to send to the LLM as context.Range: 2000-16000 (typical: 5000-12000)Impact:

Higher = more context, higher cost
Lower = faster, cheaper, more focused

Note: If retrieved chunks exceed this limit, they’re truncated.

maxContextChars: 10000

scoreThreshold

number

Minimum similarity score (0-1) for retrieved chunks.Range: 0-1 (0 = no filtering)Impact:

0: Include all matches (default)
0.5-0.6: Moderate relevance
0.7-0.8: High confidence only
0.9+: Very strict matching

scoreThreshold: 0.75

Examples:

// Use defaults (balanced)
limits: {
  topK: 8,
  maxContextChars: 8000,
  scoreThreshold: 0
}

// High precision (strict matching)
limits: {
  topK: 5,
  maxContextChars: 6000,
  scoreThreshold: 0.75
}

// Comprehensive coverage
limits: {
  topK: 12,
  maxContextChars: 12000,
  scoreThreshold: 0
}

// Cost-optimized
limits: {
  topK: 5,
  maxContextChars: 5000,
  scoreThreshold: 0.6
}

Tuning Guidelines:

For Accuracy

limits: {
  topK: 10-15,
  maxContextChars: 10000-12000,
  scoreThreshold: 0-0.5
}

More context = better answers

For Speed

limits: {
  topK: 3-5,
  maxContextChars: 5000-6000,
  scoreThreshold: 0.7
}

Less context = faster responses

For Cost

limits: {
  topK: 5,
  maxContextChars: 5000,
  scoreThreshold: 0.6
}

Less tokens = lower costs

For Precision

limits: {
  topK: 5,
  maxContextChars: 6000,
  scoreThreshold: 0.75-0.85
}

High threshold = focused answers

Complete Examples

General Purpose Configuration

await createCtxWindow({
  namespace: "general-docs",
  data: ["./documentation"],
  ai: {
    provider: "openai",
    model: "gpt-4o-mini"
  },
  vectorStore: {
    provider: "pinecone"
  },
  chunk: {
    size: 1000,
    overlap: 150
  },
  limits: {
    topK: 8,
    maxContextChars: 8000,
    scoreThreshold: 0
  }
});

High-Accuracy Configuration

await createCtxWindow({
  namespace: "research-papers",
  data: ["./papers"],
  ai: {
    provider: "openai",
    model: "gpt-4o"  // Best model
  },
  chunk: {
    size: 1500,      // Larger chunks for context
    overlap: 250
  },
  limits: {
    topK: 12,        // More chunks
    maxContextChars: 12000,
    scoreThreshold: 0  // No filtering
  },
  vectorStore: { provider: "pinecone" }
});

Cost-Optimized Configuration

await createCtxWindow({
  namespace: "faq",
  data: ["./faq.md"],
  ai: {
    provider: "openai",
    model: "gpt-4o-mini"  // Cheapest model
  },
  chunk: {
    size: 2000,           // Fewer chunks
    overlap: 100
  },
  limits: {
    topK: 5,              // Fewer retrievals
    maxContextChars: 5000, // Less context
    scoreThreshold: 0.6   // Filter low-quality matches
  },
  vectorStore: { provider: "pinecone" }
});

Legal/Compliance Configuration

await createCtxWindow({
  namespace: "legal-docs",
  data: ["./contracts", "./policies"],
  ai: {
    provider: "openai",
    model: "gpt-4o"  // Highest accuracy
  },
  chunk: {
    size: 1500,      // Large chunks for full context
    overlap: 300     // High overlap for continuity
  },
  limits: {
    topK: 5,         // Fewer, more relevant chunks
    maxContextChars: 8000,
    scoreThreshold: 0.75  // High confidence only
  },
  vectorStore: { provider: "pinecone" }
});

Environment Variables

Required environment configuration:

# Required
OPENAI_API_KEY=sk-...
PINECONE_API_KEY=...

# Optional (with defaults)
PINECONE_INDEX=context-window
PINECONE_ENVIRONMENT=us-east-1

Type Definitions

Full TypeScript types for reference:

interface CreateContextWindowOptions {
  namespace: string;
  data: string | string[];
  ai?: AIConfig;
  vectorStore?: VectorStoreConfig;
  chunk?: ChunkConfig;
  limits?: LimitsConfig;
}

interface AIConfig {
  provider: "openai";
  model?: string;
}

interface VectorStoreConfig {
  provider: "pinecone";
  namespace?: string;
}

interface ChunkConfig {
  size?: number;
  overlap?: number;
}

interface LimitsConfig {
  topK?: number;
  maxContextChars?: number;
  scoreThreshold?: number;
}

createCtxWindow

Main function using these options

Best Practices

Guidelines for optimal configuration

Examples

Real-world configuration examples

Troubleshooting

Solve configuration issues

Overview

Core Functions

​Overview

​Configuration Interface

​Required Options

​namespace

​data

​Optional Configurations

​ai

​vectorStore

​chunk

​limits

For Accuracy

For Speed

For Cost

For Precision

​Complete Examples

​General Purpose Configuration

​High-Accuracy Configuration

​Cost-Optimized Configuration

​Legal/Compliance Configuration

​Environment Variables

​Type Definitions

​Related

createCtxWindow

Best Practices

Examples

Troubleshooting

Overview

Configuration Interface

Required Options

namespace

data

Optional Configurations

ai

vectorStore

chunk

limits

Complete Examples

General Purpose Configuration

High-Accuracy Configuration

Cost-Optimized Configuration

Legal/Compliance Configuration

Environment Variables

Type Definitions

Related