OpenAI-Compatible Interface (Primary Method)
Basic Setup
Available RAG Models
Use these special model names to control RAG behavior:Model Name | Description | Use Case |
---|---|---|
vedaya-naive | Basic keyword search | Simple fact retrieval |
vedaya-local | Entity-focused retrieval | Finding specific entities |
vedaya-global | Relationship-focused | Understanding connections |
vedaya-hybrid | Combined approach (default) | General queries |
Working Example with Fallback
Multi-Turn Conversations
Maintain context across multiple queries:Direct HTTP Request (No SDK)
Embeddings Endpoint
Generate embeddings for similarity search:List Available Models
Important Compatibility Notes
What Works
✅ OpenAI chat completions endpoint at/v1/chat/completions
✅ Special
vedaya-*
models for RAG control✅ Authentication optional (works with dummy keys)
✅ Multi-turn conversations with context
✅ Embeddings generation
✅ Model listing
What Doesn’t Work
❌ Streaming responses (returns 404)❌ Real-time token streaming
❌ Ollama endpoints (may not be implemented)
Authentication
- Optional: API works without authentication
- If you have a key, add it to headers:
"Authorization": f"Bearer {API_KEY}"
- Dummy keys like “sk-dummy” work for testing