Primary Method: OpenAI-Compatible Interface
The recommended way to query your knowledge base is through the OpenAI-compatible interface:Available RAG Models
Use special model names to control RAG behavior:Model Name | Description | Best For |
---|---|---|
vedaya-hybrid | Combined entity + relationship retrieval (default) | General queries |
vedaya-naive | Basic keyword search | Simple fact retrieval |
vedaya-local | Entity-focused retrieval | Finding specific entities |
vedaya-global | Relationship-focused retrieval | Understanding connections |
vedaya-bypass | Direct LLM without retrieval | No RAG needed |
Multi-Turn Conversations
The API maintains context across messages:HTTP Fallback Method
If the OpenAI SDK fails, use direct HTTP requests:Alternative: Native Query Endpoint
You can also use the native/query
endpoint for more control:
Advanced Query Options
Complete Working Example
Here’s a practical function that handles both methods:Important Notes
- Streaming is not available - The streaming endpoint returns 404. Use regular requests instead.
- Processing is fast - Documents typically process in seconds, not minutes
- Model names matter - Use
vedaya-*
prefixed models for RAG modes
Response Types
Control output format withresponse_type
:
- “Multiple Paragraphs”
- “Single Paragraph”
- “Bullet Points”
- “Numbered List”
- “Brief Summary”
- “Technical Analysis”
- “Detailed Explanation”