Help:Ephemera Agent/LLM Providers
Supported Providers
| Provider | Available Models | API Key Location |
|---|---|---|
| Claude (Anthropic) | claude-sonnet-4-20250514, claude-opus-4-20250514, claude-haiku-4-5-20251001 | console.anthropic.com |
| GPT (OpenAI) | gpt-4o, gpt-4o-mini, gpt-4-turbo | platform.openai.com/api-keys |
| Gemini (Google) | gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3.1-pro-preview, gemini-3-flash-preview, gemini-3.1-flash-lite-preview | aistudio.google.com/apikey |
| Custom endpoint | Any model name | Varies by provider |
Custom / OpenAI-Compatible Endpoints
Select Custom (OpenAI-compatible) from the provider dropdown, then enter:
- API Endpoint URL — the full URL of the
/chat/completionsendpoint (e.g.https://api.groq.com/openai/v1/chat/completions) - Model name — exact model string the endpoint expects (e.g.
llama-3.3-70b-versatile) - API key — the provider's API key
Compatible with: Groq, Mistral, Together AI, Fireworks, Perplexity, and others. Also compatible with locally-hosted models via Ollama or LM Studio — expose them with a tunnel (e.g. ngrok) to make them reachable from the server.
Tiered Model Routing
The system uses two separate LLM calls per generative task:
- Planner tier
- Handles task classification and entity extraction. Should be a fast, cheap model. The input is small and the output is structured JSON — no creativity needed. Recommended: Haiku, Flash-Lite, GPT-4o-mini.
- Generator tier
- Handles actual content creation. Receives the full assembled context. Use the best model available for the quality of output you need.
Configure each tier independently in the SETTINGS tab. Settings persist via localStorage.
Recommendations
- Use a fast, low-cost model for the Planner tier.
- Use the strongest available model for the Generator tier.
- If tool calling behaves unexpectedly, try switching providers before assuming the prompts or backend are at fault.
Key Storage
API keys entered manually are stored in browser memory only for the duration of the session. Loading a keys file provides the same memory-only storage with one-click convenience.
Keys are never written to disk, stored in cookies, or sent to any server other than the relevant AI provider's own API.