Help:Ephemera Agent/LLM Providers

From Encyclopedia Ephemera

Supported Providers

Provider Available Models API Key Location
Claude (Anthropic) claude-sonnet-4-20250514, claude-opus-4-20250514, claude-haiku-4-5-20251001 console.anthropic.com
GPT (OpenAI) gpt-4o, gpt-4o-mini, gpt-4-turbo platform.openai.com/api-keys
Gemini (Google) gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3.1-pro-preview, gemini-3-flash-preview, gemini-3.1-flash-lite-preview aistudio.google.com/apikey
Custom endpoint Any model name Varies by provider

Custom / OpenAI-Compatible Endpoints

Select Custom (OpenAI-compatible) from the provider dropdown, then enter:

  • API Endpoint URL — the full URL of the /chat/completions endpoint (e.g. https://api.groq.com/openai/v1/chat/completions)
  • Model name — exact model string the endpoint expects (e.g. llama-3.3-70b-versatile)
  • API key — the provider's API key

Compatible with: Groq, Mistral, Together AI, Fireworks, Perplexity, and others. Also compatible with locally-hosted models via Ollama or LM Studio — expose them with a tunnel (e.g. ngrok) to make them reachable from the server.

Tiered Model Routing

The system uses two separate LLM calls per generative task:

Planner tier
Handles task classification and entity extraction. Should be a fast, cheap model. The input is small and the output is structured JSON — no creativity needed. Recommended: Haiku, Flash-Lite, GPT-4o-mini.
Generator tier
Handles actual content creation. Receives the full assembled context. Use the best model available for the quality of output you need.

Configure each tier independently in the SETTINGS tab. Settings persist via localStorage.


Recommendations

  • Use a fast, low-cost model for the Planner tier.
  • Use the strongest available model for the Generator tier.
  • If tool calling behaves unexpectedly, try switching providers before assuming the prompts or backend are at fault.

Key Storage

API keys entered manually are stored in browser memory only for the duration of the session. Loading a keys file provides the same memory-only storage with one-click convenience.

Keys are never written to disk, stored in cookies, or sent to any server other than the relevant AI provider's own API.