When Using OpenAI-Compatible APIs in Flue, Check the Model Specifier First
A note on getting stuck with 'Unknown model specifier' in Flue 1.0 Beta by mixing up the actual model ID and the provider-id/model-id format.
Tag
A note on getting stuck with 'Unknown model specifier' in Flue 1.0 Beta by mixing up the actual model ID and the provider-id/model-id format.
As a preparatory step before delegating support triage to Hermes Agent, I built three evaluation scenarios using synthetic data—fixing decision criteria and safety constraints in advance without relying on real customer data.
A hands-on report instrumenting Sakana Fugu's OpenAI-compatible API with Langfuse, measuring how latency, token consumption, and TTFT change across Level 1–3 tasks.
A hands-on log comparing how the same weather tool call looks in Vercel's Eve agent framework when observed from the TUI versus the HTTP API, separating the developer-friendly display from the integration-friendly event stream.
I subscribed to Sakana Fugu to understand its nature as an OpenAI-compatible API and to plan how to observe its black-box cooperative reasoning from the outside.
An experiment log where I redact Flue 1.0 Beta observe events before sending them to Langfuse, tracking the issue triage workflow's runId, model, and results.
A hands-on comparison of two ways to send Eve tool-calling executions to Langfuse as trace/span/generation data.
A follow-up to my first look: adding tools and evals to Eve, configuring models via the Vercel AI Gateway, invoking tools from the TUI, and exploring the info and eval commands.
A verification log of dry-running a GitHub Issue triage workflow built with Flue 1.0 Beta from GitHub Actions' issues.opened, instead of a persistent webhook server.
An experimental log of building a triage agent with Flue 1.0 Beta's Agent, Skill, and Workflow features that returns structured severity, reproducibility, and label suggestions for GitHub issues.
A quick validation log of running Vercel's open-source agent framework Eve locally through init, dev startup, and the first session.
A rough summary of how Flue thinks about harnesses, agents, workflows, skills, tools, sandboxes, and persistence — before actually running anything.