On this page
- Architecture
- Domain Contract
- Base Layer: baseCLIProvider
- Step-by-Step Implementation
- Testing
- Mandatory Cross-Provider Conventions
- Force structured output format
- Handle
dangerously_skip_permissions - Handle
system_prompt - Handle
model - Handle
output_formatfor response parsing - Ignore unknown options silently
- Token counting pattern
- NUL byte sanitization in display event parsers
- Error handling conventions
- Apply
dangerously_skip_permissionsin both arg builders - extractTextContent vs extractDisplayTextFromEvents
- Existing Providers Reference
- Non-CLI Provider (HTTP API)
- Checklist
Creating an Agent Provider
Guide for implementing a new agent provider in AWF. Covers the domain contract, infrastructure base layer, hooks, options, display events, session management, and registration.
Architecture
Agent providers live in the infrastructure layer and implement the ports.AgentProvider interface defined in the domain layer. The base infrastructure handles execution orchestration, token counting, state cloning, and stream filtering. Each provider only implements the provider-specific parts via hooks.
Domain Layer (ports) Infrastructure Layer (agents)
┌───────────────────────┐ ┌──────────────────────────────────┐
│ AgentProvider │◄────────│ baseCLIProvider │
│ CLIExecutor │ │ ├── execute() │
│ Tokenizer │ │ ├── executeConversation() │
│ Logger │ │ └── cliProviderHooks{...} │
└───────────────────────┘ │ │
│ YourProvider │
│ ├── newBase() → hooks wiring │
│ ├── buildExecuteArgs() │
│ ├── buildConversationArgs() │
│ ├── extractSessionID() │
│ ├── parseDisplayEvents() │
│ └── validateOptions() │
└──────────────────────────────────┘Domain Contract
AgentProvider Interface
File: internal/domain/ports/agent_provider.go
type AgentProvider interface {
Execute(ctx context.Context, prompt string, options map[string]any,
stdout, stderr io.Writer) (*workflow.AgentResult, error)
ExecuteConversation(ctx context.Context, state *workflow.ConversationState,
prompt string, options map[string]any,
stdout, stderr io.Writer) (*workflow.ConversationResult, error)
Name() string
Validate() error
}| Method | Purpose |
|---|---|
Execute | Single-turn prompt execution. Returns AgentResult with output, tokens, timing. |
ExecuteConversation | Multi-turn execution with conversation state. Returns ConversationResult with updated state. |
Name | Unique provider identifier used in workflow YAML (provider: your_name). |
Validate | Pre-flight check (binary in PATH, API key set, etc.). Called before first execution. |
AgentResult
File: internal/domain/workflow/agent_config.go
type AgentResult struct {
Provider string
Output string // extracted text output
DisplayOutput string // filtered output for terminal display
Response map[string]any // parsed JSON response (optional)
Tokens int
TokensEstimated bool
Error error
StartedAt time.Time
CompletedAt time.Time
}ConversationResult
File: internal/domain/workflow/conversation.go
type ConversationResult struct {
Provider string
State *ConversationState // updated state with new turns
Output string // last assistant response
DisplayOutput string
Response map[string]any
TokensInput int
TokensOutput int
TokensTotal int
TokensEstimated bool
Error error
StartedAt time.Time
CompletedAt time.Time
}ConversationState
type ConversationState struct {
SessionID string
Turns []Turn
TotalTurns int
TotalTokens int
StoppedBy StopReason
}
type Turn struct {
Role TurnRole // "system", "user", "assistant"
Content string
Tokens int
}Base Layer: baseCLIProvider
All CLI-based providers delegate to baseCLIProvider, which handles:
- Prompt validation
- CLI binary execution via
CLIExecutor - Stream filtering and display event rendering
- Token counting via injected
Tokenizer - Conversation state cloning and turn management
- Timing (StartedAt / CompletedAt)
Hooks
Provider-specific behavior is injected via cliProviderHooks:
type cliProviderHooks struct {
buildExecuteArgs func(prompt string, options map[string]any) ([]string, error)
buildConversationArgs func(state *workflow.ConversationState, prompt string, options map[string]any) ([]string, error)
extractSessionID func(output string) (string, error)
extractTextContent func(output string) string // optional
validateOptions func(options map[string]any) error // optional
parseDisplayEvents DisplayEventParser // optional
}| Hook | Required | Purpose |
|---|---|---|
buildExecuteArgs | yes | Construct CLI argv for single-turn execution. |
buildConversationArgs | yes | Construct CLI argv for multi-turn execution (session resume). |
extractSessionID | yes | Parse session/thread ID from CLI output for conversation resume. |
extractTextContent | no | Extract human-readable text from structured output (e.g., JSON wrapper). Falls back to raw output if nil. |
validateOptions | no | Validate provider-specific options before execution. Return error to reject. |
parseDisplayEvents | no | Parse a single NDJSON line into []DisplayEvent for real-time terminal display. |
What baseCLIProvider Does For You
In execute():
- Rejects empty prompts
- Calls
validateOptionshook (if set) - Calls
buildExecuteArgshook to get CLI arguments - Runs binary via
CLIExecutor.Run() - Filters output through
StreamFilterWriter(ifparseDisplayEventsset) - Counts output tokens:
b.tokenizer.CountTokens(output) - Builds and returns
AgentResult
In executeConversation():
- Clones conversation state (caller’s original is never mutated)
- Appends user turn to cloned state
- Calls
validateOptionsandbuildConversationArgshooks - Runs binary
- Calls
extractSessionIDhook, updates state - Appends assistant turn to state
- Counts input tokens (
CountTurnsTokens) and output tokens (CountTokens) - Builds and returns
ConversationResult
Step-by-Step Implementation
1. Create the provider file
File: internal/infrastructure/agents/myprovider_provider.go
package agents
import (
"context"
"fmt"
"io"
"os/exec"
"github.com/awf-project/cli/internal/domain/ports"
"github.com/awf-project/cli/internal/domain/workflow"
"github.com/awf-project/cli/internal/infrastructure/logger"
)
type MyProviderProvider struct {
base *baseCLIProvider
logger ports.Logger
executor ports.CLIExecutor
tokenizer ports.Tokenizer
}2. Add constructors
Two constructors are required: a zero-config default and a functional-options variant.
func NewMyProviderProvider() *MyProviderProvider {
p := &MyProviderProvider{
logger: logger.NopLogger{},
executor: NewExecCLIExecutor(),
}
p.base = p.newBase()
return p
}
func NewMyProviderProviderWithOptions(opts ...MyProviderProviderOption) *MyProviderProvider {
p := &MyProviderProvider{
logger: logger.NopLogger{},
executor: NewExecCLIExecutor(),
}
for _, opt := range opts {
opt(p)
}
p.base = p.newBase()
return p
}Important:
p.newBase()must be called after applying options, since options may set the executor, logger, or tokenizer thatnewBaseforwards.
3. Wire the hooks via newBase()
func (p *MyProviderProvider) newBase() *baseCLIProvider {
b := newBaseCLIProvider("myprovider", "myprovider-cli", p.executor, p.logger, cliProviderHooks{
buildExecuteArgs: p.buildExecuteArgs,
buildConversationArgs: p.buildConversationArgs,
extractSessionID: p.extractSessionID,
validateOptions: validateMyProviderOptions,
parseDisplayEvents: p.parseMyProviderDisplayEvents,
})
if p.tokenizer != nil {
b.tokenizer = p.tokenizer
}
return b
}Parameters to newBaseCLIProvider:
| Parameter | Value |
|---|---|
name | Provider identifier returned by Name(). Used in AgentResult.Provider. Must match the value users write in provider: YAML field. |
binary | CLI binary name looked up in $PATH. |
executor | The CLIExecutor to run the binary. Always forward p.executor. |
log | Logger. Nil-defaults to NopLogger. |
hooks | Provider-specific hooks (see table above). |
4. Implement the required hooks
buildExecuteArgs
Construct the CLI arguments for a single-turn call.
func (p *MyProviderProvider) buildExecuteArgs(prompt string, options map[string]any) ([]string, error) {
args := []string{"run", "--prompt", prompt, "--format", "json"}
if model, ok := getStringOption(options, "model"); ok {
args = append(args, "--model", model)
}
if skip, ok := getBoolOption(options, "dangerously_skip_permissions"); ok && skip {
args = append(args, "--yes")
}
return args, nil
}Available helpers: getStringOption(options, key), getBoolOption(options, key) — type-safe extraction from map[string]any.
buildConversationArgs
Construct CLI arguments for multi-turn. Must handle session resume vs first turn.
func (p *MyProviderProvider) buildConversationArgs(
state *workflow.ConversationState, prompt string, options map[string]any,
) ([]string, error) {
var args []string
if state.SessionID != "" {
args = []string{"resume", state.SessionID, "--prompt", prompt, "--format", "json"}
} else {
effectivePrompt := buildFirstTurnPrompt(prompt, options)
args = []string{"run", "--prompt", effectivePrompt, "--format", "json"}
}
if model, ok := getStringOption(options, "model"); ok {
args = append(args, "--model", model)
}
return args, nil
}Key patterns:
- Use
state.SessionIDto detect resume vs new conversation. - Use
buildFirstTurnPrompt(prompt, options)to inlinesystem_promptinto the first message when the CLI has no native--system-promptflag. - Always force a structured output format (JSON/NDJSON) for reliable parsing.
extractSessionID
Parse the session identifier from CLI output so subsequent turns can resume.
func (p *MyProviderProvider) extractSessionID(output string) (string, error) {
if output == "" {
return "", errors.New("empty output")
}
evt := findFirstNDJSONEvent(output, "session_start")
if evt == nil {
return "", errors.New("session_start event not found")
}
id, ok := evt["session_id"].(string)
if !ok || id == "" {
return "", errors.New("session_id missing or empty")
}
return id, nil
}Available helper: findFirstNDJSONEvent(output, eventType) — scans NDJSON output line-by-line for the first {"type": eventType, ...} event and returns it as map[string]any.
Session ID extraction errors are non-fatal. The base layer logs the error and continues in stateless mode. The conversation still works; it just cannot resume on the next turn.
5. Implement the optional hooks
validateOptions
Reject invalid option combinations before execution.
func validateMyProviderOptions(options map[string]any) error {
if options == nil {
return nil
}
if model, ok := getStringOption(options, "model"); ok {
if !strings.HasPrefix(model, "myprovider-") {
return fmt.Errorf("invalid model: %s (must start with 'myprovider-')", model)
}
}
return nil
}parseDisplayEvents
Parse a single NDJSON line into display events for real-time terminal rendering.
func (p *MyProviderProvider) parseMyProviderDisplayEvents(line []byte) []DisplayEvent {
var evt struct {
Type string `json:"type"`
Content string `json:"content"`
Tool string `json:"tool_name"`
}
if err := json.Unmarshal(line, &evt); err != nil {
return nil
}
switch evt.Type {
case "text":
return []DisplayEvent{{Kind: EventText, Text: evt.Content}}
case "tool_call":
return []DisplayEvent{{Kind: EventToolUse, Name: evt.Tool}}
}
return nil
}Display event kinds:
| Constant | Purpose |
|---|---|
EventText | Text content from the assistant. Aggregated for DisplayOutput. |
EventToolUse | Tool invocation. Rendered as tool name + argument preview. |
DisplayEvent fields:
| Field | Required | Purpose |
|---|---|---|
Kind | yes | EventText or EventToolUse |
Text | for text | The text content |
Name | for tools | Tool name |
Arg | no | Truncated argument preview. Use extractArgPreviewFromMap(args) or extractArgPreview(jsonStr). |
ID | no | Tool call ID (empty if provider doesn’t emit one) |
Delta | no | true for streaming deltas (partial text chunks) |
Type | no | Raw event type from provider output (for debugging) |
6. Implement the AgentProvider interface methods
Execute
Delegate to p.base.execute(), then apply provider-specific post-processing.
func (p *MyProviderProvider) Execute(
ctx context.Context, prompt string, options map[string]any, stdout, stderr io.Writer,
) (*workflow.AgentResult, error) {
result, rawOutput, err := p.base.execute(ctx, prompt, options, stdout, stderr)
if err != nil {
return nil, err
}
// Post-processing: extract text from structured output
if extracted := extractDisplayTextFromEvents(rawOutput, p.parseMyProviderDisplayEvents); extracted != "" {
result.Output = extracted
tokens, _ := p.base.tokenizer.CountTokens(extracted) //nolint:errcheck // ApproximationTokenizer never errors with a valid ratio
result.Tokens = tokens
}
// Optional: parse JSON response
userFormat, _ := getStringOption(options, "output_format")
if userFormat == "json" || userFormat == "stream-json" {
if jsonResp := tryParseJSONResponse(rawOutput); jsonResp != nil {
result.Response = jsonResp
}
}
return result, nil
}Why post-process? When the CLI outputs NDJSON (events), the raw output is not human-readable. Post-processing extracts the actual assistant text and re-counts tokens on the extracted content.
ExecuteConversation
Most providers simply delegate without post-processing:
func (p *MyProviderProvider) ExecuteConversation(
ctx context.Context, state *workflow.ConversationState, prompt string,
options map[string]any, stdout, stderr io.Writer,
) (*workflow.ConversationResult, error) {
result, _, err := p.base.executeConversation(ctx, state, prompt, options, stdout, stderr)
if err != nil {
return nil, err
}
return result, nil
}Name and Validate
func (p *MyProviderProvider) Name() string {
return "myprovider"
}
func (p *MyProviderProvider) Validate() error {
_, err := exec.LookPath("myprovider-cli")
if err != nil {
return fmt.Errorf("myprovider-cli not found in PATH: %w", err)
}
return nil
}7. Add functional options
File: internal/infrastructure/agents/options.go
type MyProviderProviderOption func(*MyProviderProvider)
func WithMyProviderExecutor(executor ports.CLIExecutor) MyProviderProviderOption {
return func(p *MyProviderProvider) {
p.executor = executor
}
}
func WithMyProviderTokenizer(tok ports.Tokenizer) MyProviderProviderOption {
return func(p *MyProviderProvider) {
p.tokenizer = tok
}
}
func WithMyProviderLogger(l ports.Logger) MyProviderProviderOption {
return func(p *MyProviderProvider) {
p.logger = l
}
}8. Register in the registry
File: internal/infrastructure/agents/registry.go
Add to RegisterDefaults():
func (r *AgentRegistry) RegisterDefaults() error {
defaults := []ports.AgentProvider{
NewClaudeProvider(),
NewCodexProvider(),
NewGeminiProvider(),
NewOpenAICompatibleProvider(),
NewOpenCodeProvider(),
NewCopilotProvider(),
NewMyProviderProvider(), // <-- add here
}
// ...
}Testing
Option tests
File: internal/infrastructure/agents/provider_options_test.go
func TestWithMyProviderTokenizer(t *testing.T) {
tok := &mockTokenizer{countTokensResult: 99}
provider := NewMyProviderProviderWithOptions(
WithMyProviderExecutor(mocks.NewMockCLIExecutor()),
WithMyProviderTokenizer(tok),
)
assert.Equal(t, tok, provider.base.tokenizer)
}Argument construction tests
Test that buildExecuteArgs and buildConversationArgs produce correct CLI arguments for all option combinations.
func TestMyProvider_BuildExecuteArgs(t *testing.T) {
tests := []struct {
name string
prompt string
options map[string]any
wantArgs []string
wantErr bool
}{
{
name: "basic prompt",
prompt: "hello",
options: nil,
wantArgs: []string{"run", "--prompt", "hello", "--format", "json"},
},
{
name: "with model",
prompt: "hello",
options: map[string]any{"model": "myprovider-large"},
wantArgs: []string{"run", "--prompt", "hello", "--format", "json", "--model", "myprovider-large"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
p := NewMyProviderProvider()
args, err := p.buildExecuteArgs(tt.prompt, tt.options)
if tt.wantErr {
require.Error(t, err)
return
}
require.NoError(t, err)
assert.Equal(t, tt.wantArgs, args)
})
}
}Session ID extraction tests
func TestMyProvider_ExtractSessionID(t *testing.T) {
tests := []struct {
name string
output string
wantID string
wantErr bool
}{
{
name: "valid session",
output: `{"type":"session_start","session_id":"abc-123"}`,
wantID: "abc-123",
},
{
name: "missing event",
output: `{"type":"text","content":"hello"}`,
wantErr: true,
},
{
name: "empty output",
output: "",
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
p := NewMyProviderProvider()
id, err := p.extractSessionID(tt.output)
if tt.wantErr {
require.Error(t, err)
return
}
require.NoError(t, err)
assert.Equal(t, tt.wantID, id)
})
}
}Display event parser tests
func TestMyProvider_ParseDisplayEvents(t *testing.T) {
p := NewMyProviderProvider()
t.Run("text event", func(t *testing.T) {
events := p.parseMyProviderDisplayEvents([]byte(`{"type":"text","content":"hello"}`))
require.Len(t, events, 1)
assert.Equal(t, EventText, events[0].Kind)
assert.Equal(t, "hello", events[0].Text)
})
t.Run("tool event", func(t *testing.T) {
events := p.parseMyProviderDisplayEvents([]byte(`{"type":"tool_call","tool_name":"read_file"}`))
require.Len(t, events, 1)
assert.Equal(t, EventToolUse, events[0].Kind)
assert.Equal(t, "read_file", events[0].Name)
})
t.Run("unknown event returns nil", func(t *testing.T) {
events := p.parseMyProviderDisplayEvents([]byte(`{"type":"unknown"}`))
assert.Nil(t, events)
})
t.Run("invalid JSON returns nil", func(t *testing.T) {
events := p.parseMyProviderDisplayEvents([]byte(`not json`))
assert.Nil(t, events)
})
}Option validation tests
func TestMyProvider_ValidateOptions(t *testing.T) {
tests := []struct {
name string
options map[string]any
wantErr bool
}{
{name: "nil options", options: nil},
{name: "valid model", options: map[string]any{"model": "myprovider-large"}},
{name: "invalid model", options: map[string]any{"model": "gpt-4"}, wantErr: true},
{name: "unknown option ignored", options: map[string]any{"unknown": "value"}},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := validateMyProviderOptions(tt.options)
if tt.wantErr {
require.Error(t, err)
} else {
require.NoError(t, err)
}
})
}
}Mandatory Cross-Provider Conventions
Every provider must handle these patterns. Omitting any of them creates inconsistency for users who switch between providers in their workflows.
Force structured output format
All CLI providers force NDJSON/JSON output at the CLI level, regardless of what the user requests. This ensures consistent session ID extraction, display event filtering, and text extraction.
// The user's output_format preference controls post-processing (display vs raw),
// but the wire format is always NDJSON.
func (p *MyProviderProvider) buildExecuteArgs(prompt string, options map[string]any) ([]string, error) {
args := []string{"run", "--prompt", prompt}
args = append(args, "--format", "json") // always force structured output
// ...
}How each provider does it:
| Provider | Forced flag |
|---|---|
| Claude | --output-format stream-json --verbose |
| Gemini | --output-format stream-json |
| Codex | exec --json |
| Copilot | --output-format=json --silent |
| OpenCode | --format json |
Handle dangerously_skip_permissions
This option is cross-provider — users expect it to work in any workflow regardless of provider. Each CLI maps it to its own flag:
// In buildExecuteArgs and buildConversationArgs:
if skip, ok := getBoolOption(options, "dangerously_skip_permissions"); ok && skip {
args = append(args, "--your-cli-equivalent-flag")
}| Provider | CLI flag |
|---|---|
| Claude | --dangerously-skip-permissions |
| Gemini | --approval-mode=yolo |
| Codex | --dangerously-bypass-approvals-and-sandbox |
| Copilot | --allow-all |
| OpenCode | Not supported (logged at debug level, silently ignored) |
If your CLI has no equivalent, log a debug message and ignore:
if skip, ok := getBoolOption(options, "dangerously_skip_permissions"); ok && skip {
p.logger.Debug("dangerously_skip_permissions is not supported by myprovider and will be ignored")
}Handle system_prompt
Only Claude has a native --system-prompt flag. All other providers inline it into the first turn’s message using the shared helper:
// In buildConversationArgs, for the first turn (no session ID):
effectivePrompt := buildFirstTurnPrompt(prompt, options)
// Returns: "system prompt content\n\nuserPrompt" or just "userPrompt" if no system_promptIf your CLI has a native system prompt flag, use it directly instead:
if sysPrompt, ok := getStringOption(options, "system_prompt"); ok && sysPrompt != "" {
args = append(args, "--system-prompt", sysPrompt)
}System prompt must only be applied on the first turn. On subsequent turns (when state.SessionID != ""), the provider’s session already retains the system context.
Handle model
Every provider must support the model option. Validate the model name in validateOptions to reject models incompatible with your CLI:
func validateMyProviderOptions(options map[string]any) error {
if options == nil {
return nil
}
if model, ok := getStringOption(options, "model"); ok {
if !strings.HasPrefix(model, "myprovider-") {
return fmt.Errorf("invalid model: %s (must start with 'myprovider-')", model)
}
}
return nil
}Handle output_format for response parsing
The output_format option controls what the user sees. When the user requests json or stream-json, expose the parsed JSON response in result.Response:
// In Execute(), after text extraction:
userFormat, _ := getStringOption(options, "output_format")
if userFormat == "json" || userFormat == "stream-json" {
if jsonResp := tryParseJSONResponse(rawOutput); jsonResp != nil {
result.Response = jsonResp
}
}Ignore unknown options silently
Go’s map[string]any behavior means unsupported option keys are simply not looked up. Never iterate over options to reject unknown keys — this allows cross-provider workflows to pass provider-specific options that only apply to certain providers.
Token counting pattern
Every CountTokens call in provider code must use the //nolint:errcheck directive with an explanatory comment. This is enforced by golangci-lint with check-blank: true:
tokens, _ := p.base.tokenizer.CountTokens(extracted) //nolint:errcheck // ApproximationTokenizer never errors with a valid ratio
result.Tokens = tokensNUL byte sanitization in display event parsers
CLI tools may output NUL bytes (0x00) that break json.Unmarshal. Sanitize before parsing:
func (p *MyProviderProvider) parseMyProviderDisplayEvents(line []byte) []DisplayEvent {
// Escape NUL bytes to valid JSON unicode sequences
sanitized := bytes.ReplaceAll(line, []byte{0x00}, []byte(` `))
var evt struct { /* ... */ }
if err := json.Unmarshal(sanitized, &evt); err != nil {
return nil
}
// ...
}Codex and OpenCode use this escape pattern. Claude replaces NUL with spaces instead.
Error handling conventions
| Scenario | Handling |
|---|---|
Validate() — binary not found | Return fmt.Errorf("binary not found in PATH: %w", err) |
extractSessionID fails | Non-fatal. Base layer logs at debug and continues stateless. |
JSON parsing fails in Execute() | Non-fatal. result.Response stays nil. |
validateOptions returns error | Fatal. Execution is aborted before running the CLI. |
| Empty output from CLI | Base layer substitutes " " (single space) to prevent zero-length issues. |
Apply dangerously_skip_permissions in both arg builders
The buildExecuteArgs and buildConversationArgs hooks must both handle dangerously_skip_permissions (and model, etc.). Users don’t know which execution path their workflow triggers — missing the option in one path creates hard-to-debug inconsistencies.
extractTextContent vs extractDisplayTextFromEvents
Two mechanisms exist for extracting human-readable text from structured output:
| Mechanism | When to use |
|---|---|
extractTextContent hook | Your CLI wraps the final answer in a specific JSON envelope (e.g., Claude’s result event, Copilot’s assistant.message event). Set this hook to extract from that envelope. |
extractDisplayTextFromEvents() | Your CLI outputs NDJSON events where text is spread across multiple EventText events. This helper aggregates all text events via your parseDisplayEvents hook. |
Most providers use extractDisplayTextFromEvents in their Execute() post-processing. Only set extractTextContent if your provider needs a different extraction strategy for executeConversation.
Existing Providers Reference
| Provider | Binary | Name | Session Event | Session Field | Resume Flag | System Prompt |
|---|---|---|---|---|---|---|
| Claude | claude | claude | result | session_id | -r ID | --system-prompt (native) |
| Gemini | gemini | gemini | init | session_id | --resume ID | Inlined in first turn |
| Codex | codex | codex | thread.started | thread_id | resume ID (subcommand) | Inlined in first turn |
| Copilot | copilot | github_copilot | result | sessionId (camelCase) | --resume=ID | Inlined in first turn |
| OpenCode | opencode | opencode | step_start | sessionID | -s ID / -c (fallback) | Inlined in first turn |
| OpenAI-Compatible | HTTP API | openai_compatible | API response | N/A | Messages array | system role message |
Non-CLI Provider (HTTP API)
OpenAICompatibleProvider follows a completely different path from CLI-based providers. It implements AgentProvider directly without using baseCLIProvider, hooks, or any of the CLI infrastructure.
What changes vs CLI providers
| Aspect | CLI providers | HTTP provider (OpenAI-Compatible) |
|---|---|---|
| Execution | CLIExecutor.Run() → binary subprocess | httpx.Client → HTTP POST to /chat/completions |
| Token counting | ports.Tokenizer → estimation (len/4), TokensEstimated: true | API response usage field → exact counts, TokensEstimated: false |
| Session management | Extract session ID from NDJSON, resume via CLI flag | No session ID — full messages array sent each turn |
| System prompt | Inlined in first turn or native CLI flag | system role message in messages array |
| Display events | NDJSON stream filtering via DisplayEventParser | Direct write to stdout, no parsing needed |
| State cloning | Done by baseCLIProvider.executeConversation() | Must call cloneState() manually |
| Base struct | base *baseCLIProvider field | No base — flat struct with httpClient *httpx.Client |
Token counting: the key difference
CLI providers estimate tokens because CLI tools don’t report token usage:
// CLI provider pattern — estimation
tokens, _ := p.base.tokenizer.CountTokens(extracted) //nolint:errcheck
result.Tokens = tokens
result.TokensEstimated = true // set by tokenizer.IsEstimate()The HTTP provider gets exact counts from the API response:
// HTTP provider pattern — exact counts from API
result.Tokens = resp.Usage.TotalTokens
result.TokensEstimated = false
// In ExecuteConversation, input/output are separated:
result.TokensInput = resp.Usage.PromptTokens
result.TokensOutput = resp.Usage.CompletionTokens
result.TokensTotal = resp.Usage.TotalTokensNo Tokenizer port is used. No //nolint:errcheck is needed.
Conversation: messages array vs session resume
CLI providers maintain a session ID and pass it as a CLI flag to resume:
// CLI: resume with session ID
args = []string{"--resume", state.SessionID, "-p", prompt}The HTTP provider reconstructs the full messages array from conversation state on every turn:
// HTTP: rebuild messages from turns
messages := make([]chatMessage, 0, len(state.Turns)+2)
if opts.systemPrompt != "" {
messages = append(messages, chatMessage{Role: "system", Content: opts.systemPrompt})
}
for _, turn := range state.Turns {
messages = append(messages, chatMessage{Role: string(turn.Role), Content: turn.Content})
}
messages = append(messages, chatMessage{Role: "user", Content: prompt})Struct and constructor
type OpenAICompatibleProvider struct {
httpClient *httpx.Client // no base, no logger, no executor, no tokenizer
}
func NewOpenAICompatibleProvider(opts ...OpenAICompatibleProviderOption) *OpenAICompatibleProvider {
p := &OpenAICompatibleProvider{
httpClient: httpx.NewClient(),
}
for _, opt := range opts {
opt(p)
}
return p
}Option handling
Options are parsed into a dedicated parsedOptions struct with env var fallbacks:
type parsedOptions struct {
baseURL string // required — env: OPENAI_BASE_URL
model string // required — env: OPENAI_MODEL
apiKey string // optional — env: OPENAI_API_KEY
systemPrompt string
temperature *float64 // 0.0–2.0
maxCompletionTokens *int
topP *float64 // 0.0–1.0
}When to use this pattern
Use the HTTP provider pattern (not baseCLIProvider) when:
- Your provider is an HTTP API, not a CLI binary
- The API returns exact token counts in its response
- Conversation is managed via a messages array, not session IDs
- There is no NDJSON stream to parse
Use OpenAICompatibleProvider as your reference implementation.
Checklist
Structure
- Provider struct with
base,logger,executor,tokenizerfields -
NewXxxProvider()zero-config constructor -
NewXxxProviderWithOptions()functional-options constructor -
newBase()called after options, wires all hooks, forwards tokenizer with nil-check - Option types added to
options.go(WithXxxExecutor,WithXxxTokenizer,WithXxxLogger) - Provider registered in
registry.goRegisterDefaults()
Hooks (required)
-
buildExecuteArgsforces structured output format (JSON/NDJSON) -
buildConversationArgshandles first turn vs session resume -
extractSessionIDparses session ID from provider-specific event
Cross-provider options (mandatory)
-
modelhandled in bothbuildExecuteArgsandbuildConversationArgs -
modelvalidated invalidateOptions(prefix check or allowlist) -
dangerously_skip_permissionsmapped to CLI-specific flag (or logged + ignored if unsupported) -
system_prompthandled via native flag orbuildFirstTurnPrompt()on first turn only -
output_formatchecked inExecute()to conditionally exposeresult.Response - Unknown options silently ignored (never iterate to reject)
- All options applied in both
buildExecuteArgsandbuildConversationArgs
Execute post-processing
- Text extracted from NDJSON via
extractDisplayTextFromEventsorextractTextContent - Tokens re-counted on extracted text (not raw output)
-
//nolint:errcheckwith explanatory comment on everyCountTokenscall - JSON response parsed when
output_formatisjsonorstream-json
Interface methods
-
Executedelegates top.base.execute()with post-processing -
ExecuteConversationdelegates top.base.executeConversation() -
Name()returns unique provider identifier (matchesprovider:YAML field) -
Validate()checks binary viaexec.LookPathwith%werror wrapping
Display events
-
parseDisplayEventshandles text events (EventText) and tool events (EventToolUse) - NUL bytes sanitized before
json.Unmarshal - Unknown/malformed events return
nil(never error)
Tests
- Option injection tests (
TestWithXxxTokenizer,TestWithXxxExecutor) -
buildExecuteArgstable-driven tests (basic, with model, with permissions) -
buildConversationArgstests (first turn with system_prompt, resume with session ID) -
extractSessionIDtests (valid, missing event, empty output) -
parseDisplayEventstests (text, tool, unknown, invalid JSON) -
validateOptionstests (nil, valid, invalid model, unknown option)
Final verification
-
make buildpasses -
make lintpasses with zero violations -
make testpasses -
grep -rn "dangerously_skip_permissions" your_provider.goreturns at least one match