ztick uses hexagonal architecture (also called “ports and adapters”) to maintain a clean separation of concerns and testability.

Layer Structure

The codebase is organized into 4 layers, with a strict dependency direction (inward only):

┌─────────────────────────────────────────────────────┐
│ Interfaces (CLI, Config)                            │
├─────────────────────────────────────────────────────┤
│ Infrastructure (Adapters: TCP, Shell, Persistence,  │
│                 Telemetry)                          │
├─────────────────────────────────────────────────────┤
│ Application (Scheduler, Storage, Query Handler)     │
├─────────────────────────────────────────────────────┤
│ Domain (Job, Rule, Runner, Execution)               │
└─────────────────────────────────────────────────────┘
     ↑ Dependencies flow inward only

Layer 1: Domain (src/domain/)

Purpose: Pure data types and business logic with zero external dependencies.

Exports:

  • Job — Scheduled action with status
  • JobStatusplanned, triggered, executed, failed
  • Rule — Pattern-based job-to-runner mapping
  • Runner — Tagged union of shell, direct, http, awf, amqp, redis
  • Instructionset, remove, query operations
  • Request/Response — Query protocol

Key property: No imports from outer layers. Tests can run standalone.

Example: Define a job type with lifecycle methods

pub const Job = struct {
    identifier: []const u8,
    execution: i64,  // nanosecond timestamp
    status: JobStatus,
};

Layer 2: Application (src/application/)

Purpose: Core scheduler logic—storage, pattern matching, query handling.

Exports:

  • Scheduler — Main orchestrator
  • JobStorage — In-memory HashMap + priority queue for efficient job insertion (O(log n))
  • RuleStorage — Rule persistence and pattern matching
  • QueryHandler — Instruction → response conversion
  • ExecutionClient — Tracks pending job executions

Key property: Depends only on Domain. No I/O or side effects.

Performance note: JobStorage uses std.PriorityQueue ordered by execution time for sub-linear insertion and sorted retrieval ( F021). This ensures scheduling throughput scales to thousands of jobs without linear scan overhead.

Example: Scheduler tick loop

pub fn tick(self: *Scheduler, now: i64) !void {
    const to_execute = try self.job_storage.get_to_execute(now);
    for (to_execute) |job| {
        const rule = self.rule_storage.find_rule_for(job.identifier);
        try self.execution_client.trigger(job, rule);
    }
}

Layer 3: Infrastructure (src/infrastructure/)

Purpose: Adapters that connect the application to external systems.

Exports:

  • TcpServer — Listens for TCP protocol connections (thread per connection)
  • HttpServer — RESTful JSON API server (thread per connection, mirrors TCP pattern)
  • ShellRunner — Executes shell commands via std.process
  • Encoder/Logfile — Binary persistence (read/write jobs and rules)
  • Parser — Line protocol parsing
  • Telemetry — OpenTelemetry SDK initialization and OTLP export ( ADR-0004)
  • Channel — Thread-safe bounded FIFO with drain() for single-lock batch consumption and optional wake notification ( F021)
  • Clock — Event-driven tick scheduling with condition-variable wakeup and framerate enforcement ( F021)

Key property: Depends on Domain and Application. Handles all I/O.

Performance note: The database thread uses Channel.drain() to consume incoming requests in a single lock/unlock cycle, reducing contention at high concurrency. Event-driven Clock wakes immediately on incoming requests rather than sleeping for fixed intervals, reducing single-worker latency to sub-millisecond. Both TCP and HTTP servers spawn a detached thread per connection with atomic counter tracking, enabling concurrent request handling without blocking.

Example: TCP adapter accepts connections and routes commands

pub const TcpServer = struct {
    pub fn handle_connection(
        self: *TcpServer,
        scheduler: *application.Scheduler,
        socket: std.net.Stream,
    ) !void {
        var parser = Parser.init(socket);
        while (try parser.next_instruction()) |instr| {
            const response = try scheduler.handle_query(instr);
            try socket.write(response);
        }
    }
};

Concurrency Pattern ( F022): Both TcpServer and HttpServer use an identical detached-thread pattern for handling concurrent connections:

  1. Accept loop increments an atomic counter and spawns a detached thread per accepted connection
  2. Worker thread processes the request and decrements the counter on exit (via defer)
  3. Graceful shutdown via join_all() polls the counter until it reaches zero or 5-second timeout elapses

This lock-free pattern enables linear throughput scaling: each client gets its own thread without mutex contention, and the atomic counter enables safe shutdown coordination across threads.

Key benefits:

  • No head-of-line blocking — slow clients don’t block fast clients
  • Simple reasoning — one thread per connection makes debugging straightforward
  • Proven pattern — TCP server established this pattern; HTTP replicates it exactly
  • Graceful shutdownjoin_all() ensures all in-flight requests drain before exit

Layer 4: Interfaces (src/interfaces/)

Purpose: Entry point—command-line parsing, configuration loading, component wiring.

Exports:

  • main() — Parses args, loads config, spawns threads
  • Config — TOML configuration
  • Cli — Argument parsing

Key property: Depends on all layers. Orchestrates the entire system.

Example: Main function wires up all components

pub fn main() !void {
    var config = try load_config(args.config_path);
    var scheduler = try application.Scheduler.init(allocator);
    var tcp_server = try infrastructure.TcpServer.bind(config.controller.listen);

    try tcp_server.listen(scheduler);
}

Dependency Inversion

The hexagonal pattern uses dependency inversion to keep dependencies flowing inward:

Without Hexagonal (Tightly Coupled)

main.zig
  ├─ TcpServer
  │  ├─ Scheduler
  │  │  ├─ Job
  │  │  └─ Rule
  │  └─ Encoder (I/O)
  └─ Encoder (I/O)

Problem: Application depends on I/O; hard to test

With Hexagonal (Inverted)

main.zig (orchestrates)
  ├─ TcpServer (adapter) → calls
  │  └─ Scheduler (application) → uses
  │     └─ Job, Rule (domain)

TcpServer is separate; Scheduler is testable without I/O

Testing Strategy

Each layer is tested independently:

Domain Tests

  • Pure data structures
  • Status transitions
  • Pattern matching logic
  • No I/O or allocation tracking needed

Example: src/domain/job.zig includes inline tests

test "job lifecycle" {
    var job = Job{ .identifier = "test", .status = .planned };
    job.status = .triggered;
    try std.testing.expectEqual(JobStatus.triggered, job.status);
}

Application Tests

  • Scheduler behavior
  • Storage operations
  • Rule resolution
  • No actual TCP or file I/O

Example: src/application/scheduler.zig tests

test "scheduler triggers matching jobs" {
    var scheduler = try Scheduler.init(allocator);
    try scheduler.handle_query(Request{ .instruction = .{ .set = ... } });
    try scheduler.tick(1000);
    try std.testing.expect(job.status == .triggered);
}

Infrastructure Tests

  • Parsing (protocol, TOML)
  • Encoding/decoding (persistence)
  • Channel correctness
  • Mock I/O where possible; real I/O in integration tests

Example: src/infrastructure/protocol/parser.zig tests

test "parse protocol line" {
    var parser = Parser.init("SET job.1 1234567890");
    const instr = try parser.next_instruction();
    try std.testing.expectEqual(InstructionType.set, instr.type);
}

Functional Tests

  • End-to-end behavior
  • Component interaction
  • Full tick cycle

Example: src/functional_tests.zig

test "scheduler processes job from query to executed" {
    var scheduler = try Scheduler.init(allocator);
    try scheduler.handle_query(Request{ .instruction = .{ .set = ... } });
    try scheduler.tick(1000);
    // Verify job is executed
}

Adding a New Feature

  1. Define domain typessrc/domain/new_concept.zig

    • No external dependencies
    • Include unit tests
  2. Implement application logicsrc/application/handler.zig

    • Uses domain types
    • Tested without I/O
  3. Add infrastructure adaptersrc/infrastructure/adapter.zig

    • Implements the interface
    • Handles side effects
  4. Wire in interfacessrc/main.zig

    • Compose the feature into the system
    • Update CLI/config as needed
  5. Add functional testsrc/functional_tests.zig

    • Verify end-to-end behavior

Performance Optimizations

Database Thread Throughput ( F021)

The database thread processes incoming job requests and triggers scheduled jobs. Three optimizations ensure it remains the throughput leader at scale:

1. Priority Queue Storage (O(log n) insertion)

  • JobStorage.to_execute uses std.PriorityQueue ordered by execution time instead of sorted array
  • Insertion: O(log n) vs O(n) linear scans + array shifts
  • Supports thousands of scheduled jobs without throughput degradation

2. Batch Request Drain (single lock/unlock)

  • Channel.drain() copies all buffered requests in one critical section
  • Reduces lock contention from 500+ per second (individual try_receive) to ~1 per tick
  • Multi-worker throughput scales closer to linearly

3. Event-Driven Tick Scheduling (sub-millisecond latency)

  • Clock uses condition variable timedWait() instead of unconditional Thread.sleep()
  • Wakes immediately when requests arrive; sleeps only when idle
  • Single-worker p50 latency drops from 2.24ms to <1ms
  • Framerate acts as a tick cap to prevent busy-spinning under sustained load

Benchmark targets (8 TCP workers):

  • Throughput: >3000 msg/s (6x improvement)
  • p50 latency: <5ms (68% reduction)
  • p99 latency: <15ms (53% reduction)

See Building the Project for benchmark instructions.

HTTP Server Concurrency ( F022)

The HTTP server spawns a dedicated thread per accepted connection, mirroring the TCP server’s detached-thread pattern:

  • Thread per connection — Each HTTP request is handled in its own thread, eliminating head-of-line blocking in the accept loop
  • Atomic counter tracking — An active_connections atomic counter tracks live worker threads for graceful shutdown
  • Graceful shutdownjoin_all() polls the counter with a 5-second timeout, ensuring in-flight requests complete before process exit
  • Shared state safetyResponseRouter and Channel are mutex-guarded; next_client_id uses atomic fetchAdd

Benchmark targets (8 HTTP workers):

  • Throughput: >2000 msg/s (4x improvement over sequential baseline)
  • p50 latency: <5ms (vs ~56ms under 16 concurrent workers previously)

Key Principles

  1. Domain is pure — no I/O, no dependencies
  2. Application is testable — depends only on domain
  3. Infrastructure is flexible — adapters are swappable
  4. Interfaces are thin — just wiring and config
  5. Tests are co-located — each file includes its own tests

See Also

  • Building — How to compile and test the project
  • Contributing — Code style and submission guidelines