Persistence Format
Specification of ztick’s binary persistence format used for logfiles.
Overview
ztick persists jobs and rules using a binary encoding format. The logfile backend writes entries to an append-only file for durability across restarts. The memory backend stores the same encoded bytes in an in-memory list for ephemeral operation (see Configuration for backend selection). The encoding format is designed for:
- Performance: Efficient parsing and writing
- Durability: Each entry has length prefix for robustness
- Simplicity: No external serialization library needed
File Structure
A logfile is a sequence of entries, each prefixed with its length:
[Entry 1]
[Entry 2]
...
[Entry N]No file header or magic bytes — the logfile is a raw sequence of length-prefixed entries.
Entry Format
[4 bytes: entry length (big-endian u32)]
[1 byte: entry type discriminant]
[N bytes: entry-specific data]The entry type discriminant determines how to interpret the remaining bytes.
Entry Types
Type 0: Job Entry
Stores a single job record.
[1 byte: type = 0]
[2 bytes: identifier length (big-endian u16)]
[L bytes: identifier string (UTF-8)]
[8 bytes: execution timestamp (big-endian i64, nanoseconds)]
[1 byte: status (0=planned, 1=triggered, 2=executed, 3=failed)]Example (hex dump for job “toto”, timestamp 2020-11-15T16:30:00Z, status planned):
00000000: 00 00 00 10 00 00 04 74 6f 74 6f 16 47 bb 5c ee .......toto.G.\.
00000010: e1 50 00 00 .P..Breakdown:
00000010= length 16 bytes00= type 0 (Job)0004= identifier length 4746f746f= “toto” (UTF-8)1647bb5ceee15000= timestamp 1605457800000000000 ns00= status planned
Type 1: Rule Entry
Stores a single rule record with its runner.
[1 byte: type = 1]
[2 bytes: identifier length (big-endian u16)]
[L bytes: identifier string (UTF-8)]
[2 bytes: pattern length (big-endian u16)]
[L bytes: pattern string (UTF-8)]
[1 byte: runner type (0=shell, 1=amqp, 2=direct, 3=awf, 4=http)]
├─ if runner_type == 0 (shell):
│ [2 bytes: command length (big-endian u16)]
│ [L bytes: command string (UTF-8)]
├─ if runner_type == 1 (amqp):
│ [2 bytes: dsn length (big-endian u16)]
│ [L bytes: dsn string (UTF-8)]
│ [2 bytes: exchange length (big-endian u16)]
│ [L bytes: exchange string (UTF-8)]
│ [2 bytes: routing_key length (big-endian u16)]
│ [L bytes: routing_key string (UTF-8)]
├─ if runner_type == 2 (direct):
│ [2 bytes: executable length (big-endian u16)]
│ [L bytes: executable string (UTF-8)]
│ [2 bytes: args count (big-endian u16)]
│ for each arg:
│ [2 bytes: arg length (big-endian u16)]
│ [L bytes: arg string (UTF-8)]
├─ if runner_type == 3 (awf):
│ [2 bytes: workflow length (big-endian u16)]
│ [L bytes: workflow string (UTF-8)]
│ [2 bytes: inputs count (big-endian u16)]
│ for each input:
│ [2 bytes: input length (big-endian u16)]
│ [L bytes: input string (UTF-8, key=value)]
└─ if runner_type == 4 (http):
[2 bytes: method length (big-endian u16)]
[L bytes: method string (UTF-8, GET|POST|PUT|DELETE)]
[2 bytes: url length (big-endian u16)]
[L bytes: url string (UTF-8)]Example (shell runner for rule “t” matching pattern “toto” with command “titi”):
00000000: 00 00 00 11 01 00 01 74 00 04 74 6f 74 6f 00 00 .......t..toto..
00000010: 04 74 69 74 69 .titiBreakdown:
00000011= length 17 bytes01= type 1 (Rule)0001= identifier length 174= “t” (UTF-8)0004= pattern length 4746f746f= “toto” (UTF-8)00= runner type 0 (shell)0004= command length 474697469= “titi” (UTF-8)
Type 2: Job Removal Entry
Marks a job as removed. Contains only the identifier — no timestamp or status.
[1 byte: type = 2]
[2 bytes: identifier length (big-endian u16)]
[L bytes: identifier string (UTF-8)]During logfile replay, a job removal entry causes the corresponding job to be deleted from JobStorage. During background compression, if the last entry for an ID is a removal, the ID is excluded entirely from the compressed output.
Type 3: Rule Removal Entry
Marks a rule as removed. Contains only the identifier.
[1 byte: type = 3]
[2 bytes: identifier length (big-endian u16)]
[L bytes: identifier string (UTF-8)]Behaves identically to job removal: during replay the rule is deleted from RuleStorage, and during compression the ID is excluded from output.
Encoding Details
String Encoding
All strings are UTF-8 encoded with a 2-byte big-endian length prefix:
[2 bytes: string length (big-endian u16)]
[L bytes: UTF-8 string data]Maximum string length: 65535 bytes (2^16 - 1). The encoder returns error.Overflow if a string exceeds this limit.
Timestamp Encoding
Execution timestamps are stored as big-endian i64 in nanoseconds since Unix epoch:
1711612800000000000 ns = 2024-03-28 12:00:00 UTCTo convert from seconds:
const seconds = 1711612800;
const nanoseconds = seconds * 1_000_000_000; // 1711612800000000000Status Encoding
Job status is stored as a single byte:
| Value | Status |
|---|---|
| 0 | planned |
| 1 | triggered |
| 2 | executed |
| 3 | failed |
Runner Type Encoding
Runner type is stored as a single byte:
| Value | Type |
|---|---|
| 0 | shell |
| 1 | amqp |
| 2 | direct |
| 3 | awf |
| 4 | http |
Writing Entries
When persisting a job or rule:
- Serialize the entry to a buffer
- Calculate the entry length (without the 4-byte length prefix)
- Write the 4-byte length prefix (big-endian)
- Write the entry data
- Optionally fsync to ensure durability
Example (writing a job):
var buffer = try allocator.alloc(u8, 1024);
var offset: usize = 0;
// Skip length field (will fill later)
offset += 4;
// Write type
buffer[offset] = 0; // Job
offset += 1;
// Write identifier
const id_bytes = job.identifier;
std.mem.writeInt(u16, buffer[offset..][0..2], @intCast(id_bytes.len), .big);
offset += 2;
@memcpy(buffer[offset .. offset + id_bytes.len], id_bytes);
offset += id_bytes.len;
// Write execution timestamp
std.mem.writeInt(i64, buffer[offset..][0..8], job.execution, .big);
offset += 8;
// Write status
buffer[offset] = @intFromEnum(job.status);
offset += 1;
// Fill in the length field
const entry_length = offset - 4;
std.mem.writeInt(u32, buffer[0..4], @intCast(entry_length), .big);
// Write to file
try file.writeAll(buffer[0..offset]);Reading Entries
When reading a logfile:
- Read 4-byte length prefix
- Allocate buffer of that size
- Read the entry data
- Parse based on type byte
- Return the deserialized entry
Example (reading entries):
while (true) {
var length_bytes: [4]u8 = undefined;
const read = try file.read(&length_bytes);
if (read == 0) break; // EOF
const entry_length = std.mem.readInt(u32, &length_bytes, .big);
var entry_data = try allocator.alloc(u8, entry_length);
try file.readAll(entry_data);
const entry_type = entry_data[0];
switch (entry_type) {
0 => {
// Parse Job
},
1 => {
// Parse Rule
},
2 => {
// Parse Job Removal (identifier only)
},
3 => {
// Parse Rule Removal (identifier only)
},
else => return error.UnknownEntryType,
}
}Error Handling
Incomplete Entry
If the file ends mid-entry:
return error.IncompleteEntryExample: 4-byte length prefix present but not enough data for the entry.
Invalid Type
If the type byte is unknown:
return error.UnknownEntryTypeMalformed String
If a string length exceeds the remaining buffer:
return error.StringOverflowRecovery
On startup, ztick reads the entire logfile:
- Parse the raw bytes into length-prefixed frames
- Decode each frame into a Job, Rule, Job Removal, or Rule Removal entry
- Load Job entries into
JobStorage, Rule entries intoRuleStorage; removal entries delete the corresponding entry from storage
If a frame is incomplete (e.g., truncated write from a crash), the parser stops and returns the remaining unparsed bytes — it does not skip ahead. If a complete frame contains invalid data, decoding returns InvalidData and loading stops. This means corruption at any point truncates the log at that position; entries after the corruption are lost.
Background Compression
ztick automatically compresses logfiles on a periodic interval to reduce disk usage on long-lived deployments. Compression runs in a background thread and deduplicates repeated mutations on the same job or rule IDs, keeping only the final state.
Compression Scheduling
- Interval: Configured via
compression_intervalin the[database]section (default: 3600 seconds / 1 hour) - Trigger: Compression starts after the configured interval has elapsed since the last compression
- Skipping: If a compression is already in progress, the next interval is skipped to prevent overlapping compressions
- Disabling: Set
compression_interval = 0to disable compression entirely
Compression Behavior
Logfile backend:
- Active logfile is atomically renamed to
.to_compress - Fresh logfile is created for new writes (no gaps in append operations)
- Background compression deduplicates the
.to_compressfile intologfile.compressed - On restart, ztick loads from either the compressed file or the active logfile (whichever is newer)
Memory backend:
- Compression scheduling is completely inactive — no threads spawned, no file operations
- This ensures zero overhead for ephemeral deployments
Deduplication Rules
During compression:
- Only the final state of each job or rule ID is kept
- If a job was SET, REMOVED, then SET again, the compressed file contains one entry with the final SET state
- If a job’s final state is REMOVED, it is excluded entirely from the compressed file
- This reduces logfile size from O(n mutations) to O(n IDs)
Recovery
If compression is interrupted (e.g., by shutdown):
- The active logfile remains intact and accessible
- Any partial
.to_compressfiles are compressed at the next startup before the periodic timer begins - Data integrity is guaranteed by the append-only design
Example: 24-Hour Disk Usage
With default compression:
- 10,000 mutations on 100 job IDs per day
- Without compression: ~300 KB per day → ~109 MB per year
- With compression (1-hour interval): ~10 KB per day after deduplication → ~3.6 MB per year
Configuration
Default (once per hour):
[database]
persistence = "logfile"
compression_interval = 3600Aggressive (for high-mutation workloads):
[database]
persistence = "logfile"
compression_interval = 300 # 5 minutesDisabled:
[database]
persistence = "logfile"
compression_interval = 0Performance Characteristics
- Write latency: ~1-10 us per entry (buffered)
- Read latency: ~1-10 us per entry (sequential scan)
- Durability: With fsync enabled, guaranteed to disk after each write
- Compression latency: Background process does not block tick loop; compression latency depends on logfile size but is typically 10-100 ms
Logfile Size
Typical entry sizes:
| Entry Type | Typical Size |
|---|---|
| Job (short id) | 30-50 bytes |
| Job (long id) | 50-100 bytes |
| Rule (short pattern, short command) | 40-60 bytes |
| Rule (long pattern, long command) | 100-200 bytes |
| Job/Rule Removal | 10-30 bytes |
With 10,000 jobs and 100 rules, expect ~300 KB logfile without compression.
Compressed logfile sizes depend on your workload:
- Stable jobs (few mutations): ~10-20 KB (67-93% reduction)
- Active jobs (many mutations): ~50-100 KB (50-80% reduction)
- See Background Compression above for disk usage examples
See Also
- Data Types — Structure of Job and Rule types
- Configuration — fsync_on_persist and framerate settings
- Reference — Overview of all reference docs