Andrew Phillips 5bad7ac7a6 refactor: decouple meta plugins from DB via SaveMetaFn callback, extract shared utilities
- Add SaveMetaFn callback pattern: meta plugins receive a closure instead of
  &Connection, enabling the same plugin code to work in local, client, and
  server contexts (collect-to-Vec, collect-to-HashMap, or direct DB write)
- Client save now runs meta plugins locally during streaming (smart client
  sets meta=false, server skips its own plugins)
- Add POST /api/item/{id}/update endpoint for re-running plugins on stored
  content without downloading compressed data
- Add client update mode (--update with --meta-plugin flags)
- Extract shared utilities: stream_copy, print_serialized, build_path_table,
  ensure_default_tag to reduce duplication across modes
- Add upsert_tag for idempotent tag addition (INSERT OR IGNORE)
- Add warn logging on save_meta lock failure in BaseMetaPlugin and MetaService
2026-03-14 22:36:59 -03:00
2024-02-26 13:39:34 -04:00

Keep

A command-line utility for storing and retrieving temporary data with automatic compression, metadata extraction, and querying. Pipe any output into keep for organized storage — no more losing data in /tmp files with cryptic names.

# Instead of this:
curl -s https://api.example.com/data > /tmp/api-data.json

# Do this:
curl -s https://api.example.com/data | keep --save api-data
keep --get api-data

Table of Contents

Features

  • Store and retrieve — Save content with tags, retrieve by ID or tag
  • Automatic compression — LZ4, GZip, BZip2, XZ, ZStd support
  • Metadata plugins — Auto-extract file type, digests, hostname, user info, and more
  • Filters — Apply transformations (head, tail, grep, strip ANSI) on retrieval
  • Querying — List, search, diff items with flexible formatting
  • Client/server architecture — Optional HTTP server with streaming support
  • Modular design — Extensible plugin system for compression, metadata, and filtering

Installation

From Source

Requires Rust and Cargo.

cargo build --release

Install via Cargo

cargo install --path .

Static Binary (Linux)

./build-static.bash
# Binary at bin/keep

Environment Module

A TCL modulefile is provided at modulefile. To use it, copy or symlink the project directory into your modules path:

# Symlink into an existing module path (e.g., /usr/local/modules)
ln -s /path/to/keep /usr/local/modules/keep

# Load the module
module load keep

# Verify
keep --status

# Source the shell profile (optional, for shell integration)
source $KEEP_BASH_PROFILE    # bash
source $KEEP_ZSH_PROFILE     # zsh
source $KEEP_SH_PROFILE      # sh/dash/ksh
source $KEEP_CSH_PROFILE     # csh/tcsh

The modulefile prepends keep/bin to PATH and sets shell-specific profile variables:

Variable Profile Shell
KEEP_BASH_PROFILE profile.bash bash
KEEP_ZSH_PROFILE profile.zsh zsh
KEEP_SH_PROFILE profile.sh sh, dash, ksh93, pdksh, mksh
KEEP_CSH_PROFILE profile.csh csh, tcsh

Shell Completion

Tab completion is available for bash, zsh, fish, elvish, and powershell. Completions for @ (save) and @@ (get) are available for bash and zsh only.

Bash — add to ~/.bashrc:

. <(keep --generate-completion bash)

Zsh — add to ~/.zshrc:

. <(keep --generate-completion zsh)

With profile.bash or profile.zsh: Completions for keep, @ (save), and @@ (get) are loaded automatically when sourcing the profile.

Build with Server/Client Features

# Server only
cargo build --release --features server

# Client only (for connecting to a remote keep server)
cargo build --release --features client

# Server + client + all optional features
cargo build --release --features server,client,swagger

Quick Start

# Save content with a tag (--save is optional when piping)
echo "Hello, world!" | keep greeting

# Retrieve by ID (--get is optional for numeric IDs)
keep 1

# Retrieve by tag (--get is required for tags)
keep --get greeting

# List all stored items
keep --list

# Get item details
keep --info greeting

# Delete by ID
keep --delete 1

Real-World Examples

# Save API response
curl -s https://api.github.com/repos/user/repo | keep --save repo-info

# Save test output with metadata
npm test 2>&1 | keep --save test-results --meta project=myapp --meta env=staging

# Chain commands: process and store
cat data.csv | sort | uniq | keep --save cleaned-data

# Diff two versions
keep --diff 1 5

# Get first 20 lines of an item
keep --get 1 --filters "head_lines(20)"

# List items from a specific project
keep --list --meta project=myapp

Usage

Save Mode

Save stdin content with tags and metadata. The --save flag is optional when piping content.

# Save (auto-assigned ID, no tag)
echo "data" | keep --save

# Save with a tag (--save is optional when piping)
echo "data" | keep --save my-tag
echo "data" | keep my-tag

# Save with multiple tags and metadata
cat report.pdf | keep --save report --meta project=alpha --meta env=prod

# Specify compression
echo "data" | keep --save my-tag --compression gzip

Tags and metadata make items easy to find later. Tags are simple identifiers; metadata is key-value pairs.

Get Mode

Retrieve items by ID. This is the default mode when numeric IDs are provided.

# Get by ID (no --get needed for numeric IDs)
keep --get 1
keep 1

# Get by tag (requires --get flag)
keep --get my-tag

# Get with filters applied
keep --get 1 --filters "head_lines(10)"

# Get by metadata filter
keep --get --meta project=alpha

# Force binary output to TTY (override safety check)
keep --get 1 --force

List Mode

List stored items with filtering and formatting.

# List all items
keep --list

# List by tag
keep --list my-tag

# Filter by metadata
keep --list --meta env=prod

# Custom column format
keep --list --list-format "id,time,size,tags"

# JSON output for scripting
keep --list --output-format json

# Human-readable file sizes
keep --list --human-readable

Info Mode

Show detailed information about an item.

keep --info 1
keep --info my-tag
keep --info --meta key=value

Update Mode

Update an item's tags, metadata, and re-run meta plugins.

# Replace tags
keep --update 1 new-tag

# Update metadata
keep --update 1 --meta key=newvalue

# Remove a metadata key
keep --update 1 --meta key

# Re-run meta plugins on stored content
keep --update 1 --meta-plugin digest --meta-plugin text

Delete Mode

Delete items by ID.

keep --delete 1
keep --delete 1 2 3

Diff Mode

Show differences between two items.

keep --diff 1 2

Status Mode

Show system status and supported features.

keep --status
keep --status-plugins
keep --status --verbose

Filters

Apply transformations to item content during retrieval. Filters are chained with |.

# First 10 lines
keep --get 1 --filters "head_lines(10)"

# Skip first 5 lines, then grep for errors
keep --get 1 --filters "skip_lines(5)|grep(pattern=error)"

# Strip ANSI escape codes
keep --get 1 --filters "strip_ansi"

# Last 100 bytes
keep --get 1 --filters "tail_bytes(100)"

# Complex chain
keep --get 1 --filters "skip_lines(10)|grep(pattern=TODO)|head_lines(5)"

Available Filters

Filter Description Parameters
head_bytes(n) First n bytes count
head_lines(n) First n lines count
tail_bytes(n) Last n bytes count
tail_lines(n) Last n lines count
skip_bytes(n) Skip first n bytes count
skip_lines(n) Skip first n lines count
grep(pattern) Filter matching lines pattern (regex)
strip_ansi Remove ANSI escape codes none

Set KEEP_FILTERS to apply a default filter chain to all retrievals.

Compression

Items are compressed automatically on save. Default: LZ4.

Algorithm Type Speed Ratio
lz4 Internal Fastest Lower
gzip Internal Fast Good
bzip2 External Slow Better
xz External Slowest Best
zstd External Fast Good
none Internal N/A N/A
# Specify compression per item
echo "data" | keep --save my-tag --compression zstd

# Set default via environment
export KEEP_COMPRESSION=gzip

External compression programs (bzip2, xz, zstd) must be installed on the system.

Meta Plugins

Metadata is automatically extracted when saving items.

Plugin Key Description
env * Capture KEEP_META_* environment variables
magic_file file_type File type detection (requires magic feature)
text text_line_count, text_word_count Line and word counts
user user_uid, user_name, user_gid, user_group Current user info
shell shell Current shell path
shell_pid shell_pid Shell process ID
keep_pid keep_pid Keep process ID
digest digest_sha256, digest_md5 Content digests
read_time read_time Time to read content
read_rate read_rate Data read rate
hostname hostname, hostname_short System hostname
exec Custom Run external commands for metadata
cwd cwd Current working directory
# Use specific plugins (repeatable)
echo "data" | keep --save tag --meta-plugin digest --meta-plugin text --meta-plugin user

# Pass options to a plugin via JSON
echo "data" | keep --save tag --meta-plugin 'tokens:{"options":{"min_length":"2"}}'

# Capture custom metadata via environment
KEEP_META_project=alpha echo "data" | keep --save tag

# Combine environment and CLI metadata
KEEP_META_build=1234 echo "data" | keep --save tag --meta env=staging

Configuration

Environment Variables

Variable Description Default
KEEP_DIR Storage directory ~/.keep
KEEP_CONFIG Config file path ~/.config/keep/config.yml
KEEP_COMPRESSION Compression algorithm lz4
KEEP_META_PLUGINS Meta plugins to use (JSON format: name[:{json}], comma-separated) env
KEEP_FILTERS Default filter chain none
KEEP_LIST_FORMAT List column format built-in defaults
KEEP_SERVER_ADDRESS Server bind address 127.0.0.1
KEEP_SERVER_PORT Server port 21080
KEEP_SERVER_USERNAME Server Basic auth username keep
KEEP_SERVER_PASSWORD Server password none
KEEP_SERVER_PASSWORD_HASH Server password hash none
KEEP_SERVER_JWT_SECRET JWT secret for token auth none
KEEP_SERVER_JWT_SECRET_FILE Path to JWT secret file none
KEEP_SERVER_MAX_BODY_SIZE Maximum POST body size in bytes (0=unlimited) unlimited
KEEP_SERVER_CERT TLS certificate file path (PEM) none
KEEP_SERVER_KEY TLS private key file path (PEM) none
KEEP_CLIENT_URL Remote keep server URL none
KEEP_CLIENT_USERNAME Remote server username keep
KEEP_CLIENT_PASSWORD Remote server password none
KEEP_CLIENT_JWT JWT token for remote server none

Any config setting can be overridden with KEEP__<SETTING> environment variables (double underscore separator).

Configuration File

Default location: ~/.config/keep/config.yml

Generate a default configuration:

keep --generate-config > ~/.config/keep/config.yml
# Storage directory
dir: ~/.keep

# List view columns
list_format:
  - name: id
    label: "Item"
    align: right
  - name: time
    label: "Time"
    align: right
  - name: size
    label: "Size"
    align: right
  - name: tags
    label: "Tags"
    align: left

# Table styling
table_config:
  style: utf8_full
  content_arrangement: dynamic

# Default compression
compression_plugin:
  name: gzip

# Default meta plugins
meta_plugins:
  - name: env
  - name: digest
    options:
      algorithm: sha256

# Server settings
server:
  address: "127.0.0.1"
  port: 21080
  username: "keep"
  password: "secret"
  # Maximum POST body size in bytes (0 = unlimited)
  # max_body_size: 52428800  # 50 MB
  # JWT authentication (takes priority over password)
  # jwt_secret: "my-secret-key"
  # jwt_secret_file: /path/to/jwt_secret
  # TLS (requires tls feature)
  # cert_file: /path/to/cert.pem
  # key_file: /path/to/key.pem

# Client settings
client:
  url: "http://localhost:21080"
  username: "keep"
  password: "secret"
  # Or use JWT token
  # jwt: "eyJhbGciOiJIUzI1NiIs..."

human_readable: true
quiet: false
force: false

Client/Server Mode

Keep supports a client/server architecture where one machine runs a keep server and other machines connect as clients. This is useful for:

  • Centralizing stored data across multiple machines
  • Sharing items between team members
  • Offloading storage to a dedicated server
  • Piping data from long-running processes without local storage

Server Mode

Start an HTTP REST API server:

# Default: 127.0.0.1:21080
keep --server

# Custom address and port
keep --server --server-address 0.0.0.0 --server-port 8080

# With password authentication
keep --server --server-password mypassword

# With custom username
keep --server --server-username admin --server-password mypassword

# With JWT authentication
keep --server --server-jwt-secret my-secret-key

JWT Authentication

JWT (JSON Web Token) authentication provides permission-based access control. When a JWT secret is configured, the server validates tokens and checks permission claims for each request.

Configuration:

# Via CLI flag
keep --server --server-jwt-secret my-secret-key

# Via environment variable
export KEEP_SERVER_JWT_SECRET=my-secret-key
keep --server

# Via config file (config.yml)
server:
  jwt_secret: "my-secret-key"

# Via secret file (for Docker/secrets management)
keep --server --server-jwt-secret-file /path/to/secret

Token format:

JWTs must use HS256 algorithm with the following claims:

Claim Type Required Description
sub string Yes Subject (client identifier)
exp number Yes Expiration time (Unix timestamp)
read boolean No Permission for GET requests (default: false)
write boolean No Permission for POST/PUT requests (default: false)
delete boolean No Permission for DELETE requests (default: false)

Permission mapping:

HTTP Method Required Permission
GET read
POST, PUT, PATCH write
DELETE delete

Example token payload:

{
  "sub": "ci-pipeline",
  "exp": 1735689600,
  "read": true,
  "write": true,
  "delete": false
}

Generating tokens:

The server does not generate tokens — use any JWT library or tool:

# Using jwt-cli (https://github.com/mike-engel/jwt-cli)
jwt encode --secret my-secret-key \
  --exp=$(date -d '+24 hours' +%s) \
  '{"sub":"my-client","read":true,"write":true,"delete":false}'

# Using Python
python3 -c "
import jwt, time
token = jwt.encode({
    'sub': 'my-client',
    'exp': int(time.time()) + 86400,
    'read': True, 'write': True, 'delete': False
}, 'my-secret-key', algorithm='HS256')
print(token)
"

Using tokens:

# With curl
curl -H "Authorization: Bearer <jwt-token>" http://localhost:21080/api/item/

# The keep client uses --client-jwt for JWT tokens
keep --client-url http://server:21080 --client-jwt <jwt-token> --save my-tag

Response codes:

Code Meaning
200 Authorized
401 Missing, invalid, or expired token
403 Valid token but insufficient permissions

Notes:

  • When jwt_secret is set, password authentication is disabled — all requests must present a valid JWT Bearer token
  • JWT and password authentication are mutually exclusive — when both jwt_secret and password are configured, only JWT is used
  • Permission fields default to false if omitted — tokens must explicitly grant permissions
  • JWT authentication requires the server feature (jsonwebtoken is included automatically)

HTTPS / TLS

Build with the tls feature to enable HTTPS:

cargo build --release --features server,tls

Provide a TLS certificate and private key (both PEM format):

# Via CLI flags
keep --server \
  --server-cert /path/to/cert.pem \
  --server-key /path/to/key.pem

# Via environment variables
export KEEP_SERVER_CERT=/path/to/cert.pem
export KEEP_SERVER_KEY=/path/to/key.pem
keep --server

# Via config file (config.yml)
server:
  cert_file: /path/to/cert.pem
  key_file: /path/to/key.pem

When cert and key are provided, the server listens with HTTPS. Without them, it falls back to plain HTTP. The port is controlled by --server-port (default: 21080).

Self-signed certificates (for development):

# Generate a self-signed cert
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \
  -days 365 -nodes -subj "/CN=localhost"

# Start server with self-signed cert
keep --server --server-cert cert.pem --server-key key.pem

# Connect client with HTTPS
keep --client-url https://localhost:21080 --save my-tag

The server accepts data from both dumb clients (raw HTTP/curl) and smart clients (the keep CLI).

Server Streaming

The server streams all data through fixed-size buffers (8192 bytes). At no point is the entire file content held in memory.

  • POST: Body streams through the compression and storage pipeline in chunks. When max_body_size is exceeded, the server returns 413 PAYLOAD_TOO_LARGE while keeping the partial item already saved through the pipeline.
  • GET: Content streams from disk through decompression to the client using the same fixed-size buffers.
  • Diff: Individual items are capped at 10 MB for the diff endpoint to prevent unbounded memory use.
Max Body Size

Control the maximum accepted body size with:

# Via CLI flag (bytes)
keep --server --server-max-body-size 52428800

# Via environment variable
export KEEP_SERVER__MAX_BODY_SIZE=52428800
keep --server

# Via config file (config.yml)
server:
  max_body_size: 52428800  # 50 MB

When set to 0 or omitted, no limit is enforced.

Server Query Parameters

The server supports query parameters that control processing:

Parameter Default Description
tags none Comma-separated tags
metadata none JSON-encoded metadata
compress true false = client already compressed, store as-is
meta true false = client handles metadata, skip server-side plugins
decompress true false = return raw compressed bytes on GET

The POST /api/item/{id}/update endpoint accepts additional parameters:

Parameter Default Description
plugins none Comma-separated plugin names to re-run on stored content
metadata none JSON-encoded metadata overrides to apply
tags none Comma-separated tags to add (idempotent)

When using a smart client, these are set automatically. For curl, the server handles everything by default.

Example: Curl as a Dumb Client

# Save (server handles compression and metadata)
curl -X POST -d "my data" http://localhost:21080/api/item/?tags=my-tag

# Retrieve (server decompresses)
curl http://localhost:21080/api/item/1/content

# Save compressed (client handles compression, server skips)
gzip -c data.txt | curl -X POST -d @- "http://localhost:21080/api/item/?compress=false&tags=my-tag"

Client Mode

The keep CLI can connect to a remote server as a smart client. Build with the client feature:

cargo build --release --features client
# Set server URL via flag or environment
keep --client-url http://server:21080 --save my-tag
export KEEP_CLIENT_URL=http://server:21080

# With password authentication
keep --client-url http://server:21080 --client-password mypassword --save my-tag
export KEEP_CLIENT_PASSWORD=mypassword

# With custom username
keep --client-url http://server:21080 --client-username admin --client-password mypassword --save my-tag

# With JWT authentication
keep --client-url http://server:21080 --client-jwt <jwt-token> --save my-tag
export KEEP_CLIENT_JWT=<jwt-token>

How Client Mode Works

Client mode uses local plugins and remote storage:

  1. Save: Local compression and meta plugins run on the client; compressed data streams to the server. Smart clients set meta=false so the server skips its own plugins.
  2. Get: Server sends raw compressed data; client decompresses locally and applies filters
  3. Update: Meta plugins run on the server to avoid downloading compressed data for re-processing
  4. Other operations (list, info, delete, diff): Delegated directly to the server

This means client behavior is consistent with local mode — the same compression settings and filters apply.

Streaming Architecture

Client save uses a 3-thread streaming pipeline for constant memory usage regardless of data size:

┌───────────────────┐     OS pipe      ┌────────────────┐
│ Reader thread     ├──────────────────┤ Streamer thread│
│                   │  (compressed     │                │
│ stdin → tee       │   bytes)         │ pipe → POST    │
│    → hash         │                  │   (chunked)    │
│    → compress     │                  │                │
│    → meta plugins │                  │                │
└───────────────────┘                  └────────────────┘
        │                                  │
        ▼                                  ▼
    stdout +                       Server stores blob
    computed metadata
  • Reader thread: Reads stdin, tees output to stdout, computes SHA-256 via digest plugin, compresses data, runs meta plugins (hostname, text, etc.), writes to OS pipe
  • Streamer thread: Reads compressed bytes from pipe, streams to server via chunked HTTP POST
  • Main thread: After streaming completes, sends plugin-collected metadata to server

Memory usage is O(PIPESIZE) — typically 8 KB — regardless of how much data is being stored.

Example: Remote Pipeline

# On a build server, pipe logs to a central keep server
make build 2>&1 | keep --client-url http://logserver:21080 \
  --save build-logs \
  --meta project=myapp \
  --meta branch=$(git branch --show-current)

# Retrieve from any machine
keep --client-url http://logserver:21080 --get build-logs

# List recent builds from a specific project
keep --client-url http://logserver:21080 --list --meta project=myapp

API Endpoints

Method Path Description
GET /api/status System status
GET /api/plugins/status Plugin status
GET /api/item/ List items (tags, order, start, count params)
POST /api/item/ Create item (body: raw content, params: tags, metadata, compress, meta)
GET /api/item/latest/content Latest item content
GET /api/item/latest/meta Latest item metadata
GET /api/item/{id} Item info by ID
GET /api/item/{id}/content Item content by ID
GET /api/item/{id}/meta Item metadata by ID
GET /api/item/{id}/info Item info by ID
POST /api/item/{id}/meta Add metadata to existing item (body: JSON object)
POST /api/item/{id}/update Re-run meta plugins on stored content (params: plugins, metadata, tags)
DELETE /api/item/{id} Delete item by ID
GET /api/diff Diff two items (id_a, id_b params)

Authentication

The server supports three authentication modes:

1. Password (HTTP Basic auth):

# Default username is "keep"
curl -u keep:mypassword http://localhost:21080/api/status

# Custom username
curl -u admin:mypassword http://localhost:21080/api/status

2. JWT (permission-based):

# Valid JWT with read permission allows GET requests
curl -H "Authorization: Bearer <jwt-token>" http://localhost:21080/api/item/

See JWT Authentication for token format and configuration.

3. No authentication:

When neither password nor JWT secret is configured, authentication is disabled.

Swagger UI

Build with the swagger feature to enable OpenAPI documentation:

cargo build --features server,swagger

Swagger UI available at /swagger, OpenAPI spec at /openapi.json.

Security

The server applies the following security measures:

  • Input validation: Item IDs are validated as positive integers; tags and metadata have length limits (256 and 128 characters respectively).
  • XSS protection: All user-controlled data rendered into HTML pages is escaped.
  • Security headers: Responses include X-Content-Type-Options: nosniff, X-Frame-Options: DENY, and Referrer-Policy: strict-origin-when-cross-origin.
  • CORS: Explicit allowed headers (Content-Type, Authorization, Accept); no wildcard headers.
  • Path traversal: Item IDs are validated to prevent directory traversal attacks.
  • Internal errors: Internal error details are never exposed in HTML responses — only generic messages are shown.

Shell Integration

Profile scripts are provided for several shells. Source the appropriate one to enable shell integration:

Profile Shells Features
profile.bash bash Preexec hook, wrapper function, @/@@ aliases, tab completions
profile.zsh zsh Preexec hook, wrapper function, @/@@ aliases, tab completions
profile.sh sh, dash, ksh93, pdksh, mksh Wrapper function, @/@@ aliases
profile.csh csh, tcsh Alias-based keep wrapper, @/@@ aliases
# bash
source /path/to/keep/profile.bash

# zsh
source /path/to/keep/profile.zsh

# sh, dash, ksh
source /path/to/keep/profile.sh

# csh/tcsh
source /path/to/keep/profile.csh

All profiles provide:

  • @ alias — Shorthand for keep --save
  • @@ alias — Shorthand for keep --get

Bash and zsh profiles additionally provide:

  • keep function — Captures the current command in metadata automatically
  • Tab completion — For keep, @, and @@
# Save with automatic command capture (bash/zsh)
curl -s api.example.com | @ api-response

# Quick retrieve
@@ api-response

Feature Flags

Feature Default Description
magic Yes File type detection via libmagic
lz4 Yes LZ4 compression (internal)
gzip Yes GZip compression (internal)
server No HTTP REST API server
tls No HTTPS/TLS server support (requires server)
client No HTTP client for remote server
swagger No Swagger UI for API docs
bzip2 No BZip2 compression (external program)
xz No XZ compression (external program)
zstd No ZStd compression (external program)
# Server with Swagger UI
cargo build --features server,swagger

# Server with HTTPS
cargo build --features server,tls

# Client only
cargo build --features client

# Everything
cargo build --features server,tls,client,swagger,magic

License

MIT License - see LICENSE for details.

Contact

Andrew Phillips - andrew@gt0.ca

Description
No description provided
Readme MIT 13 MiB
Languages
Rust 99.5%
Shell 0.3%
Dockerfile 0.2%