asp/keep

Go to file

Andrew Phillips 17be6abaab refactor: streaming, security hardening, and MCP removal

Major overhaul of server architecture and security posture:

- Streaming: Unified all I/O through PIPESIZE (8192-byte) buffers.
  POST bodies stream via MpscReader through the save pipeline. GET
  content streams from disk via decompression to client. Removed
  save_item_with_reader, get_item_content_info, ChannelReader.
  413 responses keep partial items (nonfatal by design).

- Security: XSS protection in all HTML pages via html_escape crate.
  Security headers middleware (nosniff, frame deny, referrer policy).
  CORS tightened to explicit headers. Input validation for tags
  (256 chars), metadata (128/4096), pagination (10k cap). Config
  file reads use from_utf8_lossy. Generic error messages in HTML.
  Diff endpoint has 10 MB per-item cap. max_body_size config option.

- Panics eliminated: Path unwraps → proper error propagation.
  Mutex unwraps → map_err (registries) / expect with message (local).

- MCP removed: Deleted all MCP code, rmcp dependency, mcp feature.

- Docs: Updated README, DESIGN, AGENTS to reflect all changes.

2026-03-14 00:03:42 -03:00

src

refactor: streaming, security hardening, and MCP removal

2026-03-14 00:03:42 -03:00

.dockerignore

fix: harden security, eliminate panics, remove dead code, add Dockerfile

2026-03-13 07:57:36 -03:00

.gitignore

fix: resolve doctest failures, database bugs, and remove dead code

2026-03-12 11:58:44 -03:00

AGENTS.md

refactor: streaming, security hardening, and MCP removal

2026-03-14 00:03:42 -03:00

build-static.bash

Put the binary in bin

2024-02-26 13:39:34 -04:00

Cargo.lock

refactor: streaming, security hardening, and MCP removal

2026-03-14 00:03:42 -03:00

Cargo.toml

refactor: streaming, security hardening, and MCP removal

2026-03-14 00:03:42 -03:00

DESIGN.md

refactor: streaming, security hardening, and MCP removal

2026-03-14 00:03:42 -03:00

docker-compose.yml

feat: add JWT auth, configurable username, switch password auth to Basic

2026-03-13 13:56:35 -03:00

Dockerfile

refactor: streaming, security hardening, and MCP removal

2026-03-14 00:03:42 -03:00

LICENSE

docs: rewrite README, add LICENSE, remove outdated files

2026-03-12 18:01:23 -03:00

modulefile

Ugh

2026-02-19 13:57:39 -04:00

profile.bash

feat: add ringbuf and crossbeam-utils dependencies

2025-08-29 13:35:55 -03:00

README.md

refactor: streaming, security hardening, and MCP removal

2026-03-14 00:03:42 -03:00

README.md

Keep

A command-line utility for storing and retrieving temporary data with automatic compression, metadata extraction, and querying. Pipe any output into keep for organized storage — no more losing data in /tmp files with cryptic names.

# Instead of this:
curl -s https://api.example.com/data > /tmp/api-data.json

# Do this:
curl -s https://api.example.com/data | keep --save api-data
keep --get api-data

Features
Installation
Quick Start
Usage
- Save Mode
- Get Mode
- List Mode
- Info Mode
- Update Mode
- Delete Mode
- Diff Mode
- Status Mode
Filters
Compression
Meta Plugins
Configuration
Client/Server Mode
Shell Integration
Feature Flags
License

Features

Store and retrieve — Save content with tags, retrieve by ID or tag
Automatic compression — LZ4, GZip, BZip2, XZ, ZStd support
Metadata plugins — Auto-extract file type, digests, hostname, user info, and more
Filters — Apply transformations (head, tail, grep, strip ANSI) on retrieval
Querying — List, search, diff items with flexible formatting
Client/server architecture — Optional HTTP server with streaming support
Modular design — Extensible plugin system for compression, metadata, and filtering

Installation

From Source

Requires Rust and Cargo.

cargo build --release

Install via Cargo

cargo install --path .

Static Binary (Linux)

./build-static.bash
# Binary at bin/keep

Build with Server/Client Features

# Server only
cargo build --release --features server

# Client only (for connecting to a remote keep server)
cargo build --release --features client

# Server + client + all optional features
cargo build --release --features server,client,swagger

Quick Start

# Save content with a tag
echo "Hello, world!" | keep --save greeting

# Retrieve by tag
keep --get greeting

# List all stored items
keep --list

# Get item details
keep --info greeting

# Delete by tag
keep --delete greeting

Real-World Examples

# Save API response
curl -s https://api.github.com/repos/user/repo | keep --save repo-info

# Save test output with metadata
npm test 2>&1 | keep --save test-results --meta project=myapp --meta env=staging

# Chain commands: process and store
cat data.csv | sort | uniq | keep --save cleaned-data

# Diff two versions
keep --diff 1 5

# Get first 20 lines of an item
keep --get 1 --filters "head_lines(20)"

# List items from a specific project
keep --list --meta project=myapp

Usage

Save Mode

Save stdin content with tags and metadata.

# Save (auto-assigned ID, no tag)
echo "data" | keep --save

# Save with a tag
echo "data" | keep --save my-tag

# Save with multiple tags and metadata
cat report.pdf | keep --save report --meta project=alpha --meta env=prod

# Specify compression and digest algorithm
echo "data" | keep --save my-tag --compression gzip --digest sha256

Tags and metadata make items easy to find later. Tags are simple identifiers; metadata is key-value pairs.

Get Mode

Retrieve items by ID or tags. This is the default mode when IDs are provided.

# Get by ID
keep --get 1
keep 1

# Get by tag
keep --get my-tag
keep my-tag

# Get with filters applied
keep --get 1 --filters "head_lines(10)"

# Get by metadata filter
keep --get --meta project=alpha

# Force binary output to TTY (override safety check)
keep --get 1 --force

List Mode

List stored items with filtering and formatting.

# List all items
keep --list

# List by tag
keep --list my-tag

# Filter by metadata
keep --list --meta env=prod

# Custom column format
keep --list --list-format "id,time,size,tags"

# JSON output for scripting
keep --list --output-format json

# Human-readable file sizes
keep --list --human-readable

Info Mode

Show detailed information about an item.

keep --info 1
keep --info my-tag
keep --info --meta key=value

Update Mode

Update an item's tags and metadata.

# Replace tags
keep --update 1 new-tag

# Update metadata
keep --update 1 --meta key=newvalue

# Remove a metadata key
keep --update 1 --meta key

Delete Mode

Delete items by ID.

keep --delete 1
keep --delete 1 2 3

Diff Mode

Show differences between two items.

keep --diff 1 2

Status Mode

Show system status and supported features.

keep --status
keep --status-plugins
keep --status --verbose

Filters

Apply transformations to item content during retrieval. Filters are chained with |.

# First 10 lines
keep --get 1 --filters "head_lines(10)"

# Skip first 5 lines, then grep for errors
keep --get 1 --filters "skip_lines(5)|grep(pattern=error)"

# Strip ANSI escape codes
keep --get 1 --filters "strip_ansi"

# Last 100 bytes
keep --get 1 --filters "tail_bytes(100)"

# Complex chain
keep --get 1 --filters "skip_lines(10)|grep(pattern=TODO)|head_lines(5)"

Available Filters

Filter	Description	Parameters
`head_bytes(n)`	First n bytes	`count`
`head_lines(n)`	First n lines	`count`
`tail_bytes(n)`	Last n bytes	`count`
`tail_lines(n)`	Last n lines	`count`
`skip_bytes(n)`	Skip first n bytes	`count`
`skip_lines(n)`	Skip first n lines	`count`
`grep(pattern)`	Filter matching lines	`pattern` (regex)
`strip_ansi`	Remove ANSI escape codes	none

Set KEEP_FILTERS to apply a default filter chain to all retrievals.

Compression

Items are compressed automatically on save. Default: LZ4.

Algorithm	Type	Speed	Ratio
`lz4`	Internal	Fastest	Lower
`gzip`	Internal	Fast	Good
`bzip2`	External	Slow	Better
`xz`	External	Slowest	Best
`zstd`	External	Fast	Good
`none`	Internal	N/A	N/A

# Specify compression per item
echo "data" | keep --save my-tag --compression zstd

# Set default via environment
export KEEP_COMPRESSION=gzip

External compression programs (bzip2, xz, zstd) must be installed on the system.

Meta Plugins

Metadata is automatically extracted when saving items.

Plugin	Key	Description
`env`	`*`	Capture `KEEP_META_*` environment variables
`magic_file`	`file_type`	File type detection (requires `magic` feature)
`text`	`text_line_count`, `text_word_count`	Line and word counts
`user`	`uid`, `user`, `gid`, `group`	Current user info
`shell`	`shell`	Current shell path
`shell_pid`	`shell_pid`	Shell process ID
`keep_pid`	`keep_pid`	Keep process ID
`digest`	`digest_sha256`, `digest_md5`	Content digests
`read_time`	`read_time`	Time to read content
`read_rate`	`read_rate`	Data read rate
`hostname`	`hostname`, `hostname_short`	System hostname
`exec`	Custom	Run external commands for metadata
`cwd`	`cwd`	Current working directory

# Use specific plugins
echo "data" | keep --save tag --meta-plugins "digest,text,user"

# Capture custom metadata via environment
KEEP_META_project=alpha echo "data" | keep --save tag

# Combine environment and CLI metadata
KEEP_META_build=1234 echo "data" | keep --save tag --meta env=staging

Configuration

Environment Variables

Variable	Description	Default
`KEEP_DIR`	Storage directory	`~/.keep`
`KEEP_CONFIG`	Config file path	`~/.config/keep/config.yml`
`KEEP_COMPRESSION`	Compression algorithm	`lz4`
`KEEP_META_PLUGINS`	Meta plugins to use	`env`
`KEEP_FILTERS`	Default filter chain	none
`KEEP_LIST_FORMAT`	List column format	built-in defaults
`KEEP_SERVER_ADDRESS`	Server bind address	`127.0.0.1`
`KEEP_SERVER_PORT`	Server port	`21080`
`KEEP_SERVER_USERNAME`	Server Basic auth username	`keep`
`KEEP_SERVER_PASSWORD`	Server password	none
`KEEP_SERVER_PASSWORD_HASH`	Server password hash	none
`KEEP_SERVER_JWT_SECRET`	JWT secret for token auth	none
`KEEP_SERVER_JWT_SECRET_FILE`	Path to JWT secret file	none
`KEEP_SERVER_MAX_BODY_SIZE`	Maximum POST body size in bytes (0=unlimited)	unlimited
`KEEP_SERVER_CERT`	TLS certificate file path (PEM)	none
`KEEP_SERVER_KEY`	TLS private key file path (PEM)	none
`KEEP_CLIENT_URL`	Remote keep server URL	none
`KEEP_CLIENT_USERNAME`	Remote server username	`keep`
`KEEP_CLIENT_PASSWORD`	Remote server password	none
`KEEP_CLIENT_JWT`	JWT token for remote server	none

Any config setting can be overridden with KEEP__<SETTING> environment variables (double underscore separator).

Configuration File

Default location: ~/.config/keep/config.yml

Generate a default configuration:

keep --generate-config > ~/.config/keep/config.yml

# Storage directory
dir: ~/.keep

# List view columns
list_format:
  - name: id
    label: "Item"
    align: right
  - name: time
    label: "Time"
    align: right
  - name: size
    label: "Size"
    align: right
  - name: tags
    label: "Tags"
    align: left

# Table styling
table_config:
  style: utf8_full
  content_arrangement: dynamic

# Default compression
compression_plugin:
  name: gzip

# Default meta plugins
meta_plugins:
  - name: env
  - name: digest
    options:
      algorithm: sha256

# Server settings
server:
  address: "127.0.0.1"
  port: 21080
  username: "keep"
  password: "secret"
  # Maximum POST body size in bytes (0 = unlimited)
  # max_body_size: 52428800  # 50 MB
  # JWT authentication (takes priority over password)
  # jwt_secret: "my-secret-key"
  # jwt_secret_file: /path/to/jwt_secret
  # TLS (requires tls feature)
  # cert_file: /path/to/cert.pem
  # key_file: /path/to/key.pem

# Client settings
client:
  url: "http://localhost:21080"
  username: "keep"
  password: "secret"
  # Or use JWT token
  # jwt: "eyJhbGciOiJIUzI1NiIs..."

human_readable: true
quiet: false
force: false

Client/Server Mode

Keep supports a client/server architecture where one machine runs a keep server and other machines connect as clients. This is useful for:

Centralizing stored data across multiple machines
Sharing items between team members
Offloading storage to a dedicated server
Piping data from long-running processes without local storage

Server Mode

Start an HTTP REST API server:

# Default: 127.0.0.1:21080
keep --server

# Custom address and port
keep --server --server-address 0.0.0.0 --server-port 8080

# With password authentication
keep --server --server-password mypassword

# With custom username
keep --server --server-username admin --server-password mypassword

# With JWT authentication
keep --server --server-jwt-secret my-secret-key

JWT Authentication

JWT (JSON Web Token) authentication provides permission-based access control. When a JWT secret is configured, the server validates tokens and checks permission claims for each request.

Configuration:

# Via CLI flag
keep --server --server-jwt-secret my-secret-key

# Via environment variable
export KEEP_SERVER_JWT_SECRET=my-secret-key
keep --server

# Via config file (config.yml)
server:
  jwt_secret: "my-secret-key"

# Via secret file (for Docker/secrets management)
keep --server --server-jwt-secret-file /path/to/secret

Token format:

JWTs must use HS256 algorithm with the following claims:

Claim	Type	Required	Description
`sub`	string	Yes	Subject (client identifier)
`exp`	number	Yes	Expiration time (Unix timestamp)
`read`	boolean	No	Permission for GET requests (default: false)
`write`	boolean	No	Permission for POST/PUT requests (default: false)
`delete`	boolean	No	Permission for DELETE requests (default: false)

Permission mapping:

HTTP Method	Required Permission
`GET`	`read`
`POST`, `PUT`, `PATCH`	`write`
`DELETE`	`delete`

Example token payload:

{
  "sub": "ci-pipeline",
  "exp": 1735689600,
  "read": true,
  "write": true,
  "delete": false
}

Generating tokens:

The server does not generate tokens — use any JWT library or tool:

# Using jwt-cli (https://github.com/mike-engel/jwt-cli)
jwt encode --secret my-secret-key \
  --exp=$(date -d '+24 hours' +%s) \
  '{"sub":"my-client","read":true,"write":true,"delete":false}'

# Using Python
python3 -c "
import jwt, time
token = jwt.encode({
    'sub': 'my-client',
    'exp': int(time.time()) + 86400,
    'read': True, 'write': True, 'delete': False
}, 'my-secret-key', algorithm='HS256')
print(token)
"

Using tokens:

# With curl
curl -H "Authorization: Bearer <jwt-token>" http://localhost:21080/api/item/

# The keep client uses --client-jwt for JWT tokens
keep --client-url http://server:21080 --client-jwt <jwt-token> --save my-tag

Response codes:

Code	Meaning
`200`	Authorized
`401`	Missing, invalid, or expired token
`403`	Valid token but insufficient permissions

Notes:

When jwt_secret is set, password authentication is disabled — all requests must present a valid JWT Bearer token
JWT and password authentication are mutually exclusive — when both jwt_secret and password are configured, only JWT is used
Permission fields default to false if omitted — tokens must explicitly grant permissions
JWT authentication requires the server feature (jsonwebtoken is included automatically)

HTTPS / TLS

Build with the tls feature to enable HTTPS:

cargo build --release --features server,tls

Provide a TLS certificate and private key (both PEM format):

# Via CLI flags
keep --server \
  --server-cert /path/to/cert.pem \
  --server-key /path/to/key.pem

# Via environment variables
export KEEP_SERVER_CERT=/path/to/cert.pem
export KEEP_SERVER_KEY=/path/to/key.pem
keep --server

# Via config file (config.yml)
server:
  cert_file: /path/to/cert.pem
  key_file: /path/to/key.pem

When cert and key are provided, the server listens with HTTPS. Without them, it falls back to plain HTTP. The port is controlled by --server-port (default: 21080).

Self-signed certificates (for development):

# Generate a self-signed cert
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \
  -days 365 -nodes -subj "/CN=localhost"

# Start server with self-signed cert
keep --server --server-cert cert.pem --server-key key.pem

# Connect client with HTTPS
keep --client-url https://localhost:21080 --save my-tag

The server accepts data from both dumb clients (raw HTTP/curl) and smart clients (the keep CLI).

Server Streaming

The server streams all data through fixed-size buffers (8192 bytes). At no point is the entire file content held in memory.

POST: Body streams through the compression and storage pipeline in chunks. When max_body_size is exceeded, the server returns 413 PAYLOAD_TOO_LARGE while keeping the partial item already saved through the pipeline.
GET: Content streams from disk through decompression to the client using the same fixed-size buffers.
Diff: Individual items are capped at 10 MB for the diff endpoint to prevent unbounded memory use.

Max Body Size

Control the maximum accepted body size with:

# Via CLI flag (bytes)
keep --server --server-max-body-size 52428800

# Via environment variable
export KEEP_SERVER__MAX_BODY_SIZE=52428800
keep --server

# Via config file (config.yml)
server:
  max_body_size: 52428800  # 50 MB

When set to 0 or omitted, no limit is enforced.

Server Query Parameters

The server supports query parameters that control processing:

Parameter	Default	Description
`tags`	none	Comma-separated tags
`metadata`	none	JSON-encoded metadata
`compress`	`true`	`false` = client already compressed, store as-is
`meta`	`true`	`false` = client handles metadata, skip server-side plugins
`decompress`	`true`	`false` = return raw compressed bytes on GET

When using a smart client, these are set automatically. For curl, the server handles everything by default.

Example: Curl as a Dumb Client

# Save (server handles compression and metadata)
curl -X POST -d "my data" http://localhost:21080/api/item/?tags=my-tag

# Retrieve (server decompresses)
curl http://localhost:21080/api/item/1/content

# Save compressed (client handles compression, server skips)
gzip -c data.txt | curl -X POST -d @- "http://localhost:21080/api/item/?compress=false&tags=my-tag"

Client Mode

The keep CLI can connect to a remote server as a smart client. Build with the client feature:

cargo build --release --features client

# Set server URL via flag or environment
keep --client-url http://server:21080 --save my-tag
export KEEP_CLIENT_URL=http://server:21080

# With password authentication
keep --client-url http://server:21080 --client-password mypassword --save my-tag
export KEEP_CLIENT_PASSWORD=mypassword

# With custom username
keep --client-url http://server:21080 --client-username admin --client-password mypassword --save my-tag

# With JWT authentication
keep --client-url http://server:21080 --client-jwt <jwt-token> --save my-tag
export KEEP_CLIENT_JWT=<jwt-token>

How Client Mode Works

Client mode uses local plugins and remote storage:

Save: Local compression and metadata plugins run on the client; compressed data streams to the server
Get: Server sends raw compressed data; client decompresses locally and applies filters
Other operations (list, info, delete, diff): Delegated directly to the server

This means client behavior is consistent with local mode — the same compression settings and filters apply.

Streaming Architecture

Client save uses a 3-thread streaming pipeline for constant memory usage regardless of data size:

┌──────────────┐     OS pipe      ┌────────────────┐
│ Reader thread ├──────────────────┤ Streamer thread│
│              │  (compressed     │                │
│ stdin → tee  │   bytes)         │ pipe → POST    │
│    → hash    │                  │   (chunked)    │
│    → compress│                  │                │
└──────────────┘                  └────────────────┘
        │                                │
        ▼                                ▼
    stdout +                    Server stores blob
    SHA-256 digest

Reader thread: Reads stdin, tees output to stdout, computes SHA-256, compresses data, writes to OS pipe
Streamer thread: Reads compressed bytes from pipe, streams to server via chunked HTTP POST
Main thread: After streaming completes, sends computed metadata (digest, hostname, size) to server

Memory usage is O(PIPESIZE) — typically 8 KB — regardless of how much data is being stored.

Example: Remote Pipeline

# On a build server, pipe logs to a central keep server
make build 2>&1 | keep --client-url http://logserver:21080 \
  --save build-logs \
  --meta project=myapp \
  --meta branch=$(git branch --show-current)

# Retrieve from any machine
keep --client-url http://logserver:21080 --get build-logs

# List recent builds from a specific project
keep --client-url http://logserver:21080 --list --meta project=myapp

API Endpoints

Method	Path	Description
`GET`	`/api/status`	System status
`GET`	`/api/plugins/status`	Plugin status
`GET`	`/api/item/`	List items (`tags`, `order`, `start`, `count` params)
`POST`	`/api/item/`	Create item (body: raw content, params: `tags`, `metadata`, `compress`, `meta`)
`GET`	`/api/item/latest/content`	Latest item content
`GET`	`/api/item/latest/meta`	Latest item metadata
`GET`	`/api/item/{id}`	Item info by ID
`GET`	`/api/item/{id}/content`	Item content by ID
`GET`	`/api/item/{id}/meta`	Item metadata by ID
`GET`	`/api/item/{id}/info`	Item info by ID
`POST`	`/api/item/{id}/meta`	Add metadata to existing item (body: JSON object)
`DELETE`	`/api/item/{id}`	Delete item by ID
`GET`	`/api/diff`	Diff two items (`id_a`, `id_b` params)

Authentication

The server supports three authentication modes:

1. Password (HTTP Basic auth):

# Default username is "keep"
curl -u keep:mypassword http://localhost:21080/api/status

# Custom username
curl -u admin:mypassword http://localhost:21080/api/status

2. JWT (permission-based):

# Valid JWT with read permission allows GET requests
curl -H "Authorization: Bearer <jwt-token>" http://localhost:21080/api/item/

See JWT Authentication for token format and configuration.

3. No authentication:

When neither password nor JWT secret is configured, authentication is disabled.

Swagger UI

Build with the swagger feature to enable OpenAPI documentation:

cargo build --features server,swagger

Swagger UI available at /swagger, OpenAPI spec at /openapi.json.

Security

The server applies the following security measures:

Input validation: Item IDs are validated as positive integers; tags and metadata have length limits (256 and 128 characters respectively).
XSS protection: All user-controlled data rendered into HTML pages is escaped.
Security headers: Responses include X-Content-Type-Options: nosniff, X-Frame-Options: DENY, and Referrer-Policy: strict-origin-when-cross-origin.
CORS: Explicit allowed headers (Content-Type, Authorization, Accept); no wildcard headers.
Path traversal: Item IDs are validated to prevent directory traversal attacks.
Internal errors: Internal error details are never exposed in HTML responses — only generic messages are shown.

Shell Integration

Source profile.bash to enable shell integration:

source /path/to/keep/profile.bash

This provides:

keep function — Captures the current command in metadata automatically
@ alias — Shorthand for keep --save
@@ alias — Shorthand for keep --get

# Save with automatic command capture
curl -s api.example.com | @ api-response

# Quick retrieve
@@ api-response

Feature Flags

Feature	Default	Description
`magic`	Yes	File type detection via libmagic
`lz4`	Yes	LZ4 compression (internal)
`gzip`	Yes	GZip compression (internal)
`server`	No	HTTP REST API server
`tls`	No	HTTPS/TLS server support (requires `server`)
`client`	No	HTTP client for remote server
`swagger`	No	Swagger UI for API docs
`bzip2`	No	BZip2 compression (external program)
`xz`	No	XZ compression (external program)
`zstd`	No	ZStd compression (external program)

# Server with Swagger UI
cargo build --features server,swagger

# Server with HTTPS
cargo build --features server,tls

# Client only
cargo build --features client

# Everything
cargo build --features server,tls,client,swagger,magic

License

MIT License - see LICENSE for details.

Contact

Andrew Phillips - andrew@gt0.ca

README.md

Keep

Table of Contents

Features

Installation

From Source

Install via Cargo

Static Binary (Linux)

Build with Server/Client Features

Quick Start

Real-World Examples

Usage

Save Mode

Get Mode

List Mode

Info Mode

Update Mode

Delete Mode

Diff Mode

Status Mode

Filters

Available Filters

Compression

Meta Plugins

Configuration

Environment Variables

Configuration File

Client/Server Mode

Server Mode

JWT Authentication

HTTPS / TLS

Server Streaming

Max Body Size

Server Query Parameters

Example: Curl as a Dumb Client

Client Mode

How Client Mode Works

Streaming Architecture

Example: Remote Pipeline

API Endpoints

Authentication

Swagger UI

Security

Shell Integration

Feature Flags

License

Contact