- Plugin features now use type_ prefix (meta_magic, filter_grep, etc.)
- Added meta_all_musl and filter_all_musl for MUSL-compatible builds
- grep filter plugin made optional via filter_grep feature flag
- Removed regex crate from grep-related code, uses strip_prefix instead
- Updated CHANGELOG.md with breaking change documentation
Fixed:
- CLI help typo: "metatdata" -> "metadata"
- Filter buffer OOM: check size before loading into memory
Changed:
- #[inline] on HTML escape helpers for hot path performance
- Replaced once_cell and lazy_static with std::sync::LazyLock
- Removed unused once_cell and lazy_static crate dependencies
Refactored:
- Added module-level doc to services/ module
Documentation:
- README.md: zstd is native not external, "none" -> "raw"
- DESIGN.md: current schema and meta plugins section
- CHANGELOG.md: Unreleased section populated
- Add infer crate as meta plugin for MIME type detection
- Add tree_magic_mini crate as alternative meta plugin for MIME type detection
- Add zstd, infer, tree_magic_mini to default features
- Fix static build script to use musl target instead of glibc+crt-static
- Remove hardcoded shell list from --generate-completion help text
- Fix update() in both new plugins to emit MIME metadata when buffer fills
- Add SaveMetaFn callback pattern: meta plugins receive a closure instead of
&Connection, enabling the same plugin code to work in local, client, and
server contexts (collect-to-Vec, collect-to-HashMap, or direct DB write)
- Client save now runs meta plugins locally during streaming (smart client
sets meta=false, server skips its own plugins)
- Add POST /api/item/{id}/update endpoint for re-running plugins on stored
content without downloading compressed data
- Add client update mode (--update with --meta-plugin flags)
- Extract shared utilities: stream_copy, print_serialized, build_path_table,
ensure_default_tag to reduce duplication across modes
- Add upsert_tag for idempotent tag addition (INSERT OR IGNORE)
- Add warn logging on save_meta lock failure in BaseMetaPlugin and MetaService
Major overhaul of server architecture and security posture:
- Streaming: Unified all I/O through PIPESIZE (8192-byte) buffers.
POST bodies stream via MpscReader through the save pipeline. GET
content streams from disk via decompression to client. Removed
save_item_with_reader, get_item_content_info, ChannelReader.
413 responses keep partial items (nonfatal by design).
- Security: XSS protection in all HTML pages via html_escape crate.
Security headers middleware (nosniff, frame deny, referrer policy).
CORS tightened to explicit headers. Input validation for tags
(256 chars), metadata (128/4096), pagination (10k cap). Config
file reads use from_utf8_lossy. Generic error messages in HTML.
Diff endpoint has 10 MB per-item cap. max_body_size config option.
- Panics eliminated: Path unwraps → proper error propagation.
Mutex unwraps → map_err (registries) / expect with message (local).
- MCP removed: Deleted all MCP code, rmcp dependency, mcp feature.
- Docs: Updated README, DESIGN, AGENTS to reflect all changes.
- Add plugin schema types and runtime discovery for meta/filter plugins
- Rewrite --generate-config to use schema system instead of hardcoded types
- Add Settings::validate_config() for startup validation
- Cache tokenizer instances via static Lazy to avoid repeated BPE loading
- Add split_by_token_iter() and count_bounded() to Tokenizer
- Fix double-counting bug in TokensMetaPlugin when buffer < max_buffer_size
- Eliminate unnecessary allocations in token count methods
- Refactor token filters: remove Option<Tokenizer>, use iterator API
- Fix TailTokensFilter correctness: unbounded buffer instead of ring buffer
- Add encoding option to all token filters
- Add description() to MetaPlugin and FilterPlugin traits
- Fix unused_mut warning in compression engine (feature-gated code)
Co-Authored-By: code-review-bot <noreply@anthropic.com>
Add tiktoken-based token counting via new 'tokens' feature flag.
New components:
- Shared tokenizer module wrapping tiktoken CoreBPE (cl100k_base, o200k_base)
- TokensMetaPlugin: streaming token counter, tokenizes each chunk independently
- head_tokens(N): stream first N tokens, split at exact boundary when mid-chunk
- skip_tokens(N): skip first N tokens, stream the rest
- tail_tokens(N): bounded ring buffer (~16KB), outputs last N tokens at finalize
All filters are fully streaming — no full-stream buffering.
Meta plugin accuracy: exact for normal text, ±1-2 tokens if long whitespace
sequence spans a chunk boundary.
Also: add 'client' and 'tokens' to default features, add curl to Dockerfile builder stage.
Security:
- Use constant-time password comparison (subtle crate) to prevent timing attacks
- Replace permissive CORS with configurable origin-restricted CORS
- Add TLS warning when password auth is used without HTTPS
Bug fixes:
- Convert MetaPlugin panics to anyhow::Result (get_meta_plugin, outputs_mut, options_mut)
- Replace item.id.unwrap() with proper error handling across 15 call sites
- Fix panic on unknown column type in list mode
- Fix conflicting PIPESIZE constant (was 8192 vs 65536, now unified to 8192)
- Add 256MB filter chain buffer limit to prevent OOM
- Gracefully skip unregistered plugins instead of panicking
Dead code removal:
- Delete unused filter parser files (filter_parser.rs, filter.pest, parser/ module)
- ~260 lines of dead PEG parser code removed
Code consolidation:
- Add is_content_binary_from_metadata() helper (was duplicated in 4 places)
- Simplify save_item_raw() to delegate to save_item_raw_streaming() (~90 lines removed)
Incomplete features:
- Populate filter_plugins in status output from global registry
- Add FallbackMagicFileMetaPlugin (was referenced but never implemented)
- Document init_plugins() as intentional no-op
Infrastructure:
- Add Dockerfile (static musl binary on scratch, 4.8MB)
- Add .dockerignore
- Add cors_origin to ServerConfig and config.rs