Compare commits

..

43 Commits

Author SHA1 Message Date
8379ae2136 refactor: rename plugin features with type prefix for consistency
- Plugin features now use type_ prefix (meta_magic, filter_grep, etc.)
- Added meta_all_musl and filter_all_musl for MUSL-compatible builds
- grep filter plugin made optional via filter_grep feature flag
- Removed regex crate from grep-related code, uses strip_prefix instead
- Updated CHANGELOG.md with breaking change documentation
2026-03-21 17:36:29 -03:00
12de215527 feat: feature-gate CLI args by server/client features
- CLI now shows only relevant options: --server and --server-* args
  hidden when built without 'server' feature; --client-* args hidden
  without 'client' feature. Run --help only displays applicable options.
- Removed verbose 'conflicts_with_all' from all mode args — clap's
  implicit group("mode") already enforces mutual exclusivity.
- 'server' feature now includes TLS/HTTPS by default (axum-server);
  'tls' feature removed. rustls already available via client/ureq.
- Gated KeepModes::Server, server mode detection, and server-password
  validation in main.rs.
- Gated server arg reads in config.rs.
- Removed redundant #[cfg(feature = "tls")] guards from server/mod.rs.
- Gated resolve_item_id/resolve_item_ids helpers in common.rs.
- All 4 feature combinations (server+client, server-only, client-only,
  neither) compile and pass tests.
2026-03-21 16:26:27 -03:00
e2cb36d2a8 feat(server): add file_size to API ItemInfo response 2026-03-21 14:03:58 -03:00
0004324301 perf: pre-allocate status info collections with known capacities 2026-03-21 13:54:37 -03:00
b3edfe7de6 chore: code review cleanup — fixes, deps, docs
Fixed:
- CLI help typo: "metatdata" -> "metadata"
- Filter buffer OOM: check size before loading into memory

Changed:
- #[inline] on HTML escape helpers for hot path performance
- Replaced once_cell and lazy_static with std::sync::LazyLock
- Removed unused once_cell and lazy_static crate dependencies

Refactored:
- Added module-level doc to services/ module

Documentation:
- README.md: zstd is native not external, "none" -> "raw"
- DESIGN.md: current schema and meta plugins section
- CHANGELOG.md: Unreleased section populated
2026-03-21 11:44:37 -03:00
ab2fb07505 docs: add changelog update instructions to AGENTS.md 2026-03-21 10:56:43 -03:00
547f0b5d11 docs: add CHANGELOG.md following Keep a Changelog format 2026-03-21 10:55:16 -03:00
30d7836bcf refactor: deduplicate ItemInfo, improve error handling, fix pre-existing bugs
- Move ItemInfo to services/types.rs for sharing between client and server
- Replace .expect() in compression_service with proper error handling
- Add CoreError::PayloadTooLarge variant for semantic error handling
- Export CoreError from lib.rs for library users
- Unify get_item_meta_name/value to take &str instead of String
- Extract item_path() helper in ItemService to reduce duplication
- Add warning logs for silent errors in list.rs
- Fix pre-existing borrow errors: tx moved in export handler,
  item_with_meta partial move in TryFrom implementation
- Fix unused data_dir variables in server code
2026-03-21 10:43:26 -03:00
2cfee5075e fix: panic guards, dedup, and unsafe documentation
- diff.rs: graceful error instead of expect() on item ID in spawned thread
- common.rs: lazy_static regex, avoid unwrap on regex captures
- db.rs: ok_or_else guard on item.id in delete_item
- list/get/info/export/client/list: use settings.meta_filter() helper
- item_service.rs: expect() on meta lock instead of silent swallow
- filter_plugin/mod.rs: extract parse_encoding_option() helper
- main.rs: document unsafe libc::umask block with safety rationale
2026-03-20 17:17:58 -03:00
52e9787edb refactor: deduplicate filter plugins, extract helpers across codebase
Bug fixes:
- client: add error field to ApiResponse to avoid swallowing server errors
- args/config: fix list_format default mismatch (5 vs 7 columns)
- client: url-encode size param in set_item_size

Dedup - filter plugins:
- Extract count_option() and pattern_option() helpers, replace 7 identical options()
- Add #[derive(Clone)] to all filter structs; remove verbose clone_box() impls
- Simplify FilterChain clone() and impl Clone for Box<dyn FilterPlugin>
- Add filter_clone_box! macro for future use
- Fix doctest example missing clone_box

Dedup - server API:
- Extract spawn_body_reader() with LimitBehavior enum for body streaming
- Extract check_binary_content() helper
- Extract stream_with_offset_and_length() helper
- Extract generate_status() helper in status.rs
- Extract append_query_params() helper in client.rs

Dedup - other:
- Extract yaml_value_to_string() in meta_plugin/mod.rs
- Extract item_from_row() in db.rs
- Delete unused DisplayListItem struct

Misc:
- Remove duplicate doc comment in compression_service.rs
2026-03-20 15:54:33 -03:00
00be72f3d0 refactor: rename size to uncompressed_size, add compressed_size and closed columns
Schema changes:
- Rename items.size to items.uncompressed_size for clarity
- Add compressed_size (INTEGER NULL) - tracks compressed file size on disk
- Add closed (BOOLEAN NOT NULL DEFAULT 1) - tracks whether item is fully written
- Existing items default to closed=true via migration

Lifecycle:
- Items created with closed=false, set to true on successful save/import
- Compressed size captured via fs::metadata() after compression writer closes
- Truncated uploads (413) get compressed_size set, closed=true, uncompressed_size=None
- Update command now backfills both uncompressed_size and compressed_size

Also includes bug fixes and dedup from prior review:
- Fix stream_raw_content_response using uncompressed_size for raw byte Content-Length
- ApiResponse::ok()/empty() constructors, TryFrom<ItemWithMeta> for ItemInfo
- tag_names() method on ItemWithMeta, meta_filter() on Settings
- Fix .unwrap() panics in compression engine Read/Write impls
- Fix TOCTOU race in stream_raw_content_response (now uses compressed_size)
- Fix swallowed write errors in meta plugins (digest, magic_file, exec)
- Fix term::stderr().unwrap() panic in item_service
- Deduplicate ItemService::new() calls across 20 API handlers
- ImportMeta supports #[serde(alias = "size")] for backward compat

All 75 tests, 67 doc tests pass. Clippy clean.
2026-03-18 10:58:26 -03:00
49793a0f94 feat: add streaming tar export/import and rename "none" to "raw"
- Add streaming tar-based export (--export produces .keep.tar)
- Add streaming tar import (--import reads .keep.tar archives)
- Add server endpoints GET /api/export and POST /api/import
- Rename CompressionType::None to CompressionType::Raw with "none" as alias
- Add DB migration to update existing "none" compression values to "raw"
- Fix export endpoint to propagate errors to client instead of swallowing
- Fix import endpoint to return 413 on max_body_size instead of truncating

Export streams items as tar archives without loading entire files into memory.
Import extracts items with new IDs, preserving original order. Both work
locally and via client/server mode.

Co-Authored-By: opencode <noreply@opencode.ai>
2026-03-17 21:24:39 -03:00
074ba64805 feat: allow --list to accept item IDs for filtering
- Local and client/server modes now support ID-based filtering
- keep -l 1 2 3 lists specific items by ID
- keep -l --ids-only 1 2 3 outputs just those IDs
- Server API adds optional 'ids' query parameter to GET /api/item/
- KeepClient.list_items gains ids parameter
2026-03-17 17:56:35 -03:00
02f0c8d453 fix: use XDG config directory for default config file location
Changes from manual HOME/.config/keep/config.yml construction to
dirs::config_dir(), which respects XDG_CONFIG_HOME.
2026-03-17 16:07:13 -03:00
c29e37c03e fix: use XDG data directory as default storage location
Changes default from ~/.keep to /keep
(e.g. ~/.local/share/keep on Linux). Uses dirs::data_dir() which
respects XDG_DATA_HOME environment variable.
2026-03-17 15:37:25 -03:00
28c3deaeca fix: expand tilde (~) in config file paths to home directory
Applies to dir, import_data_file, and all server certificate/secret file
paths. Uses existing dirs crate for home directory resolution.
2026-03-17 15:32:30 -03:00
cb56a398fa feat: add --ids-only flag to --list mode for scripting
Outputs one ID per line with no header. Errors if used with any mode
other than --list. Works with both local and client (remote) list.
2026-03-17 15:04:10 -03:00
2452da52ef chore: add license, repository, keywords, and rust-version to Cargo.toml 2026-03-17 14:50:45 -03:00
6347427536 chore: remove bin/keep binary from tracking, add bin/ to gitignore 2026-03-17 14:47:57 -03:00
a8759c4b83 feat: add infer and tree_magic_mini meta plugins, make zstd internal by default
- Add infer crate as meta plugin for MIME type detection
- Add tree_magic_mini crate as alternative meta plugin for MIME type detection
- Add zstd, infer, tree_magic_mini to default features
- Fix static build script to use musl target instead of glibc+crt-static
- Remove hardcoded shell list from --generate-completion help text
- Fix update() in both new plugins to emit MIME metadata when buffer fills
2026-03-17 14:46:51 -03:00
a90c19efc1 feat: add native zstd compression plugin and deduplicate shared compression/meta utilities
- Add zstd crate (v0.13) with native Rust compression engine (level 3)
- Gate behind 'zstd' feature flag, fall back to program-based when disabled
- Extract CompressionService::decompressing_reader/compressing_writer with zstd support
- Extract MetaService::with_collector() to eliminate Arc<Mutex<Vec>> boilerplate
- Extract read_with_bounds() helper for skip+read pattern
- Add input validation for mutually exclusive --id and --tags flags
- Add zstd round-trip tests
2026-03-16 20:03:30 -03:00
35ee71c3cf feat: add export/import modes, unify service layer, fix binary detection
Export/import:
- Add --export and --import modes for both local and client paths
- Use strfmt crate for --export-filename-format templates ({id}, {tags}, {ts}, {compression})
- Import preserves original timestamps via server ?ts= param
- --import-data-file for file-based import; stdin fallback streams with PIPESIZE buffers

Service unification:
- Merge SyncDataService unique methods into ItemService (delete_item now returns Result<Item>)
- Delete AsyncDataService, AsyncItemService, DataService trait (dead code / async-blocking anti-pattern)
- All server handlers use spawn_blocking + ItemService directly
- Extract shared types (ExportMeta, ImportMeta) and helpers (resolve_item_id(s), check_binary_tty)

Binary detection fix:
- Replace broken metadata.get("map") + is_binary(&[]) with actual content sampling
- Both as_meta and allow_binary paths read PIPESIZE sample before deciding
- Never load entire item into memory for binary check

Other fixes:
- Fix lock consistency: all handlers use blocking_lock() in spawn_blocking (no mixed lock().await)
- Use ISO 8601 format for {ts} in export filenames
- Fix resolve_item_ids returning only 1 item for tag lookups
- Fix client get.rs triple-buffering and export.rs whole-file buffering
- Add KeepClient::get_item_content_stream() for streaming reads
- Pass all clippy --features server lints (Path vs PathBuf, &mut conn, etc.)
2026-03-16 08:43:26 -03:00
0a3d61a875 fix: client save with --compression none stored lz4 instead of none
- server_compress was true when compression_type=None, telling server to
  recompress with its default (lz4) instead of storing raw
- compression_type query param was only sent when !server_compress,
  so 'none' was never sent to server
- Fix: server_compress always false in client mode (client handles all
  compression), compression_type always sent to server

Tested: save/get/list/info/filters/delete for lz4, none, gzip on both
local and client/server modes. All operations produce matching results.
2026-03-15 12:46:29 -03:00
eca17b36ee fix: client save logs item ID early, stores compression via proper field and size via update endpoint
- Client save now logs 'New item: {id}' immediately after server response
- Compression type sent as query param, stored in DB compression field (not _client_compression metadata)
- Client set_item_size() sends uncompressed size via POST /api/item/{id}/update?size=N
- Server raw content GET uses actual file size for Content-Length (not uncompressed item.size)
- Removed _client_compression metadata hack from client save and get
- Fixed server handle_update_item to support size-only updates
- Fixed clippy: collapsible_if, too_many_arguments, unnecessary mut refs
- Fixed ListItemsQuery doctest missing meta field
2026-03-15 10:14:55 -03:00
5bad7ac7a6 refactor: decouple meta plugins from DB via SaveMetaFn callback, extract shared utilities
- Add SaveMetaFn callback pattern: meta plugins receive a closure instead of
  &Connection, enabling the same plugin code to work in local, client, and
  server contexts (collect-to-Vec, collect-to-HashMap, or direct DB write)
- Client save now runs meta plugins locally during streaming (smart client
  sets meta=false, server skips its own plugins)
- Add POST /api/item/{id}/update endpoint for re-running plugins on stored
  content without downloading compressed data
- Add client update mode (--update with --meta-plugin flags)
- Extract shared utilities: stream_copy, print_serialized, build_path_table,
  ensure_default_tag to reduce duplication across modes
- Add upsert_tag for idempotent tag addition (INSERT OR IGNORE)
- Add warn logging on save_meta lock failure in BaseMetaPlugin and MetaService
2026-03-14 22:36:59 -03:00
fdc5f1d744 fix: client --list uses list_format from config like local mode
Move apply_color/apply_table_attribute to common.rs for sharing.
Add render_list_table_with_format() that takes ColumnConfig slice
and pre-computed row values. Client list now renders columns based
on settings.list_format, showing empty for columns where server
data is unavailable (e.g. text_line_count, token_count).
2026-03-14 20:01:58 -03:00
f5bae46620 fix: all tables respect table_config from settings
Extract shared render_item_info_table() and render_list_table() in
modes/common.rs. Update client/info, client/list, client/status,
info, status, and status_plugins to use create_table_with_config
with settings.table_config instead of hardcoded presets.

Previously only local --list used table_config; all other tables
(client modes, status, status-plugins) ignored it.
2026-03-14 19:49:31 -03:00
0bc8d9c909 fix: surface server error in get_status and trim table output
- Include error field in get_status() ApiResponse so server error
  messages are surfaced instead of generic 'No status data returned'
- Use trim_lines_end() on table output to match local mode formatting
2026-03-14 19:32:39 -03:00
1a942b4d23 fix: format client --status output as tables instead of raw JSON
Change client get_status() to return StatusInfo struct instead of
serde_json::Value, then render paths, meta plugins, and compression
tables matching the local mode's output style.
2026-03-14 19:25:53 -03:00
886ac98b21 fix: URL-encode query params in client and pass --meta to server on save
- URL-encode all query parameter keys and values in get_json_with_query
  and post_stream. Previously raw JSON like {"project":"alpha"} was
  sent unencoded, causing 'invalid uri character' errors.
- Pass settings.meta (key=value pairs) from client save to server as
  metadata. Previously always passed empty HashMap, so --meta was
  silently ignored in client save mode.
2026-03-14 19:16:39 -03:00
0658d8378f fix: group all server options under Server Options help heading
The --server-password, --server-password-hash, --server-username,
--server-jwt-secret, --server-jwt-secret-file, and --server-max-body-size
options were appearing in the generic Options section instead of the
Server Options section.
2026-03-14 18:56:32 -03:00
ffe71440d9 fix: use explicit snake_case serialization for CompressionType
Per project convention, enum string representations should use
snake_case. Use explicit strum serialize attributes instead of
serialize_all to avoid incorrect splitting of acronyms like
GZip → g_zip and ZStd → z_std.
2026-03-14 18:26:58 -03:00
8acbd34150 fix: add --meta filtering support to client/server list mode
Plumb metadata filter from client CLI through the HTTP API to the
server's data_service.list_items(). The server accepts a JSON-encoded
meta query parameter where null values mean 'key exists' and string
values mean 'exact match'.

Also fix LZ4 compression round-trip for client mode:
- Explicit flush FrameEncoder before drop to avoid sending only the
  frame header when compress=false
- Send _client_compression metadata so client knows actual compression
  on retrieval (server records compression=None when compress=false)
- Use FrameDecoder (frame format) instead of decompress_size_prepended
  (size-prepended format) to match server storage format
2026-03-14 18:22:07 -03:00
f2d93a2812 fix: skip_lines/skip_bytes filters producing empty output on large files
FilteringReader::read() returned Ok(0) (EOF) when a filter consumed a
chunk without producing output. Filters like skip_lines need to see
multiple chunks before outputting anything — returning 0 prematurely
truncated the stream. Loop until the filter produces output or the
underlying reader is truly exhausted.
2026-03-14 16:20:30 -03:00
0af74000d2 fix: eliminate unsafe code via nix, command-fds, and thread-local cookie
Replace 4 unsafe sites with safe wrappers:

- libc::pipe2 → nix::unistd::pipe2 (safe OwnedFd return)
- File::from_raw_fd → File::from(OwnedFd) (safe ownership transfer)
- unsafe impl Send for SendCookie → thread_local! lazy Cookie
  (each thread gets its own independent Cookie, no Send needed)
- pre_exec + libc::fcntl → command-fds crate fd_mappings()
  (handles CLOEXEC clearing safely, also fixes potential fd leak
  on spawn failure via OwnedFd RAII)

Only libc::umask remains as a single unavoidable unsafe site
(no safe Rust wrapper exists for the umask syscall).

Also updates AGENTS.md to remove stale SendCookie exception.
2026-03-14 16:01:54 -03:00
9a1e23e85f fix: use tempdir for db doctests instead of project root
All 27 doctests in db.rs wrote keep.db to the project root via
PathBuf::from("keep.db"). Now use tempfile::tempdir() so the
database is created in a temp directory and cleaned up automatically.
2026-03-14 15:10:47 -03:00
b3ca673b52 feat: add --update mode, --meta/--meta-plugin flags, streaming diff
- Add --update mode to modify tags and metadata for existing items by ID
- Add --meta key=value flag to set metadata during save/update
- Add --meta key (bare) to delete metadata keys or filter by existence
- Add --meta-plugin/-M name:{json} flag for plugin options via CLI
- Env meta plugin now uses options from --meta-plugin instead of only env vars
- Stream decompressed content to diff via /dev/fd pipes (no temp files)
- Wire --list-format CLI arg to settings (was parsed but ignored)
- Allow --info to accept tags (was restricted to numeric IDs only)
- Change DB meta filtering to HashMap<String, Option<String>> for exact match + key existence
- Fix fcntl error checking in diff pre_exec
- Fix README inaccuracies (delete by tag, nonexistent --digest flag, meta plugin key names)
2026-03-14 15:02:16 -03:00
4b51825917 docs: document default mode shortcuts for save and get
- Quick Start: show bare keep <tag> (save) and keep <#> (get) shortcuts
- Save Mode: note that --save is optional when piping content
- Get Mode: clarify that only numeric IDs default to Get mode;
  fix incorrect keep <tag> example that would actually save
2026-03-14 11:48:37 -03:00
2ffa2a977a feat: add shell profiles for zsh, sh, csh/tcsh
- profile.bash: simplified preexec_init (early return), extracted
  ___keep_complete helper for @/@@ completion wrappers
- profile.zsh: add-zsh-hook preexec, wrapper function, @/@@ aliases,
  completions via compdef
- profile.sh: POSIX-compatible for sh/dash/ksh. Wrapper function,
  @/@@ aliases. No preexec or completions.
- profile.csh: alias-based keep wrapper, @/@@ aliases. No preexec
  or completions.
- modulefile: adds KEEP_SH_PROFILE, KEEP_ZSH_PROFILE, KEEP_CSH_PROFILE
- README: updated Shell Integration table and Shell Completion section
2026-03-14 11:36:29 -03:00
1a8ed56b68 feat: add --generate-completion for shell tab completion
- Add clap_complete dependency for bash/zsh/fish/elvish/powershell
- Add --generate-completion <shell> flag that prints completion script to stdout
- profile.bash sources completions via command keep --generate-completion bash
- @ and @@ aliases get completions via wrapper functions that delegate to _keep
- README updated with Shell Completion section
2026-03-14 11:02:38 -03:00
158bf50864 docs: add environment modulefile instructions to README 2026-03-14 10:36:57 -03:00
17be6abaab refactor: streaming, security hardening, and MCP removal
Major overhaul of server architecture and security posture:

- Streaming: Unified all I/O through PIPESIZE (8192-byte) buffers.
  POST bodies stream via MpscReader through the save pipeline. GET
  content streams from disk via decompression to client. Removed
  save_item_with_reader, get_item_content_info, ChannelReader.
  413 responses keep partial items (nonfatal by design).

- Security: XSS protection in all HTML pages via html_escape crate.
  Security headers middleware (nosniff, frame deny, referrer policy).
  CORS tightened to explicit headers. Input validation for tags
  (256 chars), metadata (128/4096), pagination (10k cap). Config
  file reads use from_utf8_lossy. Generic error messages in HTML.
  Diff endpoint has 10 MB per-item cap. max_body_size config option.

- Panics eliminated: Path unwraps → proper error propagation.
  Mutex unwraps → map_err (registries) / expect with message (local).

- MCP removed: Deleted all MCP code, rmcp dependency, mcp feature.

- Docs: Updated README, DESIGN, AGENTS to reflect all changes.
2026-03-14 00:03:42 -03:00
560ba6e20c fix: count_bounded error counting, clippy if-let, auth test dedup, doc tests
- count_bounded: break on iterator error instead of counting errors as tokens
- collapse nested if-let chains with let-chains in auth middleware
- document JWT/Basic Auth as mutually exclusive
- TailTokensFilter::clone uses empty buffer (always pre-filter)
- fix 9 broken doc examples in server/common.rs
- remove 7 duplicate auth tests from auth.rs (covered by auth_tests.rs)
2026-03-13 22:04:38 -03:00
115 changed files with 7200 additions and 4388 deletions

1
.gitignore vendored
View File

@@ -2,3 +2,4 @@
.aider*
.crush
keep.db
bin/

View File

@@ -30,10 +30,36 @@ TERM=dumb cargo build --features server # With server feature
- Meta plugins extend `BaseMetaPlugin` for boilerplate reduction
- Enum string representations: `#[strum(serialize_all = "snake_case")]`
- Lint rules: `deny(clippy::all)`, `deny(unsafe_code)` (except `libc::umask` in main.rs)
- Feature flags: `default = ["magic", "lz4", "gzip"]`; optional: `server`, `mcp`, `swagger`
- Feature flags: `default = ["magic", "lz4", "gzip"]`; optional: `server`, `swagger`
## Testing
- Tests in `src/tests/` mirroring `src/` structure; shared helpers in `src/tests/common/test_helpers.rs`
- Key helpers: `create_temp_dir()`, `create_temp_db()`, `test_compression_engine()`
- Test naming: `test_<feature>_<scenario>`
## Streaming Constraint
**At no point should the whole file be in memory at once.** All I/O must use fixed-size buffers:
- `PIPESIZE` = 8192 bytes (`src/common/mod.rs:10`)
- Server POST body streams through `save_item_raw_streaming` via `MpscReader`
- Server GET content streams via streaming reader (not `read_to_end`)
- When `max_body_size` is exceeded, return `413` but keep the partial item (nonfatal by design)
- Filter/meta plugins use `PIPESIZE`-sized buffers
## HTML Rendering
- Use `html_escape` crate for all user-controlled data in HTML pages
- `esc()` for text content, `esc_attr()` for HTML attributes
- Security headers middleware: `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`, `Referrer-Policy: strict-origin-when-cross-origin`
## Changelog
The project uses [Keep a Changelog](https://keepachangelog.com/). The changelog lives at `CHANGELOG.md` in the project root.
- **Always update `CHANGELOG.md`** when making changes that affect users (new features, breaking changes, bug fixes, etc.)
- Add entries under the `[Unreleased]` section using these categories: `Added`, `Changed`, `Deprecated`, `Removed`, `Fixed`, `Security`
- Keep descriptions concise and user-focused — what changed from the user's perspective, not implementation details
- Commit changelog updates in the same commit as the feature/fix they document
- Before releasing a new version, move `[Unreleased]` entries to a versioned section (e.g., `[0.2.0] - YYYY-MM-DD`) and add a new empty `[Unreleased]` above it

107
CHANGELOG.md Normal file
View File

@@ -0,0 +1,107 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Added
- New `filter_grep` feature to optionally include the grep filter plugin (regex-based line filtering). Disabling this feature removes the `regex` crate and its ~800 KiB dependency stack from the binary.
- New `meta_all_musl` feature for all MUSL-compatible meta plugins (excludes `meta_magic` which requires libmagic)
- New `filter_all_musl` feature for all MUSL-compatible filter plugins
- Database index on `items(ts)` column for faster ORDER BY sorting
- Server API `ItemInfo` now includes `file_size` — actual filesystem-reported size of the item data file
### Changed
- CLI args now feature-gated: `--server` and related options hidden when built without `server` feature; `--client-*` options hidden when built without `client` feature. Run `--help` only shows relevant options.
- `server` Cargo feature now includes TLS support by default (`axum-server`); `tls` feature removed
- Clap `conflicts_with_all` removed from all mode args — exclusivity now handled by implicit `group("mode")`
- Filter plugins check size before loading content into memory (prevents OOM on large inputs)
- Status page pre-allocates collections with known capacities (meta plugins, compression info)
- `#[inline]` on HTML escape helper functions (`esc`, `esc_attr`) for hot path performance
- Removed `once_cell` crate (replaced with `std::sync::LazyLock` from Rust 1.80)
- Removed `lazy_static` crate (replaced with `std::sync::LazyLock`)
### Breaking
- Plugin feature flags renamed with type prefix for consistency:
- `magic``meta_magic`
- `infer``meta_infer`
- `tree_magic_mini``meta_tree_magic_mini`
- `tokens``meta_tokens`
- `grep``filter_grep`
- `all-meta-plugins``meta_all`
- `all-filter-plugins``filter_all`
### Fixed
- CLI help text typo: "metatdata" → "metadata" in `--get` and `--info` descriptions
### Refactored
- Added module-level documentation to `services/` module
### Documentation
- README.md: Fixed compression table — zstd is native (not external), "none" renamed to "raw"
- DESIGN.md: Updated schema to reflect current `items` table columns and meta plugin inventory
## [0.1.0] - 2026-03-21
### Added
- Streaming tar-based export (`--export`) producing `.keep.tar` archives without loading entire files into memory
- Streaming tar-based import (`--import`) extracting `.keep.tar` archives with new IDs
- Server endpoints `GET /api/export` and `POST /api/import`
- ID-based filtering for `--list` (`keep -l 1 2 3` lists specific items by ID)
- Server API accepts optional `ids` query parameter on `GET /api/item/`
- `--ids-only` flag for `--list` mode for scripting
- `infer` and `tree_magic_mini` meta plugins for MIME type detection
- Native `zstd` compression plugin as default
- Configurable compression via `--compression` flag
- Export/import modes with format detection (JSON, YAML, binary)
- `XDG_CONFIG_HOME` support for default config file location
- `XDG_DATA_HOME` support for default storage location
- Tilde (`~`) expansion in config file paths
### Changed
- `CompressionType::None` renamed to `CompressionType::Raw` (with `"none"` as alias for backward compatibility)
- `items.size` column renamed to `items.uncompressed_size`
- Added `items.compressed_size` column tracking compressed file size on disk
- Added `items.closed` column tracking whether an item is fully written
- Default `list_format` in config now matches CLI default (7 vs 5 columns)
- All filter plugins share deduplicated option implementations
### Refactored
- Extracted `spawn_body_reader()` and `check_binary_content()` helpers for streaming uploads
- Extracted `yaml_value_to_string()` helper for meta plugins
- Extracted `item_path()` helper in `ItemService` to reduce path duplication
- Unified `get_item_meta_name`/`value` to take `&str` instead of `String`
- Shared `ItemInfo` struct between client and server
- Compression service now returns `Result` types instead of panicking via `.expect()`
- `ApiResponse::ok()` and `ApiResponse::empty()` constructors
- `meta_filter()` helper on `Settings` for consistent filtering
- Added `tag_names()` method on `ItemWithMeta`
- `filter_clone_box!` macro for filter plugin cloning
### Fixed
- Panic guards in diff, compression engine, and spawned threads
- Pre-existing borrow errors in export handler and `TryFrom` implementation
- TOCTOU race in `stream_raw_content_response`
- Swallowed write errors in meta plugins (digest, magic_file, exec)
- Truncated uploads (413) now properly store compressed data
- `term::stderr().unwrap()` panic in `item_service`
- `.unwrap()` panics in compression engine `Read`/`Write` impls
- Client API errors now propagate to user instead of being swallowed
- Import endpoint returns 413 on `max_body_size` instead of truncating
- `keep --list` uses `list_format` from config in all modes
- All tables respect `table_config` from settings
- `DisplayListItem` struct removed (was unused)
- `#[serde(alias = "size")]` on `ImportMeta` for backward compatibility

292
Cargo.lock generated
View File

@@ -378,6 +378,17 @@ dependencies = [
"shlex",
]
[[package]]
name = "cfb"
version = "0.7.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d38f2da7a0a2c4ccf0065be06397cc26a81f4e528be095826eee9d4adbb8c60f"
dependencies = [
"byteorder",
"fnv",
"uuid",
]
[[package]]
name = "cfg-if"
version = "1.0.4"
@@ -435,6 +446,15 @@ dependencies = [
"strsim",
]
[[package]]
name = "clap_complete"
version = "4.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "19c9f1dde76b736e3681f28cec9d5a61299cbaae0fce80a68e43724ad56031eb"
dependencies = [
"clap",
]
[[package]]
name = "clap_derive"
version = "4.6.0"
@@ -479,6 +499,16 @@ dependencies = [
"unicode-width",
]
[[package]]
name = "command-fds"
version = "0.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f849b92c694fe237ecd8fafd1ba0df7ae0d45c1df6daeb7f68ed4220d51640bd"
dependencies = [
"nix",
"thiserror 2.0.18",
]
[[package]]
name = "config"
version = "0.15.21"
@@ -860,12 +890,6 @@ version = "1.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "92773504d58c093f6de2459af4af33faa518c13451eb8f2b5698ed3d36e7c813"
[[package]]
name = "dyn-clone"
version = "1.0.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d0881ea181b1df73ff77ffaaf9c7544ecc11e82fba9b5f27b262a3c73a332555"
[[package]]
name = "either"
version = "1.15.0"
@@ -1001,12 +1025,29 @@ version = "2.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"
[[package]]
name = "filetime"
version = "0.2.27"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f98844151eee8917efc50bd9e8318cb963ae8b297431495d3f758616ea5c57db"
dependencies = [
"cfg-if",
"libc",
"libredox",
]
[[package]]
name = "find-msvc-tools"
version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5baebc0774151f905a1a2cc41989300b1e6fbb29aff0ceffa1064fdd3088d582"
[[package]]
name = "fixedbitset"
version = "0.5.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1d674e81391d1e1ab681a28d99df07927c6d4aa5b027d7da16ba32d1d21ecd99"
[[package]]
name = "flate2"
version = "1.1.9"
@@ -1283,6 +1324,15 @@ dependencies = [
"digest 0.9.0",
]
[[package]]
name = "html-escape"
version = "0.2.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6d1ad449764d627e22bfd7cd5e8868264fc9236e07c752972b4080cd351cb476"
dependencies = [
"utf8-width",
]
[[package]]
name = "http"
version = "1.4.0"
@@ -1531,6 +1581,15 @@ dependencies = [
"serde_core",
]
[[package]]
name = "infer"
version = "0.19.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a588916bfdfd92e71cacef98a63d9b1f0d74d6599980d11894290e7ddefffcf7"
dependencies = [
"cfb",
]
[[package]]
name = "inventory"
version = "0.3.22"
@@ -1646,7 +1705,9 @@ dependencies = [
"base64",
"chrono",
"clap",
"clap_complete",
"comfy-table",
"command-fds",
"config",
"ctor",
"derive_more",
@@ -1658,13 +1719,14 @@ dependencies = [
"flate2",
"futures",
"gethostname",
"html-escape",
"http-body-util",
"humansize",
"hyper",
"infer",
"inventory",
"is-terminal",
"jsonwebtoken",
"lazy_static",
"libc",
"local-ip-address",
"log",
@@ -1672,7 +1734,6 @@ dependencies = [
"magic",
"md5",
"nix",
"once_cell",
"os_pipe",
"pest",
"pest_derive",
@@ -1680,7 +1741,6 @@ dependencies = [
"rand 0.9.2",
"regex",
"ringbuf",
"rmcp",
"rusqlite",
"rusqlite_migration",
"serde",
@@ -1689,9 +1749,11 @@ dependencies = [
"sha2 0.10.9",
"similar",
"smart-default",
"strfmt",
"strip-ansi-escapes",
"strum",
"subtle",
"tar",
"tempfile",
"term",
"thiserror 2.0.18",
@@ -1701,12 +1763,14 @@ dependencies = [
"tokio-util",
"tower",
"tower-http",
"tree_magic_mini",
"ureq",
"utoipa",
"utoipa-swagger-ui",
"uzers",
"which",
"xdg",
"zstd",
]
[[package]]
@@ -1739,7 +1803,10 @@ version = "0.1.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1744e39d1d6a9948f4f388969627434e31128196de472883b39f148769bfe30a"
dependencies = [
"bitflags 2.11.0",
"libc",
"plain",
"redox_syscall 0.7.3",
]
[[package]]
@@ -1949,6 +2016,15 @@ dependencies = [
"libc",
]
[[package]]
name = "nom"
version = "8.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "df9761775871bdef83bee530e60050f7e54b1105350d6884eb0fb4f46c2f9405"
dependencies = [
"memchr",
]
[[package]]
name = "num-bigint"
version = "0.4.6"
@@ -2045,17 +2121,11 @@ checksum = "2621685985a2ebf1c516881c026032ac7deafcda1a2c9b7850dc81e3dfcb64c1"
dependencies = [
"cfg-if",
"libc",
"redox_syscall",
"redox_syscall 0.5.18",
"smallvec",
"windows-link",
]
[[package]]
name = "paste"
version = "1.0.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "57c0d7b74b563b49d38dae00a0c37d4d6de9b432382b2892f0574ddcae73fd0a"
[[package]]
name = "pathdiff"
version = "0.2.3"
@@ -2121,6 +2191,17 @@ dependencies = [
"sha2 0.10.9",
]
[[package]]
name = "petgraph"
version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8701b58ea97060d5e5b155d383a69952a60943f0e6dfe30b04c287beb0b27455"
dependencies = [
"fixedbitset",
"hashbrown 0.15.5",
"indexmap",
]
[[package]]
name = "pin-project-lite"
version = "0.2.17"
@@ -2139,6 +2220,12 @@ version = "0.3.32"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7edddbd0b52d732b21ad9a5fab5c704c14cd949e5e9a1ec5929a24fded1b904c"
[[package]]
name = "plain"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4596b6d070b27117e987119b4dac604f3c58cfb0b191112e24771b2faeac1a6"
[[package]]
name = "portable-atomic"
version = "1.13.1"
@@ -2323,6 +2410,15 @@ dependencies = [
"bitflags 2.11.0",
]
[[package]]
name = "redox_syscall"
version = "0.7.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ce70a74e890531977d37e532c34d45e9055d2409ed08ddba14529471ed0be16"
dependencies = [
"bitflags 2.11.0",
]
[[package]]
name = "redox_users"
version = "0.5.2"
@@ -2388,40 +2484,6 @@ dependencies = [
"portable-atomic-util",
]
[[package]]
name = "rmcp"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37f2048a81a7ff7e8ef6bc5abced70c3d9114c8f03d85d7aaaafd9fd04f12e9e"
dependencies = [
"base64",
"chrono",
"futures",
"paste",
"pin-project-lite",
"rmcp-macros",
"schemars",
"serde",
"serde_json",
"thiserror 2.0.18",
"tokio",
"tokio-util",
"tracing",
]
[[package]]
name = "rmcp-macros"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "72398e694b9f6dbb5de960cf158c8699e6a1854cb5bbaac7de0646b2005763c4"
dependencies = [
"darling",
"proc-macro2",
"quote",
"serde_json",
"syn",
]
[[package]]
name = "ron"
version = "0.12.0"
@@ -2591,31 +2653,6 @@ dependencies = [
"winapi-util",
]
[[package]]
name = "schemars"
version = "0.8.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3fbf2ae1b8bc8e02df939598064d22402220cd5bbcca1c76f7d6a310974d5615"
dependencies = [
"chrono",
"dyn-clone",
"schemars_derive",
"serde",
"serde_json",
]
[[package]]
name = "schemars_derive"
version = "0.8.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32e265784ad618884abaea0600a9adf15393368d840e0222d101a072f3f7534d"
dependencies = [
"proc-macro2",
"quote",
"serde_derive_internals",
"syn",
]
[[package]]
name = "scopeguard"
version = "1.2.0"
@@ -2670,17 +2707,6 @@ dependencies = [
"syn",
]
[[package]]
name = "serde_derive_internals"
version = "0.29.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "18d26a20a969b9e3fdf2fc2d9f21eda6c40e2de84c9408bb5d3b05d499aae711"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "serde_json"
version = "1.0.149"
@@ -2864,6 +2890,12 @@ version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ce2be8dc25455e1f91df71bfa12ad37d7af1092ae736f3a6cd0e37bc7810596"
[[package]]
name = "strfmt"
version = "0.2.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "29fdc163db75f7b5ffa3daf0c5a7136fb0d4b2f35523cd1769da05e034159feb"
[[package]]
name = "strip-ansi-escapes"
version = "0.2.1"
@@ -2934,6 +2966,17 @@ dependencies = [
"syn",
]
[[package]]
name = "tar"
version = "0.4.44"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1d863878d212c87a19c1a610eb53bb01fe12951c0501cf5a0d65f724914a667a"
dependencies = [
"filetime",
"libc",
"xattr",
]
[[package]]
name = "tempfile"
version = "3.27.0"
@@ -3216,21 +3259,9 @@ checksum = "63e71662fa4b2a2c3a26f570f037eb95bb1f85397f3cd8076caed2f026a6d100"
dependencies = [
"log",
"pin-project-lite",
"tracing-attributes",
"tracing-core",
]
[[package]]
name = "tracing-attributes"
version = "0.1.31"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7490cfa5ec963746568740651ac6781f701c9c5ea257c58e057f3ba8cf69e8da"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "tracing-core"
version = "0.1.36"
@@ -3240,6 +3271,17 @@ dependencies = [
"once_cell",
]
[[package]]
name = "tree_magic_mini"
version = "3.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b8765b90061cba6c22b5831f675da109ae5561588290f9fa2317adab2714d5a6"
dependencies = [
"memchr",
"nom",
"petgraph",
]
[[package]]
name = "try-lock"
version = "0.2.5"
@@ -3368,6 +3410,12 @@ version = "0.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "09cc8ee72d2a9becf2f2febe0205bbed8fc6615b7cb429ad062dc7b7ddd036a9"
[[package]]
name = "utf8-width"
version = "0.1.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1292c0d970b54115d14f2492fe0170adf21d68a1de108eebc51c1df4f346a091"
[[package]]
name = "utf8_iter"
version = "1.0.4"
@@ -3422,6 +3470,16 @@ dependencies = [
"zip",
]
[[package]]
name = "uuid"
version = "1.22.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a68d3c8f01c0cfa54a75291d83601161799e4a89a39e0929f4b0354d88757a37"
dependencies = [
"js-sys",
"wasm-bindgen",
]
[[package]]
name = "uzers"
version = "0.12.2"
@@ -3942,6 +4000,16 @@ version = "0.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9edde0db4769d2dc68579893f2306b26c6ecfbe0ef499b013d731b7b9247e0b9"
[[package]]
name = "xattr"
version = "1.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32e45ad4206f6d2479085147f02bc2ef834ac85886624a23575ae137c8aa8156"
dependencies = [
"libc",
"rustix",
]
[[package]]
name = "xdg"
version = "2.5.2"
@@ -4099,3 +4167,31 @@ dependencies = [
"log",
"simd-adler32",
]
[[package]]
name = "zstd"
version = "0.13.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e91ee311a569c327171651566e07972200e76fcfe2242a4fa446149a3881c08a"
dependencies = [
"zstd-safe",
]
[[package]]
name = "zstd-safe"
version = "7.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f49c4d5f0abb602a93fb8736af2a4f4dd9512e36f7f570d66e65ff867ed3b9d"
dependencies = [
"zstd-sys",
]
[[package]]
name = "zstd-sys"
version = "2.0.16+zstd.1.5.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "91e19ebc2adc8f83e43039e79776e3fda8ca919132d68a1fed6a5faca2683748"
dependencies = [
"cc",
"pkg-config",
]

View File

@@ -2,13 +2,14 @@
name = "keep"
version = "0.1.0"
edition = "2024"
rust-version = "1.85"
description = "Keep and manage temporary files with automatic compression and metadata generation"
readme = "README.md"
license = "MIT"
repository = "https://gitea.gt0.ca/asp/keep"
keywords = ["cli", "files", "compression", "metadata"]
categories = ["command-line-utilities"]
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
anyhow = "1.0"
axum = { version = "0.8", optional = true }
@@ -18,6 +19,8 @@ thiserror = "2.0"
base64 = "0.22"
chrono = { version = "0.4", features = ["serde"] }
clap = { version = "4.6", features = ["derive", "env"] }
clap_complete = "4"
command-fds = "0.3"
config = "0.15"
ctor = "0.2"
directories = "6.0"
@@ -32,19 +35,19 @@ hyper = { version = "1.0", features = ["full"] }
http-body-util = "0.1"
inventory = "0.3"
is-terminal = "0.4"
lazy_static = "1.5"
libc = "0.2"
local-ip-address = "0.6"
log = "0.4"
lz4_flex = { version = "0.12", optional = true }
zstd = { version = "0.13", optional = true }
magic = { version = "0.13", optional = true }
nix = "0.30"
once_cell = "1.21"
infer = { version = "0.19", optional = true }
tree_magic_mini = { version = "3.2", optional = true }
nix = { version = "0.30", features = ["fs", "process"] }
comfy-table = "7.2"
pwhash = "1.0"
regex = "1.10"
regex = { version = "1.10", optional = true }
ringbuf = "0.4"
rmcp = { version = "0.2", features = ["server"], optional = true }
rusqlite = { version = "0.37", features = ["bundled", "array", "chrono"] }
rusqlite_migration = "2.3"
serde = { version = "1.0", features = ["derive"] }
@@ -54,6 +57,7 @@ sha2 = "0.10"
md5 = "0.7"
subtle = "2.6"
env_logger = "0.11"
strfmt = "0.2"
strum = { version = "0.27", features = ["derive"] }
term = "1.2"
tokio = { version = "1.0", features = ["full"] }
@@ -67,43 +71,54 @@ uzers = "0.12"
which = "8.0"
xdg = "2.5"
strip-ansi-escapes = "0.2"
tar = "0.4"
pest = "2.8"
pest_derive = "2.8"
dirs = "6.0"
similar = { version = "2.7", default-features = false, features = ["text"] }
html-escape = "0.2"
ureq = { version = "3", features = ["json"], optional = true }
os_pipe = { version = "1", optional = true }
axum-server = { version = "0.8", features = ["tls-rustls"], optional = true }
jsonwebtoken = { version = "10", optional = true, features = ["aws_lc_rs"] }
tiktoken-rs = { version = "0.9", optional = true }
tempfile = "3.3"
[features]
# Default features include core compression engines and swagger UI
default = ["magic", "lz4", "gzip", "client", "tokens"]
# Default features include core compression engines plugins that support MUSL
default = [
"client",
"gzip",
"filter_grep",
"meta_infer",
"lz4",
"meta_tokens",
"meta_tree_magic_mini",
"zstd"
]
# Full
#default = ["server", "magic", "lz4", "swagger"]
# Server feature (includes axum and related dependencies)
server = ["dep:axum", "dep:tower", "dep:tower-http", "dep:utoipa", "dep:jsonwebtoken"]
# Server feature (includes axum and TLS/HTTPS via axum-server; rustls already available via client/ureq)
server = ["dep:axum", "dep:tower", "dep:tower-http", "dep:utoipa", "dep:jsonwebtoken", "dep:axum-server"]
# Compression features
gzip = ["flate2"]
lz4 = ["lz4_flex"]
bzip2 = []
xz = []
zstd = []
zstd = ["dep:zstd"]
# Plugin features (meta and filter)
all-meta-plugins = ["dep:magic"]
all-filter-plugins = []
# Meta plugin features
meta_magic = ["dep:magic"]
meta_infer = ["dep:infer"]
meta_tree_magic_mini = ["dep:tree_magic_mini"]
meta_tokens = ["dep:tiktoken-rs"]
meta_all = ["meta_magic", "meta_infer", "meta_tree_magic_mini", "meta_tokens"]
meta_all_musl = ["meta_infer", "meta_tree_magic_mini", "meta_tokens"]
# Individual plugin features
magic = ["dep:magic"]
# MCP feature (Model Context Protocol support)
mcp = ["dep:rmcp"]
# Filter plugin features
filter_grep = ["dep:regex"]
filter_all = ["filter_grep"]
filter_all_musl = ["filter_grep"]
# Swagger UI feature
swagger = ["dep:utoipa-swagger-ui"]
@@ -111,12 +126,5 @@ swagger = ["dep:utoipa-swagger-ui"]
# Client feature (HTTP client for remote server)
client = ["dep:ureq", "dep:os_pipe"]
# TLS feature (HTTPS server support)
tls = ["dep:axum-server"]
# Token counting feature (LLM token support via tiktoken)
tokens = ["dep:tiktoken-rs"]
[dev-dependencies]
tempfile = "3.3"
rand = "0.9"

View File

@@ -33,7 +33,7 @@
- `modes/status.rs` - Show system status and capabilities
- `modes/server.rs` - REST HTTP/HTTPS server mode with OpenAPI documentation
- `modes/client.rs` - Client mode for remote server (streaming save, local decompression)
- `modes/common.rs` - Shared utilities for all modes
- `modes/common.rs` - Shared utilities for all modes (OutputFormat, table creation, `print_serialized`, `build_path_table`, `ensure_default_tag`, `render_item_info_table`, `render_list_table_with_format`)
### Database Module
- `db.rs` - SQLite database operations
@@ -49,24 +49,31 @@
- `compression_engine/program.rs` - External program wrapper
### Meta Plugin Module
- `meta_plugin.rs` - Trait and type definitions
- `meta_plugin.rs` - Trait and type definitions, `SaveMetaFn` callback type
- `meta_plugin/program.rs` - External program wrapper
- `meta_plugin/digest.rs` - Internal digest implementations
- `meta_plugin/system.rs` - System information metadata plugins
**SaveMetaFn Architecture**: Meta plugins are decoupled from direct DB access via a `SaveMetaFn` callback (`Arc<Mutex<dyn FnMut(&str, &str) + Send>>`). The callback is injected at `MetaService` construction and propagated to all plugins via `BaseMetaPlugin`. This enables:
- **Local mode**: Callback collects metadata into a `Vec`, written to DB after plugins finish
- **Client mode**: Callback collects into a `HashMap`, sent to server after streaming completes
- **Server mode**: Callback collects into a `Vec`, written to DB after plugins finish (same as local)
### Common Modules
- `common/is_binary.rs` - Binary file detection utilities
- `common/status.rs` - Status information generation
- `common/mod.rs` - `PIPESIZE` constant (8192), `stream_copy()` streaming utility
### Client Module
- `client.rs` - HTTP client wrapper (ureq-based, supports streaming POST)
- `modes/client/save.rs` - 3-thread streaming save (stdin → tee → compress → pipe → HTTP POST)
- `modes/client/save.rs` - 3-thread streaming save with local meta plugins (stdin → tee → compress → meta plugins → pipe → HTTP POST)
- `modes/client/get.rs` - Get with server-side raw fetch + local decompression
- `modes/client/list.rs` - List delegation to server
- `modes/client/info.rs` - Info delegation to server
- `modes/client/delete.rs` - Delete delegation to server
- `modes/client/diff.rs` - Diff delegation to server
- `modes/client/status.rs` - Status delegation to server
- `modes/client/update.rs` - Update delegation to server (sends plugin names/metadata/tags)
### Utility Modules
- `plugins.rs` - Shared plugin utilities
@@ -110,7 +117,7 @@
## Data Storage
### Database Schema
- `items` table: id (primary key), ts (timestamp), size (optional), compression
- `items` table: id (primary key), ts (timestamp), uncompressed_size (optional), compressed_size (optional), closed (boolean), compression
- `tags` table: id (foreign key to items), name (tag name)
- `metas` table: id (foreign key to items), name (meta key), value (meta value)
- Indexes on tag names and meta names for faster queries
@@ -128,16 +135,20 @@
### Item Operations
- `GET /api/item/` - Get a list of items as JSON. Optional params: `order=newest|oldest`, `start=0`, `count=100`, `tags=tag1,tag2`
- `POST /api/item/` - Add a new item (body: raw content). Query params: `tags`, `metadata` (JSON), `compress=true|false`, `meta=true|false`
- `POST /api/item/` - Add a new item (body: raw content, **streamed** through fixed-size 8192-byte buffers). Query params: `tags`, `metadata` (JSON), `compress=true|false`, `meta=true|false`
- `POST /api/item/<#>/meta` - Add metadata to an existing item (body: JSON object)
- `POST /api/item/<#>/update` - Re-run meta plugins on stored content. Query params: `plugins` (comma-separated), `metadata` (JSON), `tags` (comma-separated, idempotent)
- `DELETE /api/item/<#>` - Delete an item
- `GET /api/item/latest` - Return the latest item as JSON. Optional params: `tags=tag1,tag2`, `allow_binary=true|false`
- `GET /api/item/latest/meta` - Return the latest item metadata as JSON. Optional params: `tags=tag1,tag2`
- `GET /api/item/latest/content` - Return the raw content of the latest item. Optional params: `tags=tag1,tag2`, `decompress=true|false`
- `GET /api/item/latest/content` - Return the raw content of the latest item (**streamed**). Optional params: `tags=tag1,tag2`, `decompress=true|false`
- `GET /api/item/<#>` - Return the item as JSON. Optional params: `allow_binary=true|false`
- `GET /api/item/<#>/meta` - Return the item metadata as JSON
- `GET /api/item/<#>/content` - Return the raw content of the item. Optional params: `decompress=true|false`
- `GET /api/diff` - Diff two items. Params: `id_a`, `id_b`
- `GET /api/item/<#>/content` - Return the raw content of the item (**streamed**). Optional params: `decompress=true|false`
- `GET /api/diff` - Diff two items. Params: `id_a`, `id_b` (individual items capped at 10 MB)
### Server Configuration
- `max_body_size` - Maximum POST body size in bytes (default: unlimited). When exceeded, server returns `413 PAYLOAD_TOO_LARGE` while keeping the partial item already saved through the streaming pipeline. Set to `0` for unlimited.
### Server Modes
- **Plain HTTP** (default): `tokio::net::TcpListener` + `axum::serve()`
@@ -145,10 +156,13 @@
- Conditional selection at startup: cert+key present → HTTPS, otherwise → HTTP
### Client/Server Protocol
- Smart clients (keep CLI) set `compress=false` and `meta=false` on POST, handling compression/metadata locally
- Smart clients (keep CLI) set `compress=false` and `meta=false` on POST, handling compression and meta plugins locally
- Dumb clients (curl) use defaults (`compress=true`, `meta=true`), server handles everything
- Smart client update: sends `plugins` param to server, server runs plugins on stored content (avoids downloading compressed data)
- GET responses include `X-Keep-Compression` header when `decompress=false`
- Streaming save uses chunked transfer encoding for constant memory usage
- **Universal streaming**: All server paths (POST, GET, diff) use `PIPESIZE` (8192) byte buffers
- **413 partial item**: When `max_body_size` is exceeded, the server returns `413` but keeps the partial item already saved through the pipeline (nonfatal design — pipes continue normally)
### Authentication
- Bearer token authentication: `Authorization: Bearer <password>`
@@ -164,26 +178,25 @@
- None (no compression)
## Supported Meta Plugins
- FileMagic - File type detection using file command
- FileMime - MIME type detection using file command
- FileEncoding - File encoding detection using file command
- LineCount - Line count using wc command
- WordCount - Word count using wc command
- Cwd - Current working directory
- Binary - Binary file detection
- Uid - Current user ID
- User - Current username
- Gid - Current group ID
- Group - Current group name
- Shell - Shell path from SHELL environment variable
- ShellPid - Shell process ID from PPID environment variable
- KeepPid - Keep process ID
- DigestSha256 - SHA-256 digest
- DigestMd5 - MD5 digest using md5sum command
- ReadTime - Time taken to read data
- ReadRate - Rate of data reading
- Hostname - System hostname
- FullHostname - Fully qualified domain name
Meta plugins collect metadata during item save. Each plugin produces one or more key-value pairs:
- `magic_file` - File type detection using libmagic (when `magic` feature enabled)
- `infer` - MIME type detection using infer crate (when `infer` feature enabled)
- `tree_magic_mini` - MIME type detection using tree_magic_mini (when `tree_magic_mini` feature enabled)
- `tokens` - LLM token counting using tiktoken (when `tokens` feature enabled)
- `text` - Text analysis: line count, word count, char count, line average length
- `digest` - SHA-256 and MD5 checksums
- `hostname` - System hostname (full and short)
- `cwd` - Current working directory
- `user` - Current username and UID
- `shell` - Shell path from SHELL environment variable
- `shell_pid` - Shell process ID from PPID
- `keep_pid` - Keep process ID
- `env` - Arbitrary environment variables (via `KEEP_META_ENV_*` prefix)
- `exec` - Execute external commands for custom metadata
- `read_time` - Time taken to read content
- `read_rate` - Content read rate (bytes/second)
## Testing Strategy
- Unit tests for each module in `src/tests/`
@@ -207,12 +220,19 @@
- TLS/HTTPS support via rustls when certificate and key are provided
- Proper resource cleanup using RAII patterns
- Safe handling of external processes with proper stdin/stdout management
- **Streaming architecture**: All server I/O uses fixed-size 8192-byte buffers; no full file contents held in memory
- **XSS protection**: All user-controlled data in HTML pages is escaped via `html-escape`
- **Security headers**: `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`, `Referrer-Policy: strict-origin-when-cross-origin`
- **CORS**: Explicit allowed headers only (`Content-Type`, `Authorization`, `Accept`); no wildcard headers
- **Input limits**: Tags (256 chars), metadata keys (128 chars), metadata values (4096 chars), pagination (10,000 max)
- **Config file size**: 4 KB cap with `from_utf8_lossy` for safe UTF-8 handling
- **Error sanitization**: Internal errors never exposed in HTML responses
- **No `unsafe_code`**: Enforced via `#![deny(unsafe_code)]` (exceptions: `libc::umask` in main.rs, `unsafe impl Send` for `SendCookie` in magic_file.rs)
## Feature Flags
- `server` - HTTP REST API server (axum-based)
- `tls` - HTTPS/TLS support for server (axum-server + rustls)
- `client` - HTTP client for remote server (ureq-based, includes streaming save)
- `mcp` - Model Context Protocol for AI assistant integration
- `swagger` - OpenAPI/Swagger UI documentation
- `magic` - File type detection via libmagic
- `lz4` - LZ4 compression (internal)

View File

@@ -23,7 +23,7 @@ RUN cargo fetch --target x86_64-unknown-linux-musl
# magic feature excluded (requires shared libmagic; fallback uses `file` command)
COPY src/ src/
RUN cargo build --release --target x86_64-unknown-linux-musl \
--no-default-features --features lz4,gzip,server,mcp,swagger,client,tls \
--no-default-features --features lz4,gzip,server,swagger,client,tls \
&& strip target/x86_64-unknown-linux-musl/release/keep
# Runtime stage - scratch since binary is fully static
@@ -64,4 +64,4 @@ ENV KEEP_SERVER_PORT=21080
# ENV KEEP_CLIENT_PASSWORD=""
# ENV KEEP_CLIENT_JWT=""
ENTRYPOINT ["/keep"]
ENTRYPOINT ["/keep", "--server"]

215
README.md
View File

@@ -33,7 +33,6 @@ keep --get api-data
- [Server Mode](#server-mode)
- [Client Mode](#client-mode)
- [API Endpoints](#api-endpoints)
- [MCP (Model Context Protocol)](#mcp-model-context-protocol)
- [Shell Integration](#shell-integration)
- [Feature Flags](#feature-flags)
- [License](#license)
@@ -46,7 +45,6 @@ keep --get api-data
- **Filters** — Apply transformations (head, tail, grep, strip ANSI) on retrieval
- **Querying** — List, search, diff items with flexible formatting
- **Client/server architecture** — Optional HTTP server with streaming support
- **MCP support** — Model Context Protocol integration for AI assistants
- **Modular design** — Extensible plugin system for compression, metadata, and filtering
## Installation
@@ -72,6 +70,54 @@ cargo install --path .
# Binary at bin/keep
```
### Environment Module
A TCL modulefile is provided at `modulefile`. To use it, copy or symlink the project directory into your modules path:
```sh
# Symlink into an existing module path (e.g., /usr/local/modules)
ln -s /path/to/keep /usr/local/modules/keep
# Load the module
module load keep
# Verify
keep --status
# Source the shell profile (optional, for shell integration)
source $KEEP_BASH_PROFILE # bash
source $KEEP_ZSH_PROFILE # zsh
source $KEEP_SH_PROFILE # sh/dash/ksh
source $KEEP_CSH_PROFILE # csh/tcsh
```
The modulefile prepends `keep/bin` to `PATH` and sets shell-specific profile variables:
| Variable | Profile | Shell |
|----------|---------|-------|
| `KEEP_BASH_PROFILE` | `profile.bash` | bash |
| `KEEP_ZSH_PROFILE` | `profile.zsh` | zsh |
| `KEEP_SH_PROFILE` | `profile.sh` | sh, dash, ksh93, pdksh, mksh |
| `KEEP_CSH_PROFILE` | `profile.csh` | csh, tcsh |
### Shell Completion
Tab completion is available for `bash`, `zsh`, `fish`, `elvish`, and `powershell`. Completions for `@` (save) and `@@` (get) are available for `bash` and `zsh` only.
**Bash** — add to `~/.bashrc`:
```sh
. <(keep --generate-completion bash)
```
**Zsh** — add to `~/.zshrc`:
```sh
. <(keep --generate-completion zsh)
```
**With `profile.bash` or `profile.zsh`**: Completions for `keep`, `@` (save), and `@@` (get) are loaded automatically when sourcing the profile.
### Build with Server/Client Features
```sh
@@ -82,16 +128,19 @@ cargo build --release --features server
cargo build --release --features client
# Server + client + all optional features
cargo build --release --features server,tls,client,swagger,mcp
cargo build --release --features server,client,swagger
```
## Quick Start
```sh
# Save content with a tag
echo "Hello, world!" | keep --save greeting
# Save content with a tag (--save is optional when piping)
echo "Hello, world!" | keep greeting
# Retrieve by tag
# Retrieve by ID (--get is optional for numeric IDs)
keep 1
# Retrieve by tag (--get is required for tags)
keep --get greeting
# List all stored items
@@ -100,8 +149,8 @@ keep --list
# Get item details
keep --info greeting
# Delete by tag
keep --delete greeting
# Delete by ID
keep --delete 1
```
### Real-World Examples
@@ -130,36 +179,36 @@ keep --list --meta project=myapp
### Save Mode
Save stdin content with tags and metadata.
Save stdin content with tags and metadata. The `--save` flag is optional when piping content.
```sh
# Save (auto-assigned ID, no tag)
echo "data" | keep --save
# Save with a tag
# Save with a tag (--save is optional when piping)
echo "data" | keep --save my-tag
echo "data" | keep my-tag
# Save with multiple tags and metadata
cat report.pdf | keep --save report --meta project=alpha --meta env=prod
# Specify compression and digest algorithm
echo "data" | keep --save my-tag --compression gzip --digest sha256
# Specify compression
echo "data" | keep --save my-tag --compression gzip
```
Tags and metadata make items easy to find later. Tags are simple identifiers; metadata is key-value pairs.
### Get Mode
Retrieve items by ID or tags. This is the default mode when IDs are provided.
Retrieve items by ID. This is the default mode when numeric IDs are provided.
```sh
# Get by ID
# Get by ID (no --get needed for numeric IDs)
keep --get 1
keep 1
# Get by tag
# Get by tag (requires --get flag)
keep --get my-tag
keep my-tag
# Get with filters applied
keep --get 1 --filters "head_lines(10)"
@@ -207,7 +256,7 @@ keep --info --meta key=value
### Update Mode
Update an item's tags and metadata.
Update an item's tags, metadata, and re-run meta plugins.
```sh
# Replace tags
@@ -218,6 +267,9 @@ keep --update 1 --meta key=newvalue
# Remove a metadata key
keep --update 1 --meta key
# Re-run meta plugins on stored content
keep --update 1 --meta-plugin digest --meta-plugin text
```
### Delete Mode
@@ -293,8 +345,8 @@ Items are compressed automatically on save. Default: LZ4.
| `gzip` | Internal | Fast | Good |
| `bzip2` | External | Slow | Better |
| `xz` | External | Slowest | Best |
| `zstd` | External | Fast | Good |
| `none` | Internal | N/A | N/A |
| `zstd` | Internal | Fast | Good |
| `raw` | Internal | N/A | N/A |
```sh
# Specify compression per item
@@ -315,7 +367,7 @@ Metadata is automatically extracted when saving items.
| `env` | `*` | Capture `KEEP_META_*` environment variables |
| `magic_file` | `file_type` | File type detection (requires `magic` feature) |
| `text` | `text_line_count`, `text_word_count` | Line and word counts |
| `user` | `uid`, `user`, `gid`, `group` | Current user info |
| `user` | `user_uid`, `user_name`, `user_gid`, `user_group` | Current user info |
| `shell` | `shell` | Current shell path |
| `shell_pid` | `shell_pid` | Shell process ID |
| `keep_pid` | `keep_pid` | Keep process ID |
@@ -327,8 +379,11 @@ Metadata is automatically extracted when saving items.
| `cwd` | `cwd` | Current working directory |
```sh
# Use specific plugins
echo "data" | keep --save tag --meta-plugins "digest,text,user"
# Use specific plugins (repeatable)
echo "data" | keep --save tag --meta-plugin digest --meta-plugin text --meta-plugin user
# Pass options to a plugin via JSON
echo "data" | keep --save tag --meta-plugin 'tokens:{"options":{"min_length":"2"}}'
# Capture custom metadata via environment
KEEP_META_project=alpha echo "data" | keep --save tag
@@ -346,7 +401,7 @@ KEEP_META_build=1234 echo "data" | keep --save tag --meta env=staging
| `KEEP_DIR` | Storage directory | `~/.keep` |
| `KEEP_CONFIG` | Config file path | `~/.config/keep/config.yml` |
| `KEEP_COMPRESSION` | Compression algorithm | `lz4` |
| `KEEP_META_PLUGINS` | Meta plugins to use | `env` |
| `KEEP_META_PLUGINS` | Meta plugins to use (JSON format: `name[:{json}]`, comma-separated) | `env` |
| `KEEP_FILTERS` | Default filter chain | none |
| `KEEP_LIST_FORMAT` | List column format | built-in defaults |
| `KEEP_SERVER_ADDRESS` | Server bind address | `127.0.0.1` |
@@ -356,6 +411,7 @@ KEEP_META_build=1234 echo "data" | keep --save tag --meta env=staging
| `KEEP_SERVER_PASSWORD_HASH` | Server password hash | none |
| `KEEP_SERVER_JWT_SECRET` | JWT secret for token auth | none |
| `KEEP_SERVER_JWT_SECRET_FILE` | Path to JWT secret file | none |
| `KEEP_SERVER_MAX_BODY_SIZE` | Maximum POST body size in bytes (0=unlimited) | unlimited |
| `KEEP_SERVER_CERT` | TLS certificate file path (PEM) | none |
| `KEEP_SERVER_KEY` | TLS private key file path (PEM) | none |
| `KEEP_CLIENT_URL` | Remote keep server URL | none |
@@ -416,6 +472,8 @@ server:
port: 21080
username: "keep"
password: "secret"
# Maximum POST body size in bytes (0 = unlimited)
# max_body_size: 52428800 # 50 MB
# JWT authentication (takes priority over password)
# jwt_secret: "my-secret-key"
# jwt_secret_file: /path/to/jwt_secret
@@ -612,6 +670,33 @@ keep --client-url https://localhost:21080 --save my-tag
The server accepts data from both dumb clients (raw HTTP/curl) and smart clients (the keep CLI).
#### Server Streaming
The server streams all data through fixed-size buffers (8192 bytes). At no point is the entire file content held in memory.
- **POST**: Body streams through the compression and storage pipeline in chunks. When `max_body_size` is exceeded, the server returns `413 PAYLOAD_TOO_LARGE` while keeping the partial item already saved through the pipeline.
- **GET**: Content streams from disk through decompression to the client using the same fixed-size buffers.
- **Diff**: Individual items are capped at 10 MB for the diff endpoint to prevent unbounded memory use.
##### Max Body Size
Control the maximum accepted body size with:
```sh
# Via CLI flag (bytes)
keep --server --server-max-body-size 52428800
# Via environment variable
export KEEP_SERVER__MAX_BODY_SIZE=52428800
keep --server
# Via config file (config.yml)
server:
max_body_size: 52428800 # 50 MB
```
When set to `0` or omitted, no limit is enforced.
#### Server Query Parameters
The server supports query parameters that control processing:
@@ -624,6 +709,14 @@ The server supports query parameters that control processing:
| `meta` | `true` | `false` = client handles metadata, skip server-side plugins |
| `decompress` | `true` | `false` = return raw compressed bytes on GET |
The `POST /api/item/{id}/update` endpoint accepts additional parameters:
| Parameter | Default | Description |
|-----------|---------|-------------|
| `plugins` | none | Comma-separated plugin names to re-run on stored content |
| `metadata` | none | JSON-encoded metadata overrides to apply |
| `tags` | none | Comma-separated tags to add (idempotent) |
When using a smart client, these are set automatically. For curl, the server handles everything by default.
#### Example: Curl as a Dumb Client
@@ -668,9 +761,10 @@ export KEEP_CLIENT_JWT=<jwt-token>
Client mode uses **local plugins** and **remote storage**:
1. **Save**: Local compression and metadata plugins run on the client; compressed data streams to the server
1. **Save**: Local compression and meta plugins run on the client; compressed data streams to the server. Smart clients set `meta=false` so the server skips its own plugins.
2. **Get**: Server sends raw compressed data; client decompresses locally and applies filters
3. **Other operations** (list, info, delete, diff): Delegated directly to the server
3. **Update**: Meta plugins run on the server to avoid downloading compressed data for re-processing
4. **Other operations** (list, info, delete, diff): Delegated directly to the server
This means client behavior is consistent with local mode — the same compression settings and filters apply.
@@ -679,24 +773,25 @@ This means client behavior is consistent with local mode — the same compressio
Client save uses a 3-thread streaming pipeline for constant memory usage regardless of data size:
```
┌──────────────┐ OS pipe ┌────────────────┐
┌───────────────────┐ OS pipe ┌────────────────┐
│ Reader thread ├──────────────────┤ Streamer thread│
│ │ (compressed │ │
│ stdin → tee │ bytes) │ pipe → POST │
│ → hash │ │ (chunked) │
│ → compress │ │ │
└──────────────┘ └────────────────┘
│ → meta plugins │
└───────────────────┘ └────────────────┘
│ │
▼ ▼
stdout + Server stores blob
SHA-256 digest
computed metadata
```
- **Reader thread**: Reads stdin, tees output to stdout, computes SHA-256, compresses data, writes to OS pipe
- **Reader thread**: Reads stdin, tees output to stdout, computes SHA-256 via digest plugin, compresses data, runs meta plugins (hostname, text, etc.), writes to OS pipe
- **Streamer thread**: Reads compressed bytes from pipe, streams to server via chunked HTTP POST
- **Main thread**: After streaming completes, sends computed metadata (digest, hostname, size) to server
- **Main thread**: After streaming completes, sends plugin-collected metadata to server
Memory usage is O(PIPESIZE) — typically 64KB — regardless of how much data is being stored.
Memory usage is O(PIPESIZE) — typically 8 KB — regardless of how much data is being stored.
#### Example: Remote Pipeline
@@ -729,6 +824,7 @@ keep --client-url http://logserver:21080 --list --meta project=myapp
| `GET` | `/api/item/{id}/meta` | Item metadata by ID |
| `GET` | `/api/item/{id}/info` | Item info by ID |
| `POST` | `/api/item/{id}/meta` | Add metadata to existing item (body: JSON object) |
| `POST` | `/api/item/{id}/update` | Re-run meta plugins on stored content (params: `plugins`, `metadata`, `tags`) |
| `DELETE` | `/api/item/{id}` | Delete item by ID |
| `GET` | `/api/diff` | Diff two items (`id_a`, `id_b` params) |
@@ -769,42 +865,54 @@ cargo build --features server,swagger
Swagger UI available at `/swagger`, OpenAPI spec at `/openapi.json`.
## MCP (Model Context Protocol)
#### Security
AI assistant integration via the Model Context Protocol. Enable with the `mcp` feature.
The server applies the following security measures:
```sh
cargo build --features server,mcp
```
MCP endpoint available at `/mcp/sse` when the server is running.
### Available Tools
| Tool | Description | Parameters |
|------|-------------|------------|
| `save_item` | Save new content | `content`, `tags[]`, `metadata{}` |
| `get_item` | Get item by ID | `id` |
| `get_latest_item` | Get latest item | `tags[]` |
| `list_items` | List items | `tags[]`, `limit`, `offset` |
| `search_items` | Search items | `tags[]`, `metadata{}` |
- **Input validation**: Item IDs are validated as positive integers; tags and metadata have length limits (256 and 128 characters respectively).
- **XSS protection**: All user-controlled data rendered into HTML pages is escaped.
- **Security headers**: Responses include `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`, and `Referrer-Policy: strict-origin-when-cross-origin`.
- **CORS**: Explicit allowed headers (`Content-Type`, `Authorization`, `Accept`); no wildcard headers.
- **Path traversal**: Item IDs are validated to prevent directory traversal attacks.
- **Internal errors**: Internal error details are never exposed in HTML responses — only generic messages are shown.
## Shell Integration
Source `profile.bash` to enable shell integration:
Profile scripts are provided for several shells. Source the appropriate one to enable shell integration:
| Profile | Shells | Features |
|---------|--------|----------|
| `profile.bash` | bash | Preexec hook, wrapper function, `@`/`@@` aliases, tab completions |
| `profile.zsh` | zsh | Preexec hook, wrapper function, `@`/`@@` aliases, tab completions |
| `profile.sh` | sh, dash, ksh93, pdksh, mksh | Wrapper function, `@`/`@@` aliases |
| `profile.csh` | csh, tcsh | Alias-based `keep` wrapper, `@`/`@@` aliases |
```sh
# bash
source /path/to/keep/profile.bash
# zsh
source /path/to/keep/profile.zsh
# sh, dash, ksh
source /path/to/keep/profile.sh
# csh/tcsh
source /path/to/keep/profile.csh
```
This provides:
All profiles provide:
- **`keep` function** — Captures the current command in metadata automatically
- **`@` alias** — Shorthand for `keep --save`
- **`@@` alias** — Shorthand for `keep --get`
Bash and zsh profiles additionally provide:
- **`keep` function** — Captures the current command in metadata automatically
- **Tab completion** — For `keep`, `@`, and `@@`
```sh
# Save with automatic command capture
# Save with automatic command capture (bash/zsh)
curl -s api.example.com | @ api-response
# Quick retrieve
@@ -821,7 +929,6 @@ curl -s api.example.com | @ api-response
| `server` | No | HTTP REST API server |
| `tls` | No | HTTPS/TLS server support (requires `server`) |
| `client` | No | HTTP client for remote server |
| `mcp` | No | Model Context Protocol support |
| `swagger` | No | Swagger UI for API docs |
| `bzip2` | No | BZip2 compression (external program) |
| `xz` | No | XZ compression (external program) |
@@ -838,7 +945,7 @@ cargo build --features server,tls
cargo build --features client
# Everything
cargo build --features server,tls,client,mcp,swagger,magic
cargo build --features server,tls,client,swagger,magic
```
## License

View File

@@ -2,7 +2,6 @@
set -ex
export RUSTFLAGS='-C target-feature=+crt-static'
cargo build --release --target x86_64-unknown-linux-gnu
cargo build --release --target x86_64-unknown-linux-musl
mkdir -p bin
cp target/x86_64-unknown-linux-gnu/release/keep ./bin/
cp target/x86_64-unknown-linux-musl/release/keep ./bin/

View File

@@ -15,3 +15,6 @@ module-whatis Keep
prepend-path PATH $mydir/bin
setenv KEEP_BASH_PROFILE ${mydir}/profile.bash
setenv KEEP_ZSH_PROFILE ${mydir}/profile.zsh
setenv KEEP_SH_PROFILE ${mydir}/profile.sh
setenv KEEP_CSH_PROFILE ${mydir}/profile.csh

View File

@@ -6,18 +6,10 @@ function __keep_preexec {
}
function __keep_preexec_init {
local found=false
local f
for f in "${preexec_functions[@]}"; do
if [[ $f = __keep_preexec ]]; then
found=true
break
fi
[[ $f = __keep_preexec ]] && return
done
if [[ $found = false ]]; then
preexec_functions+=(__keep_preexec)
fi
}
function keep {
@@ -40,4 +32,20 @@ function @@ {
keep --get "$@"
}
# Shell completions
. <(command keep --generate-completion bash)
___keep_complete() {
local mode="$1"
COMP_WORDS=(keep "$mode" "${COMP_WORDS[@]:1}")
COMP_CWORD=$((COMP_CWORD + 1))
_keep
}
___keep_save_completion() { ___keep_complete --save; }
___keep_get_completion() { ___keep_complete --get; }
complete -F ___keep_save_completion @
complete -F ___keep_get_completion @@
__keep_preexec_init

11
profile.csh Normal file
View File

@@ -0,0 +1,11 @@
#!/bin/csh
# Profile for csh and tcsh.
# Preexec hooks are not available; KEEP_META_command is not set.
if ( ! $?KEEP_META_tty ) then
setenv KEEP_META_tty `tty`
endif
alias keep 'env KEEP_META_tty=${KEEP_META_tty} command keep \!*'
alias @ 'keep --save \!*'
alias @@ 'keep --get \!*'

13
profile.sh Normal file
View File

@@ -0,0 +1,13 @@
#!/bin/sh
# POSIX-compatible profile for sh, dash, ksh93, pdksh, mksh, and other POSIX shells.
# Preexec hooks are not available in these shells; KEEP_META_command is not set.
KEEP_META_tty=${KEEP_META_tty:-$(tty)}
keep() {
export KEEP_META_tty
command keep "$@"
}
alias @='keep --save'
alias @@='keep --get'

38
profile.zsh Normal file
View File

@@ -0,0 +1,38 @@
#!/bin/zsh
autoload -U add-zsh-hook
__keep_preexec() {
KEEP_META_command="$1"
KEEP_META_tty=${KEEP_META_tty:-$(tty)}
}
add-zsh-hook preexec __keep_preexec
keep() {
if [[ $ZSH_SUBSHELL -le 2 ]]; then
export KEEP_META_command
fi
export KEEP_META_tty
command keep "$@"
}
alias @='keep --save'
alias @@='keep --get'
# Shell completions
. <(command keep --generate-completion zsh)
___keep_complete() {
local mode="$1"
local -a words
words=(keep "$mode" "${words[@]:1}")
((CURRENT++))
_keep
}
___keep_save_completion() { ___keep_complete --save; }
___keep_get_completion() { ___keep_complete --get; }
compdef ___keep_save_completion @
compdef ___keep_get_completion @@

View File

@@ -2,6 +2,7 @@ use std::path::PathBuf;
use std::str::FromStr;
use clap::*;
use clap_complete::Shell;
/// Main struct for command-line arguments, parsed via Clap.
#[derive(Parser, Debug, Clone)]
@@ -23,70 +24,157 @@ pub struct Args {
/// Struct for mode-specific arguments, defining CLI flags for different operations.
#[derive(Parser, Debug, Clone)]
pub struct ModeArgs {
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["get", "diff", "list", "delete", "info", "status"]))]
#[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("Save an item using any tags or metadata provided"))]
pub save: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "diff", "list", "delete", "info", "status"]))]
#[arg(help(
"Get an item either by it's ID or by a combination of matching tags and metatdata"
))]
#[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("Get an item either by its ID or by a combination of matching tags and metadata"))]
pub get: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "list", "delete", "info", "status"]))]
#[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Show a diff between two items by ID"))]
pub diff: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "get", "diff", "delete", "info", "status"]))]
#[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("List items, filtering on tags or metadata if given"))]
pub list: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "get", "diff", "list", "info", "status"]))]
#[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("Delete items either by ID or by matching tags"))]
#[arg(requires = "ids_or_tags")]
pub delete: bool,
#[arg(group("mode"), help_heading("Mode Options"), short, long, conflicts_with_all(["save", "get", "diff", "list", "delete", "status"]))]
#[arg(help(
"Get an item either by it's ID or by a combination of matching tags and metatdata"
))]
#[arg(group("mode"), help_heading("Mode Options"), short, long)]
#[arg(help("Get an item either by its ID or by a combination of matching tags and metadata"))]
pub info: bool,
#[arg(group("mode"), help_heading("Mode Options"), short('S'), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "server", "status_plugins"]))]
#[arg(group("mode"), help_heading("Mode Options"), short('u'), long)]
#[arg(help("Update an item's tags and metadata by ID"))]
pub update: bool,
#[arg(group("mode"), help_heading("Mode Options"), short('S'), long)]
#[arg(help("Show status of directories and supported compression algorithms"))]
pub status: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "status", "server"]))]
#[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Show available plugins and their configurations"))]
pub status_plugins: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "status"]))]
#[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Export items to a .keep.tar archive (requires IDs or tags)"))]
pub export: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, value_name("FILE"))]
#[arg(help("Import items from a .keep.tar archive or legacy .meta.yml file"))]
pub import: Option<String>,
#[cfg(feature = "server")]
#[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Start REST HTTP server"))]
pub server: bool,
#[arg(group("mode"), help_heading("Mode Options"), long, conflicts_with_all(["save", "get", "diff", "list", "delete", "info", "status", "server"]))]
#[arg(group("mode"), help_heading("Mode Options"), long)]
#[arg(help("Generate default configuration and output to stdout"))]
pub generate_config: bool,
#[arg(help_heading("Mode Options"), long)]
#[arg(help("Generate shell completion script"))]
pub generate_completion: Option<Shell>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_ADDRESS"))]
#[arg(help("Server address to bind to"))]
pub server_address: Option<String>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PORT"))]
#[arg(help("Server port to bind to"))]
pub server_port: Option<u16>,
#[cfg(feature = "tls")]
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_CERT"))]
#[arg(help("Path to TLS certificate file (PEM) for HTTPS"))]
pub server_cert: Option<PathBuf>,
#[cfg(feature = "tls")]
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_KEY"))]
#[arg(help("Path to TLS private key file (PEM) for HTTPS"))]
pub server_key: Option<PathBuf>,
}
/// Represents a meta plugin argument with optional JSON config.
///
/// Parsed from `name` or `name:{"options":{...},"outputs":{...}}` syntax.
#[derive(Debug, Clone)]
pub struct MetaPluginArg {
pub name: String,
pub options: Option<serde_json::Value>,
}
impl FromStr for MetaPluginArg {
type Err = anyhow::Error;
fn from_str(s: &str) -> Result<Self, Self::Err> {
if let Some((name, json_str)) = s.split_once(':') {
let value: serde_json::Value = serde_json::from_str(json_str)
.map_err(|e| anyhow::anyhow!("Invalid JSON for meta plugin '{}': {}", name, e))?;
Ok(MetaPluginArg {
name: name.to_string(),
options: Some(value),
})
} else {
Ok(MetaPluginArg {
name: s.to_string(),
options: None,
})
}
}
}
/// Represents a metadata key-value argument.
///
/// Parsed from `key=value` (set) or `key` (delete/filter by existence).
#[derive(Debug, Clone)]
pub enum MetaArg {
/// Set metadata with a value.
Set { key: String, value: String },
/// Bare key without a value (delete in update mode, filter by existence otherwise).
Key(String),
}
impl MetaArg {
/// Returns the key.
pub fn key(&self) -> &str {
match self {
MetaArg::Set { key, .. } | MetaArg::Key(key) => key,
}
}
/// Returns the value if this is a Set variant.
pub fn value(&self) -> Option<&str> {
match self {
MetaArg::Set { value, .. } => Some(value),
MetaArg::Key(_) => None,
}
}
}
impl FromStr for MetaArg {
type Err = anyhow::Error;
fn from_str(s: &str) -> Result<Self, Self::Err> {
if let Some((key, value)) = s.split_once('=') {
Ok(MetaArg::Set {
key: key.to_string(),
value: value.to_string(),
})
} else {
Ok(MetaArg::Key(s.to_string()))
}
}
}
/// Struct for item-specific arguments, such as compression and plugins.
#[derive(Parser, Debug, Clone)]
pub struct ItemArgs {
@@ -97,15 +185,32 @@ pub struct ItemArgs {
#[arg(
help_heading("Item Options"),
short('M'),
long,
long = "meta-plugin",
value_parser = clap::value_parser!(MetaPluginArg),
env("KEEP_META_PLUGINS")
)]
#[arg(help("Meta plugins to use when saving items"))]
pub meta_plugins: Vec<String>,
#[arg(help("Meta plugin to use (repeatable): name or name:{json}"))]
pub meta_plugins: Vec<MetaPluginArg>,
#[arg(help_heading("Item Options"), long)]
#[arg(help("Metadata key=value to set (or key to delete in --update)"))]
pub meta: Vec<String>,
#[arg(help_heading("Item Options"), long, env("KEEP_FILTERS"))]
#[arg(help("Filter string to apply to content when getting items"))]
pub filters: Option<String>,
#[arg(help_heading("Export Options"), long, default_value = "{name}_{ts}")]
#[arg(help("Template for export tar filename (appends .keep.tar). Variables: {name} {ts}"))]
pub export_filename_format: String,
#[arg(help_heading("Export Options"), long, value_name("NAME"))]
#[arg(help("Export name used for {name} variable (default: export_<common-tags>)"))]
pub export_name: Option<String>,
#[arg(help_heading("Import Options"), long, value_name("DATA_FILE"))]
#[arg(help("Data file for import (reads from stdin if omitted)"))]
pub import_data_file: Option<PathBuf>,
}
/// Struct for general options, including verbosity, paths, and output settings.
@@ -122,7 +227,7 @@ pub struct OptionsArgs {
#[arg(
long,
env("KEEP_LIST_FORMAT"),
default_value("id,time,size,tags,meta:hostname")
default_value("id,time,size,meta:text_line_count,tags,meta:hostname_short,meta:command")
)]
#[arg(help("A comma separated list of columns to display with --list"))]
pub list_format: String,
@@ -131,6 +236,10 @@ pub struct OptionsArgs {
#[arg(help("Display file sizes with units"))]
pub human_readable: bool,
#[arg(long)]
#[arg(help("Only output item IDs (for scripting)"))]
pub ids_only: bool,
#[arg(short, long, action = clap::ArgAction::Count, conflicts_with("quiet"))]
#[arg(help("Increase message verbosity, can be given more than once"))]
pub verbose: u8,
@@ -143,28 +252,42 @@ pub struct OptionsArgs {
#[arg(help("Output format (only works with --info, --status, --list)"))]
pub output_format: Option<String>,
#[arg(long, env("KEEP_SERVER_PASSWORD"))]
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PASSWORD"))]
#[arg(help("Password for server authentication (requires --server)"))]
pub server_password: Option<String>,
#[arg(long, env("KEEP_SERVER_PASSWORD_HASH"))]
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_PASSWORD_HASH"))]
#[arg(help("Password hash for server authentication (requires --server)"))]
pub server_password_hash: Option<String>,
#[arg(long, env("KEEP_SERVER_USERNAME"))]
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_USERNAME"))]
#[arg(help(
"Username for server Basic authentication (requires --server, defaults to 'keep')"
))]
pub server_username: Option<String>,
#[arg(long, env("KEEP_SERVER_JWT_SECRET"))]
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_JWT_SECRET"))]
#[arg(help("JWT secret for token-based authentication (requires --server)"))]
pub server_jwt_secret: Option<String>,
#[arg(long, env("KEEP_SERVER_JWT_SECRET_FILE"))]
#[cfg(feature = "server")]
#[arg(
help_heading("Server Options"),
long,
env("KEEP_SERVER_JWT_SECRET_FILE")
)]
#[arg(help("Path to file containing JWT secret (requires --server)"))]
pub server_jwt_secret_file: Option<PathBuf>,
#[cfg(feature = "server")]
#[arg(help_heading("Server Options"), long, env("KEEP_SERVER_MAX_BODY_SIZE"))]
#[arg(help("Maximum request body size in bytes (requires --server, default: unlimited)"))]
pub server_max_body_size: Option<u64>,
#[cfg(feature = "client")]
#[arg(long, env("KEEP_CLIENT_URL"), help_heading("Client Options"))]
#[arg(help("Remote keep server URL for client mode"))]

View File

@@ -1,18 +1,37 @@
use crate::services::error::CoreError;
use crate::services::{ItemInfo, error::CoreError};
use base64::Engine;
use serde::de::DeserializeOwned;
use std::collections::HashMap;
use std::io::Read;
/// Item information returned from the server API.
#[derive(Debug, Clone, serde::Deserialize, serde::Serialize)]
pub struct ItemInfo {
pub id: i64,
pub ts: String,
pub size: Option<i64>,
pub compression: String,
pub tags: Vec<String>,
pub metadata: HashMap<String, String>,
/// Percent-encode a value for use in a URL query string.
fn url_encode(s: &str) -> String {
let mut result = String::with_capacity(s.len() * 3);
for byte in s.bytes() {
match byte {
b'A'..=b'Z' | b'a'..=b'z' | b'0'..=b'9' | b'-' | b'_' | b'.' | b'~' => {
result.push(byte as char);
}
_ => {
result.push('%');
result.push(char::from_digit((byte >> 4) as u32, 16).unwrap());
result.push(char::from_digit((byte & 0xF) as u32, 16).unwrap());
}
}
}
result
}
fn append_query_params(url: &mut String, params: &[(&str, &str)]) {
if !params.is_empty() {
url.push('?');
for (i, (key, value)) in params.iter().enumerate() {
if i > 0 {
url.push('&');
}
url.push_str(&format!("{}={}", url_encode(key), url_encode(value)));
}
}
}
pub struct KeepClient {
@@ -107,15 +126,7 @@ impl KeepClient {
params: &[(&str, &str)],
) -> Result<T, CoreError> {
let mut url = self.url(path);
if !params.is_empty() {
url.push('?');
for (i, (key, value)) in params.iter().enumerate() {
if i > 0 {
url.push('&');
}
url.push_str(&format!("{key}={value}"));
}
}
append_query_params(&mut url, params);
let mut req = self.agent.get(&url);
if let Some(ref auth) = self.auth_header() {
req = req.header("Authorization", auth);
@@ -160,15 +171,7 @@ impl KeepClient {
params: &[(&str, &str)],
) -> Result<ItemInfo, CoreError> {
let mut url = self.url(path);
if !params.is_empty() {
url.push('?');
for (i, (key, value)) in params.iter().enumerate() {
if i > 0 {
url.push('&');
}
url.push_str(&format!("{key}={value}"));
}
}
append_query_params(&mut url, params);
let mut req = self.agent.post(&url);
if let Some(ref auth) = self.auth_header() {
@@ -205,40 +208,77 @@ impl KeepClient {
Ok(())
}
pub fn get_status(&self) -> Result<serde_json::Value, CoreError> {
self.get_json("/api/status")
pub fn get_status(&self) -> Result<crate::common::status::StatusInfo, CoreError> {
#[derive(serde::Deserialize)]
struct ApiResponse {
data: Option<crate::common::status::StatusInfo>,
error: Option<String>,
}
let response: ApiResponse = self.get_json("/api/status")?;
response.data.ok_or_else(|| {
CoreError::Other(anyhow::anyhow!(
"{}",
response
.error
.unwrap_or_else(|| "No status data returned".to_string())
))
})
}
pub fn get_item_info(&self, id: i64) -> Result<ItemInfo, CoreError> {
#[derive(serde::Deserialize)]
struct ApiResponse {
data: Option<ItemInfo>,
error: Option<String>,
}
let response: ApiResponse = self.get_json(&format!("/api/item/{id}/info"))?;
response.data.ok_or_else(|| {
CoreError::Other(anyhow::anyhow!(
"{}",
response
.data
.ok_or_else(|| CoreError::Other(anyhow::anyhow!("Item not found")))
.error
.unwrap_or_else(|| "Item not found".to_string())
))
})
}
pub fn list_items(
&self,
ids: &[i64],
tags: &[String],
order: &str,
start: u64,
count: u64,
meta: &HashMap<String, Option<String>>,
) -> Result<Vec<ItemInfo>, CoreError> {
#[derive(serde::Deserialize)]
struct ApiResponse {
data: Option<Vec<ItemInfo>>,
error: Option<String>,
}
let mut params: Vec<(String, String)> = Vec::new();
params.push(("order".to_string(), order.to_string()));
params.push(("start".to_string(), start.to_string()));
params.push(("count".to_string(), count.to_string()));
if !ids.is_empty() {
params.push((
"ids".to_string(),
ids.iter()
.map(|i| i.to_string())
.collect::<Vec<_>>()
.join(","),
));
}
if !tags.is_empty() {
params.push(("tags".to_string(), tags.join(",")));
}
if !meta.is_empty() {
let meta_json = serde_json::to_string(meta).map_err(|e| {
CoreError::Other(anyhow::anyhow!("Failed to serialize meta filter: {}", e))
})?;
params.push(("meta".to_string(), meta_json));
}
let param_refs: Vec<(&str, &str)> = params
.iter()
@@ -246,7 +286,13 @@ impl KeepClient {
.collect();
let response: ApiResponse = self.get_json_with_query("/api/item/", &param_refs)?;
Ok(response.data.unwrap_or_default())
if let Some(data) = response.data {
return Ok(data);
}
if let Some(err) = response.error {
return Err(CoreError::Other(anyhow::anyhow!("Server error: {err}")));
}
Ok(Vec::new())
}
pub fn save_item(
@@ -303,7 +349,36 @@ impl KeepClient {
Ok(())
}
/// Set the uncompressed size for an item.
pub fn set_item_size(&self, id: i64, size: u64) -> Result<(), CoreError> {
let url = format!(
"{}?uncompressed_size={}",
self.url(&format!("/api/item/{id}/update")),
url_encode(&size.to_string())
);
let mut req = self.agent.post(&url);
if let Some(ref auth) = self.auth_header() {
req = req.header("Authorization", auth);
}
self.handle_error(req.send(ureq::SendBody::from_reader(&mut std::io::empty())))?;
Ok(())
}
pub fn get_item_content_raw(&self, id: i64) -> Result<(Vec<u8>, String), CoreError> {
let (mut reader, compression) = self.get_item_content_stream(id)?;
let mut bytes = Vec::new();
reader
.read_to_end(&mut bytes)
.map_err(|e| CoreError::Other(anyhow::anyhow!("{}", e)))?;
Ok((bytes, compression))
}
/// Get a streaming reader for item content without decompression.
///
/// Returns a reader over the HTTP response body and the compression type
/// from the X-Keep-Compression header. The caller can stream through
/// decompression readers without buffering the entire file in memory.
pub fn get_item_content_stream(&self, id: i64) -> Result<(Box<dyn Read>, String), CoreError> {
let url = format!(
"{}?decompress=false",
self.url(&format!("/api/item/{id}/content"))
@@ -320,15 +395,11 @@ impl KeepClient {
.headers()
.get("X-Keep-Compression")
.and_then(|v| v.to_str().ok())
.unwrap_or("none")
.unwrap_or("raw")
.to_string();
let mut body = response.into_body();
let bytes = body
.read_to_vec()
.map_err(|e| CoreError::Other(anyhow::anyhow!("{}", e)))?;
Ok((bytes, compression))
let reader = response.into_body().into_reader();
Ok((Box::new(reader), compression))
}
pub fn diff_items(&self, id_a: i64, id_b: i64) -> Result<Vec<String>, CoreError> {
@@ -343,4 +414,101 @@ impl KeepClient {
let response: ApiResponse = self.get_json_with_query("/api/diff", &param_refs)?;
Ok(response.data.unwrap_or_default())
}
/// Export items to a tar archive, streaming the response to a file.
///
/// # Arguments
///
/// * `ids` - Item IDs to export (mutually exclusive with tags).
/// * `tags` - Tags to search for items (mutually exclusive with ids).
/// * `dest` - Destination file path.
pub fn export_items_to_file(
&self,
ids: &[i64],
tags: &[String],
dest: &std::path::Path,
) -> Result<(), CoreError> {
let mut params: Vec<(String, String)> = Vec::new();
if !ids.is_empty() {
let id_strs: Vec<String> = ids.iter().map(|id| id.to_string()).collect();
params.push(("ids".to_string(), id_strs.join(",")));
}
if !tags.is_empty() {
params.push(("tags".to_string(), tags.join(",")));
}
let param_refs: Vec<(&str, &str)> = params
.iter()
.map(|(k, v)| (k.as_str(), v.as_str()))
.collect();
let mut url = self.url("/api/export");
append_query_params(&mut url, &param_refs);
let mut req = self.agent.get(&url);
if let Some(ref auth) = self.auth_header() {
req = req.header("Authorization", auth);
}
let response = self.handle_error(req.call())?;
let mut reader = response.into_body().into_reader();
let mut file = std::fs::File::create(dest).map_err(CoreError::Io)?;
let mut buf = [0u8; crate::common::PIPESIZE];
loop {
let n = reader.read(&mut buf).map_err(CoreError::Io)?;
if n == 0 {
break;
}
std::io::Write::write_all(&mut file, &buf[..n]).map_err(CoreError::Io)?;
}
Ok(())
}
/// Import items from a tar archive, streaming the file to the server.
///
/// # Arguments
///
/// * `tar_path` - Path to the `.keep.tar` file.
///
/// # Returns
///
/// A list of newly assigned item IDs.
pub fn import_tar_file(&self, tar_path: &std::path::Path) -> Result<Vec<i64>, CoreError> {
#[derive(serde::Deserialize)]
struct ApiResponse {
data: Option<ImportResponse>,
error: Option<String>,
}
#[derive(serde::Deserialize)]
struct ImportResponse {
ids: Vec<i64>,
}
let mut file = std::fs::File::open(tar_path).map_err(CoreError::Io)?;
let url = self.url("/api/import");
let mut req = self.agent.post(&url);
if let Some(ref auth) = self.auth_header() {
req = req.header("Authorization", auth);
}
req = req.header("Content-Type", "application/x-tar");
let response = self.handle_error(req.send(ureq::SendBody::from_reader(&mut file)))?;
let body = response
.into_body()
.read_to_string()
.map_err(|e| CoreError::InvalidInput(format!("Cannot read response: {e}")))?;
let api_response: ApiResponse = serde_json::from_str(&body)
.map_err(|e| CoreError::InvalidInput(format!("Cannot parse response: {e}")))?;
if let Some(error) = api_response.error {
return Err(CoreError::InvalidInput(error));
}
Ok(api_response.data.map(|d| d.ids).unwrap_or_default())
}
}

View File

@@ -149,7 +149,7 @@ fn has_binary_signature(data: &[u8]) -> bool {
/// Check if data looks like UTF-16 without BOM
fn looks_like_utf16(data: &[u8]) -> bool {
if data.len() < 4 || !data.len().is_multiple_of(2) {
if data.len() < 4 || data.len() % 2 != 0 {
return false;
}

View File

@@ -8,3 +8,84 @@ pub mod schema;
/// Standard buffer size for I/O operations (8KB)
pub const PIPESIZE: usize = 8192;
/// Reads chunks from `reader` until EOF, passing each chunk to `f`.
///
/// Uses a fixed PIPESIZE buffer to ensure bounded memory usage.
pub fn stream_copy<R: std::io::Read + ?Sized>(
reader: &mut R,
mut f: impl FnMut(&[u8]) -> std::io::Result<()>,
) -> std::io::Result<()> {
let mut buffer = [0u8; PIPESIZE];
loop {
let n = reader.read(&mut buffer)?;
if n == 0 {
break;
}
f(&buffer[..n])?;
}
Ok(())
}
/// Reads content from a reader with offset and length bounds.
///
/// Skips `offset` bytes from the reader, then reads up to `length` bytes
/// (or all remaining if `length` is 0). Uses PIPESIZE buffers throughout.
///
/// # Arguments
///
/// * `reader` - The source reader positioned at the start.
/// * `offset` - Number of bytes to skip before reading.
/// * `length` - Maximum bytes to read (0 = read all remaining).
/// * `content_len` - Total content size (used to cap skip/read amounts).
///
/// # Returns
///
/// A `Vec<u8>` containing the requested byte range.
pub fn read_with_bounds<R: std::io::Read>(
reader: &mut R,
offset: u64,
length: u64,
content_len: u64,
) -> std::io::Result<Vec<u8>> {
// Skip offset bytes
let skip = std::cmp::min(offset, content_len);
let mut remaining = skip;
let mut buf = [0u8; PIPESIZE];
while remaining > 0 {
let to_read = std::cmp::min(remaining, buf.len() as u64) as usize;
match reader.read(&mut buf[..to_read]) {
Ok(0) => break,
Ok(n) => remaining -= n as u64,
Err(e) => return Err(e),
}
}
// Read bounded content
let max_bytes = if length > 0 {
std::cmp::min(length, content_len.saturating_sub(offset))
} else {
content_len.saturating_sub(offset)
};
let mut result = Vec::with_capacity(std::cmp::min(max_bytes, 64 * 1024) as usize);
let mut bytes_read = 0u64;
while bytes_read < max_bytes {
let to_read = std::cmp::min(max_bytes - bytes_read, buf.len() as u64) as usize;
match reader.read(&mut buf[..to_read]) {
Ok(0) => break,
Ok(n) => {
result.extend_from_slice(&buf[..n]);
bytes_read += n as u64;
}
Err(e) => return Err(e),
}
}
Ok(result)
}
/// Sanitize a timestamp string for use in filenames.
///
/// Replaces colons with hyphens (e.g., `2026-03-17T12:00:00Z` → `2026-03-17T12-00-00Z`).
pub fn sanitize_ts_string(ts: &str) -> String {
ts.replace(':', "-")
}

View File

@@ -125,7 +125,7 @@ pub fn gather_meta_plugin_schemas() -> Vec<PluginSchema> {
pub fn gather_filter_plugin_schemas() -> Vec<PluginSchema> {
use crate::services::filter_service::get_available_filter_plugins;
let plugins = get_available_filter_plugins();
let plugins = get_available_filter_plugins().unwrap_or_default();
let mut schemas: Vec<PluginSchema> = plugins
.into_iter()
.map(|(name, creator)| {

View File

@@ -27,6 +27,22 @@ pub struct StatusInfo {
pub configured_meta_plugins: Option<Vec<crate::config::MetaPluginConfig>>,
}
impl Default for StatusInfo {
fn default() -> Self {
Self {
paths: PathInfo {
data: String::new(),
database: String::new(),
},
compression: Vec::new(),
meta_plugins: std::collections::HashMap::new(),
enabled_meta_plugins: Vec::new(),
filter_plugins: Vec::new(),
configured_meta_plugins: None,
}
}
}
#[derive(serde::Serialize, serde::Deserialize)]
#[cfg_attr(feature = "server", derive(ToSchema))]
pub struct PathInfo {
@@ -59,21 +75,21 @@ pub fn generate_status_info(
db_path: PathBuf,
enabled_meta_plugins: &[MetaPluginType],
enabled_compression_type: Option<CompressionType>,
) -> StatusInfo {
) -> anyhow::Result<StatusInfo> {
log::debug!("STATUS: Starting status info generation");
let path_info = PathInfo {
data: data_path
.into_os_string()
.into_string()
.expect("Unable to convert data path to string"),
.map_err(|_| anyhow::anyhow!("Unable to convert data path to string"))?,
database: db_path
.into_os_string()
.into_string()
.expect("Unable to convert DB path to string"),
.map_err(|_| anyhow::anyhow!("Unable to convert DB path to string"))?,
};
let _default_type = crate::compression_engine::default_compression_type();
let mut compression_info = Vec::new();
let mut compression_info = Vec::with_capacity(CompressionType::iter().count());
// Sort compression types by their string representation
let mut sorted_compression_types: Vec<CompressionType> = CompressionType::iter().collect();
@@ -125,7 +141,8 @@ pub fn generate_status_info(
});
}
let mut meta_plugins_map = std::collections::HashMap::new();
let mut meta_plugins_map =
std::collections::HashMap::with_capacity(MetaPluginType::iter().count());
let mut enabled_meta_plugins_vec = Vec::new();
// Sort meta plugin types by their string representation to avoid creating plugins just for sorting
@@ -183,7 +200,7 @@ pub fn generate_status_info(
}
// Populate filter plugin info from the global registry
let filter_plugins_map = crate::services::filter_service::get_available_filter_plugins();
let filter_plugins_map = crate::services::filter_service::get_available_filter_plugins()?;
let filter_plugins_info: Vec<FilterPluginInfo> = filter_plugins_map
.into_iter()
.map(|(name, creator)| {
@@ -196,12 +213,12 @@ pub fn generate_status_info(
})
.collect();
StatusInfo {
Ok(StatusInfo {
paths: path_info,
compression: compression_info,
meta_plugins: meta_plugins_map,
enabled_meta_plugins: enabled_meta_plugins_vec,
filter_plugins: filter_plugins_info,
configured_meta_plugins: None,
}
})
}

View File

@@ -93,10 +93,22 @@ impl<W: Write> Drop for AutoFinishGzEncoder<W> {
#[cfg(feature = "gzip")]
impl<W: Write> Write for AutoFinishGzEncoder<W> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
self.encoder.as_mut().unwrap().write(buf)
match self.encoder.as_mut() {
Some(encoder) => encoder.write(buf),
None => Err(io::Error::new(
io::ErrorKind::BrokenPipe,
"encoder already finished",
)),
}
}
fn flush(&mut self) -> io::Result<()> {
self.encoder.as_mut().unwrap().flush()
match self.encoder.as_mut() {
Some(encoder) => encoder.flush(),
None => Err(io::Error::new(
io::ErrorKind::BrokenPipe,
"encoder already finished",
)),
}
}
}

View File

@@ -7,16 +7,15 @@ use strum::{Display, EnumIter, EnumString};
use log::*;
use lazy_static::lazy_static;
extern crate enum_map;
use enum_map::enum_map;
use enum_map::{Enum, EnumMap};
pub mod gzip;
pub mod lz4;
pub mod none;
pub mod program;
pub mod raw;
pub mod zstd;
use crate::compression_engine::program::CompressionEngineProgram;
@@ -34,12 +33,18 @@ use crate::compression_engine::program::CompressionEngineProgram;
#[derive(Debug, Eq, PartialEq, Clone, EnumIter, Display, EnumString, enum_map::Enum)]
#[strum(ascii_case_insensitive)]
pub enum CompressionType {
#[strum(serialize = "lz4")]
LZ4,
#[strum(serialize = "gzip")]
GZip,
#[strum(serialize = "bzip2")]
BZip2,
#[strum(serialize = "xz")]
XZ,
#[strum(serialize = "zstd")]
ZStd,
None,
#[strum(to_string = "raw", serialize = "raw", serialize = "none")]
Raw,
}
/// Trait defining the interface for compression engines.
@@ -173,10 +178,9 @@ impl Clone for Box<dyn CompressionEngine> {
}
}
lazy_static! {
static ref COMPRESSION_ENGINES: EnumMap<CompressionType, Box<dyn CompressionEngine>> = {
#[allow(unused_mut)] // mut needed when gzip/lz4 features are enabled
let mut em = enum_map! {
fn init_compression_engines() -> EnumMap<CompressionType, Box<dyn CompressionEngine>> {
#[allow(unused_mut)]
let mut em: EnumMap<CompressionType, Box<dyn CompressionEngine>> = enum_map! {
CompressionType::LZ4 => Box::new(crate::compression_engine::program::CompressionEngineProgram::new(
"lz4",
vec!["-c"],
@@ -202,7 +206,7 @@ lazy_static! {
vec!["-c"],
vec!["-d", "-c"]
)) as Box<dyn CompressionEngine>,
CompressionType::None => Box::new(crate::compression_engine::none::CompressionEngineNone::new()) as Box<dyn CompressionEngine>
CompressionType::Raw => Box::new(crate::compression_engine::raw::CompressionEngineRaw::new()) as Box<dyn CompressionEngine>
};
#[cfg(feature = "gzip")]
@@ -219,10 +223,20 @@ lazy_static! {
as Box<dyn CompressionEngine>;
}
em
};
#[cfg(feature = "zstd")]
{
em[CompressionType::ZStd] =
Box::new(crate::compression_engine::zstd::CompressionEngineZstd::new())
as Box<dyn CompressionEngine>;
}
em
}
static COMPRESSION_ENGINES: std::sync::LazyLock<
EnumMap<CompressionType, Box<dyn CompressionEngine>>,
> = std::sync::LazyLock::new(init_compression_engines);
pub fn default_compression_type() -> CompressionType {
CompressionType::LZ4
}

View File

@@ -15,7 +15,13 @@ pub struct ProgramReader {
impl Read for ProgramReader {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
self.stdout.as_mut().unwrap().read(buf)
match self.stdout.as_mut() {
Some(stdout) => stdout.read(buf),
None => Err(std::io::Error::new(
std::io::ErrorKind::BrokenPipe,
"stdout already taken",
)),
}
}
}
@@ -33,11 +39,23 @@ pub struct ProgramWriter {
impl Write for ProgramWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
self.stdin.as_mut().unwrap().write(buf)
match self.stdin.as_mut() {
Some(stdin) => stdin.write(buf),
None => Err(std::io::Error::new(
std::io::ErrorKind::BrokenPipe,
"stdin already taken",
)),
}
}
fn flush(&mut self) -> std::io::Result<()> {
self.stdin.as_mut().unwrap().flush()
match self.stdin.as_mut() {
Some(stdin) => stdin.flush(),
None => Err(std::io::Error::new(
std::io::ErrorKind::BrokenPipe,
"stdin already taken",
)),
}
}
}

View File

@@ -7,15 +7,15 @@ use std::path::PathBuf;
use crate::compression_engine::CompressionEngine;
#[derive(Debug, Eq, PartialEq, Clone, Default)]
pub struct CompressionEngineNone {}
pub struct CompressionEngineRaw {}
impl CompressionEngineNone {
pub fn new() -> CompressionEngineNone {
CompressionEngineNone {}
impl CompressionEngineRaw {
pub fn new() -> CompressionEngineRaw {
CompressionEngineRaw {}
}
}
impl CompressionEngine for CompressionEngineNone {
impl CompressionEngine for CompressionEngineRaw {
fn is_supported(&self) -> bool {
true
}

View File

@@ -0,0 +1,54 @@
#[cfg(feature = "zstd")]
use anyhow::Result;
#[cfg(feature = "zstd")]
use log::*;
#[cfg(feature = "zstd")]
use std::io::Write;
#[cfg(feature = "zstd")]
use std::fs::File;
#[cfg(feature = "zstd")]
use std::io::Read;
#[cfg(feature = "zstd")]
use std::path::PathBuf;
#[cfg(feature = "zstd")]
use zstd::stream::read::Decoder;
#[cfg(feature = "zstd")]
use zstd::stream::write::Encoder;
#[cfg(feature = "zstd")]
use crate::compression_engine::CompressionEngine;
#[cfg(feature = "zstd")]
#[derive(Debug, Eq, PartialEq, Clone, Default)]
pub struct CompressionEngineZstd {}
#[cfg(feature = "zstd")]
impl CompressionEngineZstd {
pub fn new() -> CompressionEngineZstd {
CompressionEngineZstd {}
}
}
#[cfg(feature = "zstd")]
impl CompressionEngine for CompressionEngineZstd {
fn open(&self, file_path: PathBuf) -> Result<Box<dyn Read + Send>> {
debug!("COMPRESSION: Opening {:?} using {:?}", file_path, *self);
let file = File::open(file_path)?;
Ok(Box::new(Decoder::new(file)?))
}
fn create(&self, file_path: PathBuf) -> Result<Box<dyn Write>> {
debug!("COMPRESSION: Writing to {:?} using {:?}", file_path, *self);
let file = File::create(file_path)?;
let zstd_write = Encoder::new(file, 3)?.auto_finish();
Ok(Box::new(zstd_write))
}
fn clone_box(&self) -> Box<dyn CompressionEngine> {
Box::new(self.clone())
}
}

View File

@@ -4,7 +4,7 @@ use dirs;
use log::{debug, error};
use serde::{Deserialize, Serialize};
use std::fs;
use std::path::PathBuf;
use std::path::{Path, PathBuf};
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(rename_all = "lowercase")]
@@ -152,6 +152,7 @@ pub struct ServerConfig {
pub cert_file: Option<PathBuf>,
pub key_file: Option<PathBuf>,
pub cors_origin: Option<String>,
pub max_body_size: Option<u64>,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
@@ -190,6 +191,8 @@ pub struct Settings {
pub table_config: TableConfig,
#[serde(default)]
pub human_readable: bool,
#[serde(default)]
pub ids_only: bool,
pub output_format: Option<String>,
#[serde(default)]
pub quiet: bool,
@@ -208,6 +211,18 @@ pub struct Settings {
pub client_password: Option<String>,
#[serde(skip)]
pub client_jwt: Option<String>,
// Metadata key-value pairs from --meta CLI flag
#[serde(skip)]
pub meta: Vec<(String, Option<String>)>,
// Export filename format template (--export-filename-format)
#[serde(skip)]
pub export_filename_format: String,
// Export name for {name} variable (--export-name)
#[serde(skip)]
pub export_name: Option<String>,
// Import data file path (--import-data-file)
#[serde(skip)]
pub import_data_file: Option<std::path::PathBuf>,
}
impl Settings {
@@ -220,15 +235,13 @@ impl Settings {
} else if let Ok(env_config) = std::env::var("KEEP_CONFIG") {
PathBuf::from(env_config)
} else {
let default_path = if let Ok(home_dir) = std::env::var("HOME") {
let mut path = PathBuf::from(home_dir);
path.push(".config");
path.push("keep");
path.push("config.yml");
path
} else {
PathBuf::from("~/.config/keep/config.yml")
};
let default_path = dirs::config_dir()
.map(|mut p| {
p.push("keep");
p.push("config.yml");
p
})
.unwrap_or_else(|| PathBuf::from("~/.config/keep/config.yml"));
debug!("CONFIG: Using default config path: {default_path:?}");
default_path
};
@@ -256,13 +269,21 @@ impl Settings {
// Override with CLI args
if let Some(dir) = &args.options.dir {
debug!("CONFIG: Overriding dir with CLI arg: {dir:?}");
config_builder = config_builder.set_override("dir", dir.to_str().unwrap())?;
config_builder = config_builder.set_override(
"dir",
dir.to_str()
.ok_or_else(|| anyhow::anyhow!("non-UTF-8 directory path"))?,
)?;
}
if args.options.human_readable {
config_builder = config_builder.set_override("human_readable", true)?;
}
if args.options.ids_only {
config_builder = config_builder.set_override("ids_only", true)?;
}
if let Some(output_format) = &args.options.output_format {
config_builder =
config_builder.set_override("output_format", output_format.as_str())?;
@@ -280,60 +301,59 @@ impl Settings {
config_builder = config_builder.set_override("force", true)?;
}
#[cfg(feature = "server")]
if let Some(server_password) = &args.options.server_password {
config_builder =
config_builder.set_override("server.password", server_password.as_str())?;
}
#[cfg(feature = "server")]
if let Some(server_password_hash) = &args.options.server_password_hash {
config_builder = config_builder
.set_override("server.password_hash", server_password_hash.as_str())?;
}
#[cfg(feature = "server")]
if let Some(server_username) = &args.options.server_username {
config_builder =
config_builder.set_override("server.username", server_username.as_str())?;
}
#[cfg(feature = "server")]
if let Some(server_address) = &args.mode.server_address {
config_builder =
config_builder.set_override("server.address", server_address.as_str())?;
}
#[cfg(feature = "server")]
if let Some(server_port) = args.mode.server_port {
config_builder = config_builder.set_override("server.port", server_port)?;
}
#[cfg(feature = "tls")]
#[cfg(feature = "server")]
if let Some(server_cert) = &args.mode.server_cert {
config_builder = config_builder
.set_override("server.cert_file", server_cert.to_string_lossy().as_ref())?;
}
#[cfg(feature = "tls")]
#[cfg(feature = "server")]
if let Some(server_key) = &args.mode.server_key {
config_builder = config_builder
.set_override("server.key_file", server_key.to_string_lossy().as_ref())?;
}
#[cfg(feature = "server")]
if let Some(max_body_size) = args.options.server_max_body_size {
config_builder = config_builder.set_override("server.max_body_size", max_body_size)?;
}
if let Some(compression) = &args.item.compression {
config_builder =
config_builder.set_override("compression_plugin.name", compression.as_str())?;
}
if !args.item.meta_plugins.is_empty() {
let meta_plugins: Vec<std::collections::HashMap<String, String>> = args
.item
.meta_plugins
.iter()
.map(|name| {
let mut map = std::collections::HashMap::new();
map.insert("name".to_string(), name.clone());
map
})
.collect();
config_builder = config_builder.set_override("meta_plugins", meta_plugins)?;
}
// Build MetaPluginConfig entries from --meta-plugin args (name[:json])
// These are handled after config deserialization (see below).
let config = config_builder.build()?;
debug!("CONFIG: Built config, attempting to deserialize");
@@ -429,6 +449,59 @@ impl Settings {
}]);
}
// Override meta_plugins from --meta-plugin CLI args
if !args.item.meta_plugins.is_empty() {
debug!("CONFIG: Overriding meta_plugins from --meta-plugin CLI args");
let cli_plugins: Vec<MetaPluginConfig> = args
.item
.meta_plugins
.iter()
.map(|arg| {
let mut options = std::collections::HashMap::new();
let mut outputs = std::collections::HashMap::new();
if let Some(serde_json::Value::Object(obj)) = &arg.options {
// Extract options and outputs from JSON value
if let Some(serde_json::Value::Object(opts_obj)) =
obj.get("options")
{
for (k, v) in opts_obj {
let yaml_str = serde_json::to_string(v).unwrap_or_default();
let yaml_val: serde_yaml::Value =
serde_yaml::from_str(&yaml_str)
.unwrap_or(serde_yaml::Value::Null);
options.insert(k.clone(), yaml_val);
}
}
if let Some(serde_json::Value::Object(outs_obj)) =
obj.get("outputs")
{
for (k, v) in outs_obj {
let val_str = match v {
serde_json::Value::String(s) => s.clone(),
_ => v.to_string(),
};
outputs.insert(k.clone(), val_str);
}
}
}
MetaPluginConfig {
name: arg.name.clone(),
options,
outputs,
}
})
.collect();
settings.meta_plugins = Some(cli_plugins);
}
// Override list_format from --list-format CLI arg
if args.options.list_format
!= "id,time,size,meta:text_line_count,tags,meta:hostname_short,meta:command"
{
debug!("CONFIG: Overriding list_format from --list-format CLI arg");
settings.list_format = Settings::parse_list_format(&args.options.list_format);
}
// Set dir to default if not provided or is empty
if settings.dir == PathBuf::new() {
debug!("CONFIG: Setting default dir: {default_dir:?}");
@@ -460,6 +533,44 @@ impl Settings {
.or_else(|| settings.client.as_ref().and_then(|c| c.jwt.clone()));
}
// Parse --meta key=value and bare key arguments
settings.meta = args
.item
.meta
.iter()
.map(|s| {
if let Some((key, value)) = s.split_once('=') {
(key.to_string(), Some(value.to_string()))
} else {
(s.to_string(), None)
}
})
.collect();
// Set export filename format from CLI args
settings.export_filename_format = args.item.export_filename_format.clone();
settings.export_name = args.item.export_name.clone();
settings.import_data_file = args.item.import_data_file.clone();
// Expand ~ in all path fields
settings.dir = Settings::expand_tilde(&settings.dir);
settings.import_data_file = settings
.import_data_file
.as_ref()
.map(|p| Settings::expand_tilde(p));
if let Some(ref mut server) = settings.server {
server.password_file = server
.password_file
.as_ref()
.map(|p| Settings::expand_tilde(p));
server.jwt_secret_file = server
.jwt_secret_file
.as_ref()
.map(|p| Settings::expand_tilde(p));
server.cert_file = server.cert_file.as_ref().map(|p| Settings::expand_tilde(p));
server.key_file = server.key_file.as_ref().map(|p| Settings::expand_tilde(p));
}
debug!("CONFIG: Final settings: {settings:?}");
Ok(settings)
}
@@ -472,24 +583,42 @@ impl Settings {
pub fn default_dir() -> anyhow::Result<PathBuf> {
let mut path =
dirs::home_dir().ok_or_else(|| anyhow::anyhow!("No home directory found"))?;
path.push(".keep");
dirs::data_dir().ok_or_else(|| anyhow::anyhow!("No data directory found"))?;
path.push("keep");
if !path.exists() {
std::fs::create_dir_all(&path)?;
}
Ok(path)
}
/// Expand a leading `~` in a path to the user's home directory.
///
/// Returns the path unchanged if it doesn't start with `~` or if the
/// home directory cannot be determined.
fn expand_tilde(path: &Path) -> PathBuf {
let path_str = path.to_string_lossy();
if let Some(rest) = path_str.strip_prefix("~/") {
if let Some(home) = dirs::home_dir() {
return home.join(rest);
}
} else if path_str == "~" {
if let Some(home) = dirs::home_dir() {
return home;
}
}
path.to_path_buf()
}
/// Get server password from password_file or directly from config if configured
pub fn get_server_password(&self) -> Result<Option<String>> {
if let Some(server) = &self.server {
// First check for password_file
if let Some(password_file) = &server.password_file {
debug!("CONFIG: Reading password from file: {password_file:?}");
let password = fs::read_to_string(password_file)
.with_context(|| format!("Failed to read password file: {password_file:?}"))?
.trim()
.to_string();
let password = fs::read(password_file)
.with_context(|| format!("Failed to read password file: {password_file:?}"))?;
let end = password.len().min(4096);
let password = String::from_utf8_lossy(&password[..end]).trim().to_string();
return Ok(Some(password));
}
@@ -521,12 +650,11 @@ impl Settings {
// First check for jwt_secret_file
if let Some(jwt_secret_file) = &server.jwt_secret_file {
debug!("CONFIG: Reading JWT secret from file: {jwt_secret_file:?}");
let secret = fs::read_to_string(jwt_secret_file)
.with_context(|| {
let secret = fs::read(jwt_secret_file).with_context(|| {
format!("Failed to read JWT secret file: {jwt_secret_file:?}")
})?
.trim()
.to_string();
})?;
let end = secret.len().min(4096);
let secret = String::from_utf8_lossy(&secret[..end]).trim().to_string();
return Ok(Some(secret));
}
@@ -574,6 +702,14 @@ impl Settings {
.unwrap_or_default()
}
/// Returns the metadata filter as a HashMap.
///
/// Converts the `meta` field (list of key-value pairs from CLI --meta flags)
/// into a `HashMap<String, Option<String>>` suitable for filtering.
pub fn meta_filter(&self) -> std::collections::HashMap<String, Option<String>> {
self.meta.iter().cloned().collect()
}
/// Validates the configuration against plugin schemas.
///
/// Checks that:
@@ -634,4 +770,73 @@ impl Settings {
warnings
}
/// Parse a comma-separated column list string into Vec<ColumnConfig>.
///
/// Maps known column names to their default labels and alignment.
/// For unknown names (including meta:* columns), uses the name as its own label.
fn parse_list_format(input: &str) -> Vec<ColumnConfig> {
input
.split(',')
.map(|s| s.trim())
.filter(|s| !s.is_empty())
.map(|name| {
let (label, align) = match name {
"id" => ("Item", ColumnAlignment::Right),
"time" => ("Time", ColumnAlignment::Right),
"size" => ("Size", ColumnAlignment::Right),
"meta:text_line_count" => ("Lines", ColumnAlignment::Right),
"meta:token_count" => ("Tokens", ColumnAlignment::Right),
"tags" => ("Tags", ColumnAlignment::Left),
"meta:hostname_short" => ("Host", ColumnAlignment::Left),
"meta:hostname" => ("Host", ColumnAlignment::Left),
"meta:command" => ("Command", ColumnAlignment::Left),
"compression" => ("Compression", ColumnAlignment::Left),
other if other.starts_with("meta:") => {
let sub = other.strip_prefix("meta:").unwrap_or(other);
(sub, ColumnAlignment::Left)
}
other => (other, ColumnAlignment::Left),
};
ColumnConfig {
name: name.to_string(),
label: label.to_string(),
align,
..Default::default()
}
})
.collect()
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::path::Path;
#[test]
fn test_expand_tilde_with_slash() {
let home = dirs::home_dir().unwrap();
let result = Settings::expand_tilde(Path::new("~/foo/bar"));
assert_eq!(result, home.join("foo/bar"));
}
#[test]
fn test_expand_tilde_bare() {
let home = dirs::home_dir().unwrap();
let result = Settings::expand_tilde(Path::new("~"));
assert_eq!(result, home);
}
#[test]
fn test_expand_tilde_absolute() {
let result = Settings::expand_tilde(Path::new("/etc/keep"));
assert_eq!(result, PathBuf::from("/etc/keep"));
}
#[test]
fn test_expand_tilde_relative() {
let result = Settings::expand_tilde(Path::new("foo/bar"));
assert_eq!(result, PathBuf::from("foo/bar"));
}
}

308
src/db.rs
View File

@@ -1,8 +1,7 @@
use anyhow::{Context, Error, Result, anyhow};
use chrono::prelude::*;
use lazy_static::lazy_static;
use log::*;
use rusqlite::{Connection, OpenFlags, params};
use rusqlite::{Connection, OpenFlags, Row, params};
use rusqlite_migration::{M, Migrations};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
@@ -19,7 +18,7 @@ and query utilities for efficient data access.
# Schema
The database uses three main tables:
- `items`: Core item information (ID, timestamp, size, compression).
- `items`: Core item information (ID, timestamp, uncompressed_size, compressed_size, closed, compression).
- `tags`: Item-tag associations (many-to-many).
- `metas`: Item-metadata associations (many-to-many).
@@ -42,30 +41,26 @@ let conn = db::open(PathBuf::from("keep.db"))?;
```
Insert an item:
```ignore
let item = db::Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
let item = db::Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
let id = db::insert_item(&conn, item)?;
```
*/
lazy_static! {
// Database schema migrations for the Keep application.
//
// Defines the sequence of migrations to create and update the schema.
// Applied automatically when opening a database connection.
static ref MIGRATIONS: Migrations<'static> = Migrations::new(vec![
static MIGRATIONS: std::sync::LazyLock<Migrations<'static>> = std::sync::LazyLock::new(|| {
Migrations::new(vec![
M::up(
"CREATE TABLE items(
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
ts TEXT NOT NULL,
size INTEGER NULL,
compression TEXT NOT NULL)"
compression TEXT NOT NULL)",
),
M::up(
"CREATE TABLE tags (
id INTEGER NOT NULL,
name TEXT NOT NULL,
FOREIGN KEY(id) REFERENCES items(id) ON DELETE CASCADE,
PRIMARY KEY(id, name));"
PRIMARY KEY(id, name));",
),
M::up(
"CREATE TABLE metas (
@@ -73,12 +68,17 @@ lazy_static! {
name TEXT NOT NULL,
value TEXT NOT NULL,
FOREIGN KEY(id) REFERENCES items(id) ON DELETE CASCADE,
PRIMARY KEY(id, name));"
PRIMARY KEY(id, name));",
),
M::up("CREATE INDEX idx_tags_name ON tags(name)"),
M::up("CREATE INDEX idx_metas_name ON metas(name)"),
]);
}
M::up("CREATE INDEX idx_items_ts ON items(ts)"),
M::up("UPDATE items SET compression = 'raw' WHERE compression = 'none'"),
M::up("ALTER TABLE items RENAME COLUMN size TO uncompressed_size"),
M::up("ALTER TABLE items ADD COLUMN compressed_size INTEGER NULL"),
M::up("ALTER TABLE items ADD COLUMN closed BOOLEAN NOT NULL DEFAULT 1"),
])
});
/// Represents an item stored in the database.
///
@@ -88,7 +88,9 @@ lazy_static! {
///
/// * `id` - Unique identifier, `None` for new items.
/// * `ts` - Creation timestamp in UTC.
/// * `size` - Content size in bytes, `None` if not set.
/// * `uncompressed_size` - Uncompressed content size in bytes, `None` if not set.
/// * `compressed_size` - Compressed file size on disk, `None` if not set.
/// * `closed` - Whether the item has been fully written and closed.
/// * `compression` - Compression algorithm used.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Item {
@@ -96,12 +98,27 @@ pub struct Item {
pub id: Option<i64>,
/// Timestamp when the item was created.
pub ts: DateTime<Utc>,
/// Size of the item content in bytes, None if not set.
pub size: Option<i64>,
/// Uncompressed size of the item content in bytes, None if not set.
pub uncompressed_size: Option<i64>,
/// Compressed file size on disk in bytes, None if not set.
pub compressed_size: Option<i64>,
/// Whether the item has been fully written and closed.
pub closed: bool,
/// Compression algorithm used for the item content.
pub compression: String,
}
fn item_from_row(row: &Row) -> Result<Item> {
Ok(Item {
id: row.get(0)?,
ts: row.get(1)?,
uncompressed_size: row.get(2)?,
compressed_size: row.get(3)?,
closed: row.get(4)?,
compression: row.get(5)?,
})
}
/// Represents a tag associated with an item.
///
/// Defines the relationship between items and tags in a many-to-many structure.
@@ -162,8 +179,10 @@ pub struct Meta {
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// # Ok(())
/// # }
@@ -213,13 +232,17 @@ pub fn open(path: PathBuf) -> Result<Connection, Error> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item {
/// id: None,
/// ts: Utc::now(),
/// size: None,
/// uncompressed_size: None,
/// compressed_size: None,
/// closed: false,
/// compression: "lz4".to_string(),
/// };
/// let id = db::insert_item(&conn, item)?;
@@ -230,8 +253,8 @@ pub fn open(path: PathBuf) -> Result<Connection, Error> {
pub fn insert_item(conn: &Connection, item: Item) -> Result<i64> {
debug!("DB: Inserting item: {item:?}");
conn.execute(
"INSERT INTO items (ts, size, compression) VALUES (?1, ?2, ?3)",
params![item.ts, item.size, item.compression],
"INSERT INTO items (ts, uncompressed_size, compressed_size, closed, compression) VALUES (?1, ?2, ?3, ?4, ?5)",
params![item.ts, item.uncompressed_size, item.compressed_size, item.closed, item.compression],
)?;
Ok(conn.last_insert_rowid())
}
@@ -260,8 +283,10 @@ pub fn insert_item(conn: &Connection, item: Item) -> Result<i64> {
/// # use keep::db::*;
/// # use keep::compression_engine::CompressionType;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let compression = CompressionType::LZ4;
/// let item = db::create_item(&conn, compression)?;
@@ -276,7 +301,9 @@ pub fn create_item(
let item = Item {
id: None,
ts: chrono::Utc::now(),
size: None,
uncompressed_size: None,
compressed_size: None,
closed: false,
compression: compression_type.to_string(),
};
let item_id = insert_item(conn, item.clone())?;
@@ -286,6 +313,37 @@ pub fn create_item(
})
}
/// Creates a new item with a specific timestamp (for import).
///
/// # Arguments
///
/// * `conn` - Database connection.
/// * `ts` - Timestamp to use for the item.
/// * `compression` - Compression type string (e.g., "lz4", "gzip", "raw").
///
/// # Returns
///
/// * `Result<Item>` - The created item with its ID set.
pub fn insert_item_with_ts(
conn: &Connection,
ts: chrono::DateTime<chrono::Utc>,
compression: &str,
) -> Result<Item> {
let item = Item {
id: None,
ts,
uncompressed_size: None,
compressed_size: None,
closed: false,
compression: compression.to_string(),
};
let item_id = insert_item(conn, item.clone())?;
Ok(Item {
id: Some(item_id),
..item
})
}
/// Adds a tag to an item.
///
/// Inserts a new tag association in the `tags` table.
@@ -312,10 +370,12 @@ pub fn create_item(
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?;
/// db::add_tag(&conn, item_id, "important")?;
/// # Ok(())
@@ -329,6 +389,18 @@ pub fn add_tag(conn: &Connection, item_id: i64, tag_name: &str) -> Result<()> {
insert_tag(conn, tag)
}
/// Adds a tag to an item, ignoring if the tag already exists.
///
/// Uses `INSERT OR IGNORE` to make the operation idempotent.
pub fn upsert_tag(conn: &Connection, item_id: i64, tag_name: &str) -> Result<()> {
debug!("DB: Upserting tag: item={item_id}, tag={tag_name}");
conn.execute(
"INSERT OR IGNORE INTO tags (id, name) VALUES (?1, ?2)",
params![item_id, tag_name],
)?;
Ok(())
}
/// Adds metadata to an item.
///
/// Inserts a new metadata entry in the `metas` table.
@@ -356,10 +428,12 @@ pub fn add_tag(conn: &Connection, item_id: i64, tag_name: &str) -> Result<()> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?;
/// db::add_meta(&conn, item_id, "mime_type", "text/plain")?;
/// # Ok(())
@@ -399,10 +473,12 @@ pub fn add_meta(conn: &Connection, item_id: i64, name: &str, value: &str) -> Res
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), size: Some(1024), compression: "lz4".to_string(), ts: Utc::now() };
/// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: Some(1024), compressed_size: Some(512), closed: true, compression: "lz4".to_string() };
/// db::update_item(&conn, item)?;
/// # Ok(())
/// # }
@@ -410,8 +486,8 @@ pub fn add_meta(conn: &Connection, item_id: i64, name: &str, value: &str) -> Res
pub fn update_item(conn: &Connection, item: Item) -> Result<()> {
debug!("DB: Updating item: {item:?}");
conn.execute(
"UPDATE items SET size=?2, compression=?3 WHERE id=?1",
params![item.id, item.size, item.compression,],
"UPDATE items SET uncompressed_size=?2, compressed_size=?3, closed=?4, compression=?5 WHERE id=?1",
params![item.id, item.uncompressed_size, item.compressed_size, item.closed, item.compression,],
)?;
Ok(())
}
@@ -441,17 +517,22 @@ pub fn update_item(conn: &Connection, item: Item) -> Result<()> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// db::delete_item(&conn, item)?;
/// # Ok(())
/// # }
/// ```
pub fn delete_item(conn: &Connection, item: Item) -> Result<()> {
debug!("DB: Deleting item: {item:?}");
conn.execute("DELETE FROM items WHERE id=?1", params![item.id])?;
let id = item
.id
.ok_or_else(|| anyhow::anyhow!("Cannot delete item: ID is None"))?;
conn.execute("DELETE FROM items WHERE id=?1", params![id])?;
Ok(())
}
@@ -479,8 +560,10 @@ pub fn delete_item(conn: &Connection, item: Item) -> Result<()> {
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let meta = Meta { id: 1, name: "temp".to_string(), value: "".to_string() };
/// db::query_delete_meta(&conn, meta)?;
@@ -521,10 +604,12 @@ pub fn query_delete_meta(conn: &Connection, meta: Meta) -> Result<()> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?;
/// let meta = Meta { id: item_id, name: "mime_type".to_string(), value: "text/plain".to_string() };
/// db::query_upsert_meta(&conn, meta)?;
@@ -565,10 +650,12 @@ pub fn query_upsert_meta(conn: &Connection, meta: Meta) -> Result<()> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?;
/// // Insert new metadata
/// let meta = Meta { id: item_id, name: "source".to_string(), value: "cli".to_string() };
@@ -614,10 +701,12 @@ pub fn store_meta(conn: &Connection, meta: Meta) -> Result<()> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?;
/// let tag = Tag { id: item_id, name: "work".to_string() };
/// db::insert_tag(&conn, tag)?;
@@ -657,10 +746,12 @@ pub fn insert_tag(conn: &Connection, tag: Tag) -> Result<()> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// db::delete_item_tags(&conn, item)?;
/// # Ok(())
/// # }
@@ -697,12 +788,14 @@ pub fn delete_item_tags(conn: &Connection, item: Item) -> Result<()> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?;
/// let item = Item { id: Some(item_id), ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: Some(item_id), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let tags = vec!["project_a".to_string(), "urgent".to_string()];
/// db::set_item_tags(&conn, item, &tags)?;
/// # Ok(())
@@ -750,8 +843,10 @@ pub fn set_item_tags(conn: &Connection, item: Item, tags: &Vec<String>) -> Resul
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let all_items = db::query_all_items(&conn)?;
/// assert!(all_items.len() >= 0);
@@ -761,19 +856,13 @@ pub fn set_item_tags(conn: &Connection, item: Item, tags: &Vec<String>) -> Resul
pub fn query_all_items(conn: &Connection) -> Result<Vec<Item>> {
debug!("DB: Querying all items");
let mut statement = conn
.prepare("SELECT id, ts, size, compression FROM items ORDER BY id ASC")
.prepare("SELECT id, ts, uncompressed_size, compressed_size, closed, compression FROM items ORDER BY id ASC")
.context("Problem preparing SQL statement")?;
let mut rows = statement.query(params![])?;
let mut items = Vec::new();
while let Some(row) = rows.next()? {
let item = Item {
id: row.get(0)?,
ts: row.get(1)?,
size: row.get(2)?,
compression: row.get(3)?,
};
items.push(item);
items.push(item_from_row(row)?);
}
Ok(items)
@@ -802,8 +891,10 @@ pub fn query_all_items(conn: &Connection) -> Result<Vec<Item>> {
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let tags = vec!["work".to_string(), "urgent".to_string()];
/// let tagged_items = db::query_tagged_items(&conn, &tags)?;
@@ -817,7 +908,9 @@ pub fn query_tagged_items<'a>(conn: &'a Connection, tags: &'a Vec<String>) -> Re
"
SELECT items.id,
items.ts,
items.size,
items.uncompressed_size,
items.compressed_size,
items.closed,
items.compression,
count(tags_match.id) as tags_score
FROM items,
@@ -840,13 +933,7 @@ pub fn query_tagged_items<'a>(conn: &'a Connection, tags: &'a Vec<String>) -> Re
let mut items = Vec::new();
while let Some(row) = rows.next()? {
let item = Item {
id: row.get(0)?,
ts: row.get(1)?,
size: row.get(2)?,
compression: row.get(3)?,
};
items.push(item);
items.push(item_from_row(row)?);
}
Ok(items)
@@ -870,8 +957,10 @@ pub fn query_tagged_items<'a>(conn: &'a Connection, tags: &'a Vec<String>) -> Re
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let items = db::get_items(&conn)?;
/// # Ok(())
@@ -908,11 +997,13 @@ pub fn get_items(conn: &Connection) -> Result<Vec<Item>> {
/// # use keep::db::*;
/// # use std::collections::HashMap;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let tags = vec!["project".to_string()];
/// let meta = HashMap::from([("status".to_string(), "active".to_string())]);
/// let meta = HashMap::from([("status".to_string(), Some("active".to_string()))]);
/// let matching = db::get_items_matching(&conn, &tags, &meta)?;
/// # Ok(())
/// # }
@@ -920,7 +1011,7 @@ pub fn get_items(conn: &Connection) -> Result<Vec<Item>> {
pub fn get_items_matching(
conn: &Connection,
tags: &Vec<String>,
meta: &HashMap<String, String>,
meta: &HashMap<String, Option<String>>,
) -> Result<Vec<Item>> {
debug!("DB: Getting items matching: tags={tags:?} meta={meta:?}");
@@ -947,7 +1038,10 @@ pub fn get_items_matching(
Some(m) => m,
None => return false,
};
meta.iter().all(|(k, v)| item_meta.get(k) == Some(v))
meta.iter().all(|(k, v)| match v {
Some(val) => item_meta.get(k) == Some(val),
None => item_meta.contains_key(k),
})
})
.collect();
Ok(filtered_items)
@@ -979,8 +1073,10 @@ pub fn get_items_matching(
/// # use keep::db::*;
/// # use std::collections::HashMap;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let tags = vec!["latest".to_string()];
/// let item = db::get_item_matching(&conn, &tags, &HashMap::new())?;
@@ -990,7 +1086,7 @@ pub fn get_items_matching(
pub fn get_item_matching(
conn: &Connection,
tags: &Vec<String>,
meta: &HashMap<String, String>,
meta: &HashMap<String, Option<String>>,
) -> Result<Option<Item>> {
debug!("DB: Get item matching tags: {tags:?}, meta: {meta:?}");
let items = get_items_matching(conn, tags, meta)?;
@@ -1021,10 +1117,12 @@ pub fn get_item_matching(
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: None, ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: None, ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let item_id = db::insert_item(&conn, item)?;
/// let item = db::get_item(&conn, item_id)?;
/// assert!(item.is_some());
@@ -1036,7 +1134,7 @@ pub fn get_item(conn: &Connection, item_id: i64) -> Result<Option<Item>> {
let mut statement = conn
.prepare_cached(
"
SELECT id, ts, size, compression
SELECT id, ts, uncompressed_size, compressed_size, closed, compression
FROM items
WHERE items.id = ?1",
)
@@ -1048,8 +1146,10 @@ pub fn get_item(conn: &Connection, item_id: i64) -> Result<Option<Item>> {
Some(row) => Ok(Some(Item {
id: row.get(0)?,
ts: row.get(1)?,
size: row.get(2)?,
compression: row.get(3)?,
uncompressed_size: row.get(2)?,
compressed_size: row.get(3)?,
closed: row.get(4)?,
compression: row.get(5)?,
})),
None => Ok(None),
}
@@ -1077,8 +1177,10 @@ pub fn get_item(conn: &Connection, item_id: i64) -> Result<Option<Item>> {
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let latest = db::get_item_last(&conn)?;
/// # Ok(())
@@ -1089,7 +1191,7 @@ pub fn get_item_last(conn: &Connection) -> Result<Option<Item>> {
let mut statement = conn
.prepare_cached(
"
SELECT id, ts, size, compression
SELECT id, ts, uncompressed_size, compressed_size, closed, compression
FROM items
ORDER BY id DESC
LIMIT 1",
@@ -1102,8 +1204,10 @@ pub fn get_item_last(conn: &Connection) -> Result<Option<Item>> {
Some(row) => Ok(Some(Item {
id: row.get(0)?,
ts: row.get(1)?,
size: row.get(2)?,
compression: row.get(3)?,
uncompressed_size: row.get(2)?,
compressed_size: row.get(3)?,
closed: row.get(4)?,
compression: row.get(5)?,
})),
None => Ok(None),
}
@@ -1133,10 +1237,12 @@ pub fn get_item_last(conn: &Connection) -> Result<Option<Item>> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let tags = db::get_item_tags(&conn, &item)?;
/// # Ok(())
/// # }
@@ -1184,10 +1290,12 @@ pub fn get_item_tags(conn: &Connection, item: &Item) -> Result<Vec<Tag>> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let meta = db::get_item_meta(&conn, &item)?;
/// # Ok(())
/// # }
@@ -1237,15 +1345,17 @@ pub fn get_item_meta(conn: &Connection, item: &Item) -> Result<Vec<Meta>> {
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let meta = db::get_item_meta_name(&conn, &item, "mime_type".to_string())?;
/// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let meta = db::get_item_meta_name(&conn, &item, "mime_type")?;
/// # Ok(())
/// # }
/// ```
pub fn get_item_meta_name(conn: &Connection, item: &Item, name: String) -> Result<Option<Meta>> {
pub fn get_item_meta_name(conn: &Connection, item: &Item, name: &str) -> Result<Option<Meta>> {
debug!("DB: Getting item meta name: {item:?} {name:?}");
let mut statement = conn
.prepare_cached("SELECT id, name, value FROM metas WHERE id=?1 AND name=?2")
@@ -1287,15 +1397,17 @@ pub fn get_item_meta_name(conn: &Connection, item: &Item, name: String) -> Resul
/// # use keep::db::*;
/// # use chrono::Utc;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let item = Item { id: Some(1), ts: Utc::now(), size: None, compression: "lz4".to_string() };
/// let value = db::get_item_meta_value(&conn, &item, "source".to_string())?;
/// let item = Item { id: Some(1), ts: Utc::now(), uncompressed_size: None, compressed_size: None, closed: false, compression: "lz4".to_string() };
/// let value = db::get_item_meta_value(&conn, &item, "source")?;
/// # Ok(())
/// # }
/// ```
pub fn get_item_meta_value(conn: &Connection, item: &Item, name: String) -> Result<Option<String>> {
pub fn get_item_meta_value(conn: &Connection, item: &Item, name: &str) -> Result<Option<String>> {
debug!("DB: Getting item meta value: {item:?} {name:?}");
let mut statement = conn
.prepare_cached("SELECT value FROM metas WHERE id=?1 AND name=?2")
@@ -1331,8 +1443,10 @@ pub fn get_item_meta_value(conn: &Connection, item: &Item, name: String) -> Resu
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let ids = vec![1, 2, 3];
/// let tags_map = db::get_tags_for_items(&conn, &ids)?;
@@ -1398,8 +1512,10 @@ pub fn get_tags_for_items(
/// # use keep::db;
/// # use keep::db::*;
/// # use std::path::PathBuf;
/// # use tempfile;
/// # fn main() -> anyhow::Result<()> {
/// let db_path = PathBuf::from("keep.db");
/// let _tmp = tempfile::tempdir()?;
/// let db_path = _tmp.path().join("keep.db");
/// let conn = db::open(db_path)?;
/// let ids = vec![1, 2, 3];
/// let meta_map = db::get_meta_for_items(&conn, &ids)?;

167
src/export_tar.rs Normal file
View File

@@ -0,0 +1,167 @@
use anyhow::{Context, Result, anyhow};
use log::debug;
use std::collections::HashSet;
use std::fs;
use std::io::{Read, Seek, Write};
use std::path::Path;
use tar::{Builder, Header};
use crate::filter_plugin::FilterChain;
use crate::modes::common::ExportMeta;
use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
/// Compute the intersection of all items' tag sets.
///
/// Returns sorted tags that are present on ALL items.
pub fn common_tags(items: &[ItemWithMeta]) -> Vec<String> {
if items.is_empty() {
return Vec::new();
}
let mut common: HashSet<String> = items[0].tag_names().into_iter().collect();
for item in items.iter().skip(1) {
let item_tags: HashSet<String> = item.tag_names().into_iter().collect();
common = common.intersection(&item_tags).cloned().collect();
}
let mut result: Vec<String> = common.into_iter().collect();
result.sort();
result
}
/// Resolve the export name from the CLI arg or compute default from common tags.
///
/// If `arg` is Some, uses that value directly.
/// Otherwise, computes `export_<common-tags>` or just `export` if no common tags.
pub fn export_name(arg: &Option<String>, items: &[ItemWithMeta]) -> String {
if let Some(name) = arg {
return name.clone();
}
let tags = common_tags(items);
if tags.is_empty() {
"export".to_string()
} else {
format!("export_{}", tags.join("_"))
}
}
/// Write items to a tar archive, streaming data without loading files into memory.
///
/// The archive contains `<dir_name>/<id>.data.<compression>` and
/// `<dir_name>/<id>.meta.yml` for each item.
///
/// # Arguments
///
/// * `writer` - The output writer (e.g., a File).
/// * `dir_name` - Top-level directory name inside the tar.
/// * `items` - Items to export.
/// * `data_path` - Path to the data storage directory.
/// * `filter_chain` - Optional filter chain for transforming content on export.
/// * `item_service` - Item service for streaming content.
/// * `conn` - Database connection for filter chain operations.
pub fn write_export_tar<W: Write>(
writer: W,
dir_name: &str,
items: &[ItemWithMeta],
data_path: &Path,
filter_chain: Option<&FilterChain>,
item_service: &ItemService,
conn: &rusqlite::Connection,
) -> Result<()> {
let mut builder = Builder::new(writer);
for item_with_meta in items {
let item_id = item_with_meta.item.id.context("Item missing ID")?;
let compression = &item_with_meta.item.compression;
let item_tags = item_with_meta.tag_names();
let meta_map = item_with_meta.meta_as_map();
let data_path_entry = format!("{dir_name}/{item_id}.data.{compression}");
let meta_path_entry = format!("{dir_name}/{item_id}.meta.yml");
// Meta entry (small, in-memory is fine)
let export_meta = ExportMeta {
ts: item_with_meta.item.ts,
compression: compression.clone(),
uncompressed_size: item_with_meta.item.uncompressed_size,
tags: item_tags,
metadata: meta_map,
};
let meta_yaml = serde_yaml::to_string(&export_meta)?;
let meta_bytes = meta_yaml.into_bytes();
let meta_len = meta_bytes.len() as u64;
let mut meta_header = Header::new_gnu();
meta_header.set_size(meta_len);
meta_header.set_mode(0o644);
meta_header.set_path(&meta_path_entry)?;
meta_header.set_cksum();
builder
.append(&meta_header, meta_bytes.as_slice())
.with_context(|| format!("Cannot write meta entry for item {item_id}"))?;
debug!("EXPORT_TAR: Wrote meta entry {meta_path_entry}");
// Data entry
let mut item_file_path = data_path.to_path_buf();
item_file_path.push(item_id.to_string());
if let Some(chain) = filter_chain {
// Filtered export: spool through filter chain to a temp file,
// then stream the temp file into the tar with known size.
let (mut reader, _, _) = item_service.get_item_content_info_streaming_with_chain(
conn,
item_id,
Some(chain),
)?;
let mut tmp = tempfile::NamedTempFile::new()
.context("Cannot create temp file for filtered export")?;
let mut buf = [0u8; crate::common::PIPESIZE];
loop {
let n = reader.read(&mut buf)?;
if n == 0 {
break;
}
tmp.write_all(&buf[..n])?;
}
tmp.flush()?;
let total_size = tmp.as_file().metadata()?.len();
tmp.rewind()?;
let mut data_header = Header::new_gnu();
data_header.set_size(total_size);
data_header.set_mode(0o644);
data_header.set_path(&data_path_entry)?;
data_header.set_cksum();
builder
.append(&data_header, &mut tmp)
.with_context(|| format!("Cannot write data entry for item {item_id}"))?;
debug!("EXPORT_TAR: Wrote filtered data entry {data_path_entry} ({total_size} bytes)");
} else {
// Unfiltered export: stream raw compressed file
let file = fs::File::open(&item_file_path)
.with_context(|| format!("Cannot open data file: {}", item_file_path.display()))?;
let file_size = file.metadata()?.len();
let mut data_header = Header::new_gnu();
data_header.set_size(file_size);
data_header.set_mode(0o644);
data_header.set_path(&data_path_entry)?;
data_header.set_cksum();
builder
.append(&data_header, file)
.with_context(|| format!("Cannot write data entry for item {item_id}"))?;
debug!("EXPORT_TAR: Wrote data entry {data_path_entry} ({file_size} bytes)");
}
}
builder.finish().context("Cannot finalize tar archive")?;
debug!("EXPORT_TAR: Archive finalized");
Ok(())
}

View File

@@ -164,13 +164,6 @@ impl FilterPlugin for ExecFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// Creates a new instance without active process handles.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(ExecFilter {
program: self.program.clone(),
@@ -224,5 +217,6 @@ fn register_exec_filter() {
stdin_writer: None,
stdout_reader: None,
})
});
})
.expect("Failed to register exec filter");
}

View File

@@ -87,21 +87,6 @@ impl FilterPlugin for GrepFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// Creates a new GrepFilter with the same regex pattern.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, GrepFilter};
/// let filter = GrepFilter::new("test".to_string()).unwrap();
/// let cloned = filter.clone_box();
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
regex: self.regex.clone(),
@@ -126,11 +111,7 @@ impl FilterPlugin for GrepFilter {
/// assert!(opts[0].required);
/// ```
fn options(&self) -> Vec<FilterOption> {
vec![FilterOption {
name: "pattern".to_string(),
default: None,
required: true,
}]
crate::filter_plugin::pattern_option()
}
fn description(&self) -> &str {

View File

@@ -3,14 +3,7 @@ use crate::common::PIPESIZE;
use crate::services::filter_service::register_filter_plugin;
use std::io::{BufRead, Read, Result, Write};
/// A filter that reads the first N bytes from the input stream.
///
/// Limits the output to the initial bytes specified in the configuration.
/// Useful for previewing file contents without reading everything.
///
/// # Fields
///
/// * `remaining` - Number of bytes left to read before stopping.
#[derive(Clone)]
pub struct HeadBytesFilter {
remaining: usize,
}
@@ -94,21 +87,6 @@ impl FilterPlugin for HeadBytesFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// Creates an independent copy with the same configuration.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` clone.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, HeadBytesFilter};
/// let filter = HeadBytesFilter::new(100);
/// let cloned = filter.clone_box();
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
remaining: self.remaining,
@@ -134,11 +112,7 @@ impl FilterPlugin for HeadBytesFilter {
/// assert!(opts[0].required);
/// ```
fn options(&self) -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
crate::filter_plugin::count_option()
}
fn description(&self) -> &str {
@@ -146,7 +120,7 @@ impl FilterPlugin for HeadBytesFilter {
}
}
/// A filter that reads the first N lines from the input stream.
#[derive(Clone)]
pub struct HeadLinesFilter {
remaining: usize,
}
@@ -228,21 +202,6 @@ impl FilterPlugin for HeadLinesFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// Creates an independent copy with the same configuration.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` clone.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, HeadLinesFilter};
/// let filter = HeadLinesFilter::new(5);
/// let cloned = filter.clone_box();
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
remaining: self.remaining,
@@ -250,29 +209,8 @@ impl FilterPlugin for HeadLinesFilter {
}
/// Returns the configuration options for this filter.
///
/// Defines the "count" parameter as required with no default.
///
/// # Returns
///
/// Vector of `FilterOption` describing parameters.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterPlugin, HeadLinesFilter};
/// let filter = HeadLinesFilter::new(5);
/// let opts = filter.options();
/// assert_eq!(opts.len(), 1);
/// assert_eq!(opts[0].name, "count");
/// assert!(opts[0].required);
/// ```
fn options(&self) -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
crate::filter_plugin::count_option()
}
fn description(&self) -> &str {
@@ -283,6 +221,8 @@ impl FilterPlugin for HeadLinesFilter {
// Register the plugin at module initialization time
#[ctor::ctor]
fn register_head_filters() {
register_filter_plugin("head_bytes", || Box::new(HeadBytesFilter::new(0)));
register_filter_plugin("head_lines", || Box::new(HeadLinesFilter::new(0)));
register_filter_plugin("head_bytes", || Box::new(HeadBytesFilter::new(0)))
.expect("Failed to register head_bytes filter");
register_filter_plugin("head_lines", || Box::new(HeadLinesFilter::new(0)))
.expect("Failed to register head_lines filter");
}

View File

@@ -2,6 +2,7 @@ use std::io::{Read, Result, Write};
use std::str::FromStr;
use strum::EnumString;
#[cfg(feature = "filter_grep")]
pub mod grep;
/// Filter plugin module for processing input streams.
///
@@ -16,7 +17,7 @@ pub mod grep;
/// ```
/// # use std::io::{Read, Write};
/// # use keep::filter_plugin::parse_filter_string;
/// let mut chain = parse_filter_string("head_lines(10)|grep(pattern=error)")?;
/// let mut chain = parse_filter_string("head_lines(10)|tail_lines(5)")?;
/// # let mut reader: &mut dyn Read = &mut std::io::empty();
/// # let mut writer: Vec<u8> = Vec::new();
/// # chain.filter(&mut reader, &mut writer)?;
@@ -26,12 +27,13 @@ pub mod head;
pub mod skip;
pub mod strip_ansi;
pub mod tail;
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
pub mod tokens;
pub mod utils;
use std::collections::HashMap;
#[cfg(feature = "filter_grep")]
pub use grep::GrepFilter;
pub use head::{HeadBytesFilter, HeadLinesFilter};
pub use skip::{SkipBytesFilter, SkipLinesFilter};
@@ -108,18 +110,16 @@ pub trait FilterPlugin: Send {
/// struct MyFilter;
/// impl FilterPlugin for MyFilter {
/// fn filter(&mut self, reader: &mut dyn Read, writer: &mut dyn Write) -> Result<()> {
/// // Read and filter data
/// let mut buf = [0; 1024];
/// loop {
/// let n = reader.read(&mut buf)?;
/// if n == 0 { break; }
/// // Apply filter logic to buf[0..n]
/// writer.write_all(&buf[0..n])?;
/// }
/// Ok(())
/// }
/// fn clone_box(&self) -> Box<dyn FilterPlugin> {
/// Box::new(MyFilter)
/// Box::new(Self)
/// }
/// fn options(&self) -> Vec<FilterOption> {
/// vec![]
@@ -131,22 +131,6 @@ pub trait FilterPlugin: Send {
Ok(())
}
/// Clones this plugin into a new boxed instance.
///
/// This method is required for dynamic dispatch and cloning in filter chains.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` clone of the current plugin.
///
/// # Examples
///
/// ```
/// # use keep::filter_plugin::FilterPlugin;
/// fn example_clone_box(filter: &dyn FilterPlugin) -> Box<dyn FilterPlugin> {
/// filter.clone_box()
/// }
/// ```
fn clone_box(&self) -> Box<dyn FilterPlugin>;
/// Returns the configuration options for this plugin.
@@ -183,6 +167,22 @@ pub trait FilterPlugin: Send {
}
}
pub fn count_option() -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
}
pub fn pattern_option() -> Vec<FilterOption> {
vec![FilterOption {
name: "pattern".to_string(),
default: None,
required: true,
}]
}
/// Enum representing the different types of filters.
///
/// Used for parsing and instantiating specific filter plugins.
@@ -201,13 +201,14 @@ pub enum FilterType {
TailLines,
SkipBytes,
SkipLines,
#[cfg(feature = "filter_grep")]
Grep,
StripAnsi,
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
HeadTokens,
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
SkipTokens,
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
TailTokens,
}
@@ -215,6 +216,44 @@ pub enum FilterType {
/// Prevents OOM on large files by rejecting inputs that exceed this limit.
const MAX_FILTER_BUFFER_SIZE: usize = 256 * 1024 * 1024;
struct BoundedVecWriter {
data: Vec<u8>,
limit: usize,
}
impl BoundedVecWriter {
fn new(limit: usize) -> Self {
Self {
data: Vec::new(),
limit,
}
}
fn into_inner(self) -> Vec<u8> {
self.data
}
}
impl std::io::Write for BoundedVecWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
if self.data.len() + buf.len() > self.limit {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
format!(
"Input size exceeds maximum filter buffer size ({} bytes)",
MAX_FILTER_BUFFER_SIZE
),
));
}
self.data.write_all(buf)?;
Ok(buf.len())
}
fn flush(&mut self) -> std::io::Result<()> {
Ok(())
}
}
/// A chain of filter plugins applied sequentially.
///
/// Chains multiple filters, applying them in order to the input stream.
@@ -262,16 +301,27 @@ impl Clone for FilterChain {
}
impl Clone for Box<dyn FilterPlugin> {
/// Clones the boxed filter plugin.
///
/// # Returns
///
/// A new boxed clone of the filter plugin.
fn clone(&self) -> Self {
self.clone_box()
}
}
#[macro_export]
macro_rules! filter_clone_box {
($self:expr) => {
Box::new($self.clone())
};
($self:expr, $field:ident) => {
Box::new(Self { $field: $self.$field.clone() })
};
($self:expr, $field:ident, $($rest:ident),+) => {
Box::new(Self {
$field: $self.$field.clone(),
$($rest: $self.$rest.clone()),+
})
};
}
impl Default for FilterChain {
fn default() -> Self {
Self::new()
@@ -309,9 +359,8 @@ impl FilterChain {
/// # Examples
///
/// ```
/// # use keep::filter_plugin::{FilterChain, GrepFilter};
/// # use keep::filter_plugin::FilterChain;
/// let mut chain = FilterChain::new();
/// chain.add_plugin(Box::new(GrepFilter::new("error".to_string()).unwrap()));
/// ```
pub fn add_plugin(&mut self, plugin: Box<dyn FilterPlugin>) {
self.plugins.push(plugin);
@@ -351,21 +400,10 @@ impl FilterChain {
}
// For multiple plugins, we need to chain them together
// We'll use a temporary buffer to hold intermediate results
let mut current_data = Vec::new();
std::io::copy(reader, &mut current_data)?;
if current_data.len() > MAX_FILTER_BUFFER_SIZE {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
format!(
"Input size ({} bytes) exceeds maximum filter buffer size ({} bytes). \
Consider using fewer filter plugins or smaller inputs.",
current_data.len(),
MAX_FILTER_BUFFER_SIZE
),
));
}
// We'll use a bounded buffer to hold intermediate results
let mut bounded_writer = BoundedVecWriter::new(MAX_FILTER_BUFFER_SIZE);
std::io::copy(reader, &mut bounded_writer)?;
let mut current_data = bounded_writer.into_inner();
// Store the plugins length to avoid borrowing issues
let plugins_len = self.plugins.len();
@@ -499,6 +537,7 @@ fn create_filter_with_options(
// Get the default options for this filter type by creating a temporary instance
// To do this, we need to create a default instance of the appropriate filter
let option_defs = match filter_type {
#[cfg(feature = "filter_grep")]
FilterType::Grep => grep::GrepFilter::new("".to_string())?.options(),
FilterType::HeadBytes => head::HeadBytesFilter::new(0).options(),
FilterType::HeadLines => head::HeadLinesFilter::new(0).options(),
@@ -507,11 +546,11 @@ fn create_filter_with_options(
FilterType::SkipBytes => skip::SkipBytesFilter::new(0).options(),
FilterType::SkipLines => skip::SkipLinesFilter::new(0).options(),
FilterType::StripAnsi => strip_ansi::StripAnsiFilter::new().options(),
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
FilterType::HeadTokens => tokens::HeadTokensFilter::new(0).options(),
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
FilterType::SkipTokens => tokens::SkipTokensFilter::new(0).options(),
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
FilterType::TailTokens => tokens::TailTokensFilter::new(0).options(),
};
@@ -581,6 +620,7 @@ fn create_specific_filter(
options: &HashMap<String, serde_json::Value>,
) -> Result<Box<dyn FilterPlugin>> {
match filter_type {
#[cfg(feature = "filter_grep")]
FilterType::Grep => {
let pattern = options
.get("pattern")
@@ -681,7 +721,7 @@ fn create_specific_filter(
}
Ok(Box::new(strip_ansi::StripAnsiFilter::new()))
}
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
FilterType::HeadTokens => {
let count = options
.get("count")
@@ -693,17 +733,13 @@ fn create_specific_filter(
"head_tokens filter requires 'count' parameter",
)
})?;
let encoding = options
.get("encoding")
.and_then(|v| v.as_str())
.and_then(|s| s.parse::<crate::tokenizer::TokenEncoding>().ok())
.unwrap_or_default();
let (encoding, tokenizer) = parse_encoding_option(options);
let mut f = tokens::HeadTokensFilter::new(count);
f.tokenizer = crate::tokenizer::get_tokenizer(encoding).clone();
f.tokenizer = tokenizer;
f.encoding = encoding;
Ok(Box::new(f))
}
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
FilterType::SkipTokens => {
let count = options
.get("count")
@@ -715,17 +751,13 @@ fn create_specific_filter(
"skip_tokens filter requires 'count' parameter",
)
})?;
let encoding = options
.get("encoding")
.and_then(|v| v.as_str())
.and_then(|s| s.parse::<crate::tokenizer::TokenEncoding>().ok())
.unwrap_or_default();
let (encoding, tokenizer) = parse_encoding_option(options);
let mut f = tokens::SkipTokensFilter::new(count);
f.tokenizer = crate::tokenizer::get_tokenizer(encoding).clone();
f.tokenizer = tokenizer;
f.encoding = encoding;
Ok(Box::new(f))
}
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
FilterType::TailTokens => {
let count = options
.get("count")
@@ -737,17 +769,26 @@ fn create_specific_filter(
"tail_tokens filter requires 'count' parameter",
)
})?;
let (encoding, tokenizer) = parse_encoding_option(options);
let mut f = tokens::TailTokensFilter::new(count);
f.tokenizer = tokenizer;
f.encoding = encoding;
Ok(Box::new(f))
}
}
}
#[cfg(feature = "meta_tokens")]
fn parse_encoding_option(
options: &std::collections::HashMap<String, serde_json::Value>,
) -> (crate::tokenizer::TokenEncoding, crate::tokenizer::Tokenizer) {
let encoding = options
.get("encoding")
.and_then(|v| v.as_str())
.and_then(|s| s.parse::<crate::tokenizer::TokenEncoding>().ok())
.unwrap_or_default();
let mut f = tokens::TailTokensFilter::new(count);
f.tokenizer = crate::tokenizer::get_tokenizer(encoding).clone();
f.encoding = encoding;
Ok(Box::new(f))
}
}
let tokenizer = crate::tokenizer::get_tokenizer(encoding).clone();
(encoding, tokenizer)
}
/// Parses an option value from a string into a JSON value.

View File

@@ -4,6 +4,7 @@ use crate::services::filter_service::register_filter_plugin;
use std::io::{BufRead, Read, Result, Write};
/// A filter that skips the first N bytes from the input stream.
#[derive(Clone)]
pub struct SkipBytesFilter {
remaining: usize,
}
@@ -49,11 +50,6 @@ impl FilterPlugin for SkipBytesFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
remaining: self.remaining,
@@ -61,16 +57,8 @@ impl FilterPlugin for SkipBytesFilter {
}
/// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
crate::filter_plugin::count_option()
}
fn description(&self) -> &str {
@@ -79,6 +67,7 @@ impl FilterPlugin for SkipBytesFilter {
}
/// A filter that skips the first N lines from the input stream.
#[derive(Clone)]
pub struct SkipLinesFilter {
remaining: usize,
}
@@ -118,11 +107,6 @@ impl FilterPlugin for SkipLinesFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
remaining: self.remaining,
@@ -130,16 +114,8 @@ impl FilterPlugin for SkipLinesFilter {
}
/// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
crate::filter_plugin::count_option()
}
fn description(&self) -> &str {
@@ -150,6 +126,8 @@ impl FilterPlugin for SkipLinesFilter {
// Register the plugin at module initialization time
#[ctor::ctor]
fn register_skip_filters() {
register_filter_plugin("skip_bytes", || Box::new(SkipBytesFilter::new(0)));
register_filter_plugin("skip_lines", || Box::new(SkipLinesFilter::new(0)));
register_filter_plugin("skip_bytes", || Box::new(SkipBytesFilter::new(0)))
.expect("Failed to register skip_bytes filter");
register_filter_plugin("skip_lines", || Box::new(SkipLinesFilter::new(0)))
.expect("Failed to register skip_lines filter");
}

View File

@@ -7,7 +7,7 @@ use strip_ansi_escapes::Writer;
/// # Fields
///
/// None, stateless filter.
#[derive(Default)]
#[derive(Default, Clone)]
pub struct StripAnsiFilter;
impl StripAnsiFilter {
@@ -39,22 +39,12 @@ impl FilterPlugin for StripAnsiFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self)
}
/// Returns the configuration options for this filter (none required).
///
/// # Returns
///
/// An empty vector since this filter has no configurable options.
fn options(&self) -> Vec<FilterOption> {
Vec::new() // strip_ansi doesn't take any options
Vec::new()
}
fn description(&self) -> &str {

View File

@@ -4,7 +4,7 @@ use crate::services::filter_service::register_filter_plugin;
use std::collections::VecDeque;
use std::io::{BufRead, Read, Result, Write};
/// A filter that reads the last N bytes from the input stream.
#[derive(Clone)]
pub struct TailBytesFilter {
buffer: VecDeque<u8>,
count: usize,
@@ -58,11 +58,6 @@ impl FilterPlugin for TailBytesFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
buffer: self.buffer.clone(),
@@ -71,16 +66,8 @@ impl FilterPlugin for TailBytesFilter {
}
/// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
crate::filter_plugin::count_option()
}
fn description(&self) -> &str {
@@ -89,6 +76,7 @@ impl FilterPlugin for TailBytesFilter {
}
/// A filter that reads the last N lines from the input stream.
#[derive(Clone)]
pub struct TailLinesFilter {
lines: VecDeque<String>,
count: usize,
@@ -136,11 +124,6 @@ impl FilterPlugin for TailLinesFilter {
Ok(())
}
/// Clones this filter into a new boxed instance.
///
/// # Returns
///
/// A new `Box<dyn FilterPlugin>` representing a clone of this filter.
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
lines: self.lines.clone(),
@@ -149,16 +132,8 @@ impl FilterPlugin for TailLinesFilter {
}
/// Returns the configuration options for this filter.
///
/// # Returns
///
/// A vector of `FilterOption` describing the filter's configurable parameters.
fn options(&self) -> Vec<FilterOption> {
vec![FilterOption {
name: "count".to_string(),
default: None,
required: true,
}]
crate::filter_plugin::count_option()
}
fn description(&self) -> &str {
@@ -169,6 +144,8 @@ impl FilterPlugin for TailLinesFilter {
// Register the plugin at module initialization time
#[ctor::ctor]
fn register_tail_filters() {
register_filter_plugin("tail_bytes", || Box::new(TailBytesFilter::new(0)));
register_filter_plugin("tail_lines", || Box::new(TailLinesFilter::new(0)));
register_filter_plugin("tail_bytes", || Box::new(TailBytesFilter::new(0)))
.expect("Failed to register tail_bytes filter");
register_filter_plugin("tail_lines", || Box::new(TailLinesFilter::new(0)))
.expect("Failed to register tail_lines filter");
}

View File

@@ -8,11 +8,7 @@ use std::io::{Read, Result, Write};
// head_tokens
// ---------------------------------------------------------------------------
/// A filter that outputs only the first N tokens of the input stream.
///
/// Streams bytes directly until the token limit is reached. When the limit
/// falls mid-chunk, uses `split_by_token_iter` to find the exact byte boundary
/// without allocating token strings beyond what is needed.
#[derive(Clone)]
pub struct HeadTokensFilter {
pub remaining: usize,
pub tokenizer: Tokenizer,
@@ -78,7 +74,7 @@ impl FilterPlugin for HeadTokensFilter {
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
remaining: self.remaining,
tokenizer: get_tokenizer(self.encoding).clone(),
tokenizer: self.tokenizer.clone(),
encoding: self.encoding,
})
}
@@ -107,7 +103,7 @@ impl FilterPlugin for HeadTokensFilter {
// skip_tokens
// ---------------------------------------------------------------------------
/// A filter that skips the first N tokens of the input stream and outputs the rest.
#[derive(Clone)]
pub struct SkipTokensFilter {
pub remaining: usize,
pub tokenizer: Tokenizer,
@@ -180,7 +176,7 @@ impl FilterPlugin for SkipTokensFilter {
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
remaining: self.remaining,
tokenizer: get_tokenizer(self.encoding).clone(),
tokenizer: self.tokenizer.clone(),
encoding: self.encoding,
})
}
@@ -211,8 +207,7 @@ impl FilterPlugin for SkipTokensFilter {
/// A filter that outputs only the last N tokens of the input stream.
///
/// Buffers all bytes from the stream, then at finalize tokenizes the
/// content and writes only the last N tokens.
#[derive(Clone)]
pub struct TailTokensFilter {
pub count: usize,
/// Buffer holding all bytes from the stream.
@@ -275,8 +270,8 @@ impl FilterPlugin for TailTokensFilter {
fn clone_box(&self) -> Box<dyn FilterPlugin> {
Box::new(Self {
count: self.count,
buffer: self.buffer.clone(),
tokenizer: get_tokenizer(self.encoding).clone(),
buffer: Vec::new(),
tokenizer: self.tokenizer.clone(),
encoding: self.encoding,
})
}
@@ -377,9 +372,12 @@ fn map_lossy_pos_to_bytes(original: &[u8], lossy: &str, lossy_pos: usize) -> usi
#[ctor::ctor]
fn register_token_filters() {
register_filter_plugin("head_tokens", || Box::new(HeadTokensFilter::new(0)));
register_filter_plugin("skip_tokens", || Box::new(SkipTokensFilter::new(0)));
register_filter_plugin("tail_tokens", || Box::new(TailTokensFilter::new(0)));
register_filter_plugin("head_tokens", || Box::new(HeadTokensFilter::new(0)))
.expect("Failed to register head_tokens filter");
register_filter_plugin("skip_tokens", || Box::new(SkipTokensFilter::new(0)))
.expect("Failed to register skip_tokens filter");
register_filter_plugin("tail_tokens", || Box::new(TailTokensFilter::new(0)))
.expect("Failed to register tail_tokens filter");
}
#[cfg(test)]

225
src/import_tar.rs Normal file
View File

@@ -0,0 +1,225 @@
use anyhow::{Context, Result, anyhow};
use log::debug;
use std::collections::HashMap;
use std::fs;
use std::io::{Read, Write};
use std::path::Path;
use std::str::FromStr;
use tempfile::TempDir;
use tar::Archive;
use crate::common::PIPESIZE;
use crate::compression_engine::CompressionType;
use crate::db;
use crate::modes::common::ImportMeta;
/// Represents a parsed tar entry from an export archive.
struct TarEntry {
/// Path to the extracted data file in the temp directory.
data_path: Option<std::path::PathBuf>,
/// Path to the extracted meta file in the temp directory.
meta_path: Option<std::path::PathBuf>,
}
/// Import all items from a `.keep.tar` archive.
///
/// Items are imported in ascending order of their original IDs,
/// ensuring chronological ordering is preserved. Each imported item
/// receives a new auto-incremented ID from the target database.
///
/// # Arguments
///
/// * `tar_path` - Path to the `.keep.tar` file.
/// * `conn` - Mutable database connection.
/// * `data_path` - Path to the data storage directory.
///
/// # Returns
///
/// A list of newly assigned item IDs.
pub fn import_from_tar(
tar_path: &Path,
conn: &mut rusqlite::Connection,
data_path: &Path,
) -> Result<Vec<i64>> {
let file = fs::File::open(tar_path)
.with_context(|| format!("Cannot open tar file: {}", tar_path.display()))?;
let mut archive = Archive::new(file);
let tmp_dir = TempDir::new().context("Cannot create temporary directory for import")?;
let tmp_path = tmp_dir.path();
// Extract entries to temp dir
let mut entries_map: HashMap<i64, TarEntry> = HashMap::new();
for entry_result in archive.entries().context("Cannot read tar entries")? {
let mut entry = entry_result.context("Cannot read tar entry")?;
let entry_path = entry.path().context("Cannot get entry path")?.to_path_buf();
let path_str = entry_path.to_string_lossy().replace('\\', "/");
// Reject path traversal attempts
if path_str.starts_with('/') || path_str.starts_with("..") || path_str.contains("/../") {
return Err(anyhow!("Rejected path traversal entry: {path_str}"));
}
// Skip directory entries
if entry.header().entry_type().is_dir() {
debug!("IMPORT_TAR: Skipping directory entry: {path_str}");
continue;
}
// Parse: <dir>/<id>.data.<compression> or <dir>/<id>.meta.yml
let filename = entry_path
.file_name()
.ok_or_else(|| anyhow!("Invalid entry path: {path_str}"))?
.to_string_lossy();
let (orig_id, is_data) = if let Some(id_str) = filename.strip_suffix(".meta.yml") {
let id: i64 = id_str
.parse()
.with_context(|| format!("Invalid ID in entry: {path_str}"))?;
(id, false)
} else if let Some(dot_pos) = filename.find(".data.") {
let id_str = &filename[..dot_pos];
let id: i64 = id_str
.parse()
.with_context(|| format!("Invalid ID in entry: {path_str}"))?;
(id, true)
} else {
debug!("IMPORT_TAR: Skipping unrecognized entry: {path_str}");
continue;
};
let entry_ref = entries_map.entry(orig_id).or_insert_with(|| TarEntry {
data_path: None,
meta_path: None,
});
if is_data {
let dest = tmp_path.join(format!("{orig_id}.data"));
let mut dest_file = fs::File::create(&dest).context("Cannot create temp data file")?;
let mut buf = [0u8; PIPESIZE];
loop {
let n = entry.read(&mut buf)?;
if n == 0 {
break;
}
dest_file.write_all(&buf[..n])?;
}
entry_ref.data_path = Some(dest);
debug!("IMPORT_TAR: Extracted data for original ID {orig_id}");
} else {
let dest = tmp_path.join(format!("{orig_id}.meta.yml"));
let mut dest_file = fs::File::create(&dest).context("Cannot create temp meta file")?;
let mut buf = [0u8; PIPESIZE];
loop {
let n = entry.read(&mut buf)?;
if n == 0 {
break;
}
dest_file.write_all(&buf[..n])?;
}
entry_ref.meta_path = Some(dest);
debug!("IMPORT_TAR: Extracted meta for original ID {orig_id}");
}
}
if entries_map.is_empty() {
return Err(anyhow!("No items found in archive"));
}
// Sort by original ID ascending
let mut sorted_ids: Vec<i64> = entries_map.keys().copied().collect();
sorted_ids.sort_unstable();
let mut imported_ids = Vec::new();
for orig_id in sorted_ids {
let entry = entries_map.get(&orig_id).expect("ID should exist in map");
let meta_path = entry
.meta_path
.as_ref()
.ok_or_else(|| anyhow!("Item {orig_id} missing .meta.yml entry"))?;
let data_path_entry = entry
.data_path
.as_ref()
.ok_or_else(|| anyhow!("Item {orig_id} missing .data entry"))?;
// Parse metadata
let meta_yaml = fs::read_to_string(meta_path)
.with_context(|| format!("Cannot read meta file for item {orig_id}"))?;
let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml)
.with_context(|| format!("Cannot parse meta file for item {orig_id}"))?;
// Validate compression type
CompressionType::from_str(&import_meta.compression).map_err(|_| {
anyhow!(
"Invalid compression type '{}' for item {}",
import_meta.compression,
orig_id
)
})?;
// Create item with original timestamp
let item = db::insert_item_with_ts(conn, import_meta.ts, &import_meta.compression)?;
let new_id = item.id.context("New item missing ID")?;
// Set tags
let tags = if !import_meta.tags.is_empty() {
db::set_item_tags(conn, item.clone(), &import_meta.tags)?;
import_meta.tags.clone()
} else {
Vec::new()
};
// Stream data to storage
let mut storage_path = data_path.to_path_buf();
storage_path.push(new_id.to_string());
let mut reader = fs::File::open(data_path_entry)
.with_context(|| format!("Cannot read data file for item {orig_id}"))?;
let mut writer = fs::File::create(&storage_path)
.with_context(|| format!("Cannot create storage file for item {new_id}"))?;
let mut buf = [0u8; PIPESIZE];
let mut total = 0i64;
loop {
let n = reader.read(&mut buf)?;
if n == 0 {
break;
}
writer.write_all(&buf[..n])?;
total += n as i64;
}
if total == 0 {
return Err(anyhow!("Item {orig_id} has empty data file"));
}
// Set metadata
for (key, value) in &import_meta.metadata {
db::query_upsert_meta(
conn,
db::Meta {
id: new_id,
name: key.clone(),
value: value.clone(),
},
)?;
}
// Update item sizes
let size_to_record = import_meta.uncompressed_size.unwrap_or(total);
let mut updated_item = item;
updated_item.uncompressed_size = Some(size_to_record);
updated_item.compressed_size = Some(std::fs::metadata(&storage_path)?.len() as i64);
updated_item.closed = true;
db::update_item(conn, updated_item)?;
log::info!("KEEP: Imported item {new_id} (was {orig_id}) tags: {tags:?}");
imported_ids.push(new_id);
}
Ok(imported_ids)
}

View File

@@ -35,7 +35,9 @@ pub mod common;
pub mod compression_engine;
pub mod config;
pub mod db;
pub mod export_tar;
pub mod filter_plugin;
pub mod import_tar;
pub mod meta_plugin;
pub mod modes;
pub mod services;
@@ -43,19 +45,23 @@ pub mod services;
#[cfg(feature = "client")]
pub mod client;
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
pub mod tokenizer;
// Re-export Args struct for library usage
pub use args::Args;
// Re-export PIPESIZE constant
pub use common::PIPESIZE;
pub use services::CoreError;
// Import all filter plugins to ensure they register themselves
#[allow(unused_imports)]
use filter_plugin::{grep, head, skip, strip_ansi, tail};
#[cfg(feature = "filter_grep")]
use filter_plugin::grep;
#[allow(unused_imports)]
use filter_plugin::{head, skip, strip_ansi, tail};
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
#[allow(unused_imports)]
use filter_plugin::tokens as token_filters;
@@ -63,14 +69,22 @@ use crate::meta_plugin::{
cwd, digest, env, exec, hostname, keep_pid, read_rate, read_time, shell, shell_pid, user,
};
#[cfg(feature = "magic")]
#[cfg(feature = "meta_magic")]
#[allow(unused_imports)]
use crate::meta_plugin::magic_file;
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
#[allow(unused_imports)]
use crate::meta_plugin::tokens;
#[cfg(feature = "meta_infer")]
#[allow(unused_imports)]
use crate::meta_plugin::infer_plugin;
#[cfg(feature = "meta_tree_magic_mini")]
#[allow(unused_imports)]
use crate::meta_plugin::tree_magic_mini;
/// Initializes plugins at library load time.
///
/// Plugin registration happens automatically via `#[ctor]` constructors

View File

@@ -28,6 +28,12 @@ fn main() -> Result<(), Error> {
cmd.error(ErrorKind::ValueValidation, e).exit();
}
// Handle --generate-completion early (prints to stdout and exits)
if let Some(shell) = args.mode.generate_completion {
clap_complete::generate(shell, &mut Args::command(), "keep", &mut std::io::stdout());
std::process::exit(0);
}
let start = Instant::now();
let mut builder = env_logger::Builder::new();
let show_module = args.options.verbose >= 2;
@@ -75,7 +81,7 @@ fn main() -> Result<(), Error> {
let ids = &mut Vec::new();
let tags = &mut Vec::new();
// For --info and --get modes, treat numeric strings as IDs
// For --info, --get, --export, and --list modes, treat numeric strings as IDs
for v in args.ids_or_tags.iter() {
debug!("MAIN: Parsed value: {v:?}");
match v.clone() {
@@ -84,22 +90,15 @@ fn main() -> Result<(), Error> {
ids.push(num)
}
NumberOrString::Str(str) => {
// For --info and --get, try to parse strings as numbers to treat them as IDs
if args.mode.info || args.mode.get {
if let Ok(num) = str.parse::<i64>() {
// For --info, --get, --export, and --list, try to parse strings as numbers to treat them as IDs
if (args.mode.info || args.mode.get || args.mode.export || args.mode.list)
&& let Ok(num) = str.parse::<i64>()
{
debug!("MAIN: Adding parsed string to ids: {num}");
ids.push(num);
continue;
} else if args.mode.info {
// --info only accepts numeric IDs
cmd.error(
ErrorKind::InvalidValue,
format!("--info requires numeric IDs, found: '{str}'"),
)
.exit();
}
}
// If not a number, or not using --info/--get, treat as tag
// If not a number, or not using --info/--get/--export/--list, treat as tag
debug!("MAIN: Adding to tags: {str}");
tags.push(str)
}
@@ -118,8 +117,12 @@ fn main() -> Result<(), Error> {
List,
Delete,
Info,
Update,
Export,
Import,
Status,
StatusPlugins,
#[cfg(feature = "server")]
Server,
GenerateConfig,
}
@@ -138,13 +141,24 @@ fn main() -> Result<(), Error> {
mode = KeepModes::Delete;
} else if args.mode.info {
mode = KeepModes::Info;
} else if args.mode.update {
mode = KeepModes::Update;
} else if args.mode.export {
mode = KeepModes::Export;
} else if args.mode.import.is_some() {
mode = KeepModes::Import;
} else if args.mode.status {
mode = KeepModes::Status;
} else if args.mode.status_plugins {
mode = KeepModes::StatusPlugins;
} else if args.mode.server {
}
#[cfg(feature = "server")]
{
if args.mode.server {
mode = KeepModes::Server;
} else if args.mode.generate_config {
}
}
if args.mode.generate_config {
mode = KeepModes::GenerateConfig;
}
@@ -180,6 +194,7 @@ fn main() -> Result<(), Error> {
}
// Validate server password usage
#[cfg(feature = "server")]
if settings.server_password().is_some() && mode != KeepModes::Server {
cmd.error(
ErrorKind::InvalidValue,
@@ -188,6 +203,15 @@ fn main() -> Result<(), Error> {
.exit();
}
// Validate ids-only usage
if settings.ids_only && mode != KeepModes::List {
cmd.error(
ErrorKind::InvalidValue,
"--ids-only can only be used with --list mode",
)
.exit();
}
debug!("MAIN: args: {args:?}");
debug!("MAIN: ids: {ids:?}");
debug!("MAIN: tags: {tags:?}");
@@ -223,7 +247,11 @@ fn main() -> Result<(), Error> {
return match mode {
KeepModes::Save => {
let metadata = std::collections::HashMap::new();
let metadata: std::collections::HashMap<String, String> = settings
.meta
.iter()
.filter_map(|(k, v)| v.as_ref().map(|val| (k.clone(), val.clone())))
.collect();
keep::modes::client::save::mode(&client, &mut cmd, &settings, tags, metadata)
}
KeepModes::Get => keep::modes::client::get::mode(
@@ -235,7 +263,7 @@ fn main() -> Result<(), Error> {
filter_chain,
),
KeepModes::List => {
keep::modes::client::list::mode(&client, &mut cmd, &settings, tags)
keep::modes::client::list::mode(&client, &mut cmd, &settings, ids, tags)
}
KeepModes::Delete => {
keep::modes::client::delete::mode(&client, &mut cmd, &settings, ids)
@@ -249,6 +277,16 @@ fn main() -> Result<(), Error> {
KeepModes::Status => {
keep::modes::client::status::mode(&client, &mut cmd, &settings)
}
KeepModes::Update => {
keep::modes::client::update::mode(&client, &mut cmd, &settings, ids, tags)
}
KeepModes::Export => {
keep::modes::client::export::mode(&client, &mut cmd, &settings, ids, tags)
}
KeepModes::Import => {
let meta_file = args.mode.import.as_ref().unwrap();
keep::modes::client::import::mode(&client, &mut cmd, &settings, meta_file)
}
_ => {
cmd.error(
ErrorKind::InvalidValue,
@@ -260,6 +298,9 @@ fn main() -> Result<(), Error> {
}
}
// SAFETY: umask is thread-safe by POSIX spec, and we invoke it exactly once
// before any file operations to set a secure default mask. No other threads
// exist yet at this point in main(), so there is no data race.
unsafe {
libc::umask(0o077);
}
@@ -301,23 +342,28 @@ fn main() -> Result<(), Error> {
KeepModes::Info => {
modes::info::mode_info(&mut cmd, &settings, ids, tags, &mut conn, data_path)
}
KeepModes::Update => {
modes::update::mode_update(&mut cmd, &settings, ids, tags, &mut conn, data_path)
}
KeepModes::Export => modes::export::mode_export(
&mut cmd,
&settings,
ids,
tags,
&mut conn,
data_path,
filter_chain,
),
KeepModes::Import => {
let meta_file = args.mode.import.as_ref().unwrap();
modes::import::mode_import(&mut cmd, &settings, meta_file, &mut conn, data_path)
}
KeepModes::Status => modes::status::mode_status(&mut cmd, &settings, data_path, db_path),
KeepModes::StatusPlugins => {
modes::status_plugins::mode_status_plugins(&mut cmd, &settings, data_path, db_path)
}
KeepModes::Server => {
#[cfg(feature = "server")]
{
modes::server::mode_server(&mut cmd, &settings, &mut conn, data_path)
}
#[cfg(not(feature = "server"))]
{
cmd.error(
ErrorKind::MissingRequiredArgument,
"This binary was not compiled with server support. Recompile with --features server"
).exit();
}
}
KeepModes::Server => modes::server::mode_server(&mut cmd, &settings, &mut conn, data_path),
KeepModes::GenerateConfig => {
modes::generate_config::mode_generate_config(&mut cmd, &settings)
}

View File

@@ -49,6 +49,14 @@ impl MetaPlugin for CwdMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn finalize(&mut self) -> crate::meta_plugin::MetaPluginResponse {
// If already finalized, don't process again
if self.is_finalized {
@@ -128,5 +136,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_cwd_plugin() {
register_meta_plugin(MetaPluginType::Cwd, |options, outputs| {
Box::new(CwdMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register CwdMetaPlugin");
}

View File

@@ -32,7 +32,7 @@ impl Hasher {
match self {
Hasher::Sha256(hasher) => hasher.update(data),
Hasher::Md5(hasher) => {
let _ = hasher.write(data);
hasher.consume(data);
}
Hasher::Sha512(hasher) => hasher.update(data),
}
@@ -159,6 +159,14 @@ impl MetaPlugin for DigestMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn initialize(&mut self) -> crate::meta_plugin::MetaPluginResponse {
crate::meta_plugin::MetaPluginResponse {
metadata: Vec::new(),
@@ -271,5 +279,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_digest_plugin() {
register_meta_plugin(MetaPluginType::Digest, |options, outputs| {
Box::new(DigestMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register DigestMetaPlugin");
}

View File

@@ -22,24 +22,40 @@ impl EnvMetaPlugin {
///
/// A new instance of `EnvMetaPlugin`.
pub fn new(
_options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
outputs: Option<std::collections::HashMap<String, serde_yaml::Value>>,
) -> Self {
// Collect environment variables starting with KEEP_META_
let mut env_vars = Vec::new();
let mut outputs_map = std::collections::HashMap::new();
// Use options from --meta-plugin JSON if provided and non-empty,
// otherwise fall back to KEEP_META_* environment variables.
let use_options = options.as_ref().map(|o| !o.is_empty()).unwrap_or(false);
if use_options {
let opts = options.as_ref().unwrap();
for (key, value) in opts {
let value_str = match value {
serde_yaml::Value::String(s) => s.clone(),
serde_yaml::Value::Number(n) => n.to_string(),
serde_yaml::Value::Bool(b) => b.to_string(),
_ => serde_yaml::to_string(value).unwrap_or_default(),
};
env_vars.push((key.clone(), value_str));
outputs_map.insert(key.clone(), serde_yaml::Value::String(key.clone()));
}
} else {
// Fall back to KEEP_META_* environment variables
for (key, value) in std::env::vars() {
if let Some(stripped_key) = key.strip_prefix("KEEP_META_") {
// Add to env_vars to process later
env_vars.push((stripped_key.to_string(), value));
// Add to outputs with default mapping to the stripped name
outputs_map.insert(
stripped_key.to_string(),
serde_yaml::Value::String(stripped_key.to_string()),
);
}
}
}
// Override with provided outputs
if let Some(provided_outputs) = outputs {
@@ -87,6 +103,14 @@ impl MetaPlugin for EnvMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
/// Initializes the plugin, processing environment variables.
///
/// Processes all KEEP_META_* variables and generates metadata using output mappings.
@@ -227,5 +251,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_env_plugin() {
register_meta_plugin(MetaPluginType::Env, |options, outputs| {
Box::new(EnvMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register EnvMetaPlugin");
}

View File

@@ -131,7 +131,19 @@ impl MetaPluginExec {
match cmd.spawn() {
Ok(mut child) => {
let stdin = child.stdin.take().unwrap();
let stdin = match child.stdin.take() {
Some(s) => s,
None => {
error!(
"META: Exec plugin: failed to capture stdin for '{}'",
self.program
);
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
};
self.writer = Some(Box::new(stdin));
self.process = Some(child);
debug!("META: Exec plugin: started process for '{}'", self.program);
@@ -167,6 +179,14 @@ impl MetaPlugin for MetaPluginExec {
false
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn initialize(&mut self) -> MetaPluginResponse {
self.start_process()
}
@@ -311,5 +331,6 @@ fn register_exec_plugin() {
options,
outputs,
))
});
})
.expect("Failed to register ExecMetaPlugin");
}

View File

@@ -211,6 +211,14 @@ impl MetaPlugin for HostnameMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn finalize(&mut self) -> crate::meta_plugin::MetaPluginResponse {
// If already finalized, don't process again
if self.is_finalized {
@@ -406,5 +414,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_hostname_plugin() {
register_meta_plugin(MetaPluginType::Hostname, |options, outputs| {
Box::new(HostnameMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register HostnameMetaPlugin");
}

View File

@@ -0,0 +1,177 @@
use crate::common::PIPESIZE;
use crate::meta_plugin::{
BaseMetaPlugin, MetaPlugin, MetaPluginResponse, MetaPluginType, process_metadata_outputs,
register_meta_plugin,
};
#[derive(Debug, Default)]
pub struct InferMetaPlugin {
buffer: Vec<u8>,
max_buffer_size: usize,
is_finalized: bool,
base: BaseMetaPlugin,
}
impl InferMetaPlugin {
pub fn new(
options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
outputs: Option<std::collections::HashMap<String, serde_yaml::Value>>,
) -> InferMetaPlugin {
let mut base = BaseMetaPlugin::new();
if let Some(opts) = options {
for (key, value) in opts {
base.options.insert(key, value);
}
}
let max_buffer_size = base
.options
.get("max_buffer_size")
.and_then(|v| v.as_u64())
.unwrap_or(PIPESIZE as u64) as usize;
base.outputs.insert(
"infer_mime_type".to_string(),
serde_yaml::Value::String("infer_mime_type".to_string()),
);
if let Some(outs) = outputs {
for (key, value) in outs {
base.outputs.insert(key, value);
}
}
InferMetaPlugin {
buffer: Vec::new(),
max_buffer_size,
is_finalized: false,
base,
}
}
}
impl MetaPlugin for InferMetaPlugin {
fn meta_type(&self) -> MetaPluginType {
MetaPluginType::Infer
}
fn is_finalized(&self) -> bool {
self.is_finalized
}
fn set_finalized(&mut self, finalized: bool) {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn update(&mut self, data: &[u8]) -> MetaPluginResponse {
if self.is_finalized {
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
let remaining = self.max_buffer_size.saturating_sub(self.buffer.len());
let to_add = &data[..data.len().min(remaining)];
self.buffer.extend_from_slice(to_add);
if self.buffer.len() >= self.max_buffer_size {
let mime_type = infer::get(&self.buffer)
.map(|kind| kind.mime_type().to_string())
.unwrap_or_else(|| "application/octet-stream".to_string());
self.is_finalized = true;
let metadata = process_metadata_outputs(
"infer_mime_type",
serde_yaml::Value::String(mime_type),
self.base.outputs(),
)
.map(|m| vec![m])
.unwrap_or_default();
return MetaPluginResponse {
metadata,
is_finalized: true,
};
}
MetaPluginResponse {
metadata: Vec::new(),
is_finalized: false,
}
}
fn finalize(&mut self) -> MetaPluginResponse {
if self.is_finalized {
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
let mime_type = infer::get(&self.buffer)
.map(|kind| kind.mime_type().to_string())
.unwrap_or_else(|| "application/octet-stream".to_string());
self.is_finalized = true;
let metadata = process_metadata_outputs(
"infer_mime_type",
serde_yaml::Value::String(mime_type),
self.base.outputs(),
)
.map(|m| vec![m])
.unwrap_or_default();
MetaPluginResponse {
metadata,
is_finalized: true,
}
}
fn outputs(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
self.base.outputs()
}
fn outputs_mut(
&mut self,
) -> anyhow::Result<&mut std::collections::HashMap<String, serde_yaml::Value>> {
Ok(self.base.outputs_mut())
}
fn default_outputs(&self) -> Vec<String> {
vec!["infer_mime_type".to_string()]
}
fn options(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
self.base.options()
}
fn options_mut(
&mut self,
) -> anyhow::Result<&mut std::collections::HashMap<String, serde_yaml::Value>> {
Ok(self.base.options_mut())
}
fn parallel_safe(&self) -> bool {
true
}
}
#[ctor::ctor]
fn register_infer_plugin() {
register_meta_plugin(MetaPluginType::Infer, |options, outputs| {
Box::new(InferMetaPlugin::new(options, outputs))
})
.expect("Failed to register InferMetaPlugin");
}

View File

@@ -54,6 +54,14 @@ impl MetaPlugin for KeepPidMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
/// Finalizes the plugin, processing any remaining data if needed.
///
/// # Returns
@@ -204,5 +212,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_keep_pid_plugin() {
register_meta_plugin(MetaPluginType::KeepPid, |options, outputs| {
Box::new(KeepPidMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register KeepPidMetaPlugin");
}

View File

@@ -1,9 +1,8 @@
#[cfg(feature = "magic")]
#[cfg(feature = "meta_magic")]
use magic::{Cookie, CookieFlags};
#[cfg(not(feature = "magic"))]
#[cfg(not(feature = "meta_magic"))]
use std::process::{Command, Stdio};
use log::debug;
use std::io::{self, Write};
use std::path::Path;
@@ -12,40 +11,26 @@ use crate::meta_plugin::{
process_metadata_outputs,
};
/// Wrapper around `magic::Cookie` that is Send.
///
/// Libmagic cookies are thread-safe per-instance (separate cookies have
/// independent state). The raw pointer `*mut magic_sys::magic_set` does not
/// auto-derive Send, but concurrent access to distinct cookies is safe per
/// the libmagic documentation.
#[cfg(feature = "magic")]
struct SendCookie(Cookie);
#[cfg(feature = "magic")]
// SAFETY: Each SendCookie owns a distinct libmagic instance. Libmagic
// documents that separate cookies can be used from different threads
// concurrently without synchronization.
#[allow(unsafe_code)]
unsafe impl Send for SendCookie {}
#[cfg(feature = "magic")]
impl std::fmt::Debug for SendCookie {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("SendCookie").finish()
}
// Thread-local libmagic cookie, lazily initialized on first access per thread.
// Each thread gets its own independent Cookie instance. Libmagic documents that
// separate cookies can be used from different threads concurrently without
// synchronization. Using thread_local! avoids unsafe impl Send since the
// storage is inherently !Send.
#[cfg(feature = "meta_magic")]
thread_local! {
static MAGIC_COOKIE: std::cell::RefCell<Option<Cookie>> = const { std::cell::RefCell::new(None) };
}
#[cfg(feature = "magic")]
#[cfg(feature = "meta_magic")]
#[derive(Debug)]
pub struct MagicFileMetaPluginImpl {
buffer: Vec<u8>,
max_buffer_size: usize,
is_finalized: bool,
cookie: Option<SendCookie>,
base: BaseMetaPlugin,
}
#[cfg(feature = "magic")]
#[cfg(feature = "meta_magic")]
impl MagicFileMetaPluginImpl {
pub fn new(
options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
@@ -68,14 +53,28 @@ impl MagicFileMetaPluginImpl {
buffer: Vec::new(),
max_buffer_size,
is_finalized: false,
cookie: None,
base,
}
}
fn get_magic_result(&self, flags: CookieFlags) -> io::Result<String> {
if let Some(send_cookie) = &self.cookie {
let cookie = &send_cookie.0;
MAGIC_COOKIE.with(|cell| {
// Lazy init: create cookie on first access per thread
{
let mut opt = cell.borrow_mut();
if opt.is_none() {
let cookie = Cookie::open(CookieFlags::default())
.map_err(|e| io::Error::other(format!("Failed to open magic: {e}")))?;
cookie.load(&[] as &[&Path]).map_err(|e| {
io::Error::other(format!("Failed to load magic database: {e}"))
})?;
*opt = Some(cookie);
}
}
let cookie_ref = cell.borrow();
let cookie = cookie_ref.as_ref().expect("cookie initialized above");
cookie
.set_flags(flags)
.map_err(|e| io::Error::other(format!("Failed to set magic flags: {e}")))?;
@@ -84,13 +83,8 @@ impl MagicFileMetaPluginImpl {
.buffer(&self.buffer)
.map_err(|e| io::Error::other(format!("Failed to analyze buffer: {e}")))?;
// Clean up the result - remove extra whitespace
let trimmed = result.trim().to_string();
Ok(trimmed)
} else {
Err(io::Error::other("Magic cookie not initialized"))
}
Ok(result.trim().to_string())
})
}
fn process_magic_types(&self) -> Vec<MetaData> {
@@ -119,7 +113,7 @@ impl MagicFileMetaPluginImpl {
}
}
#[cfg(feature = "magic")]
#[cfg(feature = "meta_magic")]
impl MetaPlugin for MagicFileMetaPluginImpl {
fn is_finalized(&self) -> bool {
self.is_finalized
@@ -129,28 +123,16 @@ impl MetaPlugin for MagicFileMetaPluginImpl {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn initialize(&mut self) -> MetaPluginResponse {
let cookie = match Cookie::open(CookieFlags::default()) {
Ok(cookie) => cookie,
Err(e) => {
debug!("META: MagicFile plugin: failed to create cookie: {e}");
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
};
if let Err(e) = cookie.load(&[] as &[&Path]) {
debug!("META: MagicFile plugin: failed to load magic database: {e}");
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
self.cookie = Some(SendCookie(cookie));
// Cookie is lazily initialized in the thread-local on first use.
MetaPluginResponse {
metadata: Vec::new(),
is_finalized: false,
@@ -240,10 +222,10 @@ impl MetaPlugin for MagicFileMetaPluginImpl {
}
}
#[cfg(feature = "magic")]
#[cfg(feature = "meta_magic")]
pub use MagicFileMetaPluginImpl as MagicFileMetaPlugin;
#[cfg(not(feature = "magic"))]
#[cfg(not(feature = "meta_magic"))]
#[derive(Debug)]
pub struct FallbackMagicFileMetaPlugin {
buffer: Vec<u8>,
@@ -252,7 +234,7 @@ pub struct FallbackMagicFileMetaPlugin {
base: BaseMetaPlugin,
}
#[cfg(not(feature = "magic"))]
#[cfg(not(feature = "meta_magic"))]
impl FallbackMagicFileMetaPlugin {
pub fn new(
options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
@@ -285,7 +267,10 @@ impl FallbackMagicFileMetaPlugin {
.spawn()
.and_then(|mut child| {
if let Some(mut stdin) = child.stdin.take() {
let _ = stdin.write_all(&self.buffer);
if stdin.write_all(&self.buffer).is_err() {
// Ignore write error; child will see EOF and likely fail
// the file detection, returning no output.
}
}
child.wait_with_output()
});
@@ -351,7 +336,7 @@ impl FallbackMagicFileMetaPlugin {
}
}
#[cfg(not(feature = "magic"))]
#[cfg(not(feature = "meta_magic"))]
impl MetaPlugin for FallbackMagicFileMetaPlugin {
fn is_finalized(&self) -> bool {
self.is_finalized
@@ -361,6 +346,14 @@ impl MetaPlugin for FallbackMagicFileMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn initialize(&mut self) -> MetaPluginResponse {
MetaPluginResponse {
metadata: Vec::new(),
@@ -448,7 +441,7 @@ impl MetaPlugin for FallbackMagicFileMetaPlugin {
}
}
#[cfg(not(feature = "magic"))]
#[cfg(not(feature = "meta_magic"))]
pub use FallbackMagicFileMetaPlugin as MagicFileMetaPlugin;
use crate::meta_plugin::register_meta_plugin;
@@ -457,5 +450,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_magic_file_plugin() {
register_meta_plugin(MetaPluginType::MagicFile, |options, outputs| {
Box::new(MagicFileMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register MagicFileMetaPlugin");
}

View File

@@ -1,14 +1,15 @@
use log::debug;
use once_cell::sync::Lazy;
use log::{debug, warn};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::sync::Mutex;
use std::sync::{Arc, Mutex};
pub mod cwd;
pub mod digest;
pub mod env;
pub mod exec;
pub mod hostname;
#[cfg(feature = "meta_infer")]
pub mod infer_plugin;
pub mod keep_pid;
pub mod magic_file;
pub mod read_rate;
@@ -16,26 +17,32 @@ pub mod read_time;
pub mod shell;
pub mod shell_pid;
pub mod text;
#[cfg(feature = "tokens")]
#[cfg(feature = "meta_tokens")]
pub mod tokens;
#[cfg(feature = "meta_tree_magic_mini")]
pub mod tree_magic_mini;
pub mod user;
pub use digest::DigestMetaPlugin;
pub use exec::MetaPluginExec;
#[cfg(feature = "magic")]
#[cfg(feature = "meta_magic")]
pub use magic_file::MagicFileMetaPlugin;
// pub use text::TextMetaPlugin; // Removed duplicate
pub use cwd::CwdMetaPlugin;
pub use env::EnvMetaPlugin;
pub use hostname::HostnameMetaPlugin;
#[cfg(feature = "meta_infer")]
pub use infer_plugin::InferMetaPlugin;
pub use keep_pid::KeepPidMetaPlugin;
pub use read_rate::ReadRateMetaPlugin;
pub use read_time::ReadTimeMetaPlugin;
pub use shell::ShellMetaPlugin;
pub use shell_pid::ShellPidMetaPlugin;
#[cfg(feature = "meta_tree_magic_mini")]
pub use tree_magic_mini::TreeMagicMiniMetaPlugin;
pub use user::UserMetaPlugin;
#[cfg(not(feature = "magic"))]
#[cfg(not(feature = "meta_magic"))]
pub use magic_file::FallbackMagicFileMetaPlugin as MagicFileMetaPlugin;
type PluginConstructor = fn(
@@ -61,8 +68,16 @@ pub struct MetaPluginResponse {
pub is_finalized: bool,
}
/// Type alias for the save_meta callback shared by all plugins.
pub type SaveMetaFn = Arc<Mutex<dyn FnMut(&str, &str) + Send>>;
/// Creates a no-op save_meta for plugins not wired through MetaService.
pub fn noop_save_meta() -> SaveMetaFn {
Arc::new(Mutex::new(|_: &str, _: &str| {}))
}
/// Base implementation for meta plugins to reduce boilerplate.
#[derive(Debug, Clone, Default)]
#[derive(Clone)]
pub struct BaseMetaPlugin {
/// Output mappings for metadata.
pub outputs: std::collections::HashMap<String, serde_yaml::Value>,
@@ -70,6 +85,29 @@ pub struct BaseMetaPlugin {
pub options: std::collections::HashMap<String, serde_yaml::Value>,
/// Whether the plugin is finalized.
pub is_finalized: bool,
/// Callback to store metadata. Called directly by plugins.
pub save_meta: SaveMetaFn,
}
impl std::fmt::Debug for BaseMetaPlugin {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("BaseMetaPlugin")
.field("outputs", &self.outputs)
.field("options", &self.options)
.field("is_finalized", &self.is_finalized)
.finish_non_exhaustive()
}
}
impl Default for BaseMetaPlugin {
fn default() -> Self {
Self {
outputs: HashMap::new(),
options: HashMap::new(),
is_finalized: false,
save_meta: noop_save_meta(),
}
}
}
impl BaseMetaPlugin {
@@ -83,41 +121,39 @@ impl BaseMetaPlugin {
}
/// Returns a reference to the outputs mapping.
///
/// # Returns
///
/// A reference to the `HashMap` of outputs.
pub fn outputs(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
&self.outputs
}
/// Returns a mutable reference to the outputs mapping.
///
/// # Returns
///
/// A mutable reference to the `HashMap` of outputs.
pub fn outputs_mut(&mut self) -> &mut std::collections::HashMap<String, serde_yaml::Value> {
&mut self.outputs
}
/// Returns a reference to the options mapping.
///
/// # Returns
///
/// A reference to the `HashMap` of options.
pub fn options(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
&self.options
}
/// Returns a mutable reference to the options mapping.
///
/// # Returns
///
/// A mutable reference to the `HashMap` of options.
pub fn options_mut(&mut self) -> &mut std::collections::HashMap<String, serde_yaml::Value> {
&mut self.options
}
/// Sets the save_meta callback on the base plugin.
pub fn set_save_meta(&mut self, save_meta: SaveMetaFn) {
self.save_meta = save_meta;
}
/// Saves a metadata entry via the save_meta callback.
pub fn save_meta(&self, name: &str, value: &str) {
if let Ok(mut f) = self.save_meta.lock() {
f(name, value);
} else {
warn!("META_PLUGIN: save_meta lock poisoned, dropping metadata: {name}={value}");
}
}
/// Helper function to initialize plugin options and outputs.
///
/// # Arguments
@@ -234,6 +270,8 @@ pub enum MetaPluginType {
Exec,
Env,
Tokens,
TreeMagicMini,
Infer,
}
/// Central function to handle metadata output with name mapping.
@@ -267,22 +305,7 @@ pub fn process_metadata_outputs(
return None;
}
if let Some(custom_name) = mapping.as_str() {
// Convert the value to a string representation
let value_str = match &value {
serde_yaml::Value::Null => "null".to_string(),
serde_yaml::Value::Bool(b) => b.to_string(),
serde_yaml::Value::Number(n) => n.to_string(),
serde_yaml::Value::String(s) => s.clone(),
serde_yaml::Value::Sequence(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Mapping(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Tagged(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
};
let value_str = yaml_value_to_string(&value);
debug!(
"META: Processing metadata: internal_name={internal_name}, custom_name={custom_name}, value={value_str}"
);
@@ -293,22 +316,7 @@ pub fn process_metadata_outputs(
}
}
// Convert the value to a string representation
let value_str = match &value {
serde_yaml::Value::Null => "null".to_string(),
serde_yaml::Value::Bool(b) => b.to_string(),
serde_yaml::Value::Number(n) => n.to_string(),
serde_yaml::Value::String(s) => s.clone(),
serde_yaml::Value::Sequence(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Mapping(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
serde_yaml::Value::Tagged(_) => {
serde_yaml::to_string(&value).unwrap_or_else(|_| "".to_string())
}
};
let value_str = yaml_value_to_string(&value);
// Default: use internal name as output name
debug!("META: Processing metadata: name={internal_name}, value={value_str}");
@@ -318,6 +326,20 @@ pub fn process_metadata_outputs(
})
}
fn yaml_value_to_string(value: &serde_yaml::Value) -> String {
match value {
serde_yaml::Value::Null => "null".to_string(),
serde_yaml::Value::Bool(b) => b.to_string(),
serde_yaml::Value::Number(n) => n.to_string(),
serde_yaml::Value::String(s) => s.clone(),
serde_yaml::Value::Sequence(_)
| serde_yaml::Value::Mapping(_)
| serde_yaml::Value::Tagged(_) => {
serde_yaml::to_string(value).unwrap_or_else(|_| "".to_string())
}
}
}
pub trait MetaPlugin: Send
where
Self: 'static,
@@ -421,9 +443,9 @@ where
///
/// An empty `HashMap` (default implementation).
fn outputs(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
use once_cell::sync::Lazy;
static EMPTY: Lazy<std::collections::HashMap<String, serde_yaml::Value>> =
Lazy::new(std::collections::HashMap::new);
use std::sync::LazyLock;
static EMPTY: LazyLock<std::collections::HashMap<String, serde_yaml::Value>> =
LazyLock::new(std::collections::HashMap::new);
&EMPTY
}
@@ -448,9 +470,9 @@ where
///
/// An empty `HashMap` (default implementation).
fn options(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
use once_cell::sync::Lazy;
static EMPTY: Lazy<std::collections::HashMap<String, serde_yaml::Value>> =
Lazy::new(std::collections::HashMap::new);
use std::sync::LazyLock;
static EMPTY: LazyLock<std::collections::HashMap<String, serde_yaml::Value>> =
LazyLock::new(std::collections::HashMap::new);
&EMPTY
}
@@ -566,11 +588,22 @@ where
{
self
}
/// Sets the save_meta callback for this plugin.
///
/// Called by MetaService to wire the plugin to the metadata storage.
fn set_save_meta(&mut self, _save_meta: SaveMetaFn) {}
/// Saves a metadata entry via the save_meta callback.
///
/// Plugins call this during initialize/update/finalize to persist metadata.
fn save_meta(&self, _name: &str, _value: &str) {}
}
/// Global registry for meta plugins.
static META_PLUGIN_REGISTRY: Lazy<Mutex<HashMap<MetaPluginType, PluginConstructor>>> =
Lazy::new(|| Mutex::new(HashMap::new()));
static META_PLUGIN_REGISTRY: std::sync::LazyLock<
Mutex<HashMap<MetaPluginType, PluginConstructor>>,
> = std::sync::LazyLock::new(|| Mutex::new(HashMap::new()));
/// Register a meta plugin with the global registry.
///
@@ -578,11 +611,15 @@ static META_PLUGIN_REGISTRY: Lazy<Mutex<HashMap<MetaPluginType, PluginConstructo
///
/// * `meta_plugin_type` - The type of the meta plugin to register.
/// * `constructor` - The constructor function for creating plugin instances.
pub fn register_meta_plugin(meta_plugin_type: MetaPluginType, constructor: PluginConstructor) {
pub fn register_meta_plugin(
meta_plugin_type: MetaPluginType,
constructor: PluginConstructor,
) -> anyhow::Result<()> {
META_PLUGIN_REGISTRY
.lock()
.unwrap()
.map_err(|e| anyhow::anyhow!("plugin registry poisoned: {e}"))?
.insert(meta_plugin_type, constructor);
Ok(())
}
pub fn get_meta_plugin(
@@ -590,9 +627,28 @@ pub fn get_meta_plugin(
options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
outputs: Option<std::collections::HashMap<String, serde_yaml::Value>>,
) -> anyhow::Result<Box<dyn MetaPlugin>> {
let registry = META_PLUGIN_REGISTRY.lock().unwrap();
get_meta_plugin_with_save(meta_plugin_type, options, outputs, None)
}
/// Creates a meta plugin instance with an optional save_meta callback.
///
/// If `save_meta` is provided, it is wired to the plugin so it can
/// store metadata directly during initialize/update/finalize.
pub fn get_meta_plugin_with_save(
meta_plugin_type: MetaPluginType,
options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
outputs: Option<std::collections::HashMap<String, serde_yaml::Value>>,
save_meta: Option<SaveMetaFn>,
) -> anyhow::Result<Box<dyn MetaPlugin>> {
let registry = META_PLUGIN_REGISTRY
.lock()
.map_err(|e| anyhow::anyhow!("plugin registry poisoned: {e}"))?;
if let Some(constructor) = registry.get(&meta_plugin_type) {
return Ok(constructor(options, outputs));
let mut plugin = constructor(options, outputs);
if let Some(sm) = save_meta {
plugin.set_save_meta(sm);
}
return Ok(plugin);
}
anyhow::bail!("Meta plugin {meta_plugin_type:?} not registered")

View File

@@ -84,6 +84,14 @@ impl MetaPlugin for ReadRateMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
/// Finalizes the plugin, calculating the read rate.
///
/// Computes KB/s from bytes read and elapsed time. Outputs via mappings.
@@ -237,5 +245,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_read_rate_plugin() {
register_meta_plugin(MetaPluginType::ReadRate, |options, outputs| {
Box::new(ReadRateMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register ReadRateMetaPlugin");
}

View File

@@ -37,6 +37,14 @@ impl MetaPlugin for ReadTimeMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn finalize(&mut self) -> crate::meta_plugin::MetaPluginResponse {
// If already finalized, don't process again
if self.is_finalized {
@@ -124,5 +132,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_read_time_plugin() {
register_meta_plugin(MetaPluginType::ReadTime, |options, outputs| {
Box::new(ReadTimeMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register ReadTimeMetaPlugin");
}

View File

@@ -70,6 +70,14 @@ impl MetaPlugin for ShellMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
/// Finalizes the plugin without processing data.
///
/// For this plugin, finalization is handled in `initialize`, so this returns empty metadata.
@@ -240,5 +248,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_shell_plugin() {
register_meta_plugin(MetaPluginType::Shell, |options, outputs| {
Box::new(ShellMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register ShellMetaPlugin");
}

View File

@@ -35,6 +35,14 @@ impl MetaPlugin for ShellPidMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn finalize(&mut self) -> crate::meta_plugin::MetaPluginResponse {
// If already finalized, don't process again
if self.is_finalized {
@@ -132,5 +140,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_shell_pid_plugin() {
register_meta_plugin(MetaPluginType::ShellPid, |options, outputs| {
Box::new(ShellPidMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register ShellPidMetaPlugin");
}

View File

@@ -510,6 +510,14 @@ impl MetaPlugin for TextMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
/// Updates the plugin with new data chunk.
///
/// Accumulates data for binary detection (if pending) or text statistics.
@@ -818,5 +826,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_text_plugin() {
register_meta_plugin(MetaPluginType::Text, |options, outputs| {
Box::new(TextMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register TextMetaPlugin");
}

View File

@@ -148,6 +148,14 @@ impl MetaPlugin for TokensMetaPlugin {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn update(&mut self, data: &[u8]) -> MetaPluginResponse {
if self.is_finalized {
return MetaPluginResponse {
@@ -312,5 +320,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_tokens_plugin() {
register_meta_plugin(MetaPluginType::Tokens, |options, outputs| {
Box::new(TokensMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register TokensMetaPlugin");
}

View File

@@ -0,0 +1,173 @@
use crate::common::PIPESIZE;
use crate::meta_plugin::{
BaseMetaPlugin, MetaPlugin, MetaPluginResponse, MetaPluginType, process_metadata_outputs,
register_meta_plugin,
};
#[derive(Debug, Default)]
pub struct TreeMagicMiniMetaPlugin {
buffer: Vec<u8>,
max_buffer_size: usize,
is_finalized: bool,
base: BaseMetaPlugin,
}
impl TreeMagicMiniMetaPlugin {
pub fn new(
options: Option<std::collections::HashMap<String, serde_yaml::Value>>,
outputs: Option<std::collections::HashMap<String, serde_yaml::Value>>,
) -> TreeMagicMiniMetaPlugin {
let mut base = BaseMetaPlugin::new();
if let Some(opts) = options {
for (key, value) in opts {
base.options.insert(key, value);
}
}
let max_buffer_size = base
.options
.get("max_buffer_size")
.and_then(|v| v.as_u64())
.unwrap_or(PIPESIZE as u64) as usize;
base.outputs.insert(
"tree_magic_mime_type".to_string(),
serde_yaml::Value::String("tree_magic_mime_type".to_string()),
);
if let Some(outs) = outputs {
for (key, value) in outs {
base.outputs.insert(key, value);
}
}
TreeMagicMiniMetaPlugin {
buffer: Vec::new(),
max_buffer_size,
is_finalized: false,
base,
}
}
}
impl MetaPlugin for TreeMagicMiniMetaPlugin {
fn meta_type(&self) -> MetaPluginType {
MetaPluginType::TreeMagicMini
}
fn is_finalized(&self) -> bool {
self.is_finalized
}
fn set_finalized(&mut self, finalized: bool) {
self.is_finalized = finalized;
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
fn update(&mut self, data: &[u8]) -> MetaPluginResponse {
if self.is_finalized {
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
let remaining = self.max_buffer_size.saturating_sub(self.buffer.len());
let to_add = &data[..data.len().min(remaining)];
self.buffer.extend_from_slice(to_add);
if self.buffer.len() >= self.max_buffer_size {
let mime_type = tree_magic_mini::from_u8(&self.buffer);
self.is_finalized = true;
let metadata = process_metadata_outputs(
"tree_magic_mime_type",
serde_yaml::Value::String(mime_type.to_string()),
self.base.outputs(),
)
.map(|m| vec![m])
.unwrap_or_default();
return MetaPluginResponse {
metadata,
is_finalized: true,
};
}
MetaPluginResponse {
metadata: Vec::new(),
is_finalized: false,
}
}
fn finalize(&mut self) -> MetaPluginResponse {
if self.is_finalized {
return MetaPluginResponse {
metadata: Vec::new(),
is_finalized: true,
};
}
let mime_type = tree_magic_mini::from_u8(&self.buffer);
self.is_finalized = true;
let metadata = process_metadata_outputs(
"tree_magic_mime_type",
serde_yaml::Value::String(mime_type.to_string()),
self.base.outputs(),
)
.map(|m| vec![m])
.unwrap_or_default();
MetaPluginResponse {
metadata,
is_finalized: true,
}
}
fn outputs(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
self.base.outputs()
}
fn outputs_mut(
&mut self,
) -> anyhow::Result<&mut std::collections::HashMap<String, serde_yaml::Value>> {
Ok(self.base.outputs_mut())
}
fn default_outputs(&self) -> Vec<String> {
vec!["tree_magic_mime_type".to_string()]
}
fn options(&self) -> &std::collections::HashMap<String, serde_yaml::Value> {
self.base.options()
}
fn options_mut(
&mut self,
) -> anyhow::Result<&mut std::collections::HashMap<String, serde_yaml::Value>> {
Ok(self.base.options_mut())
}
fn parallel_safe(&self) -> bool {
true
}
}
#[ctor::ctor]
fn register_tree_magic_mini_plugin() {
register_meta_plugin(MetaPluginType::TreeMagicMini, |options, outputs| {
Box::new(TreeMagicMiniMetaPlugin::new(options, outputs))
})
.expect("Failed to register TreeMagicMiniMetaPlugin");
}

View File

@@ -105,6 +105,14 @@ impl MetaPlugin for UserMetaPlugin {
MetaPluginType::User
}
fn set_save_meta(&mut self, save_meta: crate::meta_plugin::SaveMetaFn) {
self.base.set_save_meta(save_meta);
}
fn save_meta(&self, name: &str, value: &str) {
self.base.save_meta(name, value);
}
/// Returns a reference to the outputs mapping.
///
/// # Returns
@@ -166,5 +174,6 @@ use crate::meta_plugin::register_meta_plugin;
fn register_user_plugin() {
register_meta_plugin(MetaPluginType::User, |options, outputs| {
Box::new(UserMetaPlugin::new(options, outputs))
});
})
.expect("Failed to register UserMetaPlugin");
}

View File

@@ -0,0 +1,77 @@
use anyhow::{Context, Result, anyhow};
use chrono::Utc;
use clap::Command;
use log::debug;
use std::collections::HashMap;
use std::fs;
use crate::client::KeepClient;
use crate::common::sanitize_ts_string;
use crate::config;
/// Export items to a `.keep.tar` archive via client.
///
/// Sends a request to the server's `/api/export` endpoint and
/// streams the response to a local tar file.
pub fn mode(
client: &KeepClient,
cmd: &mut Command,
settings: &config::Settings,
ids: &[i64],
tags: &[String],
) -> Result<()> {
// Validate: IDs XOR tags
if !ids.is_empty() && !tags.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"Cannot use both IDs and tags with --export",
)
.exit();
}
if ids.is_empty() && tags.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"Must provide either IDs or tags with --export",
)
.exit();
}
// We need to resolve items on the server to compute the filename.
// First, get the item info to build the filename template variables.
// For the tar filename, we use {name}_{ts}.keep.tar where name comes from
// --export-name or default export_<common-tags>.
let dir_name = if let Some(ref name) = settings.export_name {
name.clone()
} else {
"export".to_string()
};
let now = Utc::now();
let ts_str = sanitize_ts_string(&now.format("%Y-%m-%dT%H:%M:%SZ").to_string());
let mut vars = HashMap::new();
vars.insert("name".to_string(), dir_name);
vars.insert("ts".to_string(), ts_str);
let basename = strfmt::strfmt(&settings.export_filename_format, &vars).map_err(|e| {
anyhow!(
"Invalid export filename format '{}': {}",
settings.export_filename_format,
e
)
})?;
let tar_filename = format!("{basename}.keep.tar");
client
.export_items_to_file(ids, tags, std::path::Path::new(&tar_filename))
.map_err(|e| anyhow!("Export failed: {e}"))?;
if !settings.quiet {
eprintln!("{tar_filename}");
}
debug!("CLIENT_EXPORT: Wrote items to {tar_filename}");
Ok(())
}

View File

@@ -1,16 +1,17 @@
use crate::client::KeepClient;
use crate::compression_engine::CompressionType;
use crate::filter_plugin::FilterChain;
use crate::modes::common::{check_binary_tty, resolve_item_id};
use crate::services::compression_service::CompressionService;
use anyhow::Result;
use clap::Command;
use is_terminal::IsTerminal;
use log::debug;
use std::io::{Read, Write};
use std::str::FromStr;
pub fn mode(
client: &KeepClient,
_cmd: &mut Command,
cmd: &mut Command,
settings: &crate::config::Settings,
ids: &[i64],
tags: &[String],
@@ -18,78 +19,57 @@ pub fn mode(
) -> Result<(), anyhow::Error> {
debug!("CLIENT_GET: Getting item via remote server");
// Find the item ID
let item_id = if !ids.is_empty() {
ids[0]
} else if !tags.is_empty() {
// Find item by tags
let items = client.list_items(tags, "newest", 0, 1)?;
if items.is_empty() {
return Err(anyhow::anyhow!("No items found matching tags: {:?}", tags));
if !ids.is_empty() && !tags.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"Both ID and tags given, you must supply either IDs or tags when using --get",
)
.exit();
}
items[0].id
} else {
// Get latest item
let items = client.list_items(&[], "newest", 0, 1)?;
if items.is_empty() {
return Err(anyhow::anyhow!("No items found"));
}
items[0].id
};
// Get item info to determine compression type
let item_id = resolve_item_id(client, ids, tags)?;
// Get item info for metadata
let item_info = client.get_item_info(item_id)?;
let metadata = &item_info.metadata;
// Get raw content from server
let (raw_bytes, compression) = client.get_item_content_raw(item_id)?;
// Get streaming reader for raw content
let (reader, compression) = client.get_item_content_stream(item_id)?;
let compression_type = CompressionType::from_str(&compression).unwrap_or(CompressionType::Raw);
// Check if binary content would be sent to TTY
let is_text = item_info
.metadata
.get("text")
.map(|v| v == "true")
.unwrap_or(false);
// Decompress through streaming readers
let mut decompressed_reader: Box<dyn Read> =
CompressionService::decompressing_reader(reader, &compression_type)?;
if std::io::stdout().is_terminal() && !is_text && !settings.force {
// Check if content is binary
let sample_len = std::cmp::min(raw_bytes.len(), 8192);
if crate::common::is_binary::is_binary(&raw_bytes[..sample_len]) {
return Err(anyhow::anyhow!(
"Refusing to output binary data to a terminal. Use --force to override."
));
}
}
// Binary detection: sample first chunk
let mut sample_buf = [0u8; crate::common::PIPESIZE];
let sample_len = decompressed_reader.read(&mut sample_buf)?;
check_binary_tty(metadata, &sample_buf[..sample_len], settings.force)?;
// Decompress locally
let compression_type = CompressionType::from_str(&compression).unwrap_or(CompressionType::None);
let decompressed = match compression_type {
CompressionType::GZip => {
use flate2::read::GzDecoder;
let mut decoder = GzDecoder::new(&raw_bytes[..]);
let mut content = Vec::new();
decoder.read_to_end(&mut content)?;
content
}
CompressionType::LZ4 => lz4_flex::decompress_size_prepended(&raw_bytes)
.map_err(|e| anyhow::anyhow!("LZ4 decompression failed: {}", e))?,
_ => raw_bytes,
};
// Apply filters if present
let output = if let Some(mut chain) = filter_chain {
let mut filtered = Vec::new();
chain.filter(&mut &decompressed[..], &mut filtered)?;
filtered
} else {
decompressed
};
// Stream to stdout
// If filters present, buffer through filter chain; otherwise stream directly
if let Some(mut chain) = filter_chain {
// Apply filter to sample first, then remaining
let mut output = Vec::new();
chain.filter(&mut &sample_buf[..sample_len], &mut output)?;
crate::common::stream_copy(&mut decompressed_reader, |chunk| {
chain.filter(&mut std::io::Cursor::new(chunk), &mut output)?;
Ok(())
})?;
let stdout = std::io::stdout();
let mut stdout = stdout.lock();
stdout.write_all(&output)?;
stdout.flush()?;
} else {
// Stream decompressed content to stdout
let stdout = std::io::stdout();
let mut stdout = stdout.lock();
stdout.write_all(&sample_buf[..sample_len])?;
crate::common::stream_copy(&mut decompressed_reader, |chunk| {
stdout.write_all(chunk)?;
Ok(())
})?;
stdout.flush()?;
}
Ok(())
}

160
src/modes/client/import.rs Normal file
View File

@@ -0,0 +1,160 @@
use anyhow::{Context, Result, anyhow};
use clap::Command;
use log::debug;
use std::collections::HashMap;
use std::fs;
use std::io::Read;
use std::path::Path;
use crate::client::KeepClient;
use crate::compression_engine::CompressionType;
use crate::config;
use crate::modes::common::ImportMeta;
use std::str::FromStr;
/// Import items from a `.keep.tar` archive or legacy `.meta.yml` file via client.
///
/// For `.keep.tar` files, streams the archive to the server's `/api/import` endpoint.
/// For `.meta.yml` files, uses the legacy single-item import path.
pub fn mode(
client: &KeepClient,
cmd: &mut Command,
settings: &config::Settings,
import_path: &str,
) -> Result<()> {
if import_path.ends_with(".keep.tar") {
import_tar(client, cmd, settings, import_path)
} else if import_path.ends_with(".meta.yml") {
import_legacy(client, cmd, settings, import_path)
} else {
cmd.error(
clap::error::ErrorKind::InvalidValue,
format!("Unsupported import format: {}", import_path),
)
.exit();
}
}
/// Import from a `.keep.tar` archive via the server API.
fn import_tar(
client: &KeepClient,
_cmd: &mut Command,
settings: &config::Settings,
tar_path: &str,
) -> Result<()> {
let path = Path::new(tar_path);
let imported_ids = client
.import_tar_file(path)
.map_err(|e| anyhow!("Import failed: {e}"))?;
if !settings.quiet {
println!(
"KEEP: Imported {} item(s): {:?}",
imported_ids.len(),
imported_ids
);
}
debug!(
"CLIENT_IMPORT: Imported {} items from {}",
imported_ids.len(),
tar_path
);
Ok(())
}
/// Legacy single-item import from a `.meta.yml` file.
fn import_legacy(
client: &KeepClient,
cmd: &mut Command,
settings: &config::Settings,
meta_file: &str,
) -> Result<()> {
// Read and parse metadata
let meta_yaml = fs::read_to_string(meta_file)
.with_context(|| format!("Cannot read metadata file: {meta_file}"))?;
let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml)
.with_context(|| format!("Cannot parse metadata file: {meta_file}"))?;
// Validate compression type
CompressionType::from_str(&import_meta.compression).map_err(|_| {
anyhow!(
"Invalid compression type '{}' in metadata file",
import_meta.compression
)
})?;
debug!(
"CLIENT_IMPORT: Parsed meta: ts={}, compression={}, tags={:?}",
import_meta.ts, import_meta.compression, import_meta.tags
);
// Build query parameters
let ts_str = import_meta.ts.to_rfc3339();
let params = [
("compress".to_string(), "false".to_string()),
("meta".to_string(), "false".to_string()),
("tags".to_string(), import_meta.tags.join(",")),
(
"compression_type".to_string(),
import_meta.compression.clone(),
),
("ts".to_string(), ts_str),
];
let param_refs: Vec<(&str, &str)> = params
.iter()
.map(|(k, v)| (k.as_str(), v.as_str()))
.collect();
// Stream data to server without buffering entire file
let item_info = if let Some(ref data_file) = settings.import_data_file {
let mut reader = fs::File::open(data_file)
.with_context(|| format!("Cannot read data file: {}", data_file.display()))?;
client.post_stream("/api/item/", &mut reader, &param_refs)?
} else {
// For stdin, we need to buffer since stdin can't be seeked
// and post_stream may need to retry.
let mut buf = Vec::new();
std::io::stdin()
.read_to_end(&mut buf)
.context("Cannot read data from stdin")?;
if buf.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"No data provided (empty stdin)",
)
.exit();
}
let mut cursor = std::io::Cursor::new(&buf);
client.post_stream("/api/item/", &mut cursor, &param_refs)?
};
let item_id = item_info.id;
debug!("CLIENT_IMPORT: Created item {} via server", item_id);
// Set uncompressed size if known from metadata
if let Some(size) = import_meta.uncompressed_size {
client.set_item_size(item_id, size as u64)?;
debug!("CLIENT_IMPORT: Set size to {}", size);
}
// Post metadata
if !import_meta.metadata.is_empty() {
client.post_metadata(item_id, &import_meta.metadata)?;
debug!(
"CLIENT_IMPORT: Set {} metadata entries",
import_meta.metadata.len()
);
}
if !settings.quiet {
println!(
"KEEP: Imported item {} tags: {:?}",
item_id, import_meta.tags
);
}
Ok(())
}

View File

@@ -1,5 +1,8 @@
use crate::client::KeepClient;
use crate::modes::common::{OutputFormat, format_size, settings_output_format};
use crate::modes::common::{
DisplayItemInfo, OutputFormat, format_size, render_item_info_table, resolve_item_ids,
settings_output_format,
};
use clap::Command;
use log::debug;
@@ -13,50 +16,34 @@ pub fn mode(
debug!("CLIENT_INFO: Getting item info via remote server");
let output_format = settings_output_format(settings);
// If tags provided, find matching item first
let item_ids: Vec<i64> = if !tags.is_empty() {
let items = client.list_items(tags, "newest", 0, 1)?;
if items.is_empty() {
return Err(anyhow::anyhow!("No items found matching tags: {:?}", tags));
}
items.into_iter().map(|i| i.id).collect()
} else {
ids.to_vec()
};
let item_ids = resolve_item_ids(client, ids, tags)?;
for &id in &item_ids {
let item = client.get_item_info(id)?;
match output_format {
OutputFormat::Json => {
println!("{}", serde_json::to_string_pretty(&item)?);
}
OutputFormat::Yaml => {
println!("{}", serde_yaml::to_string(&item)?);
OutputFormat::Json | OutputFormat::Yaml => {
crate::modes::common::print_serialized(&item, &output_format)?;
}
OutputFormat::Table => {
use comfy_table::{Table, presets::UTF8_FULL};
let mut table = Table::new();
table.load_preset(UTF8_FULL);
let size_str = item
.size
let display = DisplayItemInfo {
id: item.id,
timestamp: item.ts.clone(),
path: String::new(),
stream_size: item
.uncompressed_size
.map(|s| format_size(s as u64, settings.human_readable))
.unwrap_or_else(|| "N/A".to_string());
table.add_row(vec!["ID".to_string(), item.id.to_string()]);
table.add_row(vec!["Time".to_string(), item.ts.clone()]);
table.add_row(vec!["Size".to_string(), size_str]);
table.add_row(vec!["Compression".to_string(), item.compression.clone()]);
table.add_row(vec!["Tags".to_string(), item.tags.join(", ")]);
for (key, value) in &item.metadata {
table.add_row(vec![format!("Meta: {}", key), value.clone()]);
}
println!("{table}");
.unwrap_or_else(|| "N/A".to_string()),
compression: item.compression.clone(),
file_size: String::new(),
tags: item.tags.clone(),
metadata: item
.metadata
.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect(),
};
render_item_info_table(&display, &settings.table_config);
}
}
}

View File

@@ -1,53 +1,69 @@
use crate::client::KeepClient;
use crate::modes::common::{OutputFormat, format_size, settings_output_format};
use crate::modes::common::{
ColumnType, OutputFormat, format_size, render_list_table_with_format, settings_output_format,
};
use clap::Command;
use log::debug;
use std::str::FromStr;
pub fn mode(
client: &KeepClient,
_cmd: &mut Command,
settings: &crate::config::Settings,
ids: &[i64],
tags: &[String],
) -> Result<(), anyhow::Error> {
debug!("CLIENT_LIST: Listing items via remote server");
let items = client.list_items(tags, "newest", 0, 100)?;
let items = client.list_items(ids, tags, "newest", 0, 100, &settings.meta_filter())?;
if settings.ids_only {
for item in &items {
println!("{}", item.id);
}
return Ok(());
}
let output_format = settings_output_format(settings);
match output_format {
OutputFormat::Json => {
println!("{}", serde_json::to_string_pretty(&items)?);
}
OutputFormat::Yaml => {
println!("{}", serde_yaml::to_string(&items)?);
OutputFormat::Json | OutputFormat::Yaml => {
crate::modes::common::print_serialized(&items, &output_format)?;
}
OutputFormat::Table => {
use comfy_table::{Table, presets::UTF8_FULL};
let mut table = Table::new();
table.load_preset(UTF8_FULL);
// Header
let headers = ["ID", "Time", "Size", "Compression", "Tags"];
table.set_header(headers.iter().map(|h| h.to_string()).collect::<Vec<_>>());
for item in &items {
let size_str = item
.size
let rows: Vec<Vec<String>> = items
.iter()
.map(|item| {
let mut row = Vec::new();
for column in &settings.list_format {
let col_type = ColumnType::from_str(&column.name).ok();
let cell = match col_type {
Some(ColumnType::Id) => item.id.to_string(),
Some(ColumnType::Time) => item.ts.clone(),
Some(ColumnType::Size) => item
.uncompressed_size
.map(|s| format_size(s as u64, settings.human_readable))
.unwrap_or_default();
table.add_row(vec![
item.id.to_string(),
item.ts.clone(),
size_str,
item.compression.clone(),
item.tags.join(", "),
]);
.unwrap_or_default(),
Some(ColumnType::Compression) => item.compression.clone(),
Some(ColumnType::Tags) => item.tags.join(" "),
Some(ColumnType::Meta) => {
let meta_key = column.name.strip_prefix("meta:");
match meta_key {
Some(key) => {
item.metadata.get(key).cloned().unwrap_or_default()
}
None => String::new(),
}
}
_ => String::new(),
};
row.push(cell);
}
row
})
.collect();
println!("{table}");
render_list_table_with_format(&settings.list_format, &rows, &settings.table_config);
}
}

View File

@@ -1,7 +1,10 @@
pub mod delete;
pub mod diff;
pub mod export;
pub mod get;
pub mod import;
pub mod info;
pub mod list;
pub mod save;
pub mod status;
pub mod update;

View File

@@ -1,12 +1,15 @@
use crate::client::{ItemInfo, KeepClient};
use crate::client::KeepClient;
use crate::compression_engine::CompressionType;
use crate::config::Settings;
use crate::meta_plugin::SaveMetaFn;
use crate::modes::common::settings_compression_type;
use crate::services::ItemInfo;
use crate::services::compression_service::CompressionService;
use crate::services::meta_service::MetaService;
use anyhow::Result;
use clap::Command;
use is_terminal::IsTerminal;
use log::debug;
use sha2::{Digest, Sha256};
use std::collections::HashMap;
use std::io::{Read, Write};
use std::sync::{Arc, Mutex};
@@ -14,11 +17,14 @@ use std::sync::{Arc, Mutex};
/// Streaming save mode for client.
///
/// Uses three threads for true streaming with constant memory:
/// - Reader thread: reads stdin, tees to stdout, computes SHA-256,
/// - Reader thread: reads stdin, tees to stdout, runs meta plugins,
/// compresses data, writes to OS pipe
/// - Pipe: zero-copy transfer of compressed bytes between threads
/// - Streamer thread: reads from pipe, streams to server via chunked HTTP
///
/// Meta plugins run on the client side during streaming. Collected metadata
/// is sent to the server via a separate POST after streaming completes.
///
/// Memory usage is O(PIPESIZE) regardless of data size.
pub fn mode(
client: &KeepClient,
@@ -29,43 +35,48 @@ pub fn mode(
) -> Result<(), anyhow::Error> {
debug!("CLIENT_SAVE: Saving item via remote server (streaming)");
if tags.is_empty() {
tags.push("none".to_string());
}
crate::modes::common::ensure_default_tag(tags);
// Determine compression type from settings
let compression_type = settings_compression_type(cmd, settings);
let server_compress = matches!(compression_type, CompressionType::None);
let compression_type_str = compression_type.to_string();
// In client mode, the client always handles compression (even "raw").
// The server should never re-compress client data.
let server_compress = false;
// Shared metadata collection: plugins write here via save_meta closure
let collected_meta: Arc<Mutex<HashMap<String, String>>> = Arc::new(Mutex::new(HashMap::new()));
let meta_collector = collected_meta.clone();
let save_meta: SaveMetaFn = Arc::new(Mutex::new(move |name: &str, value: &str| {
if let Ok(mut map) = meta_collector.lock() {
map.insert(name.to_string(), value.to_string());
}
}));
// Create MetaService and get plugins (must happen before spawning reader thread)
let meta_service = MetaService::new(save_meta);
let mut plugins = meta_service.get_plugins(cmd, settings);
// Create OS pipe for streaming compressed bytes between threads
let (pipe_reader, pipe_writer) = os_pipe::pipe()?;
// Shared state for reader thread results
let shared = Arc::new(Mutex::new((0u64, String::new())));
let shared_reader = Arc::clone(&shared);
// Reader thread: stdin → tee(stdout) → hash → compress → pipe
// Reader thread: stdin → tee(stdout) → meta plugins → compress → pipe
let compression_type_clone = compression_type.clone();
let reader_handle = std::thread::spawn(move || -> Result<(u64, String)> {
let reader_handle = std::thread::spawn(move || -> Result<u64> {
let stdin = std::io::stdin();
let stdout = std::io::stdout();
let mut stdin_lock = stdin.lock();
let mut stdout_lock = stdout.lock();
let mut hasher = Sha256::new();
let mut total_bytes = 0u64;
let mut buffer = [0u8; 8192];
// Initialize meta plugins
meta_service.initialize_plugins(&mut plugins);
// Wrap pipe writer with appropriate compression
let mut compressor: Box<dyn Write> = match compression_type_clone {
CompressionType::GZip => {
use flate2::Compression;
use flate2::write::GzEncoder;
Box::new(GzEncoder::new(pipe_writer, Compression::default()))
}
CompressionType::LZ4 => Box::new(lz4_flex::frame::FrameEncoder::new(pipe_writer)),
_ => Box::new(pipe_writer),
};
let mut compressor: Box<dyn Write> =
CompressionService::compressing_writer(Box::new(pipe_writer), &compression_type_clone)?;
loop {
let n = stdin_lock.read(&mut buffer)?;
@@ -76,26 +87,23 @@ pub fn mode(
// Tee to stdout
stdout_lock.write_all(&buffer[..n])?;
// Update hash
hasher.update(&buffer[..n]);
// Feed chunk to meta plugins
meta_service.process_chunk(&mut plugins, &buffer[..n]);
total_bytes += n as u64;
// Compress and write to pipe
compressor.write_all(&buffer[..n])?;
}
// Finalize compression (flushes any buffered compressed data)
// Finalize meta plugins (digest, text, tokens produce final output here)
meta_service.finalize_plugins(&mut plugins);
// Explicitly flush and finalize compression before dropping.
compressor.flush()?;
drop(compressor);
// Pipe writer is now dropped (inside compressor), signaling EOF to streamer
let digest = format!("{:x}", hasher.finalize());
// Set shared state for main thread
let mut shared = shared_reader.lock().unwrap();
*shared = (total_bytes, digest.clone());
Ok((total_bytes, digest))
Ok(total_bytes)
});
// Streamer thread: reads compressed bytes from pipe → POST to server
@@ -104,6 +112,7 @@ pub fn mode(
let client_password = client.password().cloned();
let client_jwt = client.jwt().cloned();
let tags_clone = tags.clone();
let compression_type_str_clone = compression_type_str.clone();
let streamer_handle = std::thread::spawn(move || -> Result<ItemInfo> {
let streaming_client =
@@ -112,7 +121,12 @@ pub fn mode(
("compress".to_string(), server_compress.to_string()),
("meta".to_string(), "false".to_string()),
("tags".to_string(), tags_clone.join(",")),
// Always send compression_type when compress=false (client handled compression)
("compression_type".to_string(), compression_type_str_clone),
];
// Filter out empty params
let params: Vec<(String, String)> =
params.into_iter().filter(|(_, v)| !v.is_empty()).collect();
let param_refs: Vec<(&str, &str)> = params
.iter()
.map(|(k, v)| (k.as_str(), v.as_str()))
@@ -129,42 +143,34 @@ pub fn mode(
.map_err(|e| anyhow::anyhow!("Streamer thread panicked: {:?}", e))??;
// Wait for reader thread (should complete quickly after pipe is drained)
reader_handle
let uncompressed_size = reader_handle
.join()
.map_err(|e| anyhow::anyhow!("Reader thread panicked: {:?}", e))??;
// Read results from shared state
let (uncompressed_size, digest) = {
let shared = shared.lock().unwrap();
shared.clone()
};
// Build local metadata and send to server
// Merge plugin-collected metadata with CLI metadata
let mut local_metadata = metadata;
local_metadata.insert("digest_sha256".to_string(), digest);
local_metadata.insert(
"uncompressed_size".to_string(),
uncompressed_size.to_string(),
);
// Add hostname
if let Ok(hostname) = gethostname::gethostname().into_string() {
local_metadata.insert("hostname".to_string(), hostname.clone());
let short = hostname.split('.').next().unwrap_or(&hostname).to_string();
local_metadata.insert("hostname_short".to_string(), short);
// Add plugin-collected metadata (digest, hostname, text stats, etc.)
if let Ok(plugin_meta) = collected_meta.lock() {
for (k, v) in plugin_meta.iter() {
local_metadata.entry(k.clone()).or_insert_with(|| v.clone());
}
}
// Send uncompressed size to server (proper field, not metadata)
client.set_item_size(item_info.id, uncompressed_size)?;
// Send metadata to server
if !local_metadata.is_empty() {
client.post_metadata(item_info.id, &local_metadata)?;
}
// Print status to stderr
// Print status to stderr (item ID is known immediately from server response)
if !settings.quiet {
if std::io::stderr().is_terminal() {
eprintln!("KEEP: New item (streaming) tags: {}", tags.join(" "));
eprintln!("KEEP: New item: {} tags: {}", item_info.id, tags.join(" "));
} else {
eprintln!("KEEP: New item (streaming) tags: {tags:?}");
eprintln!("KEEP: New item: {} tags: {tags:?}", item_info.id);
}
}

View File

@@ -2,6 +2,7 @@ use crate::client::KeepClient;
use crate::modes::common::OutputFormat;
use crate::modes::common::settings_output_format;
use clap::Command;
use comfy_table::{Attribute, Cell, Table};
use log::debug;
pub fn mode(
@@ -11,21 +12,78 @@ pub fn mode(
) -> Result<(), anyhow::Error> {
debug!("CLIENT_STATUS: Getting status from remote server");
let status = client.get_status()?;
let status_info = client.get_status()?;
let output_format = settings_output_format(settings);
match output_format {
OutputFormat::Json => {
println!("{}", serde_json::to_string_pretty(&status)?);
}
OutputFormat::Yaml => {
println!("{}", serde_yaml::to_string(&status)?);
OutputFormat::Json | OutputFormat::Yaml => {
crate::modes::common::print_serialized(&status_info, &output_format)?;
}
OutputFormat::Table => {
println!("Remote Server Status");
println!("====================");
println!("{}", serde_json::to_string_pretty(&status)?);
// Paths
let path_table =
crate::modes::common::build_path_table(&status_info.paths, &settings.table_config);
println!("PATHS:");
println!(
"{}",
crate::modes::common::trim_lines_end(&path_table.trim_fmt())
);
println!();
// Configured meta plugins
if let Some(ref configured) = status_info.configured_meta_plugins
&& !configured.is_empty()
{
let mut sorted = configured.clone();
sorted.sort_by(|a, b| a.name.cmp(&b.name));
let mut table =
crate::modes::common::create_table_with_config(&settings.table_config);
table.set_header(vec![
Cell::new("Plugin Name").add_attribute(Attribute::Bold),
Cell::new("Enabled").add_attribute(Attribute::Bold),
]);
for plugin in &sorted {
let enabled = status_info.enabled_meta_plugins.contains(&plugin.name);
table.add_row(vec![
plugin.name.clone(),
if enabled { "Yes" } else { "No" }.to_string(),
]);
}
println!("META PLUGINS:");
println!(
"{}",
crate::modes::common::trim_lines_end(&table.trim_fmt())
);
println!();
}
// Compression
if !status_info.compression.is_empty() {
let mut table =
crate::modes::common::create_table_with_config(&settings.table_config);
table.set_header(vec![
Cell::new("Type").add_attribute(Attribute::Bold),
Cell::new("Found").add_attribute(Attribute::Bold),
Cell::new("Default").add_attribute(Attribute::Bold),
Cell::new("Binary").add_attribute(Attribute::Bold),
]);
for comp in &status_info.compression {
table.add_row(vec![
comp.compression_type.clone(),
if comp.found { "Yes" } else { "No" }.to_string(),
if comp.default { "Yes" } else { "No" }.to_string(),
comp.binary.clone(),
]);
}
println!("COMPRESSION:");
println!(
"{}",
crate::modes::common::trim_lines_end(&table.trim_fmt())
);
println!();
}
}
}

102
src/modes/client/update.rs Normal file
View File

@@ -0,0 +1,102 @@
use crate::client::KeepClient;
use crate::config::Settings;
use anyhow::Result;
use clap::Command;
use log::debug;
use std::collections::HashMap;
/// Client update mode: runs meta plugins on the server for an existing item.
///
/// Sends the list of plugin names (from --meta-plugin config) and any direct
/// metadata (--meta key=value) to the server. The server reads the stored file,
/// runs the specified plugins, and stores the results.
pub fn mode(
client: &KeepClient,
cmd: &mut Command,
settings: &Settings,
ids: &mut [i64],
tags: &mut [String],
) -> Result<(), anyhow::Error> {
debug!("CLIENT_UPDATE: Updating item via remote server");
if ids.len() != 1 {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"--update requires exactly one numeric ID",
)
.exit();
}
let item_id = ids[0];
// Collect plugin names from settings (--meta-plugin config)
let plugin_names: Vec<String> = settings
.meta_plugins_names()
.into_iter()
.flat_map(|s| {
s.split(',')
.map(|p| p.trim().to_string())
.collect::<Vec<_>>()
})
.filter(|p| !p.is_empty())
.collect();
// Collect direct metadata from --meta flags
let metadata: HashMap<String, String> = settings
.meta
.iter()
.filter_map(|(k, v)| v.as_ref().map(|val| (k.clone(), val.clone())))
.collect();
// Build query params
let mut params: Vec<(String, String)> = Vec::new();
if !plugin_names.is_empty() {
params.push(("plugins".to_string(), plugin_names.join(",")));
}
if !metadata.is_empty() {
let meta_json = serde_json::to_string(&metadata)?;
params.push(("metadata".to_string(), meta_json));
}
if !tags.is_empty() {
params.push(("tags".to_string(), tags.join(",")));
}
// Nothing to update
if params.is_empty() {
if !settings.quiet {
eprintln!("KEEP: No changes specified for item {item_id}");
}
return Ok(());
}
let param_refs: Vec<(&str, &str)> = params
.iter()
.map(|(k, v)| (k.as_str(), v.as_str()))
.collect();
let url_path = format!("/api/item/{item_id}/update");
// POST to update endpoint
let _item_info = client.post_bytes(&url_path, &[], &param_refs)?;
if !settings.quiet {
let mut parts = Vec::new();
if !plugin_names.is_empty() {
parts.push(format!("plugins: {}", plugin_names.join(", ")));
}
if !metadata.is_empty() {
parts.push(format!("{} metadata", metadata.len()));
}
if !tags.is_empty() {
parts.push(format!("tags: {}", tags.join(" ")));
}
let action = parts.join(", ");
eprintln!("KEEP: Updated item {item_id} ({action})");
}
Ok(())
}

View File

@@ -1,3 +1,4 @@
use crate::common::status::PathInfo;
use crate::compression_engine::CompressionType;
/// Common utilities shared across different modes in the Keep application.
///
@@ -15,11 +16,13 @@ use crate::compression_engine::CompressionType;
/// ```
use crate::config;
use crate::meta_plugin::MetaPluginType;
use anyhow::{Result, anyhow};
use chrono::{DateTime, Utc};
use clap::Command;
use clap::error::ErrorKind;
use comfy_table::{ContentArrangement, Table};
use comfy_table::{Attribute, Cell, ContentArrangement, Table};
use log::debug;
use regex::Regex;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::env;
use std::io::IsTerminal;
@@ -52,38 +55,18 @@ pub enum OutputFormat {
Yaml,
}
/// Extracts metadata from KEEP_META_* environment variables.
///
/// Scans environment for variables prefixed with KEEP_META_ and extracts
/// key-value pairs for initial item metadata. Ignores KEEP_META_PLUGINS.
///
/// # Returns
///
/// `HashMap<String, String>` - Metadata from environment variables, with keys in uppercase without prefix.
///
/// # Errors
///
/// None; silently ignores non-matching vars and PLUGINS.
///
/// # Examples
///
/// ```ignore
/// use std::env;
/// env::set_var("KEEP_META_COMMAND", "ls -la");
/// let meta = keep::modes::common::get_meta_from_env();
/// assert_eq!(meta.get("COMMAND"), Some(&"ls -la".to_string()));
/// ```
pub const IMPORT_FORMAT_ERROR: &str =
"Unsupported import format: {} (expected .keep.tar or .meta.yml)";
pub fn get_meta_from_env() -> HashMap<String, String> {
debug!("COMMON: Getting meta from KEEP_META_*");
let re = Regex::new(r"^KEEP_META_(.+)$").unwrap();
let mut meta_env: HashMap<String, String> = HashMap::new();
const PREFIX: &str = "KEEP_META_";
for (key, value) in env::vars() {
if let Some(meta_name_caps) = re.captures(key.as_str()) {
let name = String::from(meta_name_caps.get(1).unwrap().as_str());
// Ignore KEEP_META_PLUGINS
if name != "PLUGINS" {
debug!("COMMON: Found meta: {}={}", name.clone(), value.clone());
meta_env.insert(name, value.clone());
if let Some(name) = key.strip_prefix(PREFIX) {
if !name.is_empty() && name != "PLUGINS" {
debug!("COMMON: Found meta: {}={}", name, value);
meta_env.insert(name.to_string(), value);
}
}
}
@@ -337,26 +320,8 @@ pub fn trim_lines_end(s: &str) -> String {
/// let mut table = create_table(true);
/// table.add_row(vec!["Header1", "Header2"]);
/// ```
pub fn create_table(use_styling: bool) -> Table {
let mut table = Table::new();
table.set_content_arrangement(ContentArrangement::Dynamic);
if use_styling {
if std::io::stdout().is_terminal() {
table
.load_preset(comfy_table::presets::UTF8_FULL)
.apply_modifier(comfy_table::modifiers::UTF8_SOLID_INNER_BORDERS);
} else {
table.load_preset(comfy_table::presets::ASCII_FULL);
}
} else {
table.load_preset(comfy_table::presets::NOTHING);
}
if !std::io::stdout().is_terminal() {
table.force_no_tty();
}
table
pub fn create_table(_use_styling: bool) -> Table {
create_table_with_config(&crate::config::TableConfig::default())
}
/// Creates a table configured from application table settings.
@@ -447,3 +412,292 @@ pub fn create_table_with_config(table_config: &crate::config::TableConfig) -> Ta
table
}
/// Display data for a single item's detail view (used by --info).
pub struct DisplayItemInfo {
pub id: i64,
pub timestamp: String,
pub path: String,
pub stream_size: String,
pub compression: String,
pub file_size: String,
pub tags: Vec<String>,
pub metadata: Vec<(String, String)>,
}
/// Renders item detail table. Shared by local and client info modes.
pub fn render_item_info_table(info: &DisplayItemInfo, table_config: &config::TableConfig) {
use comfy_table::{Attribute, Cell};
let mut table = create_table_with_config(table_config);
table.add_row(vec![
Cell::new("ID").add_attribute(Attribute::Bold),
Cell::new(info.id.to_string()),
]);
table.add_row(vec![
Cell::new("Time").add_attribute(Attribute::Bold),
Cell::new(&info.timestamp),
]);
table.add_row(vec![
Cell::new("Size").add_attribute(Attribute::Bold),
Cell::new(&info.stream_size),
]);
table.add_row(vec![
Cell::new("Compression").add_attribute(Attribute::Bold),
Cell::new(&info.compression),
]);
table.add_row(vec![
Cell::new("Tags").add_attribute(Attribute::Bold),
Cell::new(info.tags.join(" ")),
]);
for (key, value) in &info.metadata {
table.add_row(vec![
Cell::new(format!("Meta: {key}")).add_attribute(Attribute::Bold),
Cell::new(value),
]);
}
println!("{}", trim_lines_end(&table.trim_fmt()));
}
/// Renders list table with column format from config. Shared by local and client list modes.
pub fn render_list_table_with_format(
columns: &[config::ColumnConfig],
rows: &[Vec<String>],
table_config: &config::TableConfig,
) {
let mut table = create_table_with_config(table_config);
let header_cells: Vec<Cell> = columns
.iter()
.map(|col| Cell::new(&col.label).add_attribute(Attribute::Bold))
.collect();
table.set_header(header_cells);
for row in rows {
let cells: Vec<Cell> = row
.iter()
.enumerate()
.map(|(i, val)| {
let mut cell = Cell::new(val);
if let Some(col) = columns.get(i) {
if let Some(ref fg) = col.fg_color {
cell = apply_color(cell, fg, true);
}
if let Some(ref bg) = col.bg_color {
cell = apply_color(cell, bg, false);
}
for attr in &col.attributes {
cell = apply_table_attribute(cell, attr);
}
}
cell
})
.collect();
table.add_row(cells);
}
println!("{}", trim_lines_end(&table.trim_fmt()));
}
/// Applies config TableColor to a comfy-table Cell.
pub fn apply_color(mut cell: Cell, color: &config::TableColor, is_foreground: bool) -> Cell {
use comfy_table::Color;
let comfy_color = match color {
config::TableColor::Black => Color::Black,
config::TableColor::Red => Color::Red,
config::TableColor::Green => Color::Green,
config::TableColor::Yellow => Color::Yellow,
config::TableColor::Blue => Color::Blue,
config::TableColor::Magenta => Color::Magenta,
config::TableColor::Cyan => Color::Cyan,
config::TableColor::White => Color::White,
config::TableColor::Gray => Color::Grey,
config::TableColor::DarkRed => Color::DarkRed,
config::TableColor::DarkGreen => Color::DarkGreen,
config::TableColor::DarkYellow => Color::DarkYellow,
config::TableColor::DarkBlue => Color::DarkBlue,
config::TableColor::DarkMagenta => Color::DarkMagenta,
config::TableColor::DarkCyan => Color::DarkCyan,
config::TableColor::Rgb(r, g, b) => Color::Rgb {
r: *r,
g: *g,
b: *b,
},
};
if is_foreground {
cell = cell.fg(comfy_color);
} else {
cell = cell.bg(comfy_color);
}
cell
}
/// Ensures tags has at least one entry, adding "none" if empty.
pub fn ensure_default_tag(tags: &mut Vec<String>) {
if tags.is_empty() {
tags.push("none".to_string());
}
}
/// Prints a serializable value in JSON or YAML format based on output format.
///
/// Only handles Json and Yaml variants; Table should be handled separately.
pub fn print_serialized<T: serde::Serialize>(
value: &T,
format: &OutputFormat,
) -> anyhow::Result<()> {
match format {
OutputFormat::Json => println!("{}", serde_json::to_string_pretty(value)?),
OutputFormat::Yaml => println!("{}", serde_yaml::to_string(value)?),
OutputFormat::Table => unreachable!(),
}
Ok(())
}
/// Applies config TableAttribute to a comfy-table Cell.
pub fn apply_table_attribute(mut cell: Cell, attribute: &config::TableAttribute) -> Cell {
match attribute {
config::TableAttribute::Bold => cell = cell.add_attribute(Attribute::Bold),
config::TableAttribute::Dim => cell = cell.add_attribute(Attribute::Dim),
config::TableAttribute::Italic => cell = cell.add_attribute(Attribute::Italic),
config::TableAttribute::Underlined => cell = cell.add_attribute(Attribute::Underlined),
config::TableAttribute::SlowBlink => cell = cell.add_attribute(Attribute::SlowBlink),
config::TableAttribute::RapidBlink => cell = cell.add_attribute(Attribute::RapidBlink),
config::TableAttribute::Reverse => cell = cell.add_attribute(Attribute::Reverse),
config::TableAttribute::Hidden => cell = cell.add_attribute(Attribute::Hidden),
config::TableAttribute::CrossedOut => cell = cell.add_attribute(Attribute::CrossedOut),
}
cell
}
/// Builds a table showing data and database path information.
pub fn build_path_table(path_info: &PathInfo, table_config: &config::TableConfig) -> Table {
let mut path_table = create_table_with_config(table_config);
path_table.set_header(vec![
Cell::new("Type").add_attribute(Attribute::Bold),
Cell::new("Path").add_attribute(Attribute::Bold),
]);
path_table.add_row(vec!["Data", &path_info.data]);
path_table.add_row(vec!["Database", &path_info.database]);
path_table
}
/// Sanitize tags for use in filenames.
///
/// Replaces non-alphanumeric characters with underscores and joins with `_`.
/// Empty tags are filtered out to avoid double underscores.
pub fn sanitize_tags(tags: &[String]) -> String {
tags.iter()
.filter(|t| !t.is_empty())
.map(|t| {
t.chars()
.map(|c| if c.is_alphanumeric() { c } else { '_' })
.collect::<String>()
})
.collect::<Vec<_>>()
.join("_")
}
/// Metadata structure for export to YAML. Shared by local and client export modes.
#[derive(Debug, Serialize)]
pub struct ExportMeta {
pub ts: DateTime<Utc>,
pub compression: String,
pub uncompressed_size: Option<i64>,
pub tags: Vec<String>,
pub metadata: HashMap<String, String>,
}
/// Metadata structure for import from YAML. Shared by local and client import modes.
#[derive(Debug, Deserialize)]
pub struct ImportMeta {
pub ts: DateTime<Utc>,
pub compression: String,
#[serde(default, alias = "size")]
pub uncompressed_size: Option<i64>,
#[serde(default)]
pub tags: Vec<String>,
#[serde(default)]
pub metadata: HashMap<String, String>,
}
/// Resolve a single item ID from explicit IDs, tags, or latest item.
///
/// Returns the first ID if provided, the newest item matching tags,
/// or the newest item overall if neither is specified.
#[cfg(feature = "client")]
pub fn resolve_item_id(
client: &crate::client::KeepClient,
ids: &[i64],
tags: &[String],
) -> Result<i64> {
if !ids.is_empty() {
Ok(ids[0])
} else if !tags.is_empty() {
let items = client.list_items(&[], tags, "newest", 0, 1, &HashMap::new())?;
if items.is_empty() {
return Err(anyhow!("No items found matching tags: {:?}", tags));
}
Ok(items[0].id)
} else {
let items = client.list_items(&[], &[], "newest", 0, 1, &HashMap::new())?;
if items.is_empty() {
return Err(anyhow!("No items found"));
}
Ok(items[0].id)
}
}
/// Resolve item IDs from explicit IDs or tags (multi-item variant).
#[cfg(feature = "client")]
pub fn resolve_item_ids(
client: &crate::client::KeepClient,
ids: &[i64],
tags: &[String],
) -> Result<Vec<i64>> {
if !ids.is_empty() {
Ok(ids.to_vec())
} else if !tags.is_empty() {
let items = client.list_items(&[], tags, "newest", 0, 0, &HashMap::new())?;
if items.is_empty() {
return Err(anyhow!("No items found matching tags: {:?}", tags));
}
Ok(items.into_iter().map(|i| i.id).collect())
} else {
let items = client.list_items(&[], &[], "newest", 0, 1, &HashMap::new())?;
if items.is_empty() {
return Err(anyhow!("No items found"));
}
Ok(vec![items[0].id])
}
}
/// Check if binary content should be blocked from TTY output.
///
/// Uses metadata `text` field as fast path, then falls back to byte sampling.
/// Returns Err if content is binary and should not be displayed.
pub fn check_binary_tty(
metadata: &HashMap<String, String>,
data_sample: &[u8],
force: bool,
) -> Result<()> {
if force || !std::io::stdout().is_terminal() {
return Ok(());
}
if crate::common::is_binary::is_content_binary_from_metadata(metadata, data_sample) {
return Err(anyhow!(
"Refusing to output binary data to TTY, use --force to override"
));
}
Ok(())
}

View File

@@ -1,12 +1,19 @@
use crate::config;
use crate::services::item_service::ItemService;
/// Diff mode implementation.
///
/// This module provides functionality for comparing two items and displaying their
/// differences using external diff tools.
use anyhow::{Context, Result};
/// differences using external diff tools. Decompressed content is streamed to diff
/// via pipes and /dev/fd file descriptors — no temporary files are created.
use crate::config;
use crate::services::compression_service::CompressionService;
use crate::services::item_service::ItemService;
use anyhow::{Context, Result, anyhow};
use clap::Command;
use command_fds::{CommandFdExt, FdMapping};
use log::debug;
use nix::fcntl::OFlag;
use nix::unistd::pipe2;
use std::io::Read;
use std::os::unix::io::{AsRawFd, OwnedFd};
fn validate_diff_args(_cmd: &mut Command, ids: &[i64], tags: &[String]) -> anyhow::Result<()> {
if !tags.is_empty() {
@@ -23,19 +30,6 @@ fn validate_diff_args(_cmd: &mut Command, ids: &[i64], tags: &[String]) -> anyho
}
/// Fetches and validates items from the database for diff operation.
///
/// This function retrieves two items by their IDs from the database using the
/// item service, which handles validation, and returns them as a tuple.
///
/// # Arguments
///
/// * `conn` - Mutable reference to the database connection.
/// * `ids` - Vector of item IDs to fetch.
/// * `item_service` - Reference to the item service for validation.
///
/// # Returns
///
/// * `Result<(ItemWithMeta, ItemWithMeta)>` - Tuple of items with metadata or error.
fn fetch_and_validate_items(
conn: &mut rusqlite::Connection,
ids: &[i64],
@@ -44,7 +38,6 @@ fn fetch_and_validate_items(
crate::services::types::ItemWithMeta,
crate::services::types::ItemWithMeta,
)> {
// Fetch items using the service, which handles validation
let item_a = item_service
.get_item(conn, ids[0])
.with_context(|| format!("Unable to find first item (ID: {}) in database", ids[0]))?;
@@ -52,48 +45,12 @@ fn fetch_and_validate_items(
.get_item(conn, ids[1])
.with_context(|| format!("Unable to find second item (ID: {}) in database", ids[1]))?;
debug!("MAIN: Found item A {:?}", item_a.item);
debug!("MAIN: Found item B {:?}", item_b.item);
debug!("DIFF: Found item A {:?}", item_a.item);
debug!("DIFF: Found item B {:?}", item_b.item);
Ok((item_a, item_b))
}
/// Sets up file paths and compression for diff operation.
///
/// This function constructs the file paths for the two items and prepares the
/// compression engines needed for reading their contents.
///
/// # Arguments
///
/// * `item_service` - Reference to the item service.
/// * `item_a` - First item with metadata.
/// * `item_b` - Second item with metadata.
///
/// # Returns
///
/// * `Result<(PathBuf, PathBuf)>` - Tuple of item file paths or error.
fn setup_diff_paths_and_compression(
item_service: &ItemService,
item_a: &crate::services::types::ItemWithMeta,
item_b: &crate::services::types::ItemWithMeta,
) -> Result<(std::path::PathBuf, std::path::PathBuf)> {
let item_a_id = item_a
.item
.id
.ok_or_else(|| anyhow::anyhow!("Item A missing ID"))?;
let item_b_id = item_b
.item
.id
.ok_or_else(|| anyhow::anyhow!("Item B missing ID"))?;
// Use the service's data path to construct proper file paths
let data_path = item_service.get_data_path();
let item_a_path = data_path.join(item_a_id.to_string());
let item_b_path = data_path.join(item_b_id.to_string());
Ok((item_a_path, item_b_path))
}
pub fn mode_diff(
cmd: &mut Command,
args: &crate::args::Args,
@@ -125,51 +82,119 @@ pub fn mode_diff(
validate_diff_args(cmd, &ids, &tags)?;
let settings = crate::config::Settings::new(args, crate::config::Settings::default_dir()?)?;
let item_service = crate::services::item_service::ItemService::new(settings.dir.clone());
let settings = config::Settings::new(args, config::Settings::default_dir()?)?;
let item_service = ItemService::new(settings.dir.clone());
let (item_a, item_b) = fetch_and_validate_items(conn, &ids, &item_service)?;
let (path_a, path_b) = setup_diff_paths_and_compression(&item_service, &item_a, &item_b)?;
run_external_diff(&path_a, &path_b)?;
Ok(())
run_external_diff(&item_service, &item_a, &item_b)
}
/// Runs external diff command to compare two files.
/// Creates a pipe with CLOEXEC set atomically, returns (read_fd, write_fd).
fn create_pipe() -> Result<(OwnedFd, OwnedFd)> {
pipe2(OFlag::O_CLOEXEC).context("Failed to create pipe")
}
/// Streams decompressed item content through a pipe fd.
///
/// Uses the system's `diff` command to generate a unified diff output.
/// Returns an error if the diff command is not found.
/// Returns a JoinHandle for the writer thread. The thread writes decompressed
/// data to write_fd and closes it when done (causing EOF for the reader).
fn spawn_writer_thread(
item_service: &ItemService,
item: &crate::services::types::ItemWithMeta,
write_fd: OwnedFd,
) -> std::thread::JoinHandle<Result<()>> {
let data_path = item_service.get_data_path().clone();
let id = match item.item.id {
Some(id) => id,
None => return std::thread::spawn(|| Err(anyhow!("item missing ID"))),
};
let compression = item.item.compression.clone();
let mut item_path = data_path;
item_path.push(id.to_string());
std::thread::spawn(move || -> Result<()> {
let compression_service = CompressionService::new();
let mut reader = compression_service
.stream_item_content(item_path, &compression)
.map_err(|e| anyhow::anyhow!("Failed to stream item {id}: {e}"))?;
// Convert OwnedFd to File — safe, takes ownership, closes on drop
let mut writer = std::fs::File::from(write_fd);
crate::common::stream_copy(&mut reader, |chunk| {
use std::io::Write;
writer.write_all(chunk)
})
.map_err(|e| anyhow::anyhow!("Error reading item {id}: {e}"))?;
// writer dropped here, closing write_fd → diff sees EOF
Ok(())
})
}
/// Runs external diff command, streaming decompressed content via /dev/fd pipes.
///
/// # Arguments
///
/// * `path_a` - Path to the first file.
/// * `path_b` - Path to the second file.
///
/// # Returns
///
/// * `Result<()>` - Success or error.
fn run_external_diff(path_a: &std::path::Path, path_b: &std::path::Path) -> anyhow::Result<()> {
/// Creates two pipes, spawns writer threads to decompress each item into its pipe,
/// and runs `diff -u /dev/fd/N /dev/fd/M` where N and M are the pipe read fds.
/// The `command-fds` crate handles CLOEXEC clearing safely — no unsafe needed.
fn run_external_diff(
item_service: &ItemService,
item_a: &crate::services::types::ItemWithMeta,
item_b: &crate::services::types::ItemWithMeta,
) -> Result<()> {
if which::which_global("diff").is_err() {
return Err(anyhow::anyhow!(
"diff command not found. Please install diffutils."
));
}
let mut child = std::process::Command::new("diff")
let (read_fd_a, write_fd_a) = create_pipe()?;
let (read_fd_b, write_fd_b) = create_pipe()?;
// Spawn writer threads — they take ownership of write fds and close them on exit
let writer_a = spawn_writer_thread(item_service, item_a, write_fd_a);
let writer_b = spawn_writer_thread(item_service, item_b, write_fd_b);
// Get fd numbers for /dev/fd paths (borrows, does not consume)
let raw_read_a = read_fd_a.as_raw_fd();
let raw_read_b = read_fd_b.as_raw_fd();
debug!("DIFF: pipe fds: a(r={raw_read_a}) b(r={raw_read_b})");
// Spawn diff with /dev/fd/N paths. command-fds handles CLOEXEC clearing
// and fd inheritance safely — the fds are released from OwnedFd to the
// child process. If spawn fails, the OwnedFd values in FdMapping are
// dropped and the fds are properly closed.
let mut command = std::process::Command::new("diff");
command
.arg("-u")
.arg(path_a)
.arg(path_b)
.arg(format!("/dev/fd/{raw_read_a}"))
.arg(format!("/dev/fd/{raw_read_b}"))
.stdout(std::process::Stdio::inherit())
.stderr(std::process::Stdio::inherit())
.spawn()
.context("Failed to spawn diff command")?;
.stdin(std::process::Stdio::null())
.fd_mappings(vec![
FdMapping {
parent_fd: read_fd_a,
child_fd: raw_read_a,
},
FdMapping {
parent_fd: read_fd_b,
child_fd: raw_read_b,
},
])
.map_err(|e| anyhow::anyhow!("FD mapping collision: {e}"))?;
let mut child = command.spawn().context("Failed to spawn diff command")?;
let status = child.wait().context("Failed to wait for diff command")?;
// diff returns 0 if files are identical, 1 if different, 2 on error
// Join writer threads and propagate errors
writer_a
.join()
.map_err(|e| anyhow::anyhow!("Writer A panicked: {e:?}"))??;
writer_b
.join()
.map_err(|e| anyhow::anyhow!("Writer B panicked: {e:?}"))??;
// diff returns 0 if identical, 1 if different, 2 on error
if status.code() == Some(2) {
Err(anyhow::anyhow!("diff command failed with an error"))
} else {

145
src/modes/export.rs Normal file
View File

@@ -0,0 +1,145 @@
use anyhow::{Context, Result, anyhow};
use chrono::Utc;
use clap::Command;
use log::debug;
use std::collections::HashMap;
use std::fs;
use std::path::PathBuf;
use crate::common::sanitize_ts_string;
use crate::config;
use crate::export_tar;
use crate::filter_plugin::FilterChain;
use crate::modes::common::sanitize_tags;
use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
/// Export items to a `.keep.tar` archive.
///
/// Requires either IDs or tags (mutually exclusive). If IDs are given,
/// ALL must exist. Archives contain per-item data and metadata files.
pub fn mode_export(
cmd: &mut Command,
settings: &config::Settings,
ids: &[i64],
tags: &[String],
conn: &mut rusqlite::Connection,
data_path: PathBuf,
filter_chain: Option<FilterChain>,
) -> Result<()> {
// Validate: IDs XOR tags
if !ids.is_empty() && !tags.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"Cannot use both IDs and tags with --export",
)
.exit();
}
if ids.is_empty() && tags.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"Must provide either IDs or tags with --export",
)
.exit();
}
let item_service = ItemService::new(data_path.clone());
let meta_filter = settings.meta_filter();
// Resolve items
let items: Vec<ItemWithMeta> = if !ids.is_empty() {
// Fetch each ID individually; ALL must exist
let mut result = Vec::new();
for &id in ids {
match item_service.get_item(conn, id) {
Ok(item) => result.push(item),
Err(_) => {
cmd.error(
clap::error::ErrorKind::InvalidValue,
format!("Item {id} not found"),
)
.exit();
}
}
}
result
} else {
// Search by tags
item_service
.list_items(conn, tags, &meta_filter)
.map_err(|e| anyhow!("Unable to find matching items: {}", e))?
};
if items.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"No items found matching the given criteria",
)
.exit();
}
// Validate: --export-filename-format doesn't use per-item vars with multiple items
if items.len() > 1 {
let fmt = &settings.export_filename_format;
if fmt.contains("{id}") || fmt.contains("{tags}") || fmt.contains("{compression}") {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"Cannot use {id}, {tags}, or {compression} in --export-filename-format when exporting multiple items",
)
.exit();
}
}
// Compute export name
let dir_name = export_tar::export_name(&settings.export_name, &items);
// Compute tar filename from format template
let now = Utc::now();
let ts_str = sanitize_ts_string(&now.format("%Y-%m-%dT%H:%M:%SZ").to_string());
let mut vars = HashMap::new();
vars.insert("name".to_string(), dir_name.clone());
vars.insert("ts".to_string(), ts_str.clone());
// For single-item exports, also provide per-item vars
if items.len() == 1 {
let item = &items[0];
let item_id = item.item.id.context("Item missing ID")?;
let item_tags = item.tag_names();
vars.insert("id".to_string(), item_id.to_string());
vars.insert("tags".to_string(), sanitize_tags(&item_tags));
vars.insert("compression".to_string(), item.item.compression.clone());
}
let basename = strfmt::strfmt(&settings.export_filename_format, &vars).map_err(|e| {
anyhow!(
"Invalid export filename format '{}': {}",
settings.export_filename_format,
e
)
})?;
let tar_filename = format!("{basename}.keep.tar");
// Write the tar archive
let tar_file = fs::File::create(&tar_filename)
.with_context(|| format!("Cannot create tar file: {tar_filename}"))?;
export_tar::write_export_tar(
tar_file,
&dir_name,
&items,
&data_path,
filter_chain.as_ref(),
&item_service,
conn,
)?;
if !settings.quiet {
eprintln!("{tar_filename}");
}
debug!("EXPORT: Wrote {} items to {tar_filename}", items.len());
Ok(())
}

View File

@@ -258,7 +258,7 @@ fn compression_description(name: &str) -> &str {
"bzip2" => "High compression (requires bzip2 binary)",
"xz" => "Very high compression (requires xz binary)",
"zstd" => "Modern fast compression (requires zstd binary)",
"none" => "No compression",
"raw" => "No compression (alias: none)",
_ => "",
}
}

View File

@@ -52,7 +52,7 @@ pub fn mode_get(
let item_service = ItemService::new(data_path.clone());
let item_with_meta = item_service
.find_item(conn, ids, tags, &std::collections::HashMap::new())
.find_item(conn, ids, tags, &settings.meta_filter())
.map_err(|e| anyhow!("Unable to find matching item in database: {}", e))?;
let item_id = item_with_meta.item.id.context("Item missing ID")?;
@@ -103,13 +103,9 @@ pub fn mode_get(
fn stream_to_stdout(mut reader: Box<dyn Read + Send>) -> Result<()> {
let mut stdout = std::io::stdout();
let mut buffer = [0; PIPESIZE];
loop {
let bytes_read = reader.read(&mut buffer)?;
if bytes_read == 0 {
break;
}
stdout.write_all(&buffer[..bytes_read])?;
}
crate::common::stream_copy(&mut reader, |chunk| {
stdout.write_all(chunk)?;
Ok(())
})?;
Ok(())
}

192
src/modes/import.rs Normal file
View File

@@ -0,0 +1,192 @@
use anyhow::{Context, Result, anyhow};
use chrono::{DateTime, Utc};
use clap::Command;
use log::debug;
use std::collections::HashMap;
use std::fs;
use std::io::{Read, Write};
use std::path::PathBuf;
use std::str::FromStr;
use crate::common::PIPESIZE;
use crate::compression_engine::CompressionType;
use crate::config;
use crate::db;
use crate::import_tar;
use crate::modes::common::ImportMeta;
/// Import items from a `.keep.tar` archive or legacy `.meta.yml` file.
///
/// For `.keep.tar` files, all items are imported in their original ID order,
/// each receiving a new auto-incremented ID from the database.
/// For `.meta.yml` files, the legacy single-item import is used.
pub fn mode_import(
cmd: &mut Command,
settings: &config::Settings,
import_path: &str,
conn: &mut rusqlite::Connection,
data_path: PathBuf,
) -> Result<()> {
let path = PathBuf::from(import_path);
if import_path.ends_with(".keep.tar") {
// New tar-based import
let imported_ids = import_tar::import_from_tar(&path, conn, &data_path)?;
if !settings.quiet {
println!(
"KEEP: Imported {} item(s): {:?}",
imported_ids.len(),
imported_ids
);
}
debug!(
"IMPORT: Imported {} items from {}",
imported_ids.len(),
import_path
);
} else if import_path.ends_with(".meta.yml") {
// Legacy single-item import
import_legacy(cmd, settings, import_path, conn, data_path)?;
} else {
cmd.error(
clap::error::ErrorKind::InvalidValue,
format!("Unsupported import format: {}", import_path),
)
.exit();
}
Ok(())
}
/// Legacy single-item import from a `.meta.yml` file.
fn import_legacy(
cmd: &mut Command,
settings: &config::Settings,
meta_file: &str,
conn: &mut rusqlite::Connection,
data_path: PathBuf,
) -> Result<()> {
// Read metadata
let meta_yaml = fs::read_to_string(meta_file)
.with_context(|| format!("Cannot read metadata file: {meta_file}"))?;
let import_meta: ImportMeta = serde_yaml::from_str(&meta_yaml)
.with_context(|| format!("Cannot parse metadata file: {meta_file}"))?;
// Validate compression type
CompressionType::from_str(&import_meta.compression).map_err(|_| {
anyhow!(
"Invalid compression type '{}' in metadata file",
import_meta.compression
)
})?;
debug!(
"IMPORT: Parsed meta: ts={}, compression={}, tags={:?}",
import_meta.ts, import_meta.compression, import_meta.tags
);
// Create item with original timestamp
let item = db::insert_item_with_ts(conn, import_meta.ts, &import_meta.compression)?;
let item_id = item.id.context("New item missing ID")?;
debug!(
"IMPORT: Created item {} with compression {}",
item_id, import_meta.compression
);
// Set tags
if !import_meta.tags.is_empty() {
db::set_item_tags(conn, item.clone(), &import_meta.tags)?;
debug!("IMPORT: Set {} tags", import_meta.tags.len());
}
// Write data to storage using streaming copy
let mut item_path = data_path;
item_path.push(item_id.to_string());
let data_size: i64 = if let Some(ref data_file) = settings.import_data_file {
// Stream from file to storage using fixed-size buffers
let mut reader = fs::File::open(data_file)
.with_context(|| format!("Cannot read data file: {}", data_file.display()))?;
let mut writer = fs::File::create(&item_path)
.with_context(|| format!("Cannot create item file: {}", item_path.display()))?;
let mut buf = [0u8; PIPESIZE];
let mut total = 0i64;
loop {
let n = reader.read(&mut buf)?;
if n == 0 {
break;
}
writer.write_all(&buf[..n])?;
total += n as i64;
}
total
} else {
// Stream from stdin to storage
let mut writer = fs::File::create(&item_path)
.with_context(|| format!("Cannot create item file: {}", item_path.display()))?;
let mut stdin = std::io::stdin().lock();
let mut buf = [0u8; PIPESIZE];
let mut total = 0i64;
loop {
let n = stdin.read(&mut buf)?;
if n == 0 {
break;
}
writer.write_all(&buf[..n])?;
total += n as i64;
}
total
};
if data_size == 0 {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"No data provided (empty file or stdin)",
)
.exit();
}
debug!(
"IMPORT: Wrote {} bytes to {}",
data_size,
item_path.display()
);
// Set metadata
for (key, value) in &import_meta.metadata {
db::query_upsert_meta(
conn,
db::Meta {
id: item_id,
name: key.clone(),
value: value.clone(),
},
)?;
}
if !import_meta.metadata.is_empty() {
debug!(
"IMPORT: Set {} metadata entries",
import_meta.metadata.len()
);
}
// Update item sizes (use imported size if available, otherwise data length)
let size_to_record = import_meta.uncompressed_size.unwrap_or(data_size);
let mut updated_item = item;
updated_item.uncompressed_size = Some(size_to_record);
updated_item.compressed_size = Some(std::fs::metadata(&item_path)?.len() as i64);
updated_item.closed = true;
db::update_item(conn, updated_item)?;
if !settings.quiet {
println!(
"KEEP: Imported item {} tags: {:?}",
item_id, import_meta.tags
);
}
Ok(())
}

View File

@@ -1,5 +1,5 @@
use crate::config;
use crate::modes::common::{OutputFormat, format_size};
use crate::modes::common::{DisplayItemInfo, OutputFormat, format_size, render_item_info_table};
use crate::services::types::ItemWithMeta;
use anyhow::{Context, Result, anyhow};
use clap::Command;
@@ -9,7 +9,6 @@ use std::path::PathBuf;
use crate::services::item_service::ItemService;
use chrono::prelude::*;
use comfy_table::{Attribute, Cell};
/// Displays detailed information about an item or the last item if no ID/tags specified.
///
@@ -65,9 +64,8 @@ pub fn mode_info(
// If both are empty, find_item will find the last item
let item_service = ItemService::new(data_path.clone());
// Use empty metadata HashMap
let item_with_meta = item_service
.find_item(conn, ids, tags, &std::collections::HashMap::new())
.find_item(conn, ids, tags, &settings.meta_filter())
.map_err(|e| anyhow!("Unable to find matching item in database: {}", e))?;
show_item(item_with_meta, settings, data_path)
@@ -140,77 +138,44 @@ fn show_item(
return show_item_structured(item_with_meta, settings, data_path, output_format);
}
let item_tags = item_with_meta.tag_names();
let item = item_with_meta.item;
let item_id = item.id.context("Item missing ID")?;
let item_tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect();
let mut table = crate::modes::common::create_table(false);
// Add all the rows
table.add_row(vec![
Cell::new("ID").add_attribute(Attribute::Bold),
Cell::new(item_id.to_string()),
]);
let timestamp_str = item.ts.with_timezone(&Local).format("%F %T %Z").to_string();
table.add_row(vec![
Cell::new("Timestamp").add_attribute(Attribute::Bold),
Cell::new(&timestamp_str),
]);
let mut item_path_buf = data_path.clone();
item_path_buf.push(item_id.to_string());
let path_str = item_path_buf
.to_str()
.expect("Unable to get item path")
.to_string();
table.add_row(vec![
Cell::new("Path").add_attribute(Attribute::Bold),
Cell::new(&path_str),
]);
let size_str = match item.size {
let size_str = match item.uncompressed_size {
Some(size) => format_size(size as u64, settings.human_readable),
None => "Missing".to_string(),
};
table.add_row(vec![
Cell::new("Stream Size").add_attribute(Attribute::Bold),
Cell::new(&size_str),
]);
table.add_row(vec![
Cell::new("Compression").add_attribute(Attribute::Bold),
Cell::new(&item.compression),
]);
let file_size_str = match item_path_buf.metadata() {
Ok(metadata) => format_size(metadata.len(), settings.human_readable),
Err(_) => "Missing".to_string(),
};
table.add_row(vec![
Cell::new("File Size").add_attribute(Attribute::Bold),
Cell::new(&file_size_str),
]);
let tags_str = item_tags.join(" ");
table.add_row(vec![
Cell::new("Tags").add_attribute(Attribute::Bold),
Cell::new(&tags_str),
]);
let metadata: Vec<(String, String)> = item_with_meta
.meta
.iter()
.map(|m| (m.name.clone(), m.value.clone()))
.collect();
// Add meta rows
for meta in item_with_meta.meta {
let meta_name = format!("Meta: {}", &meta.name);
table.add_row(vec![
Cell::new(&meta_name).add_attribute(Attribute::Bold),
Cell::new(&meta.value),
]);
}
let display = DisplayItemInfo {
id: item_id,
timestamp: item.ts.with_timezone(&Local).format("%F %T %Z").to_string(),
path: item_path_buf
.to_str()
.ok_or_else(|| anyhow::anyhow!("non-UTF-8 item path"))?
.to_string(),
stream_size: size_str,
compression: item.compression.clone(),
file_size: file_size_str,
tags: item_tags,
metadata,
};
println!(
"{}",
crate::modes::common::trim_lines_end(&table.trim_fmt())
);
render_item_info_table(&display, &settings.table_config);
Ok(())
}
@@ -246,7 +211,7 @@ fn show_item_structured(
data_path: PathBuf,
output_format: OutputFormat,
) -> Result<()> {
let item_tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect();
let item_tags = item_with_meta.tag_names();
let meta_map = item_with_meta.meta_as_map();
let item = item_with_meta.item;
let item_id = item.id.context("Item missing ID")?;
@@ -260,7 +225,7 @@ fn show_item_structured(
None => "Missing".to_string(),
};
let stream_size_formatted = match item.size {
let stream_size_formatted = match item.uncompressed_size {
Some(size) => format_size(size as u64, settings.human_readable),
None => "Missing".to_string(),
};
@@ -273,7 +238,7 @@ fn show_item_structured(
.format("%F %T %Z")
.to_string(),
path: item_path_buf.to_str().unwrap_or("").to_string(),
stream_size: item.size.map(|s| s as u64),
stream_size: item.uncompressed_size.map(|s| s as u64),
stream_size_formatted,
compression: item.compression,
file_size,
@@ -282,15 +247,7 @@ fn show_item_structured(
meta: meta_map,
};
match output_format {
OutputFormat::Json => {
println!("{}", serde_json::to_string_pretty(&item_info)?);
}
OutputFormat::Yaml => {
println!("{}", serde_yaml::to_string(&item_info)?);
}
OutputFormat::Table => unreachable!(),
}
crate::modes::common::print_serialized(&item_info, &output_format)?;
Ok(())
}

View File

@@ -5,7 +5,7 @@
/// including table, JSON, and YAML.
use crate::config;
use crate::modes::common::ColumnType;
use crate::modes::common::{OutputFormat, format_size};
use crate::modes::common::{OutputFormat, apply_color, apply_table_attribute, format_size};
use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
use anyhow::{Context, Result};
@@ -63,88 +63,6 @@ struct ListItem {
meta: std::collections::HashMap<String, String>,
}
// Helper function to apply color to a cell.
///
/// This function converts the configuration color to a comfy-table Color and
/// applies it to the cell as foreground or background color.
///
/// # Arguments
///
/// * `cell` - The cell to modify.
/// * `color` - The color from configuration to apply.
/// * `is_foreground` - True for foreground color, false for background.
///
/// # Returns
///
/// The modified cell with color applied.
fn apply_color(mut cell: Cell, color: &crate::config::TableColor, is_foreground: bool) -> Cell {
use crate::config::TableColor::*;
use comfy_table::Color;
let comfy_color = match color {
Black => Color::Black,
Red => Color::Red,
Green => Color::Green,
Yellow => Color::Yellow,
Blue => Color::Blue,
Magenta => Color::Magenta,
Cyan => Color::Cyan,
White => Color::White,
Gray => Color::Grey,
DarkRed => Color::DarkRed,
DarkGreen => Color::DarkGreen,
DarkYellow => Color::DarkYellow,
DarkBlue => Color::DarkBlue,
DarkMagenta => Color::DarkMagenta,
DarkCyan => Color::DarkCyan,
Rgb(r, g, b) => Color::Rgb {
r: *r,
g: *g,
b: *b,
},
};
if is_foreground {
cell = cell.fg(comfy_color);
} else {
cell = cell.bg(comfy_color);
}
cell
}
// Helper function to apply attribute to a cell.
///
/// This function applies a single table attribute to the cell based on the
/// configuration attribute type.
///
/// # Arguments
///
/// * `cell` - The cell to modify.
/// * `attribute` - The attribute from configuration to apply.
///
/// # Returns
///
/// The modified cell with attribute applied.
fn apply_attribute(mut cell: Cell, attribute: &crate::config::TableAttribute) -> Cell {
use crate::config::TableAttribute::*;
use comfy_table::Attribute;
match attribute {
Bold => cell = cell.add_attribute(Attribute::Bold),
Dim => cell = cell.add_attribute(Attribute::Dim),
Italic => cell = cell.add_attribute(Attribute::Italic),
Underlined => cell = cell.add_attribute(Attribute::Underlined),
SlowBlink => cell = cell.add_attribute(Attribute::SlowBlink),
RapidBlink => cell = cell.add_attribute(Attribute::RapidBlink),
Reverse => cell = cell.add_attribute(Attribute::Reverse),
Hidden => cell = cell.add_attribute(Attribute::Hidden),
CrossedOut => cell = cell.add_attribute(Attribute::CrossedOut),
}
cell
}
/// Main list mode function.
///
/// This function handles the listing of items based on tags, applying formatting
@@ -163,23 +81,24 @@ fn apply_attribute(mut cell: Cell, attribute: &crate::config::TableAttribute) ->
///
/// * `Result<()>` - Success or error if listing fails.
pub fn mode_list(
cmd: &mut clap::Command,
_cmd: &mut clap::Command,
settings: &config::Settings,
ids: &mut [i64],
tags: &[String],
conn: &mut rusqlite::Connection,
data_path: std::path::PathBuf,
) -> Result<()> {
if !ids.is_empty() {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"ID given, you can only supply tags when using --list",
)
.exit();
}
let item_service = ItemService::new(data_path.clone());
let items_with_meta = item_service.list_items(conn, tags, &std::collections::HashMap::new())?;
let items_with_meta = item_service.get_items(conn, ids, tags, &settings.meta_filter())?;
if settings.ids_only {
for item_with_meta in &items_with_meta {
if let Some(id) = item_with_meta.item.id {
println!("{id}");
}
}
return Ok(());
}
let output_format = crate::modes::common::settings_output_format(settings);
@@ -197,7 +116,7 @@ pub fn mode_list(
table.set_header(header_cells);
for item_with_meta in items_with_meta {
let tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect();
let tags = item_with_meta.tag_names();
let meta = item_with_meta.meta_as_map();
let item = item_with_meta.item;
@@ -228,19 +147,29 @@ pub fn mode_list(
.with_timezone(&chrono::Local)
.format("%F %T")
.to_string(),
ColumnType::Size => match item.size {
ColumnType::Size => match item.uncompressed_size {
Some(size) => format_size(size as u64, settings.human_readable),
None => match item_path.metadata() {
Ok(_) => "Unknown".to_string(),
Err(_) => "Missing".to_string(),
Err(e) => {
log::warn!("File missing or inaccessible: {}", e);
"Missing".to_string()
}
},
},
ColumnType::Compression => item.compression.to_string(),
ColumnType::FileSize => match item_path.metadata() {
Ok(metadata) => format_size(metadata.len(), settings.human_readable),
Err(_) => "Missing".to_string(),
Err(e) => {
log::warn!("File missing or inaccessible: {}", e);
"Missing".to_string()
}
},
ColumnType::FilePath => item_path.clone().into_os_string().into_string().unwrap(),
ColumnType::FilePath => item_path
.clone()
.into_os_string()
.into_string()
.unwrap_or_else(|os| os.to_string_lossy().into_owned()),
ColumnType::Tags => tags.join(" "),
ColumnType::Meta => match meta_name {
Some(meta_name) => match meta.get(meta_name) {
@@ -278,7 +207,7 @@ pub fn mode_list(
}
for attribute in &column.attributes {
cell = apply_attribute(cell, attribute);
cell = apply_table_attribute(cell, attribute);
}
// Apply padding if specified
@@ -290,7 +219,7 @@ pub fn mode_list(
// Apply styling for specific cases
match column_type {
ColumnType::Size => {
if item.size.is_none() {
if item.uncompressed_size.is_none() {
if item_path.metadata().is_ok() {
cell = cell
.fg(comfy_table::Color::Yellow)
@@ -340,7 +269,7 @@ fn show_list_structured(
let mut list_items = Vec::new();
for item_with_meta in items_with_meta {
let tags: Vec<String> = item_with_meta.tags.iter().map(|t| t.name.clone()).collect();
let tags = item_with_meta.tag_names();
let meta = item_with_meta.meta_as_map();
let item = item_with_meta.item;
let item_id = item.id.context("Item missing ID")?;
@@ -354,7 +283,7 @@ fn show_list_structured(
None => "Missing".to_string(),
};
let size_formatted = match item.size {
let size_formatted = match item.uncompressed_size {
Some(size) => crate::modes::common::format_size(size as u64, settings.human_readable),
None => "Unknown".to_string(),
};
@@ -366,7 +295,7 @@ fn show_list_structured(
.with_timezone(&chrono::Local)
.format("%F %T")
.to_string(),
size: item.size.map(|s| s as u64),
size: item.uncompressed_size.map(|s| s as u64),
size_formatted,
compression: item.compression,
file_size,
@@ -379,15 +308,7 @@ fn show_list_structured(
list_items.push(list_item);
}
match output_format {
OutputFormat::Json => {
println!("{}", serde_json::to_string_pretty(&list_items)?);
}
OutputFormat::Yaml => {
println!("{}", serde_yaml::to_string(&list_items)?);
}
OutputFormat::Table => unreachable!(),
}
crate::modes::common::print_serialized(&list_items, &output_format)?;
Ok(())
}

View File

@@ -9,13 +9,16 @@ pub mod common;
pub mod delete;
pub mod diff;
pub mod export;
pub mod generate_config;
pub mod get;
pub mod import;
pub mod info;
pub mod list;
pub mod save;
pub mod status;
pub mod status_plugins;
pub mod update;
/// Column types, output formats, and formatting utilities shared across modes.
pub use common::{ColumnType, OutputFormat, format_size, settings_output_format};
@@ -26,12 +29,18 @@ pub use delete::mode_delete;
/// Compares two items and shows differences.
pub use diff::mode_diff;
/// Exports an item to data and metadata files.
pub use export::mode_export;
/// Generates a default configuration file.
pub use generate_config::mode_generate_config;
/// Retrieves and outputs item content.
pub use get::mode_get;
/// Imports an item from metadata and data files.
pub use import::mode_import;
/// Displays detailed information about items.
pub use info::mode_info;
@@ -50,3 +59,6 @@ pub use status::mode_status;
/// Lists available plugins and their configurations.
pub use status_plugins::mode_status_plugins;
/// Updates an item's tags and metadata by ID.
pub use update::mode_update;

File diff suppressed because it is too large Load Diff

View File

@@ -1,72 +0,0 @@
use axum::{
extract::State,
http::StatusCode,
response::sse::{Event, KeepAlive, Sse},
};
use futures::stream::{self, Stream};
use log::{debug, info};
use std::convert::Infallible;
use std::time::Duration;
use crate::modes::server::common::AppState;
use crate::modes::server::mcp::KeepMcpServer;
#[utoipa::path(
get,
path = "/mcp/sse",
operation_id = "mcp_sse",
summary = "MCP SSE endpoint",
description = "Server-Sent Events for Model Context Protocol. Enables AI tools to interact with Keep's storage and retrieval functions.",
responses(
(status = 200, description = "SSE stream established"),
(status = 401, description = "Unauthorized"),
(status = 500, description = "Internal server error")
),
security(
("bearerAuth" = [])
),
tag = "mcp"
)]
pub async fn handle_mcp_sse(
State(state): State<AppState>,
) -> Result<Sse<impl Stream<Item = Result<Event, Infallible>>>, StatusCode> {
debug!("MCP: Starting SSE endpoint");
let _mcp_server = KeepMcpServer::new(state);
// Create a simple message channel for SSE communication
let (tx, rx) = tokio::sync::mpsc::unbounded_channel::<String>();
// Send initial connection message
let _ = tx.send("data: {\"type\":\"connection\",\"status\":\"connected\"}\n\n".to_string());
// For now, create a simple stream that sends periodic keep-alive messages
// In a full implementation, this would integrate with the rmcp transport layer
let stream = stream::unfold((rx, tx), |(mut rx, tx)| async move {
tokio::select! {
msg = rx.recv() => {
match msg {
Some(data) => {
let event = Event::default().data(data);
Some((Ok(event), (rx, tx)))
}
None => None,
}
}
_ = tokio::time::sleep(Duration::from_secs(30)) => {
let event = Event::default()
.event("keep-alive")
.data("ping");
Some((Ok(event), (rx, tx)))
}
}
});
info!("MCP: SSE endpoint established");
Ok(Sse::new(stream).keep_alive(
KeepAlive::new()
.interval(Duration::from_secs(30))
.text("keep-alive"),
))
}

View File

@@ -1,12 +1,10 @@
pub mod common;
pub mod item;
#[cfg(feature = "mcp")]
pub mod mcp;
pub mod status;
use axum::{
Router,
routing::{delete, get},
routing::{delete, get, post},
};
use crate::modes::server::common::AppState;
@@ -60,8 +58,7 @@ use utoipa_swagger_ui::SwaggerUi;
struct ApiDoc;
pub fn add_routes(router: Router<AppState>) -> Router<AppState> {
#[cfg_attr(not(feature = "mcp"), allow(unused_mut))]
let mut router = router
router
// Status endpoints
.route("/api/status", get(status::handle_status))
.route("/api/plugins/status", get(status::handle_plugins_status))
@@ -88,14 +85,10 @@ pub fn add_routes(router: Router<AppState>) -> Router<AppState> {
)
.route("/api/item/{item_id}", delete(item::handle_delete_item))
.route("/api/item/{item_id}/info", get(item::handle_get_item_info))
.route("/api/diff", get(item::handle_diff_items));
#[cfg(feature = "mcp")]
{
router = router.route("/mcp/sse", get(mcp::handle_mcp_sse));
}
router
.route("/api/item/{item_id}/update", post(item::handle_update_item))
.route("/api/diff", get(item::handle_diff_items))
.route("/api/export", get(item::handle_export_items))
.route("/api/import", post(item::handle_import_items))
}
#[cfg(feature = "swagger")]

View File

@@ -2,6 +2,32 @@ use axum::{extract::State, http::StatusCode, response::Json};
use crate::modes::server::common::{ApiResponse, AppState, StatusInfoResponse};
async fn generate_status(
state: &AppState,
) -> Result<crate::common::status::StatusInfo, StatusCode> {
let db_path = state
.db
.lock()
.await
.path()
.unwrap_or("unknown")
.to_string();
let status_service = crate::services::status_service::StatusService::new();
let mut cmd = state.cmd.lock().await;
status_service
.generate_status(
&mut cmd,
&state.settings,
state.data_dir.clone(),
db_path.into(),
)
.map_err(|e| {
log::warn!("Failed to generate status: {e}");
StatusCode::INTERNAL_SERVER_ERROR
})
}
#[utoipa::path(
get,
path = "/api/status",
@@ -39,7 +65,7 @@ use crate::modes::server::common::{ApiResponse, AppState, StatusInfoResponse};
///
/// # Examples
///
/// ```
/// ```ignore
/// // In an Axum app:
/// async fn app() -> Result<Json<StatusInfoResponse>, StatusCode> {
/// handle_status(State(app_state)).await
@@ -48,24 +74,7 @@ use crate::modes::server::common::{ApiResponse, AppState, StatusInfoResponse};
pub async fn handle_status(
State(state): State<AppState>,
) -> Result<Json<StatusInfoResponse>, StatusCode> {
// Get database path
let db_path = state
.db
.lock()
.await
.path()
.unwrap_or("unknown")
.to_string();
// Use the status service to generate status info showing configured plugins
let status_service = crate::services::status_service::StatusService::new();
let mut cmd = state.cmd.lock().await;
let status_info = status_service.generate_status(
&mut cmd,
&state.settings,
state.data_dir.clone(),
db_path.into(),
);
let status_info = generate_status(&state).await?;
let response = StatusInfoResponse {
success: true,
@@ -102,22 +111,7 @@ pub struct PluginsStatusResponse {
pub async fn handle_plugins_status(
State(state): State<AppState>,
) -> Result<Json<crate::modes::server::common::ApiResponse<PluginsStatusResponse>>, StatusCode> {
let db_path = state
.db
.lock()
.await
.path()
.unwrap_or("unknown")
.to_string();
let status_service = crate::services::status_service::StatusService::new();
let mut cmd = state.cmd.lock().await;
let status_info = status_service.generate_status(
&mut cmd,
&state.settings,
state.data_dir.clone(),
db_path.into(),
);
let status_info = generate_status(&state).await?;
let response_data = PluginsStatusResponse {
meta_plugins: status_info.meta_plugins,

View File

@@ -116,116 +116,3 @@ pub fn validate_jwt(token: &str, secret: &str) -> Result<Claims, String> {
Ok(token_data.claims)
}
#[cfg(test)]
mod tests {
use super::*;
use jsonwebtoken::{EncodingKey, Header, encode};
fn make_token(claims: &serde_json::Value, secret: &str) -> String {
let header = Header::new(jsonwebtoken::Algorithm::HS256);
encode(
&header,
claims,
&EncodingKey::from_secret(secret.as_bytes()),
)
.unwrap()
}
#[test]
fn test_validate_jwt_valid_token() {
let secret = "test-secret";
let claims = serde_json::json!({
"sub": "test-client",
"exp": 9999999999usize,
"read": true,
"write": true,
"delete": false
});
let token = make_token(&claims, secret);
let result = validate_jwt(&token, secret);
assert!(result.is_ok());
let claims = result.unwrap();
assert_eq!(claims.sub, "test-client");
assert!(claims.read);
assert!(claims.write);
assert!(!claims.delete);
}
#[test]
fn test_validate_jwt_expired_token() {
let secret = "test-secret";
let claims = serde_json::json!({
"sub": "test-client",
"exp": 1000000000usize,
"read": true
});
let token = make_token(&claims, secret);
let result = validate_jwt(&token, secret);
assert!(result.is_err());
assert_eq!(result.unwrap_err(), "Token expired");
}
#[test]
fn test_validate_jwt_wrong_secret() {
let claims = serde_json::json!({
"sub": "test-client",
"exp": 9999999999usize,
"read": true
});
let token = make_token(&claims, "correct-secret");
let result = validate_jwt(&token, "wrong-secret");
assert!(result.is_err());
}
#[test]
fn test_validate_jwt_malformed_token() {
let result = validate_jwt("not.a.jwt", "secret");
assert!(result.is_err());
}
#[test]
fn test_required_permission() {
assert_eq!(required_permission(&Method::GET), "read");
assert_eq!(required_permission(&Method::HEAD), "read");
assert_eq!(required_permission(&Method::POST), "write");
assert_eq!(required_permission(&Method::PUT), "write");
assert_eq!(required_permission(&Method::PATCH), "write");
assert_eq!(required_permission(&Method::DELETE), "delete");
}
#[test]
fn test_check_permission() {
let claims = Claims {
sub: "test".to_string(),
exp: 9999999999,
read: true,
write: false,
delete: true,
};
assert!(check_permission(&claims, "read"));
assert!(!check_permission(&claims, "write"));
assert!(check_permission(&claims, "delete"));
assert!(!check_permission(&claims, "unknown"));
}
#[test]
fn test_check_permission_default_false() {
// When fields are missing from JSON, serde(default) makes them false
let secret = "test-secret";
let claims = serde_json::json!({
"sub": "test-client",
"exp": 9999999999usize
});
let token = make_token(&claims, secret);
let claims = validate_jwt(&token, secret).unwrap();
assert!(!claims.read);
assert!(!claims.write);
assert!(!claims.delete);
}
}

View File

@@ -1,4 +1,5 @@
use crate::services::item_service::ItemService;
use crate::services::types::ItemWithMeta;
/// Common utilities and types for the server module.
///
/// This module provides shared structures, functions, and middleware used across
@@ -7,10 +8,10 @@ use crate::services::item_service::ItemService;
///
/// # Usage
///
/// ```rust
/// ```rust,ignore
/// // Illustrative — requires runtime values (db connection, settings).
/// use keep::modes::server::common::{ServerConfig, AppState};
/// let config = ServerConfig { address: "127.0.0.1".to_string(), ..Default::default() };
/// let state = AppState { /* ... */ };
/// let config = ServerConfig { address: "127.0.0.1".to_string(), port: Some(8080), /* ... */ };
/// ```
use anyhow::Result;
use axum::{
@@ -38,7 +39,8 @@ use utoipa::ToSchema;
///
/// # Examples
///
/// ```
/// ```rust
/// use keep::modes::server::common::ServerConfig;
/// let config = ServerConfig {
/// address: "127.0.0.1".to_string(),
/// port: Some(8080),
@@ -105,7 +107,8 @@ pub struct ServerConfig {
///
/// # Examples
///
/// ```rust
/// ```rust,ignore
/// // AppState requires runtime values (db connection, settings) not available in doctests.
/// use keep::modes::server::common::AppState;
/// use std::sync::Arc;
/// use tokio::sync::Mutex;
@@ -155,9 +158,9 @@ pub struct AppState {
///
/// ```rust
/// use keep::modes::server::common::ApiResponse;
/// let response: ApiResponse<Vec<ItemInfo>> = ApiResponse {
/// let response: ApiResponse<String> = ApiResponse {
/// success: true,
/// data: Some(items),
/// data: Some("items".to_string()),
/// error: None,
/// };
/// ```
@@ -180,6 +183,26 @@ pub struct ApiResponse<T> {
pub error: Option<String>,
}
impl<T> ApiResponse<T> {
/// Creates a successful API response with the given data.
pub fn ok(data: T) -> Self {
Self {
success: true,
data: Some(data),
error: None,
}
}
/// Creates a successful API response with no data.
pub fn empty() -> Self {
Self {
success: true,
data: None,
error: None,
}
}
}
/// Response type for list of item information.
///
/// Specialized response for endpoints that return multiple items.
@@ -190,7 +213,7 @@ pub struct ApiResponse<T> {
/// use keep::modes::server::common::ItemInfoListResponse;
/// let response = ItemInfoListResponse {
/// success: true,
/// data: Some(vec![item_info]),
/// data: Some(vec![]),
/// error: None,
/// };
/// ```
@@ -220,7 +243,7 @@ pub struct ItemInfoListResponse {
/// use keep::modes::server::common::ItemInfoResponse;
/// let response = ItemInfoResponse {
/// success: true,
/// data: Some(item_info),
/// data: None,
/// error: None,
/// };
/// ```
@@ -250,7 +273,7 @@ pub struct ItemInfoResponse {
/// use keep::modes::server::common::ItemContentInfoResponse;
/// let response = ItemContentInfoResponse {
/// success: true,
/// data: Some(content_info),
/// data: None,
/// error: None,
/// };
/// ```
@@ -280,7 +303,7 @@ pub struct ItemContentInfoResponse {
/// use keep::modes::server::common::MetadataResponse;
/// let response = MetadataResponse {
/// success: true,
/// data: Some(meta_map),
/// data: None,
/// error: None,
/// };
/// ```
@@ -310,7 +333,7 @@ pub struct MetadataResponse {
/// use keep::modes::server::common::StatusInfoResponse;
/// let response = StatusInfoResponse {
/// success: true,
/// data: Some(status_info),
/// data: None,
/// error: None,
/// };
/// ```
@@ -343,10 +366,13 @@ pub struct StatusInfoResponse {
/// let item_info = ItemInfo {
/// id: 42,
/// ts: "2023-12-01T15:30:45Z".to_string(),
/// size: Some(1024),
/// uncompressed_size: Some(1024),
/// compressed_size: Some(512),
/// closed: true,
/// compression: "gzip".to_string(),
/// tags: vec!["important".to_string()],
/// metadata: HashMap::from([("mime_type".to_string(), "text/plain".to_string())]),
/// file_size: Some(512),
/// };
/// ```
#[derive(Serialize, Deserialize, ToSchema)]
@@ -362,11 +388,19 @@ pub struct ItemInfo {
/// The creation timestamp of the item in ISO 8601 format.
#[schema(example = "2023-12-01T15:30:45Z")]
pub ts: String,
/// Size in bytes.
/// Uncompressed size in bytes.
///
/// The size of the item's content in bytes, may be None if not set.
/// The uncompressed size of the item's content in bytes, may be None if not set.
#[schema(example = 1024)]
pub size: Option<i64>,
pub uncompressed_size: Option<i64>,
/// Compressed size in bytes.
///
/// The compressed file size on disk in bytes, may be None if not set.
#[schema(example = 512)]
pub compressed_size: Option<i64>,
/// Whether the item has been fully written and closed.
#[schema(example = true)]
pub closed: bool,
/// Compression type.
///
/// The compression algorithm used for the item's content.
@@ -382,6 +416,56 @@ pub struct ItemInfo {
/// Key-value pairs containing additional metadata about the item.
#[schema(example = json!({"mime_type": "text/plain", "mime_encoding": "utf-8", "line_count": "42"}))]
pub metadata: HashMap<String, String>,
/// Actual file size in bytes.
///
/// The filesystem-reported size of the item's data file. This may differ from
/// `compressed_size` if the file was written and the database hasn't been updated.
/// None if the file cannot be read (e.g., file not found, permission denied).
#[schema(example = 512)]
pub file_size: Option<i64>,
}
impl ItemInfo {
/// Enriches this `ItemInfo` with the actual filesystem-reported size.
///
/// Reads the size of the item's data file from disk and sets `file_size`.
/// If the file cannot be read, `file_size` is left as None.
///
/// # Arguments
///
/// * `data_dir` - The data directory path containing item files.
///
/// # Returns
///
/// A new `ItemInfo` with `file_size` populated from the filesystem.
pub fn with_file_size(mut self, data_dir: &std::path::Path) -> Self {
let item_path = data_dir.join(self.id.to_string());
self.file_size = std::fs::metadata(&item_path).map(|m| m.len() as i64).ok();
self
}
}
impl TryFrom<ItemWithMeta> for ItemInfo {
type Error = anyhow::Error;
fn try_from(item_with_meta: ItemWithMeta) -> Result<Self, Self::Error> {
let tags = item_with_meta.tag_names();
let metadata = item_with_meta.meta_as_map();
Ok(ItemInfo {
id: item_with_meta
.item
.id
.ok_or_else(|| anyhow::anyhow!("Item missing ID"))?,
ts: item_with_meta.item.ts.to_rfc3339(),
uncompressed_size: item_with_meta.item.uncompressed_size,
compressed_size: item_with_meta.item.compressed_size,
closed: item_with_meta.item.closed,
compression: item_with_meta.item.compression,
tags,
metadata,
file_size: None,
})
}
}
/// Item information including content and metadata, with binary detection.
@@ -448,14 +532,20 @@ pub struct TagsQuery {
/// ```rust
/// use keep::modes::server::common::ListItemsQuery;
/// let query = ListItemsQuery {
/// ids: None,
/// tags: Some("important".to_string()),
/// order: Some("newest".to_string()),
/// start: Some(0),
/// count: Some(10),
/// meta: None,
/// };
/// ```
#[derive(Debug, Deserialize)]
pub struct ListItemsQuery {
/// Optional comma-separated item IDs for filtering.
///
/// String containing numeric IDs to filter the item list.
pub ids: Option<String>,
/// Optional comma-separated tags for filtering.
///
/// String containing tags to filter the item list.
@@ -472,6 +562,11 @@ pub struct ListItemsQuery {
///
/// Unsigned integer limiting the number of items returned.
pub count: Option<u32>,
/// Optional metadata filter as JSON string.
///
/// JSON object where keys are metadata keys and values are either
/// `null` (filter by key existence) or a string (filter by exact value match).
pub meta: Option<String>,
}
/// Query parameters for item retrieval.
@@ -488,6 +583,7 @@ pub struct ListItemsQuery {
/// length: 1024,
/// stream: false,
/// as_meta: false,
/// decompress: true,
/// };
/// ```
#[derive(Debug, Deserialize, utoipa::ToSchema)]
@@ -538,6 +634,7 @@ pub struct ItemQuery {
/// length: 1024,
/// stream: false,
/// as_meta: false,
/// decompress: true,
/// };
/// ```
#[derive(Debug, Deserialize, utoipa::ToSchema)]
@@ -630,6 +727,31 @@ pub struct CreateItemQuery {
/// Set to false when the client has already collected metadata.
#[serde(default = "default_true")]
pub meta: bool,
/// Compression type used by the client (e.g. "lz4", "gzip").
/// Only used when compress=false — tells the server what compression
/// the client applied so the correct type is recorded in the database.
pub compression_type: Option<String>,
/// Optional timestamp for the item (RFC 3339 format).
/// Used during import to preserve the original item's timestamp.
/// If not provided, the server uses the current time.
pub ts: Option<String>,
}
/// Query parameters for updating item metadata via POST.
///
/// Query parameters for POST /api/item/{item_id}/update.
/// Re-runs specified meta plugins on the stored content and/or
/// applies direct metadata key-value overrides.
#[derive(Debug, Deserialize)]
pub struct UpdateItemQuery {
/// Optional comma-separated list of plugin names to re-run.
pub plugins: Option<String>,
/// Optional metadata overrides as JSON string.
pub metadata: Option<String>,
/// Optional comma-separated tags to add.
pub tags: Option<String>,
/// Optional uncompressed size to set on the item.
pub uncompressed_size: Option<i64>,
}
/// Request body for creating a new item.
@@ -678,8 +800,9 @@ pub fn check_auth(
let effective_username = username.as_deref().unwrap_or("keep");
if let Some(auth_header) = headers.get("authorization") {
if let Ok(auth_str) = auth_header.to_str() {
if let Some(auth_header) = headers.get("authorization")
&& let Ok(auth_str) = auth_header.to_str()
{
return check_basic_auth(
auth_str,
effective_username,
@@ -687,7 +810,6 @@ pub fn check_auth(
password_hash,
);
}
}
false
}
@@ -722,9 +844,10 @@ fn check_basic_auth(
}
let encoded = &auth_str[6..];
if let Ok(decoded_bytes) = base64::engine::general_purpose::STANDARD.decode(encoded) {
if let Ok(decoded_str) = String::from_utf8(decoded_bytes) {
if let Some(colon_pos) = decoded_str.find(':') {
if let Ok(decoded_bytes) = base64::engine::general_purpose::STANDARD.decode(encoded)
&& let Ok(decoded_str) = String::from_utf8(decoded_bytes)
&& let Some(colon_pos) = decoded_str.find(':')
{
let provided_username = &decoded_str[..colon_pos];
let provided_password = &decoded_str[colon_pos + 1..];
@@ -749,8 +872,6 @@ fn check_basic_auth(
.ct_eq(expected_password.as_bytes()),
);
}
}
}
false
}
@@ -817,9 +938,13 @@ pub async fn logging_middleware(
/// Creates authentication middleware for the application.
///
/// This function returns a middleware that enforces authentication on protected routes.
/// When `jwt_secret` is set, it validates JWT tokens and checks permission claims
/// (read, write, delete) based on the HTTP method. Otherwise, it falls back to
/// Basic Auth password authentication.
///
/// **JWT and Basic Auth are mutually exclusive.** When `jwt_secret` is set, the
/// middleware validates JWT (HS256) tokens and checks permission claims (read, write,
/// delete) based on the HTTP method. Requests without a valid Bearer token are
/// rejected with 401 — Basic Auth is **not** consulted as a fallback.
///
/// When `jwt_secret` is not set, Basic Auth password authentication is used instead.
///
/// # Arguments
///
@@ -831,13 +956,6 @@ pub async fn logging_middleware(
/// # Returns
///
/// A clonable async middleware function for Axum.
///
/// # Examples
///
/// ```
/// let auth_middleware = create_auth_middleware(None, Some("pass".to_string()), None, None);
/// router.layer(auth_middleware);
/// ```
#[allow(clippy::type_complexity)]
pub fn create_auth_middleware(
username: Option<String>,
@@ -868,10 +986,11 @@ pub fn create_auth_middleware(
}
// JWT authentication takes priority when secret is configured
if let Some(ref secret) = jwt_secret {
if let Some(auth_header) = headers.get("authorization") {
if let Ok(auth_str) = auth_header.to_str() {
if let Some(token) = auth_str.strip_prefix("Bearer ") {
if let Some(ref secret) = jwt_secret
&& let Some(auth_header) = headers.get("authorization")
&& let Ok(auth_str) = auth_header.to_str()
&& let Some(token) = auth_str.strip_prefix("Bearer ")
{
match super::auth::validate_jwt(token, secret) {
Ok(claims) => {
let required = super::auth::required_permission(&method);
@@ -881,8 +1000,7 @@ pub fn create_auth_middleware(
(sub={}, missing permission: {required})",
claims.sub
);
let mut response =
Response::new(axum::body::Body::from("Forbidden"));
let mut response = Response::new(axum::body::Body::from("Forbidden"));
*response.status_mut() = StatusCode::FORBIDDEN;
return Ok(response);
}
@@ -892,16 +1010,15 @@ pub fn create_auth_middleware(
}
Err(e) => {
warn!("JWT validation failed for {uri} from {addr}: {e}");
let mut response =
Response::new(axum::body::Body::from("Unauthorized"));
let mut response = Response::new(axum::body::Body::from("Unauthorized"));
*response.status_mut() = StatusCode::UNAUTHORIZED;
return Ok(response);
}
}
}
}
}
// JWT secret configured but no valid Bearer token provided
if jwt_secret.is_some() {
warn!("Missing JWT token for {uri} from {addr}");
let mut response = Response::new(axum::body::Body::from("Unauthorized"));
*response.status_mut() = StatusCode::UNAUTHORIZED;

View File

@@ -1,83 +0,0 @@
pub mod server;
pub mod tools;
pub use server::KeepMcpServer;
/// Module for handling MCP (Model Context Protocol) requests in the server.
///
/// Provides handlers for JSON-RPC style requests to interact with Keep's storage
/// via the API.
use axum::{Json, extract::State, http::StatusCode, response::IntoResponse};
use serde::Deserialize;
use serde_json::Value;
use crate::modes::server::common::ApiResponse;
use crate::modes::server::common::AppState;
/// Request structure for MCP JSON-RPC calls.
///
/// # Fields
///
/// * `method` - The MCP method name (e.g., "save_item").
/// * `params` - Optional JSON parameters for the method.
#[derive(Deserialize)]
pub struct McpRequest {
pub method: String,
pub params: Option<Value>,
}
/// Handles an MCP request via the Axum framework.
///
/// Parses the JSON request, delegates to `KeepMcpServer`, and returns an API response.
/// Attempts to parse the result as JSON; falls back to string if invalid.
///
/// # Arguments
///
/// * `State(state)` - The application state.
/// * `Json(request)` - The deserialized MCP request.
///
/// # Returns
///
/// An `IntoResponse` with status code and JSON API response.
///
/// # Errors
///
/// Returns 400 Bad Request on handler errors.
pub async fn handle_mcp_request(
State(state): State<AppState>,
Json(request): Json<McpRequest>,
) -> impl IntoResponse {
let mcp_server = KeepMcpServer::new(state);
match mcp_server
.handle_request(&request.method, request.params)
.await
{
Ok(result) => match serde_json::from_str(&result) {
Ok(parsed_result) => {
let response = ApiResponse {
success: true,
data: Some(parsed_result),
error: None,
};
(StatusCode::OK, Json(response))
}
Err(_) => {
let response = ApiResponse {
success: true,
data: Some(serde_json::Value::String(result)),
error: None,
};
(StatusCode::OK, Json(response))
}
},
Err(e) => {
let response = ApiResponse {
success: false,
data: None,
error: Some(e.to_string()),
};
(StatusCode::BAD_REQUEST, Json(response))
}
}
}

View File

@@ -1,83 +0,0 @@
use log::debug;
use serde_json::Value;
use super::tools::{KeepTools, ToolError};
use crate::modes::server::common::AppState;
/// Server handler for MCP (Model Context Protocol) requests.
///
/// Routes requests to appropriate tools and handles responses. Clones AppState for tool usage.
///
/// # Fields
///
/// * `state` - The shared application state (DB, config, etc.).
#[derive(Clone)]
pub struct KeepMcpServer {
state: AppState,
}
/// Creates a new `KeepMcpServer` instance.
///
/// # Arguments
///
/// * `state` - The application state containing DB, config, and services.
///
/// # Returns
///
/// A new `KeepMcpServer` instance.
///
/// # Examples
///
/// ```
/// let server = KeepMcpServer::new(app_state);
/// ```
impl KeepMcpServer {
pub fn new(state: AppState) -> Self {
Self { state }
}
/// Handles an MCP request by routing to the appropriate tool.
///
/// Supports methods like "save_item", "get_item", "list_items". Logs the request and delegates to KeepTools.
///
/// # Arguments
///
/// * `method` - The MCP method name (string).
/// * `params` - Optional JSON parameters as serde_json::Value.
///
/// # Returns
///
/// `Ok(String)` with JSON-serialized response on success, or `Err(ToolError)` on failure.
///
/// # Errors
///
/// * ToolError::UnknownTool if method unsupported.
/// * Propagates tool-specific errors (e.g., invalid args, DB failures).
///
/// # Examples
///
/// ```
/// let result = server.handle_request("save_item", Some(params)).await?;
/// ```
pub async fn handle_request(
&self,
method: &str,
params: Option<Value>,
) -> Result<String, ToolError> {
debug!(
"MCP: Handling request '{}' with params: {:?}",
method, params
);
let tools = KeepTools::new(self.state.clone());
match method {
"save_item" => tools.save_item(params).await,
"get_item" => tools.get_item(params).await,
"get_latest_item" => tools.get_latest_item(params).await,
"list_items" => tools.list_items(params).await,
"search_items" => tools.search_items(params).await,
_ => Err(ToolError::UnknownTool(method.to_string())),
}
}
}

View File

@@ -1,344 +0,0 @@
use anyhow::{Result, anyhow};
use log::debug;
use serde_json::Value;
use std::collections::HashMap;
use crate::modes::server::common::AppState;
use crate::services::async_item_service::AsyncItemService;
use crate::services::error::CoreError;
#[derive(Debug, thiserror::Error)]
pub enum ToolError {
#[error("Unknown tool: {0}")]
UnknownTool(String),
#[error("Invalid arguments: {0}")]
InvalidArguments(String),
#[error("Database error: {0}")]
Database(#[from] rusqlite::Error),
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("JSON error: {0}")]
Json(#[from] serde_json::Error),
#[error("Parse error: {0}")]
Parse(#[from] strum::ParseError),
#[error("Other error: {0}")]
Other(#[from] anyhow::Error),
}
pub struct KeepTools {
state: AppState,
}
impl KeepTools {
pub fn new(state: AppState) -> Self {
Self { state }
}
pub async fn save_item(&self, args: Option<Value>) -> Result<String, ToolError> {
let args =
args.ok_or_else(|| ToolError::InvalidArguments("Missing arguments".to_string()))?;
let content = args
.get("content")
.and_then(|v| v.as_str())
.ok_or_else(|| ToolError::InvalidArguments("Missing 'content' field".to_string()))?;
let tags: Vec<String> = args
.get("tags")
.and_then(|v| v.as_array())
.map(|arr| {
arr.iter()
.filter_map(|v| v.as_str().map(|s| s.to_string()))
.collect()
})
.unwrap_or_default();
let metadata: HashMap<String, String> = args
.get("metadata")
.and_then(|v| v.as_object())
.map(|obj| {
obj.iter()
.filter_map(|(k, v)| v.as_str().map(|s| (k.clone(), s.to_string())))
.collect()
})
.unwrap_or_default();
debug!(
"MCP: Saving item with {} bytes, {} tags, {} metadata entries",
content.len(),
tags.len(),
metadata.len()
);
let service = AsyncItemService::new(
self.state.data_dir.clone(),
self.state.db.clone(),
self.state.item_service.clone(),
self.state.cmd.clone(),
self.state.settings.clone(),
);
let item_with_meta = service
.save_item_from_mcp(content.as_bytes().to_vec(), tags, metadata)
.await
.map_err(|e| ToolError::Other(anyhow::Error::from(e)))?;
let item_id = item_with_meta
.item
.id
.ok_or_else(|| anyhow!("Failed to get item ID"))?;
Ok(format!("Successfully saved item with ID: {}", item_id))
}
pub async fn get_item(&self, args: Option<Value>) -> Result<String, ToolError> {
let args =
args.ok_or_else(|| ToolError::InvalidArguments("Missing arguments".to_string()))?;
let item_id = args.get("id").and_then(|v| v.as_i64()).ok_or_else(|| {
ToolError::InvalidArguments("Missing or invalid 'id' field".to_string())
})?;
let service = AsyncItemService::new(
self.state.data_dir.clone(),
self.state.db.clone(),
self.state.item_service.clone(),
self.state.cmd.clone(),
self.state.settings.clone(),
);
let item_with_content = match service.get_item_content(item_id).await {
Ok(iwc) => iwc,
Err(CoreError::ItemNotFound(_)) => {
return Err(ToolError::InvalidArguments(format!(
"Item {} not found",
item_id
)));
}
Err(e) => return Err(ToolError::Other(anyhow::Error::from(e))),
};
let content = String::from_utf8_lossy(&item_with_content.content).to_string();
let tags: Vec<String> = item_with_content
.item_with_meta
.tags
.iter()
.map(|t| t.name.clone())
.collect();
let metadata = item_with_content.item_with_meta.meta_as_map();
let item = item_with_content.item_with_meta.item;
let response = serde_json::json!({
"id": item_id,
"content": content,
"timestamp": item.ts.to_rfc3339(),
"size": item.size,
"compression": item.compression,
"tags": tags,
"metadata": metadata,
});
Ok(serde_json::to_string_pretty(&response)?)
}
pub async fn get_latest_item(&self, args: Option<Value>) -> Result<String, ToolError> {
let tags: Vec<String> = args
.as_ref()
.and_then(|v| v.get("tags"))
.and_then(|v| v.as_array())
.map(|arr| {
arr.iter()
.filter_map(|v| v.as_str().map(|s| s.to_string()))
.collect()
})
.unwrap_or_default();
let service = AsyncItemService::new(
self.state.data_dir.clone(),
self.state.db.clone(),
self.state.item_service.clone(),
self.state.cmd.clone(),
self.state.settings.clone(),
);
let item_with_meta = match service.find_item(vec![], tags, HashMap::new()).await {
Ok(iwm) => iwm,
Err(CoreError::ItemNotFoundGeneric) => {
return Err(ToolError::InvalidArguments("No items found".to_string()));
}
Err(e) => return Err(ToolError::Other(anyhow::Error::from(e))),
};
let item_id = item_with_meta
.item
.id
.ok_or_else(|| anyhow!("Item missing ID after find"))?;
let item_with_content = service
.get_item_content(item_id)
.await
.map_err(|e| ToolError::Other(anyhow::Error::from(e)))?;
let content = String::from_utf8_lossy(&item_with_content.content).to_string();
let tags: Vec<String> = item_with_content
.item_with_meta
.tags
.iter()
.map(|t| t.name.clone())
.collect();
let metadata = item_with_content.item_with_meta.meta_as_map();
let item = item_with_content.item_with_meta.item;
let response = serde_json::json!({
"id": item_id,
"content": content,
"timestamp": item.ts.to_rfc3339(),
"size": item.size,
"compression": item.compression,
"tags": tags,
"metadata": metadata,
});
Ok(serde_json::to_string_pretty(&response)?)
}
pub async fn list_items(&self, args: Option<Value>) -> Result<String, ToolError> {
let args_ref = args.as_ref();
let tags: Vec<String> = args_ref
.and_then(|v| v.get("tags"))
.and_then(|v| v.as_array())
.map(|arr| {
arr.iter()
.filter_map(|v| v.as_str().map(|s| s.to_string()))
.collect()
})
.unwrap_or_default();
let limit = args_ref
.and_then(|v| v.get("limit"))
.and_then(|v| v.as_u64())
.unwrap_or(10) as usize;
let offset = args_ref
.and_then(|v| v.get("offset"))
.and_then(|v| v.as_u64())
.unwrap_or(0) as usize;
let service = AsyncItemService::new(
self.state.data_dir.clone(),
self.state.db.clone(),
self.state.item_service.clone(),
self.state.cmd.clone(),
self.state.settings.clone(),
);
let mut items_with_meta = service
.list_items(tags, HashMap::new())
.await
.map_err(|e| ToolError::Other(anyhow::Error::from(e)))?;
// Sort by timestamp (newest first) and apply pagination
items_with_meta.sort_by(|a, b| b.item.ts.cmp(&a.item.ts));
let items_with_meta: Vec<_> = items_with_meta
.into_iter()
.skip(offset)
.take(limit)
.collect();
let items_info: Vec<_> = items_with_meta
.into_iter()
.map(|item_with_meta| {
let item_tags: Vec<String> =
item_with_meta.tags.iter().map(|t| t.name.clone()).collect();
let item_meta = item_with_meta.meta_as_map();
let item = item_with_meta.item;
let item_id = item.id.unwrap_or(0);
serde_json::json!({
"id": item_id,
"timestamp": item.ts.to_rfc3339(),
"size": item.size,
"compression": item.compression,
"tags": item_tags,
"metadata": item_meta
})
})
.collect();
let response = serde_json::json!({
"items": items_info,
"count": items_info.len(),
"offset": offset,
"limit": limit
});
Ok(serde_json::to_string_pretty(&response)?)
}
pub async fn search_items(&self, args: Option<Value>) -> Result<String, ToolError> {
let tags: Vec<String> = args
.as_ref()
.and_then(|v| v.get("tags"))
.and_then(|v| v.as_array())
.map(|arr| {
arr.iter()
.filter_map(|v| v.as_str().map(|s| s.to_string()))
.collect()
})
.unwrap_or_default();
let metadata: HashMap<String, String> = args
.as_ref()
.and_then(|v| v.get("metadata"))
.and_then(|v| v.as_object())
.map(|obj| {
obj.iter()
.filter_map(|(k, v)| v.as_str().map(|s| (k.clone(), s.to_string())))
.collect()
})
.unwrap_or_default();
let service = AsyncItemService::new(
self.state.data_dir.clone(),
self.state.db.clone(),
self.state.item_service.clone(),
self.state.cmd.clone(),
self.state.settings.clone(),
);
let mut items_with_meta = service
.list_items(tags.clone(), metadata.clone())
.await
.map_err(|e| ToolError::Other(anyhow::Error::from(e)))?;
// Sort by timestamp (newest first)
items_with_meta.sort_by(|a, b| b.item.ts.cmp(&a.item.ts));
let items_info: Vec<_> = items_with_meta
.into_iter()
.map(|item_with_meta| {
let item_tags: Vec<String> =
item_with_meta.tags.iter().map(|t| t.name.clone()).collect();
let item_meta = item_with_meta.meta_as_map();
let item = item_with_meta.item;
let item_id = item.id.unwrap_or(0);
serde_json::json!({
"id": item_id,
"timestamp": item.ts.to_rfc3339(),
"size": item.size,
"compression": item.compression,
"tags": item_tags,
"metadata": item_meta
})
})
.collect();
let response = serde_json::json!({
"items": items_info,
"count": items_info.len(),
"search_criteria": {
"tags": tags,
"metadata": metadata
}
});
Ok(serde_json::to_string_pretty(&response)?)
}
}

View File

@@ -1,7 +1,10 @@
use crate::config;
use crate::services::item_service::ItemService;
use anyhow::Result;
use axum::{Router, routing::post};
use axum::Router;
use axum::http::{HeaderValue, header};
use axum::middleware::Next;
use axum::response::Response;
use clap::Command;
use log::{debug, info};
use std::net::SocketAddr;
@@ -15,12 +18,26 @@ use tower_http::trace::TraceLayer;
mod api;
pub mod auth;
pub mod common;
#[cfg(feature = "mcp")]
mod mcp;
mod pages;
pub use common::{AppState, create_auth_middleware, logging_middleware};
/// Adds security headers to all responses.
async fn security_headers(req: axum::extract::Request, next: Next) -> Response {
let mut response = next.run(req).await;
let headers = response.headers_mut();
headers.insert(
header::X_CONTENT_TYPE_OPTIONS,
HeaderValue::from_static("nosniff"),
);
headers.insert(header::X_FRAME_OPTIONS, HeaderValue::from_static("DENY"));
headers.insert(
header::REFERRER_POLICY,
HeaderValue::from_static("strict-origin-when-cross-origin"),
);
response
}
pub fn mode_server(
cmd: &mut Command,
settings: &config::Settings,
@@ -107,23 +124,10 @@ async fn run_server(
settings: Arc::new(settings.clone()),
};
#[cfg(feature = "mcp")]
let mcp_router = Router::new()
.route("/mcp", post(mcp::handle_mcp_request))
.with_state(state.clone());
#[cfg_attr(not(feature = "mcp"), allow(unused_mut))]
let mut protected_router = Router::new()
let protected_router = Router::new()
.merge(api::add_routes(Router::new()))
.merge(pages::add_routes(Router::new()));
#[cfg(feature = "mcp")]
{
protected_router = protected_router.merge(mcp_router);
}
let protected_router =
protected_router.layer(axum::middleware::from_fn(create_auth_middleware(
.merge(pages::add_routes(Router::new()))
.layer(axum::middleware::from_fn(create_auth_middleware(
config.username.clone(),
config.password.clone(),
config.password_hash.clone(),
@@ -152,18 +156,19 @@ async fn run_server(
axum::http::Method::PUT,
axum::http::Method::DELETE,
])
.allow_headers(tower_http::cors::Any)
.allow_headers([header::CONTENT_TYPE, header::AUTHORIZATION, header::ACCEPT])
};
// Create the app with documentation routes open and others protected
let app = Router::new()
// Add documentation routes without authentication
.merge(api::add_docs_routes(Router::new()))
// Add API, pages, and MCP routes with authentication
// Add API and pages routes with authentication
.merge(protected_router)
// Apply state to all routes
.with_state(state)
// Add other middleware layers to all routes
.layer(axum::middleware::from_fn(security_headers))
.layer(axum::middleware::from_fn(logging_middleware))
.layer(
ServiceBuilder::new()
@@ -174,24 +179,18 @@ async fn run_server(
let addr: SocketAddr = bind_address.parse()?;
// Warn if authentication is enabled without TLS
if config.password.is_some() || config.password_hash.is_some() || config.jwt_secret.is_some() {
#[cfg(not(feature = "tls"))]
log::warn!(
"SECURITY: Authentication enabled but TLS support is not compiled in. Credentials will be transmitted in plain text!"
);
#[cfg(feature = "tls")]
if config.cert_file.is_none() || config.key_file.is_none() {
if (config.password.is_some() || config.password_hash.is_some() || config.jwt_secret.is_some())
&& (config.cert_file.is_none() || config.key_file.is_none())
{
log::warn!(
"SECURITY: Authentication enabled but TLS is not configured. Credentials will be transmitted in plain text!"
);
}
}
// Build the app into a service
let service = app.into_make_service_with_connect_info::<SocketAddr>();
// Use TLS if both cert and key files are provided
#[cfg(feature = "tls")]
if let (Some(cert_file), Some(key_file)) = (&config.cert_file, &config.key_file) {
info!("SERVER: HTTPS server listening on {addr}");

View File

@@ -6,11 +6,24 @@ use axum::{
extract::{Path, Query, State},
response::{Html, Response},
};
use html_escape::{encode_double_quoted_attribute, encode_text};
use log::debug;
use rusqlite::Connection;
use serde::Deserialize;
use std::collections::HashMap;
/// Escape text content for safe HTML insertion.
#[inline]
fn esc(s: &str) -> String {
encode_text(s).to_string()
}
/// Escape attribute values for safe HTML attribute insertion.
#[inline]
fn esc_attr(s: &str) -> String {
encode_double_quoted_attribute(s).to_string()
}
#[derive(Deserialize)]
/// Query parameters for the item list endpoint.
///
@@ -62,7 +75,7 @@ fn default_count() -> usize {
///
/// # Examples
///
/// ```
/// ```ignore
/// let app = pages::add_routes(axum::Router::new());
/// ```
pub fn add_routes(app: axum::Router<AppState>) -> axum::Router<AppState> {
@@ -90,7 +103,9 @@ async fn list_items(
.map_err(|_| Html("<html><body>Internal Server Error</body></html>".to_string()))?;
Ok(response)
}
Err(e) => Err(Html(format!("<html><body>Error: {e}</body></html>"))),
Err(_e) => Err(Html(
"<html><body>An internal error occurred</body></html>".to_string(),
)),
}
}
@@ -121,7 +136,8 @@ fn build_item_list(
// Apply pagination
let start = params.start;
let end = std::cmp::min(start + params.count, sorted_items.len());
let count = params.count.min(10000);
let end = std::cmp::min(start + count, sorted_items.len());
let page_items = if start < sorted_items.len() {
sorted_items[start..std::cmp::min(end, sorted_items.len())].to_vec()
} else {
@@ -153,14 +169,14 @@ fn build_item_list(
// Collect all tags from all items, keeping track of their timestamps
let mut all_tags_with_time: Vec<(String, chrono::DateTime<chrono::Utc>)> = Vec::new();
for item in &sorted_items {
if let Some(item_id) = item.id {
if let Some(tags) = tags_map.get(&item_id) {
if let Some(item_id) = item.id
&& let Some(tags) = tags_map.get(&item_id)
{
for tag in tags {
all_tags_with_time.push((tag.name.clone(), item.ts));
}
}
}
}
// Sort by timestamp descending (most recent first)
all_tags_with_time.sort_by(|a, b| b.1.cmp(&a.1));
@@ -184,7 +200,9 @@ fn build_item_list(
html.push_str("<p>");
for tag in recent_tags {
html.push_str(&format!(
"<a href=\"/?tags={tag}\" style=\"margin-right: 8px;\">{tag}</a>"
"<a href=\"/?tags={}\" style=\"margin-right: 8px;\">{}</a>",
esc_attr(&tag),
esc(&tag)
));
}
html.push_str("</p>");
@@ -196,7 +214,7 @@ fn build_item_list(
// Table headers
html.push_str("<tr>");
for column in columns {
html.push_str(&format!("<th>{}</th>", column.label));
html.push_str(&format!("<th>{}</th>", esc(&column.label)));
}
html.push_str("<th>Actions</th>");
html.push_str("</tr>");
@@ -224,12 +242,21 @@ fn build_item_list(
format!("<a href=\"/item/{item_id}\">{id_value}</a>")
}
"time" => item.ts.format("%Y-%m-%d %H:%M:%S").to_string(),
"size" => item.size.map(|s| s.to_string()).unwrap_or_default(),
"size" => item
.uncompressed_size
.map(|s| s.to_string())
.unwrap_or_default(),
"tags" => {
// Make sure we're using all tags for the item
let tag_links: Vec<String> = tags
.iter()
.map(|t| format!("<a href=\"/?tags={}\">{}</a>", t.name, t.name))
.map(|t| {
format!(
"<a href=\"/?tags={}\">{}</a>",
esc_attr(&t.name),
esc(&t.name)
)
})
.collect();
tag_links.join(", ")
}
@@ -268,7 +295,15 @@ fn build_item_list(
crate::config::ColumnAlignment::Center => "text-align: center;",
};
html.push_str(&format!("<td style=\"{align_style}\">{display_value}</td>"));
let rendered_value = if column.name == "tags" {
display_value // Already contains escaped HTML links
} else {
esc(&display_value)
};
html.push_str(&format!(
"<td style=\"{align_style}\">{rendered_value}</td>"
));
}
// Actions column
@@ -361,7 +396,9 @@ async fn show_item(
.map_err(|_| Html("<html><body>Internal Server Error</body></html>".to_string()))?;
Ok(response)
}
Err(e) => Err(Html(format!("<html><body>Error: {e}</body></html>"))),
Err(_e) => Err(Html(
"<html><body>An internal error occurred</body></html>".to_string(),
)),
}
}
@@ -392,11 +429,11 @@ fn build_item_details(conn: &Connection, id: i64) -> Result<String> {
));
html.push_str(&format!(
"<tr><th>Size</th><td>{}</td></tr>",
item.size.unwrap_or(0)
item.uncompressed_size.unwrap_or(0)
));
html.push_str(&format!(
"<tr><th>Compression</th><td>{}</td></tr>",
item.compression
esc(&item.compression)
));
// Tags row
@@ -406,7 +443,13 @@ fn build_item_details(conn: &Connection, id: i64) -> Result<String> {
} else {
let tag_links: Vec<String> = tags
.iter()
.map(|t| format!("<a href=\"/?tags={}\">{}</a>", t.name, t.name))
.map(|t| {
format!(
"<a href=\"/?tags={}\">{}</a>",
esc_attr(&t.name),
esc(&t.name)
)
})
.collect();
html.push_str(&tag_links.join(", "));
}
@@ -419,7 +462,8 @@ fn build_item_details(conn: &Connection, id: i64) -> Result<String> {
for meta in metas {
html.push_str(&format!(
"<tr><th>{}</th><td>{}</td></tr>",
meta.name, meta.value
esc(&meta.name),
esc(&meta.value)
));
}
}

View File

@@ -10,26 +10,11 @@ use comfy_table::{Attribute, Cell, Table};
use serde_json;
use serde_yaml;
use crate::common::status::PathInfo;
use crate::meta_plugin::MetaPluginType;
use crate::meta_plugin::get_meta_plugin;
fn build_path_table(path_info: &PathInfo) -> Table {
let mut path_table = crate::modes::common::create_table(true);
path_table.set_header(vec![
Cell::new("Type").add_attribute(Attribute::Bold),
Cell::new("Path").add_attribute(Attribute::Bold),
]);
path_table.add_row(vec!["Data", &path_info.data]);
path_table.add_row(vec!["Database", &path_info.database]);
path_table
}
fn build_config_table(settings: &config::Settings) -> Table {
let mut config_table = crate::modes::common::create_table(true);
let mut config_table = crate::modes::common::create_table_with_config(&settings.table_config);
config_table.set_header(vec![
Cell::new("Setting").add_attribute(Attribute::Bold),
@@ -52,7 +37,10 @@ fn build_config_table(settings: &config::Settings) -> Table {
config_table
}
fn build_meta_plugins_configured_table(status_info: &StatusInfo) -> Option<Table> {
fn build_meta_plugins_configured_table(
status_info: &StatusInfo,
table_config: &config::TableConfig,
) -> Option<Table> {
let meta_plugins = status_info.configured_meta_plugins.as_ref()?;
if meta_plugins.is_empty() {
return None;
@@ -62,7 +50,7 @@ fn build_meta_plugins_configured_table(status_info: &StatusInfo) -> Option<Table
let mut sorted_meta_plugins = meta_plugins.clone();
sorted_meta_plugins.sort_by(|a, b| a.name.cmp(&b.name));
let mut table = crate::modes::common::create_table(true);
let mut table = crate::modes::common::create_table_with_config(table_config);
table.set_header(vec![
Cell::new("Plugin Name").add_attribute(Attribute::Bold),
@@ -198,7 +186,7 @@ pub fn mode_status(
let status_service = crate::services::status_service::StatusService::new();
let output_format = crate::modes::common::settings_output_format(settings);
debug!("STATUS: About to generate status info");
let status_info = status_service.generate_status(cmd, settings, data_path, db_path);
let status_info = status_service.generate_status(cmd, settings, data_path, db_path)?;
debug!("STATUS: Status info generated successfully");
match output_format {
@@ -212,7 +200,8 @@ pub fn mode_status(
println!();
println!("PATHS:");
let path_table = build_path_table(&status_info.paths);
let path_table =
crate::modes::common::build_path_table(&status_info.paths, &settings.table_config);
println!(
"{}",
crate::modes::common::trim_lines_end(&path_table.trim_fmt())
@@ -220,7 +209,9 @@ pub fn mode_status(
println!();
// Always try to print META PLUGINS CONFIGURED section using status_info
if let Some(meta_plugins_table) = build_meta_plugins_configured_table(&status_info) {
if let Some(meta_plugins_table) =
build_meta_plugins_configured_table(&status_info, &settings.table_config)
{
println!("META PLUGINS CONFIGURED:");
println!(
"{}",
@@ -235,12 +226,11 @@ pub fn mode_status(
Ok(())
}
OutputFormat::Json => {
// Create a subset for status info that includes everything
println!("{}", serde_json::to_string_pretty(&status_info)?);
crate::modes::common::print_serialized(&status_info, &output_format)?;
Ok(())
}
OutputFormat::Yaml => {
println!("{}", serde_yaml::to_string(&status_info)?);
crate::modes::common::print_serialized(&status_info, &output_format)?;
Ok(())
}
}

View File

@@ -60,6 +60,7 @@ use crate::meta_plugin::{MetaPluginType, get_meta_plugin};
fn build_meta_plugin_table(
meta_plugin_info: &std::collections::HashMap<String, MetaPluginInfo>,
table_config: &crate::config::TableConfig,
) -> Table {
// Builds a formatted table displaying meta plugin information.
//
@@ -72,7 +73,7 @@ fn build_meta_plugin_table(
// # Returns
//
// A formatted `comfy_table::Table`.
let mut meta_plugin_table = crate::modes::common::create_table(true);
let mut meta_plugin_table = crate::modes::common::create_table_with_config(table_config);
meta_plugin_table.set_header(vec![
Cell::new("Plugin Name").add_attribute(Attribute::Bold),
@@ -126,7 +127,10 @@ fn build_meta_plugin_table(
meta_plugin_table
}
fn build_compression_table(compression_info: &Vec<CompressionInfo>) -> Table {
fn build_compression_table(
compression_info: &Vec<CompressionInfo>,
table_config: &crate::config::TableConfig,
) -> Table {
// Builds a formatted table displaying compression plugin information.
//
// # Arguments
@@ -136,7 +140,7 @@ fn build_compression_table(compression_info: &Vec<CompressionInfo>) -> Table {
// # Returns
//
// A formatted `comfy_table::Table`.
let mut compression_table = crate::modes::common::create_table(true);
let mut compression_table = crate::modes::common::create_table_with_config(table_config);
compression_table.set_header(vec![
Cell::new("Type").add_attribute(Attribute::Bold),
@@ -167,7 +171,10 @@ fn build_compression_table(compression_info: &Vec<CompressionInfo>) -> Table {
compression_table
}
fn build_filter_plugin_table(filter_plugins: &[crate::common::status::FilterPluginInfo]) -> Table {
fn build_filter_plugin_table(
filter_plugins: &[crate::common::status::FilterPluginInfo],
table_config: &crate::config::TableConfig,
) -> Table {
// Builds a formatted table displaying filter plugin information.
//
// Sorts plugins by name and formats options as YAML sequence.
@@ -179,7 +186,7 @@ fn build_filter_plugin_table(filter_plugins: &[crate::common::status::FilterPlug
// # Returns
//
// A formatted `comfy_table::Table`.
let mut filter_plugin_table = crate::modes::common::create_table(true);
let mut filter_plugin_table = crate::modes::common::create_table_with_config(table_config);
filter_plugin_table.set_header(vec![
Cell::new("Plugin Name").add_attribute(Attribute::Bold),
@@ -298,13 +305,14 @@ pub fn mode_status_plugins(
let status_service = crate::services::status_service::StatusService::new();
let output_format = crate::modes::common::settings_output_format(settings);
debug!("STATUS_PLUGINS: About to generate status info");
let status_info = status_service.generate_status(cmd, settings, data_path, db_path);
let status_info = status_service.generate_status(cmd, settings, data_path, db_path)?;
debug!("STATUS_PLUGINS: Status info generated successfully");
match output_format {
OutputFormat::Table => {
println!("META PLUGINS:");
let meta_table = build_meta_plugin_table(&status_info.meta_plugins);
let meta_table =
build_meta_plugin_table(&status_info.meta_plugins, &settings.table_config);
println!(
"{}",
crate::modes::common::trim_lines_end(&meta_table.trim_fmt())
@@ -312,7 +320,8 @@ pub fn mode_status_plugins(
println!();
println!("COMPRESSION PLUGINS:");
let compression_table = build_compression_table(&status_info.compression);
let compression_table =
build_compression_table(&status_info.compression, &settings.table_config);
println!(
"{}",
crate::modes::common::trim_lines_end(&compression_table.trim_fmt())
@@ -320,7 +329,8 @@ pub fn mode_status_plugins(
println!();
println!("FILTER PLUGINS:");
let filter_table = build_filter_plugin_table(&status_info.filter_plugins);
let filter_table =
build_filter_plugin_table(&status_info.filter_plugins, &settings.table_config);
println!(
"{}",
crate::modes::common::trim_lines_end(&filter_table.trim_fmt())

235
src/modes/update.rs Normal file
View File

@@ -0,0 +1,235 @@
use anyhow::{Context, Result};
use std::io::Read;
use std::path::{Path, PathBuf};
use crate::common::PIPESIZE;
use crate::config;
use crate::db;
use crate::services::compression_service::CompressionService;
use crate::services::meta_service::MetaService;
use clap::Command;
use log::debug;
use rusqlite::Connection;
/// Handles the update mode: modifies tags and metadata for an existing item by ID.
///
/// This function processes a single item ID, updating its metadata based on `--meta`
/// arguments and optionally replacing its tags with positional arguments.
/// If the item's size is not set, it backfills it by streaming through the content file.
///
/// # Arguments
///
/// * `cmd` - Clap command for error handling.
/// * `settings` - Global settings containing metadata and meta plugin config.
/// * `ids` - List containing exactly one item ID.
/// * `conn` - Database connection.
/// * `data_path` - Path to data directory.
///
/// # Returns
///
/// `Result<()>` on success, or an error if the update fails.
pub fn mode_update(
cmd: &mut Command,
settings: &config::Settings,
ids: &mut [i64],
tags: &mut Vec<String>,
conn: &mut Connection,
data_path: PathBuf,
) -> Result<()> {
if ids.len() != 1 {
cmd.error(
clap::error::ErrorKind::InvalidValue,
"--update requires exactly one numeric ID",
)
.exit();
}
let item_id = ids[0];
// Look up the item
let item =
db::get_item(conn, item_id)?.ok_or_else(|| anyhow::anyhow!("Item {item_id} not found"))?;
debug!("UPDATE: Found item {item_id}: {item:?}");
// Parse --meta arguments into set and delete lists
let mut set_meta: Vec<(String, String)> = Vec::new();
let mut delete_keys: Vec<String> = Vec::new();
for (key, value) in &settings.meta {
match value {
Some(v) => set_meta.push((key.clone(), v.clone())),
None => delete_keys.push(key.clone()),
}
}
// Apply metadata changes
for (key, value) in &set_meta {
debug!("UPDATE: Setting meta {key}={value}");
db::store_meta(
conn,
db::Meta {
id: item_id,
name: key.clone(),
value: value.clone(),
},
)?;
}
for key in &delete_keys {
debug!("UPDATE: Deleting meta {key}");
db::query_delete_meta(
conn,
db::Meta {
id: item_id,
name: key.clone(),
value: String::new(),
},
)?;
}
// Replace tags if provided
if !tags.is_empty() {
debug!("UPDATE: Replacing tags with {:?}", tags);
db::set_item_tags(conn, item.clone(), tags)?;
}
// Run meta plugins if --meta-plugin flags are provided
let plugin_names = settings.meta_plugins_names();
if !plugin_names.is_empty() {
debug!("UPDATE: Running meta plugins: {:?}", plugin_names);
run_meta_plugins_on_item(conn, cmd, settings, &data_path, &item, item_id)?;
}
// Backfill size if not set
let mut updated_item = item.clone();
if item.uncompressed_size.is_none() {
debug!("UPDATE: Size not set, backfilling from content file");
if let Some(size) = compute_item_size(&data_path, &item) {
debug!("UPDATE: Computed size: {size}");
updated_item.uncompressed_size = Some(size);
db::update_item(conn, updated_item.clone())?;
}
}
// Backfill compressed_size if not set
if item.compressed_size.is_none() {
let item_path = data_path.join(item_id.to_string());
if let Ok(meta) = std::fs::metadata(&item_path) {
updated_item.compressed_size = Some(meta.len() as i64);
db::update_item(conn, updated_item.clone())?;
}
}
// Print confirmation
if !settings.quiet {
let mut parts = Vec::new();
if !set_meta.is_empty() {
parts.push(format!("set {} metadata", set_meta.len()));
}
if !delete_keys.is_empty() {
parts.push(format!("deleted {} metadata", delete_keys.len()));
}
if !tags.is_empty() {
parts.push(format!("tags: {}", tags.join(" ")));
}
let action = if parts.is_empty() {
"no changes".to_string()
} else {
parts.join(", ")
};
eprintln!("KEEP: Updated item {item_id} ({action})");
}
Ok(())
}
/// Computes the decompressed size of an item by streaming through its content file.
///
/// Reads the compressed file in PIPESIZE chunks and counts total decompressed bytes.
/// Returns None if the file doesn't exist or decompression fails.
fn compute_item_size(data_path: &Path, item: &db::Item) -> Option<i64> {
let item_id = item.id?;
let mut item_path = data_path.to_path_buf();
item_path.push(item_id.to_string());
if !item_path.exists() {
debug!("UPDATE: Content file not found: {item_path:?}");
return None;
}
let compression_service = CompressionService::new();
let mut reader = match compression_service.stream_item_content(item_path, &item.compression) {
Ok(r) => r,
Err(e) => {
debug!("UPDATE: Failed to open content stream: {e}");
return None;
}
};
let mut buffer = [0u8; PIPESIZE];
let mut total_bytes: i64 = 0;
loop {
match reader.read(&mut buffer) {
Ok(0) => break,
Ok(n) => {
total_bytes += n as i64;
}
Err(e) => {
debug!("UPDATE: Error reading content: {e}");
return None;
}
}
}
Some(total_bytes)
}
/// Runs meta plugins on an existing item's content and stores the results.
fn run_meta_plugins_on_item(
conn: &mut Connection,
cmd: &mut Command,
settings: &config::Settings,
data_path: &Path,
item: &db::Item,
item_id: i64,
) -> Result<()> {
let mut item_path = data_path.to_path_buf();
item_path.push(item_id.to_string());
if !item_path.exists() {
debug!("UPDATE: Content file not found: {item_path:?}");
return Ok(());
}
// Collect metadata in memory
let (meta_service, collected_meta) = MetaService::with_collector();
let mut plugins = meta_service.get_plugins(cmd, settings);
if plugins.is_empty() {
return Ok(());
}
let compression_service = CompressionService::new();
let mut reader = compression_service.stream_item_content(item_path, &item.compression)?;
meta_service.initialize_plugins(&mut plugins);
crate::common::stream_copy(&mut reader, |chunk| {
meta_service.process_chunk(&mut plugins, chunk);
Ok(())
})?;
meta_service.finalize_plugins(&mut plugins);
// Write collected plugin metadata to DB
if let Ok(entries) = collected_meta.lock() {
for (name, value) in entries.iter() {
db::add_meta(conn, item_id, name, value)?;
}
}
Ok(())
}

View File

@@ -1,300 +0,0 @@
use crate::common::status::StatusInfo;
use crate::config::Settings;
use crate::db::Item;
use crate::db::Meta;
use crate::services::data_service::DataService;
use crate::services::error::CoreError;
use crate::services::types::{ItemWithContent, ItemWithMeta};
use clap::Command;
use futures::Stream;
use rusqlite::Connection;
use std::collections::HashMap;
use std::io::Read;
use std::path::{Path, PathBuf};
use std::pin::Pin;
use std::sync::Arc;
use tokio::sync::Mutex;
pub struct AsyncDataService {
data_path: PathBuf,
settings: Arc<Settings>,
db: Arc<Mutex<Connection>>,
sync_service: crate::services::SyncDataService,
}
impl AsyncDataService {
pub fn new(data_path: PathBuf, settings: Arc<Settings>, db: Arc<Mutex<Connection>>) -> Self {
let sync_service =
crate::services::SyncDataService::new(data_path.clone(), settings.as_ref().clone());
Self {
data_path,
settings,
db,
sync_service,
}
}
pub fn data_path(&self) -> &PathBuf {
&self.data_path
}
pub fn settings(&self) -> Arc<Settings> {
self.settings.clone()
}
pub fn db(&self) -> Arc<Mutex<Connection>> {
self.db.clone()
}
pub async fn get_item(&self, id: i64) -> Result<ItemWithMeta, CoreError> {
let mut conn = self.db.lock().await;
self.get(&mut conn, id)
}
pub async fn add_item_meta(
&self,
item_id: i64,
name: &str,
value: &str,
) -> Result<(), CoreError> {
let conn = self.db.lock().await;
crate::db::add_meta(&conn, item_id, name, value)?;
Ok(())
}
pub async fn list_items(
&self,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, CoreError> {
let mut conn = self.db.lock().await;
self.list(&mut conn, tags, meta)
}
pub async fn find_item(
&self,
ids: Vec<i64>,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<ItemWithMeta, CoreError> {
let mut conn = self.db.lock().await;
DataService::find_item(self, &mut conn, ids, tags, meta)
}
pub async fn get_item_content_info(
&self,
id: i64,
_filter: Option<String>,
) -> Result<(Vec<u8>, ItemWithMeta, bool), CoreError> {
let mut conn = self.db.lock().await;
let (mut reader, item_with_meta) = self.get_content(&mut conn, id)?;
let mut content = Vec::new();
reader.read_to_end(&mut content)?;
let is_binary = item_with_meta
.meta
.iter()
.find(|m| m.name == "text")
.map(|m| m.value == "false")
.unwrap_or(false);
Ok((content, item_with_meta, is_binary))
}
pub async fn get_item_content_info_streaming(
&self,
id: i64,
_filter: Option<String>,
) -> Result<
(
Pin<Box<dyn Stream<Item = Result<Vec<u8>, CoreError>> + Send>>,
ItemWithMeta,
bool,
),
CoreError,
> {
let mut conn = self.db.lock().await;
let (reader, item_with_meta) = self.get_content(&mut conn, id)?;
let is_binary = item_with_meta
.meta
.iter()
.find(|m| m.name == "text")
.map(|m| m.value == "false")
.unwrap_or(false);
// Convert reader to stream with optimized buffer reuse
let stream = async_stream::stream! {
let mut reader = reader;
let mut buf = [0u8; 8192];
loop {
match reader.read(&mut buf) {
Ok(0) => break,
Ok(n) => yield Ok(buf[..n].to_vec()),
Err(e) => yield Err(CoreError::from(e)),
}
}
};
Ok((Box::pin(stream), item_with_meta, is_binary))
}
pub async fn stream_item_content_by_id_with_metadata(
&self,
id: i64,
_metadata: &HashMap<String, String>,
_force_text: bool,
offset: u64,
length: u64,
_filter: Option<String>,
) -> Result<
(
Pin<Box<dyn Stream<Item = Result<Vec<u8>, std::io::Error>> + Send>>,
u64,
),
CoreError,
> {
let mut conn = self.db.lock().await;
let (mut reader, _item_with_meta) = self.get_content(&mut conn, id)?;
// Skip bytes for offset
if offset > 0 {
let mut skip_buf = [0u8; 8192];
let mut remaining = offset;
while remaining > 0 {
let to_read = std::cmp::min(8192, remaining as usize);
let n = reader.read(&mut skip_buf[..to_read])?;
if n == 0 {
break;
}
remaining -= n as u64;
}
}
let content_length = if length > 0 { length } else { u64::MAX };
// Optimized stream that reuses a single buffer for reading
let stream = async_stream::stream! {
let mut reader = reader;
let mut remaining = content_length;
let mut buf = [0u8; 8192];
while remaining > 0 {
let to_read = std::cmp::min(8192, remaining as usize);
match reader.read(&mut buf[..to_read]) {
Ok(0) => break,
Ok(n) => {
remaining -= n as u64;
yield Ok(buf[..n].to_vec());
}
Err(e) => {
yield Err(e);
break;
}
}
}
};
Ok((Box::pin(stream), content_length))
}
/// Get raw item content without decompression.
///
/// Reads the stored file bytes directly from disk, bypassing decompression.
/// Used when the client requests raw bytes with `decompress=false`.
pub async fn get_raw_item_content(&self, id: i64) -> Result<Vec<u8>, CoreError> {
let data_path = self.data_path.clone();
tokio::task::spawn_blocking(move || {
let mut item_path = data_path;
item_path.push(id.to_string());
let mut file = std::fs::File::open(&item_path).map_err(|e| {
CoreError::Io(std::io::Error::new(
std::io::ErrorKind::NotFound,
format!("Item file not found: {item_path:?}: {e}"),
))
})?;
let mut content = Vec::new();
file.read_to_end(&mut content)?;
Ok(content)
})
.await
.map_err(|e| CoreError::Other(anyhow::anyhow!("Task join error: {}", e)))?
}
}
impl DataService for AsyncDataService {
type Error = CoreError;
fn save<R: Read>(
&self,
content: R,
cmd: &mut Command,
settings: &Settings,
tags: Vec<String>,
conn: &mut Connection,
) -> Result<Item, Self::Error> {
self.sync_service.save(content, cmd, settings, tags, conn)
}
fn get(&self, conn: &mut Connection, id: i64) -> Result<ItemWithMeta, Self::Error> {
self.sync_service.get(conn, id)
}
fn get_content(
&self,
conn: &mut Connection,
id: i64,
) -> Result<(Box<dyn Read + Send>, ItemWithMeta), Self::Error> {
self.sync_service.get_content(conn, id)
}
fn list(
&self,
conn: &mut Connection,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, Self::Error> {
self.sync_service.list(conn, tags, meta)
}
fn delete(&self, conn: &mut Connection, id: i64) -> Result<Item, Self::Error> {
self.sync_service.delete(conn, id)
}
fn find_item(
&self,
conn: &mut Connection,
ids: Vec<i64>,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<ItemWithMeta, Self::Error> {
self.sync_service.find_item(conn, ids, tags, meta)
}
fn get_items(
&self,
conn: &mut Connection,
ids: &[i64],
tags: &[String],
meta: &HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, Self::Error> {
self.sync_service.get_items(conn, ids, tags, meta)
}
fn generate_status(
&self,
settings: &Settings,
data_path: &Path,
db_path: &Path,
) -> Result<StatusInfo, Self::Error> {
let mut cmd = Command::new("keep");
let status_service = crate::services::StatusService::new();
Ok(status_service.generate_status(
&mut cmd,
settings,
data_path.to_path_buf(),
db_path.to_path_buf(),
))
}
}

View File

@@ -1,401 +0,0 @@
/// Asynchronous service wrapper for `ItemService`.
///
/// Uses `tokio::task::spawn_blocking` to offload synchronous operations (DB/FS)
/// to a blocking thread pool, allowing non-blocking async usage in servers.
use crate::common::PIPESIZE;
use crate::config::Settings;
use crate::services::error::CoreError;
use crate::services::item_service::ItemService;
use crate::services::types::{ItemWithContent, ItemWithMeta};
use clap::Command;
use rusqlite::Connection;
use std::collections::HashMap;
use std::io::Read;
use std::path::PathBuf;
use std::sync::Arc;
use tokio::sync::Mutex;
/// An asynchronous wrapper around the `ItemService` for use in async contexts like the web server.
/// It uses `tokio::task::spawn_blocking` to run synchronous database and filesystem operations
/// on a dedicated thread pool, preventing them from blocking the async runtime.
#[allow(dead_code)]
/// Async wrapper for ItemService operations.
pub struct AsyncItemService {
pub data_dir: PathBuf,
db: Arc<Mutex<Connection>>,
item_service: Arc<ItemService>,
cmd: Arc<Mutex<Command>>,
settings: Arc<Settings>,
}
#[allow(dead_code)]
impl AsyncItemService {
/// Creates a new `AsyncItemService`.
///
/// # Arguments
///
/// * `data_dir` - Path to data directory.
/// * `db` - Arc-wrapped mutex for DB connection.
/// * `item_service` - Arc-wrapped ItemService.
/// * `cmd` - Arc-wrapped mutex for Clap command.
/// * `settings` - Arc-wrapped settings.
///
/// # Returns
///
/// A new `AsyncItemService`.
pub fn new(
data_dir: PathBuf,
db: Arc<Mutex<Connection>>,
item_service: Arc<ItemService>,
cmd: Arc<Mutex<Command>>,
settings: Arc<Settings>,
) -> Self {
Self {
data_dir,
db,
item_service,
cmd,
settings,
}
}
/// Internal helper to execute synchronous operations in a blocking task.
///
/// Spawns a blocking task with the DB connection and ItemService.
///
/// # Type Parameters
///
/// * `F` - Closure type.
/// * `T` - Return type.
///
/// # Arguments
///
/// * `f` - The synchronous closure to execute.
///
/// # Returns
///
/// Result of the closure, or CoreError on task failure.
async fn execute_blocking<F, T>(&self, f: F) -> Result<T, CoreError>
where
F: FnOnce(&Connection, &ItemService) -> Result<T, CoreError> + Send + 'static,
T: Send + 'static,
{
let db = self.db.clone();
let item_service = self.item_service.clone();
tokio::task::spawn_blocking(move || {
let conn = db.blocking_lock();
f(&conn, &item_service)
})
.await
.map_err(|e| CoreError::Other(anyhow::anyhow!("Blocking task failed: {}", e)))?
}
pub async fn get_item(&self, id: i64) -> Result<ItemWithMeta, CoreError> {
self.execute_blocking(move |conn, item_service| item_service.get_item(conn, id))
.await
}
pub async fn get_item_content(&self, id: i64) -> Result<ItemWithContent, CoreError> {
self.execute_blocking(move |conn, item_service| item_service.get_item_content(conn, id))
.await
}
pub async fn get_item_content_info(
&self,
id: i64,
filter: Option<String>,
) -> Result<(Vec<u8>, String, bool), CoreError> {
self.execute_blocking(move |conn, item_service| {
item_service.get_item_content_info(conn, id, filter)
})
.await
}
pub async fn stream_item_content_by_id(
&self,
item_id: i64,
allow_binary: bool,
offset: u64,
length: u64,
) -> Result<
(
std::pin::Pin<
Box<
dyn tokio_stream::Stream<
Item = Result<tokio_util::bytes::Bytes, std::io::Error>,
> + Send,
>,
>,
String,
),
CoreError,
> {
let content = self
.execute_blocking(move |conn, item_service| {
let item_with_content = item_service.get_item_content(conn, item_id)?;
Ok::<_, CoreError>(item_with_content.content)
})
.await?;
// Clone content for use in the binary check closure
let content_clone = content.clone();
// Get metadata to determine MIME type and binary status
let (mime_type, is_binary) = {
let db = self.db.clone();
let item_service = self.item_service.clone();
tokio::task::spawn_blocking(move || {
let conn = db.blocking_lock();
let item_with_meta = item_service.get_item(&conn, item_id)?;
let metadata = item_with_meta.meta_as_map();
let mime_type = metadata
.get("mime_type")
.map(|s| s.to_string())
.unwrap_or_else(|| "application/octet-stream".to_string());
let is_binary = crate::common::is_binary::is_content_binary_from_metadata(
&metadata,
&content_clone,
);
Ok::<_, CoreError>((mime_type, is_binary))
})
.await
.unwrap()?
};
// Check if content is binary when allow_binary is false
if !allow_binary && is_binary {
return Err(CoreError::InvalidInput(
"Binary content not allowed".to_string(),
));
}
// Create a stream that reads only the requested portion
let content_len = content.len() as u64;
// Apply offset and length constraints
let start = std::cmp::min(offset, content_len);
let end = if length > 0 {
std::cmp::min(start + length, content_len)
} else {
content_len
};
let stream = if start < content_len {
let chunk =
tokio_util::bytes::Bytes::from(content[start as usize..end as usize].to_vec());
Box::pin(tokio_stream::iter(vec![Ok(chunk)]))
} else {
Box::pin(tokio_stream::iter(vec![]))
};
Ok((stream, mime_type))
}
pub async fn stream_item_content_by_id_with_metadata(
&self,
item_id: i64,
metadata: &HashMap<String, String>,
allow_binary: bool,
offset: u64,
length: u64,
filter: Option<String>,
) -> Result<
(
std::pin::Pin<
Box<
dyn tokio_stream::Stream<
Item = Result<tokio_util::bytes::Bytes, std::io::Error>,
> + Send,
>,
>,
String,
),
CoreError,
> {
// Use provided metadata to determine MIME type and binary status
let mime_type = metadata
.get("mime_type")
.map(|s| s.to_string())
.unwrap_or_else(|| "application/octet-stream".to_string());
// Check if content is binary when allow_binary is false
if !allow_binary {
let is_binary = if let Some(text_val) = metadata.get("text") {
text_val == "false"
} else {
// Get binary status using streaming approach
let (_, _, is_binary) = self.get_item_content_info_streaming(item_id, None).await?;
is_binary
};
if is_binary {
return Err(CoreError::InvalidInput(
"Binary content not allowed".to_string(),
));
}
}
// Get a streaming reader for the content with filtering applied
let reader = {
let db = self.db.clone();
let item_service = self.item_service.clone();
let filter = filter.clone();
tokio::task::spawn_blocking(move || {
let conn = db.blocking_lock();
item_service
.get_item_content_info_streaming(&conn, item_id, filter)
.map(|(reader, _, _)| reader)
})
.await
.map_err(|e| CoreError::Other(anyhow::anyhow!("Blocking task failed: {}", e)))?
};
// Convert the reader into an async stream manually
use tokio_util::bytes::Bytes;
// Create a channel to stream data between the blocking thread and async runtime
let (tx, rx) = tokio::sync::mpsc::channel(1);
// Spawn a blocking task to read from the reader and send chunks
tokio::task::spawn_blocking(move || {
let mut reader = reader;
// Apply offset by reading and discarding bytes
if offset > 0 {
let mut remaining = offset;
let mut buf = [0; PIPESIZE];
while remaining > 0 {
let to_read = std::cmp::min(remaining, buf.len() as u64);
match reader.as_mut().unwrap().read(&mut buf[..to_read as usize]) {
Ok(0) => break, // EOF reached before offset
Ok(n) => remaining -= n as u64,
Err(e) => {
let _ = tx.blocking_send(Err(e));
return;
}
}
}
}
// Read and send data up to the specified length
let mut remaining_length = length;
let mut buffer = [0; PIPESIZE];
loop {
// Determine how much to read in this iteration
let to_read = if length > 0 {
// If length is specified, don't read more than remaining_length
std::cmp::min(remaining_length, buffer.len() as u64) as usize
} else {
buffer.len()
};
if to_read == 0 {
break; // We've read the requested length
}
match reader.as_mut().unwrap().read(&mut buffer[..to_read]) {
Ok(0) => break, // EOF
Ok(n) => {
let chunk = Bytes::copy_from_slice(&buffer[..n]);
// Block on sending to the channel
if tx.blocking_send(Ok(chunk)).is_err() {
break; // Receiver dropped
}
if length > 0 {
remaining_length -= n as u64;
if remaining_length == 0 {
break; // Reached the requested length
}
}
}
Err(e) => {
let _ = tx.blocking_send(Err(e));
break;
}
}
}
});
// Convert the receiver into a stream
let stream = tokio_stream::wrappers::ReceiverStream::new(rx);
Ok((Box::pin(stream), mime_type))
}
pub async fn get_item_content_info_streaming(
&self,
item_id: i64,
filter: Option<String>,
) -> Result<(Box<dyn Read + Send>, String, bool), CoreError> {
self.execute_blocking(move |conn, item_service| {
item_service.get_item_content_info_streaming(conn, item_id, filter)
})
.await
}
pub async fn find_item(
&self,
ids: Vec<i64>,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<ItemWithMeta, CoreError> {
let ids_clone = ids.clone();
let tags_clone = tags.clone();
let meta_clone = meta.clone();
self.execute_blocking(move |conn, item_service| {
item_service.find_item(conn, &ids_clone, &tags_clone, &meta_clone)
})
.await
}
pub async fn list_items(
&self,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, CoreError> {
let tags_clone = tags.clone();
let meta_clone = meta.clone();
self.execute_blocking(move |conn, item_service| {
item_service.list_items(conn, &tags_clone, &meta_clone)
})
.await
}
pub async fn delete_item(&self, id: i64) -> Result<(), CoreError> {
let db = self.db.clone();
let item_service = self.item_service.clone();
tokio::task::spawn_blocking(move || {
let mut conn = db.blocking_lock();
item_service.delete_item(&mut conn, id)
})
.await
.unwrap()
}
pub async fn save_item_from_mcp(
&self,
content: Vec<u8>,
tags: Vec<String>,
metadata: HashMap<String, String>,
) -> Result<ItemWithMeta, CoreError> {
let db = self.db.clone();
let item_service = self.item_service.clone();
let cmd = self.cmd.clone();
let settings = self.settings.clone();
tokio::task::spawn_blocking(move || {
let mut conn = db.blocking_lock();
let mut cmd = cmd.blocking_lock();
let settings = settings.as_ref();
item_service
.save_item_from_mcp(&content, &tags, &metadata, &mut cmd, settings, &mut conn)
})
.await
.unwrap()
}
}

View File

@@ -1,33 +1,12 @@
use crate::compression_engine::{CompressionType, get_compression_engine};
use crate::services::error::CoreError;
use anyhow::anyhow;
use std::io::Read;
use std::io::{Read, Write};
use std::path::PathBuf;
use std::str::FromStr;
pub struct CompressionService;
/// Service for handling compression and decompression of item content.
///
/// Provides methods to read compressed item files either fully into memory
/// or as streaming readers. Supports various compression types via engines.
/// This service abstracts the underlying compression engines for consistent access.
///
/// # Examples
///
/// ```ignore
/// let service = CompressionService::new();
/// let content = service.get_item_content(path, "gzip")?;
/// ```
/// Provides methods to read compressed item files either fully into memory
/// or as streaming readers. Supports various compression types via engines.
///
/// # Examples
///
/// ```ignore
/// let service = CompressionService::new();
/// let content = service.get_item_content(path, "gzip")?;
/// ```
impl CompressionService {
/// Creates a new CompressionService instance.
///
@@ -132,6 +111,67 @@ impl CompressionService {
})?;
Ok(reader)
}
pub fn decompressing_reader(
reader: Box<dyn Read>,
compression: &CompressionType,
) -> Result<Box<dyn Read>, CoreError> {
match compression {
CompressionType::GZip => {
use flate2::read::GzDecoder;
Ok(Box::new(GzDecoder::new(reader)))
}
CompressionType::LZ4 => {
use lz4_flex::frame::FrameDecoder;
Ok(Box::new(FrameDecoder::new(reader)))
}
#[cfg(feature = "zstd")]
CompressionType::ZStd => {
use zstd::stream::read::Decoder;
Ok(Box::new(Decoder::new(reader).map_err(|e| {
CoreError::Compression(format!("zstd decoder error: {}", e))
})?))
}
_ => Ok(reader),
}
}
/// Creates a compressing writer wrapping the given writer.
///
/// Returns a boxed writer that compresses on the fly based on the compression type.
/// Useful for compressing data to network streams or pipes.
///
/// # Arguments
///
/// * `writer` - The underlying destination writer.
/// * `compression` - Compression type string (e.g., "gzip", "lz4").
///
/// # Returns
///
/// A boxed compressing writer. Unknown/none types pass through unchanged.
pub fn compressing_writer(
writer: Box<dyn Write>,
compression: &CompressionType,
) -> Result<Box<dyn Write>, CoreError> {
match compression {
CompressionType::GZip => {
use flate2::Compression;
use flate2::write::GzEncoder;
Ok(Box::new(GzEncoder::new(writer, Compression::default())))
}
CompressionType::LZ4 => Ok(Box::new(lz4_flex::frame::FrameEncoder::new(writer))),
#[cfg(feature = "zstd")]
CompressionType::ZStd => {
use zstd::stream::write::Encoder;
Ok(Box::new(
Encoder::new(writer, 3)
.map_err(|e| CoreError::Compression(format!("zstd encoder error: {}", e)))?
.auto_finish(),
))
}
_ => Ok(writer),
}
}
}
impl Default for CompressionService {

View File

@@ -1,63 +0,0 @@
use crate::common::status::StatusInfo;
use crate::config::Settings;
use crate::db::Item;
use crate::services::error::CoreError;
use crate::services::types::{ItemWithContent, ItemWithMeta};
use clap::Command;
use rusqlite::Connection;
use std::collections::HashMap;
use std::io::Read;
use std::path::Path;
pub trait DataService {
type Error;
fn save<R: Read>(
&self,
content: R,
cmd: &mut Command,
settings: &Settings,
tags: Vec<String>,
conn: &mut Connection,
) -> Result<Item, Self::Error>;
fn get(&self, conn: &mut Connection, id: i64) -> Result<ItemWithMeta, Self::Error>;
fn get_content(
&self,
conn: &mut Connection,
id: i64,
) -> Result<(Box<dyn Read + Send>, ItemWithMeta), Self::Error>;
fn list(
&self,
conn: &mut Connection,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, Self::Error>;
fn delete(&self, conn: &mut Connection, id: i64) -> Result<Item, Self::Error>;
fn find_item(
&self,
conn: &mut Connection,
ids: Vec<i64>,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<ItemWithMeta, Self::Error>;
fn get_items(
&self,
conn: &mut Connection,
ids: &[i64],
tags: &[String],
meta: &HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, Self::Error>;
fn generate_status(
&self,
settings: &Settings,
data_path: &Path,
db_path: &Path,
) -> Result<StatusInfo, Self::Error>;
}

View File

@@ -13,32 +13,27 @@ use thiserror::Error;
/// * `ItemNotFoundGeneric` - Generic item not found (no ID specified).
/// * `InvalidInput(String)` - User or config input validation failure with message.
/// * `Compression(String)` - Compression/decompression errors with details.
/// * `PayloadTooLarge` - Request body exceeded maximum allowed size.
/// * `Other(anyhow::Error)` - Catch-all for other anyhow-wrapped errors.
/// * `Migration(rusqlite_migration::Error)` - Database migration failures.
#[derive(Error, Debug)]
pub enum CoreError {
#[error("Database error: {0}")]
/// Database operation failed.
Database(#[from] rusqlite::Error),
#[error("I/O error: {0}")]
/// File or stream I/O operation failed.
Io(#[from] std::io::Error),
#[error("Item not found with id {0}")]
/// Item with the specified ID does not exist in the database.
ItemNotFound(i64),
#[error("Item not found")]
/// Item does not exist (no specific ID).
ItemNotFoundGeneric,
#[error("Invalid input: {0}")]
/// Input validation failed.
InvalidInput(String),
#[error("Compression error: {0}")]
/// Compression or decompression operation failed.
Compression(String),
#[error("Payload too large")]
PayloadTooLarge,
#[error(transparent)]
/// Other unexpected error.
Other(#[from] anyhow::Error),
#[error("Migration error: {0}")]
/// Database schema migration failed.
Migration(#[from] rusqlite_migration::Error),
}

View File

@@ -1,5 +1,4 @@
use crate::filter_plugin::{FilterChain, parse_filter_string};
use once_cell::sync::Lazy;
use std::collections::HashMap;
use std::io::{Read, Result, Write};
use std::sync::Mutex;
@@ -166,8 +165,8 @@ impl FilterService {
/// # Panics
///
/// Lock acquisition failures (rare) cause panics in accessors.
static FILTER_PLUGIN_REGISTRY: Lazy<Mutex<HashMap<String, FilterConstructor>>> =
Lazy::new(|| Mutex::new(HashMap::new()));
static FILTER_PLUGIN_REGISTRY: std::sync::LazyLock<Mutex<HashMap<String, FilterConstructor>>> =
std::sync::LazyLock::new(|| Mutex::new(HashMap::new()));
/// Registers a filter plugin in the global registry.
///
@@ -188,11 +187,12 @@ static FILTER_PLUGIN_REGISTRY: Lazy<Mutex<HashMap<String, FilterConstructor>>> =
/// ```ignore
/// register_filter_plugin("custom_filter", || Box::new(CustomFilter::default()));
/// ```
pub fn register_filter_plugin(name: &str, constructor: FilterConstructor) {
pub fn register_filter_plugin(name: &str, constructor: FilterConstructor) -> anyhow::Result<()> {
FILTER_PLUGIN_REGISTRY
.lock()
.unwrap()
.map_err(|e| anyhow::anyhow!("plugin registry poisoned: {e}"))?
.insert(name.to_string(), constructor);
Ok(())
}
/// Retrieves a snapshot of all registered filter plugins.
@@ -214,6 +214,9 @@ pub fn register_filter_plugin(name: &str, constructor: FilterConstructor) {
/// let plugins = get_available_filter_plugins();
/// // Plugins are registered at startup via ctors; specific names may vary by configuration.
/// ```
pub fn get_available_filter_plugins() -> HashMap<String, FilterConstructor> {
FILTER_PLUGIN_REGISTRY.lock().unwrap().clone()
pub fn get_available_filter_plugins() -> anyhow::Result<HashMap<String, FilterConstructor>> {
FILTER_PLUGIN_REGISTRY
.lock()
.map_err(|e| anyhow::anyhow!("plugin registry poisoned: {e}"))
.map(|guard| guard.clone())
}

View File

@@ -1,4 +1,3 @@
use crate::common::PIPESIZE;
use crate::compression_engine::{CompressionType, get_compression_engine};
use crate::config::Settings;
use crate::db::{self, Item, Meta};
@@ -9,12 +8,14 @@ use crate::services::error::CoreError;
use crate::services::filter_service::FilterService;
use crate::services::meta_service::MetaService;
use crate::services::types::{ItemWithContent, ItemWithMeta};
use chrono::DateTime;
use chrono::Utc;
use clap::Command;
use log::debug;
use rusqlite::Connection;
use std::collections::HashMap;
use std::fs;
use std::io::{IsTerminal, Read, Write};
use std::io::{Cursor, IsTerminal, Read, Write};
use std::path::PathBuf;
/// Service for managing items in the Keep application.
@@ -28,8 +29,6 @@ pub struct ItemService {
data_path: PathBuf,
/// Service for handling compression and decompression.
compression_service: CompressionService,
/// Service for managing metadata plugins.
meta_service: MetaService,
/// Service for applying content filters.
filter_service: FilterService,
}
@@ -59,11 +58,16 @@ impl ItemService {
Self {
data_path,
compression_service: CompressionService::new(),
meta_service: MetaService::new(),
filter_service: FilterService::new(),
}
}
fn item_path(&self, item_id: i64) -> PathBuf {
let mut path = self.data_path.clone();
path.push(item_id.to_string());
path
}
/// Retrieves an item with its associated metadata and tags.
///
/// Fetches the item from the database by ID and loads its tags and metadata.
@@ -106,6 +110,8 @@ impl ItemService {
/// Retrieves an item with its content, metadata, and tags.
///
/// Loads the item, its metadata/tags, and decompresses the full content.
/// This method is intended for CLI use only and has a size guard (100MB).
/// For larger items or server use, use `get_item_content_info_streaming`.
///
/// # Arguments
///
@@ -120,6 +126,7 @@ impl ItemService {
///
/// * `CoreError::ItemNotFound(id)` - If the item does not exist.
/// * `CoreError::Io(...)` - If file read or decompression fails.
/// * `CoreError::InvalidInput(...)` - If item exceeds 100MB size limit.
///
/// # Examples
///
@@ -132,6 +139,9 @@ impl ItemService {
conn: &Connection,
id: i64,
) -> Result<ItemWithContent, CoreError> {
// Size limit for loading entire content into memory (100MB)
const MAX_CONTENT_SIZE: i64 = 100 * 1024 * 1024;
debug!("ITEM_SERVICE: Getting item content for id: {id}");
let item_with_meta = self.get_item(conn, id)?;
let item_id = item_with_meta
@@ -145,8 +155,17 @@ impl ItemService {
)));
}
let mut item_path = self.data_path.clone();
item_path.push(item_id.to_string());
// Check size guard before loading content
if let Some(size) = item_with_meta.item.uncompressed_size
&& size > MAX_CONTENT_SIZE
{
return Err(CoreError::InvalidInput(format!(
"Item {} exceeds size limit ({} > {}). Use streaming API for large items.",
item_id, size, MAX_CONTENT_SIZE
)));
}
let item_path = self.item_path(item_id);
debug!("ITEM_SERVICE: Reading content from path: {item_path:?}");
let content = self
@@ -164,47 +183,6 @@ impl ItemService {
})
}
/// Retrieves item content with binary detection and optional filtering.
///
/// Loads content, applies filters if specified, and determines MIME type and binary status.
///
/// # Arguments
///
/// * `conn` - Database connection.
/// * `id` - Item ID.
/// * `filter` - Optional filter string to apply to content.
///
/// # Returns
///
/// * `Result<(Vec<u8>, String, bool), CoreError>` - (content, MIME type, is_binary).
///
/// # Errors
///
/// * `CoreError::ItemNotFound(id)` - If item not found.
/// * Filter or compression errors.
///
/// # Examples
///
/// ```ignore
/// let (content, mime, is_binary) = item_service.get_item_content_info(&conn, 1, Some("head_lines(10)"))?;
/// ```
pub fn get_item_content_info(
&self,
conn: &Connection,
id: i64,
filter: Option<String>,
) -> Result<(Vec<u8>, String, bool), CoreError> {
// Use streaming approach to handle all filtering options consistently
let (mut reader, mime_type, is_binary) =
self.get_item_content_info_streaming(conn, id, filter)?;
// Read all the filtered content into a buffer
let mut content = Vec::new();
reader.read_to_end(&mut content)?;
Ok((content, mime_type, is_binary))
}
/// Determines if item content is binary based on metadata or sampling.
///
/// Checks existing "text" metadata first; if absent, samples the first 8192 bytes.
@@ -331,8 +309,7 @@ impl ItemService {
)));
}
let mut item_path = self.data_path.clone();
item_path.push(item_id.to_string());
let item_path = self.item_path(item_id);
let reader = self
.compression_service
@@ -372,8 +349,7 @@ impl ItemService {
)));
}
let mut item_path = self.data_path.clone();
item_path.push(item_id.to_string());
let item_path = self.item_path(item_id);
let reader = self
.compression_service
@@ -423,7 +399,7 @@ impl ItemService {
conn: &Connection,
ids: &[i64],
tags: &[String],
meta: &HashMap<String, String>,
meta: &HashMap<String, Option<String>>,
) -> Result<ItemWithMeta, CoreError> {
debug!("ITEM_SERVICE: Finding item with ids: {ids:?}, tags: {tags:?}, meta: {meta:?}");
let item_maybe = match (ids.is_empty(), tags.is_empty() && meta.is_empty()) {
@@ -491,7 +467,7 @@ impl ItemService {
&self,
conn: &Connection,
tags: &[String],
meta: &HashMap<String, String>,
meta: &HashMap<String, Option<String>>,
) -> Result<Vec<ItemWithMeta>, CoreError> {
debug!("ITEM_SERVICE: Listing items with tags: {tags:?}, meta: {meta:?}");
let items = db::get_items_matching(conn, &tags.to_vec(), meta)?;
@@ -559,7 +535,7 @@ impl ItemService {
/// ```ignore
/// item_service.delete_item(&mut conn, 1)?;
/// ```
pub fn delete_item(&self, conn: &mut Connection, id: i64) -> Result<(), CoreError> {
pub fn delete_item(&self, conn: &mut Connection, id: i64) -> Result<Item, CoreError> {
debug!("ITEM_SERVICE: Deleting item with id: {id}");
if id <= 0 {
return Err(CoreError::InvalidInput(format!("Invalid item ID: {id}")));
@@ -567,10 +543,10 @@ impl ItemService {
let item = db::get_item(conn, id)?.ok_or(CoreError::ItemNotFound(id))?;
debug!("ITEM_SERVICE: Found item to delete: {item:?}");
let mut item_path = self.data_path.clone();
item_path.push(id.to_string());
let item_path = self.item_path(id);
debug!("ITEM_SERVICE: Deleting file at path: {item_path:?}");
let deleted_item = item.clone();
db::delete_item(conn, item)?;
fs::remove_file(&item_path).or_else(|e| {
if e.kind() == std::io::ErrorKind::NotFound {
@@ -581,7 +557,7 @@ impl ItemService {
})?;
debug!("ITEM_SERVICE: Successfully deleted item {id}");
Ok(())
Ok(deleted_item)
}
/// Saves content from a reader to a new item.
@@ -621,10 +597,8 @@ impl ItemService {
conn: &mut Connection,
) -> Result<Item, CoreError> {
debug!("ITEM_SERVICE: Starting save_item with tags: {tags:?}");
if tags.is_empty() {
tags.push("none".to_string());
debug!("ITEM_SERVICE: No tags provided, using default 'none' tag");
}
crate::modes::common::ensure_default_tag(tags);
debug!("ITEM_SERVICE: Tags after ensure_default: {tags:?}");
let compression_type = settings_compression_type(cmd, settings);
debug!("ITEM_SERVICE: Using compression type: {compression_type:?}");
@@ -640,7 +614,7 @@ impl ItemService {
debug!("ITEM_SERVICE: Created new item with id: {item_id}");
db::set_item_tags(conn, item.clone(), tags)?;
debug!("ITEM_SERVICE: Set tags for item {item_id}");
let item_meta = self.meta_service.collect_initial_meta();
let item_meta = MetaService::collect_initial_meta_static();
debug!(
"ITEM_SERVICE: Collected {} initial meta entries",
item_meta.len()
@@ -648,12 +622,19 @@ impl ItemService {
for (k, v) in item_meta.iter() {
db::add_meta(conn, item_id, k, v)?;
}
// Store user-specified metadata from --meta CLI flags
for (key, value) in &settings.meta {
if let Some(v) = value {
debug!("ITEM_SERVICE: Setting user meta {key}={v}");
db::add_meta(conn, item_id, key, v)?;
}
}
}
// Print the "KEEP: New item" message before starting to read input
if !settings.quiet {
if std::io::stderr().is_terminal() {
let mut t = term::stderr().unwrap();
if let Some(mut t) = term::stderr() {
let _ = t.reset();
let _ = t.attr(term::Attr::Bold);
let _ = write!(t, "KEEP:");
@@ -668,48 +649,54 @@ impl ItemService {
let _ = t.reset();
let _ = writeln!(t);
let _ = std::io::stderr().flush();
}
} else {
let mut t = std::io::stderr();
let _ = writeln!(t, "KEEP: New item: {item_id} tags: {tags:?}");
}
}
let mut plugins = self.meta_service.get_plugins(cmd, settings);
// Collect metadata from plugins into a Vec, then write to DB after plugins finish.
// This avoids capturing &Connection in the save_meta closure (which would need unsafe
// and wouldn't be Send for parallel plugins).
let (meta_service, collected_meta) = MetaService::with_collector();
let mut plugins = meta_service.get_plugins(cmd, settings);
debug!("ITEM_SERVICE: Got {} meta plugins", plugins.len());
self.meta_service
.initialize_plugins(&mut plugins, conn, item_id);
meta_service.initialize_plugins(&mut plugins);
let mut item_path = self.data_path.clone();
item_path.push(item_id.to_string());
let item_path = self.item_path(item_id);
debug!("ITEM_SERVICE: Writing item to path: {item_path:?}");
let mut item_out = compression_engine.create(item_path.clone())?;
let mut buffer = [0; PIPESIZE];
let mut total_bytes = 0;
let mut total_bytes: i64 = 0;
debug!("ITEM_SERVICE: Starting to read and process input data");
loop {
let n = input.read(&mut buffer)?;
if n == 0 {
break;
}
total_bytes += n as i64;
item_out.write_all(&buffer[..n])?;
self.meta_service
.process_chunk(&mut plugins, &buffer[..n], conn, item_id);
}
crate::common::stream_copy(&mut input, |chunk| {
total_bytes += chunk.len() as i64;
item_out.write_all(chunk)?;
meta_service.process_chunk(&mut plugins, chunk);
Ok(())
})?;
debug!("ITEM_SERVICE: Processed {total_bytes} bytes total");
item_out.flush()?;
drop(item_out);
debug!("ITEM_SERVICE: Finalizing meta plugins");
self.meta_service
.finalize_plugins(&mut plugins, conn, item_id);
let compressed_size = std::fs::metadata(&item_path)?.len() as i64;
item.size = Some(total_bytes);
debug!("ITEM_SERVICE: Finalizing meta plugins");
meta_service.finalize_plugins(&mut plugins);
// Write collected plugin metadata to DB
let entries = collected_meta.lock().expect("meta lock poisoned");
for (name, value) in entries.iter() {
db::add_meta(conn, item_id, name, value)?;
}
item.uncompressed_size = Some(total_bytes);
item.compressed_size = Some(compressed_size);
item.closed = true;
db::update_item(conn, item.clone())?;
debug!("ITEM_SERVICE: Save completed successfully");
@@ -717,110 +704,6 @@ impl ItemService {
Ok(item)
}
/// Saves pre-loaded content as a new item, typically from MCP (Machine-Common-Processing) sources.
///
/// Bypasses streaming read, directly writes content and applies metadata/plugins.
///
/// # Arguments
///
/// * `content` - Byte slice of content to save.
/// * `tags` - Tags to associate.
/// * `metadata` - Initial metadata key-value pairs.
/// * `cmd` - Mutable command.
/// * `settings` - Settings.
/// * `conn` - Mutable database connection.
///
/// # Returns
///
/// * `Result<ItemWithMeta, CoreError>` - The saved item with full details.
///
/// # Errors
///
/// * `CoreError::Database(...)` - If DB insert fails.
/// * `CoreError::Io(...)` - If file write fails.
///
/// # Examples
///
/// ```ignore
/// let content = b"Hello, world!";
/// let tags = vec!["mcp".to_string()];
/// let meta = HashMap::from([("source".to_string(), "api".to_string())]);
/// let item = service.save_item_from_mcp(content, &tags, &meta, &mut cmd, &settings, &mut conn)?;
/// ```
pub fn save_item_from_mcp(
&self,
content: &[u8],
tags: &Vec<String>,
metadata: &HashMap<String, String>,
cmd: &mut Command,
settings: &Settings,
conn: &mut Connection,
) -> Result<ItemWithMeta, CoreError> {
debug!(
"ITEM_SERVICE: Starting save_item_from_mcp with {} bytes, {} tags, {} metadata entries",
content.len(),
tags.len(),
metadata.len()
);
let compression_type = CompressionType::LZ4;
let compression_engine = get_compression_engine(compression_type.clone())?;
let item_id;
let mut item;
{
item = db::create_item(conn, compression_type.clone())?;
item_id = item
.id
.ok_or_else(|| CoreError::InvalidInput("Item missing ID".to_string()))?;
debug!("ITEM_SERVICE: Created MCP item with id: {item_id}");
// Add tags
for tag in tags {
db::add_tag(conn, item_id, tag)?;
}
debug!("ITEM_SERVICE: Added {} tags to MCP item", tags.len());
// Add custom metadata
for (key, value) in metadata {
db::add_meta(conn, item_id, key, value)?;
}
debug!(
"ITEM_SERVICE: Added {} custom metadata entries to MCP item",
metadata.len()
);
}
let mut item_path = self.data_path.clone();
item_path.push(item_id.to_string());
debug!("ITEM_SERVICE: Writing MCP item to path: {item_path:?}");
let mut writer = compression_engine.create(item_path.clone())?;
writer.write_all(content)?;
drop(writer);
let mut plugins = self.meta_service.get_plugins(cmd, settings);
debug!(
"ITEM_SERVICE: Got {} configured meta plugins for MCP item",
plugins.len()
);
self.meta_service
.initialize_plugins(&mut plugins, conn, item_id);
self.meta_service
.process_chunk(&mut plugins, content, conn, item_id);
self.meta_service
.finalize_plugins(&mut plugins, conn, item_id);
debug!("ITEM_SERVICE: Processed MCP item through configured meta plugins");
item.size = Some(content.len() as i64);
db::update_item(conn, item.clone())?;
debug!("ITEM_SERVICE: MCP item saved successfully");
self.get_item(conn, item_id)
}
/// Returns a reference to the internal compression service.
///
/// # Returns
@@ -838,6 +721,255 @@ impl ItemService {
pub fn get_data_path(&self) -> &PathBuf {
&self.data_path
}
/// Returns a streaming reader and item metadata for the given item.
pub fn get_item_content_streaming(
&self,
conn: &Connection,
id: i64,
) -> Result<(Box<dyn Read + Send>, ItemWithMeta), CoreError> {
let (reader, _mime, _is_binary) = self.get_item_content_info_streaming(conn, id, None)?;
let item_with_meta = self.get_item(conn, id)?;
Ok((reader, item_with_meta))
}
/// Fetches multiple items by ID, silently skipping not-found items.
/// Falls back to `list_items` if the ID list is empty.
pub fn get_items(
&self,
conn: &Connection,
ids: &[i64],
tags: &[String],
meta: &HashMap<String, Option<String>>,
) -> Result<Vec<ItemWithMeta>, CoreError> {
if ids.is_empty() {
return self.list_items(conn, tags, meta);
}
let mut results = Vec::new();
for id in ids {
match self.get_item(conn, *id) {
Ok(item) => results.push(item),
Err(CoreError::ItemNotFound(_)) => continue,
Err(e) => return Err(e),
}
}
Ok(results)
}
/// Save an item with granular control over compression and meta plugins.
///
/// This method allows callers to control whether compression and meta plugins
/// run server-side or were already handled by the client.
///
/// # Arguments
///
/// * `conn` - Database connection.
/// * `content` - Raw content bytes.
/// * `tags` - Tags to associate with the item.
/// * `metadata` - Client-provided metadata.
/// * `compress` - Whether the server should compress the content.
/// * `run_meta` - Whether the server should run meta plugins.
/// * `settings` - Application settings.
///
/// # Returns
///
/// * `Result<ItemWithMeta, CoreError>` - The saved item with full details.
#[allow(clippy::too_many_arguments)]
pub fn save_item_raw(
&self,
conn: &mut Connection,
content: &[u8],
tags: Vec<String>,
metadata: HashMap<String, String>,
compress: bool,
run_meta: bool,
settings: &Settings,
) -> Result<ItemWithMeta, CoreError> {
let mut cursor = Cursor::new(content);
self.save_item_raw_streaming(
conn,
&mut cursor,
tags,
metadata,
compress,
run_meta,
None,
None,
settings,
true,
)
}
/// Save an item from a streaming reader with granular control over compression.
///
/// Unlike `save_item_raw` which takes a pre-buffered `&[u8]`, this method
/// reads from the reader in chunks and writes directly to the compression
/// engine, avoiding buffering the entire content in memory.
#[allow(clippy::too_many_arguments)]
pub fn save_item_raw_streaming(
&self,
conn: &mut Connection,
reader: &mut dyn Read,
tags: Vec<String>,
metadata: HashMap<String, String>,
compress: bool,
run_meta: bool,
client_compression_type: Option<CompressionType>,
import_ts: Option<DateTime<Utc>>,
settings: &Settings,
set_size: bool,
) -> Result<ItemWithMeta, CoreError> {
let mut cmd = Command::new("keep");
let mut tags = tags;
crate::modes::common::ensure_default_tag(&mut tags);
let (compression_type_for_db, compression_engine) = if compress {
let ct = settings_compression_type(&mut cmd, settings);
let engine = get_compression_engine(ct.clone())?;
(ct, engine)
} else {
let ct = client_compression_type.unwrap_or(CompressionType::Raw);
let engine = get_compression_engine(CompressionType::Raw)?;
(ct, engine)
};
let item_id;
let mut item;
{
item = if let Some(ts) = import_ts {
db::insert_item_with_ts(conn, ts, &compression_type_for_db.to_string())?
} else {
db::create_item(conn, compression_type_for_db.clone())?
};
item_id = item
.id
.ok_or_else(|| CoreError::InvalidInput("Item missing ID".to_string()))?;
db::set_item_tags(conn, item.clone(), &tags)?;
}
let (meta_service, collected_meta) = MetaService::with_collector();
let mut plugins = if run_meta {
meta_service.get_plugins(&mut cmd, settings)
} else {
Vec::new()
};
if run_meta {
meta_service.initialize_plugins(&mut plugins);
}
let item_path = self.item_path(item_id);
let mut item_out = compression_engine.create(item_path.clone())?;
let mut total_bytes = 0i64;
crate::common::stream_copy(reader, |chunk| {
item_out.write_all(chunk)?;
total_bytes += chunk.len() as i64;
if run_meta {
meta_service.process_chunk(&mut plugins, chunk);
}
Ok(())
})?;
item_out.flush()?;
drop(item_out);
let compressed_size = std::fs::metadata(&item_path)?.len() as i64;
if run_meta {
meta_service.finalize_plugins(&mut plugins);
}
if run_meta {
let entries = collected_meta.lock().expect("meta lock poisoned");
for (name, value) in entries.iter() {
db::add_meta(conn, item_id, name, value)?;
}
}
for (key, value) in &metadata {
if key != "uncompressed_size" {
db::add_meta(conn, item_id, key, value)?;
}
}
item.uncompressed_size = if set_size { Some(total_bytes) } else { None };
item.compressed_size = Some(compressed_size);
item.closed = true;
db::update_item(conn, item)?;
self.get_item(conn, item_id)
}
/// Runs specified meta plugins on an existing item's content and stores the results.
pub fn update_item_plugins(
&self,
conn: &mut Connection,
item_id: i64,
plugin_names: &[String],
metadata: HashMap<String, String>,
tags: &[String],
settings: &Settings,
) -> Result<ItemWithMeta, CoreError> {
let item = db::get_item(conn, item_id)?.ok_or_else(|| CoreError::ItemNotFound(item_id))?;
let (meta_service, collected_meta) = MetaService::with_collector();
let mut cmd = Command::new("keep");
let all_plugins = meta_service.get_plugins(&mut cmd, settings);
let mut plugins: Vec<Box<dyn crate::meta_plugin::MetaPlugin>> = all_plugins
.into_iter()
.filter(|p| {
let plugin_name = p.meta_type().to_string();
plugin_names.iter().any(|n| n == &plugin_name)
})
.collect();
if plugins.is_empty() && metadata.is_empty() {
return self.get_item(conn, item_id);
}
let item_path = self.item_path(item_id);
if !item_path.exists() {
return Err(CoreError::ItemNotFound(item_id));
}
if !plugins.is_empty() {
let compression_service = CompressionService::new();
let mut reader =
compression_service.stream_item_content(item_path, &item.compression)?;
meta_service.initialize_plugins(&mut plugins);
crate::common::stream_copy(&mut reader, |chunk| {
meta_service.process_chunk(&mut plugins, chunk);
Ok(())
})?;
meta_service.finalize_plugins(&mut plugins);
if let Ok(entries) = collected_meta.lock() {
for (name, value) in entries.iter() {
db::add_meta(conn, item_id, name, value)?;
}
}
}
for (key, value) in &metadata {
db::add_meta(conn, item_id, key, value)?;
}
for tag in tags {
db::upsert_tag(conn, item_id, tag)?;
}
self.get_item(conn, item_id)
}
}
/// A reader that applies a filter chain to the data as it's read.
@@ -934,16 +1066,19 @@ impl<R: Read> Read for FilteringReader<R> {
return self.reader.read(buf);
}
// Read from the original reader into the reusable temp buffer
// Read chunks and process through the filter chain.
// Loop because filters like skip_lines may consume entire chunks
// without producing output — that is not EOF, we must keep reading.
let chain = self.filter_chain.as_mut().unwrap();
loop {
let to_read = std::cmp::min(buf.len(), self.temp_buf.len());
let bytes_read = self.reader.read(&mut self.temp_buf[..to_read])?;
if bytes_read == 0 {
// True EOF from the underlying reader
return Ok(0);
}
// Process through the filter chain
if let Some(ref mut chain) = self.filter_chain {
let mut input_cursor = std::io::Cursor::new(&self.temp_buf[..bytes_read]);
chain.filter(&mut input_cursor, &mut self.buffer)?;
@@ -951,13 +1086,9 @@ impl<R: Read> Read for FilteringReader<R> {
let bytes_to_copy = std::cmp::min(buf.len(), self.buffer.len());
buf[..bytes_to_copy].copy_from_slice(&self.buffer[..bytes_to_copy]);
self.buffer_pos = bytes_to_copy;
Ok(bytes_to_copy)
} else {
// No data produced by filter, signal to read more
Ok(0)
return Ok(bytes_to_copy);
}
} else {
unreachable!()
// Filter produced no output for this chunk — read another
}
}
}

View File

@@ -1,12 +1,17 @@
use crate::config::Settings;
use crate::meta_plugin::{MetaPlugin, MetaPluginResponse, MetaPluginType};
use crate::meta_plugin::{MetaPlugin, MetaPluginResponse, MetaPluginType, SaveMetaFn};
use crate::modes::common::settings_meta_plugin_types;
use clap::Command;
use log::{debug, error};
use rusqlite::Connection;
use log::{debug, error, warn};
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
pub struct MetaService;
/// Shared collector for metadata entries from plugins.
pub type MetaCollector = Arc<Mutex<Vec<(String, String)>>>;
pub struct MetaService {
save_meta: SaveMetaFn,
}
/// Sentinel plugin used as a placeholder when extracting plugins for parallel
/// execution. The original plugin is written back immediately after the threads
@@ -22,9 +27,49 @@ fn replace_plugin(plugins: &mut [Box<dyn MetaPlugin>], i: usize) -> Box<dyn Meta
std::mem::replace(&mut plugins[i], Box::new(NullMetaPlugin))
}
/// Stores metadata entries from a plugin response via the save_meta callback.
fn store_plugin_response(response: &MetaPluginResponse, save_meta: &SaveMetaFn) {
if let Ok(mut f) = save_meta.lock() {
for meta_data in &response.metadata {
f(&meta_data.name, &meta_data.value);
}
} else {
warn!(
"META_SERVICE: save_meta lock poisoned, dropping {} metadata entries",
response.metadata.len()
);
}
}
impl MetaService {
pub fn new() -> Self {
Self
/// Creates a new MetaService with the given save_meta callback.
///
/// All plugins created by this service will share this callback for
/// persisting metadata. The callback is wrapped in Arc<Mutex<>> so it
/// can be cloned into parallel-safe plugin threads.
pub fn new(save_meta: SaveMetaFn) -> Self {
Self { save_meta }
}
/// Creates a MetaService with a built-in Vec collector.
///
/// Returns both the service and the shared collector. Plugins write
/// metadata entries into the collector via the internal save_meta callback.
/// This eliminates the boilerplate of creating the Arc<Mutex<Vec<...>>> manually.
///
/// # Returns
///
/// A tuple of (MetaService, Arc<Mutex<Vec<(String, String)>>>) where the
/// collector accumulates (name, value) pairs from plugin responses.
pub fn with_collector() -> (Self, MetaCollector) {
let collected: MetaCollector = Arc::new(Mutex::new(Vec::new()));
let collector = collected.clone();
let save_meta: SaveMetaFn = Arc::new(Mutex::new(move |name: &str, value: &str| {
if let Ok(mut v) = collector.lock() {
v.push((name.to_string(), value.to_string()));
}
}));
(Self::new(save_meta), collected)
}
pub fn get_plugins(&self, cmd: &mut Command, settings: &Settings) -> Vec<Box<dyn MetaPlugin>> {
@@ -32,7 +77,7 @@ impl MetaService {
let meta_plugin_types: Vec<MetaPluginType> = settings_meta_plugin_types(cmd, settings);
debug!("META_SERVICE: Meta plugin types from settings: {meta_plugin_types:?}");
// Create plugins with their configuration
// Create plugins with their configuration and wire save_meta
let meta_plugins: Vec<Box<dyn MetaPlugin>> = meta_plugin_types
.iter()
.filter_map(|meta_plugin_type| {
@@ -66,7 +111,12 @@ impl MetaService {
(None, None)
};
match crate::meta_plugin::get_meta_plugin(meta_plugin_type.clone(), options, outputs) {
match crate::meta_plugin::get_meta_plugin_with_save(
meta_plugin_type.clone(),
options,
outputs,
Some(self.save_meta.clone()),
) {
Ok(plugin) => Some(plugin),
Err(e) => {
log::warn!("META_SERVICE: Failed to create plugin {meta_plugin_type:?}: {e}, skipping");
@@ -79,12 +129,7 @@ impl MetaService {
meta_plugins
}
pub fn initialize_plugins(
&self,
plugins: &mut [Box<dyn MetaPlugin>],
conn: &Connection,
item_id: i64,
) {
pub fn initialize_plugins(&self, plugins: &mut [Box<dyn MetaPlugin>]) {
// Check for duplicate output names before initializing plugins
let mut output_names: std::collections::HashMap<String, Vec<String>> =
std::collections::HashMap::new();
@@ -135,7 +180,6 @@ impl MetaService {
parallel_plugins.push(replace_plugin(plugins, i));
}
// Write results back to original slots sequentially (DB writes are serial)
let (results, panicked): (Vec<(usize, MetaPluginResponse)>, Vec<usize>) =
std::thread::scope(|s| {
let handles: Vec<_> = parallel_plugins
@@ -157,15 +201,13 @@ impl MetaService {
});
for (j, response) in results {
store_plugin_metadata(conn, item_id, &response);
store_plugin_response(&response, &self.save_meta);
let mut plugin = replace_plugin(&mut parallel_plugins, j);
if response.is_finalized {
plugin.set_finalized(true);
}
plugins[parallel_idx[j]] = plugin;
}
// Panicked plugins: restore the NullMetaPlugin sentinel and
// mark it finalized so future phases skip it cleanly.
for j in panicked {
let mut plugin = replace_plugin(&mut parallel_plugins, j);
plugin.set_finalized(true);
@@ -176,20 +218,14 @@ impl MetaService {
// Run sequential plugins
for &i in &sequential_idx {
let response = plugins[i].initialize();
store_plugin_metadata(conn, item_id, &response);
store_plugin_response(&response, &self.save_meta);
if response.is_finalized {
plugins[i].set_finalized(true);
}
}
}
pub fn process_chunk(
&self,
plugins: &mut [Box<dyn MetaPlugin>],
chunk: &[u8],
conn: &Connection,
item_id: i64,
) {
pub fn process_chunk(&self, plugins: &mut [Box<dyn MetaPlugin>], chunk: &[u8]) {
// Partition non-finalized plugins by parallel_safe
let (parallel_idx, sequential_idx): (Vec<usize>, Vec<usize>) = plugins
.iter()
@@ -200,7 +236,6 @@ impl MetaService {
// Run parallel-safe plugins concurrently on this chunk
if !parallel_idx.is_empty() {
// Extract plugins by unique index into a flat Vec indexed by position
let mut parallel_plugins: Vec<Box<dyn MetaPlugin>> =
Vec::with_capacity(parallel_idx.len());
for &i in &parallel_idx {
@@ -228,7 +263,7 @@ impl MetaService {
});
for (j, response) in results {
store_plugin_metadata(conn, item_id, &response);
store_plugin_response(&response, &self.save_meta);
let mut plugin = replace_plugin(&mut parallel_plugins, j);
if response.is_finalized {
plugin.set_finalized(true);
@@ -245,26 +280,21 @@ impl MetaService {
// Run sequential plugins
for &i in &sequential_idx {
let response = plugins[i].update(chunk);
store_plugin_metadata(conn, item_id, &response);
store_plugin_response(&response, &self.save_meta);
if response.is_finalized {
plugins[i].set_finalized(true);
}
}
}
pub fn finalize_plugins(
&self,
plugins: &mut [Box<dyn MetaPlugin>],
conn: &Connection,
item_id: i64,
) {
pub fn finalize_plugins(&self, plugins: &mut [Box<dyn MetaPlugin>]) {
for meta_plugin in plugins.iter_mut() {
if meta_plugin.is_finalized() {
continue;
}
let response = meta_plugin.finalize();
store_plugin_metadata(conn, item_id, &response);
store_plugin_response(&response, &self.save_meta);
if response.is_finalized {
meta_plugin.set_finalized(true);
@@ -273,22 +303,12 @@ impl MetaService {
}
/// Collects initial metadata from environment variables and hostname.
///
/// Gathers metadata from `KEEP_META_*` environment variables and adds hostname
/// if not already present.
///
/// # Returns
///
/// A `HashMap` of initial metadata key-value pairs.
///
/// # Examples
///
/// ```
/// # use keep::services::MetaService;
/// let service = MetaService::new();
/// let initial_meta = service.collect_initial_meta();
/// ```
pub fn collect_initial_meta(&self) -> HashMap<String, String> {
Self::collect_initial_meta_static()
}
/// Static version of collect_initial_meta for use without a MetaService instance.
pub fn collect_initial_meta_static() -> HashMap<String, String> {
let mut item_meta: HashMap<String, String> = crate::modes::common::get_meta_from_env();
if let Ok(hostname) = gethostname::gethostname().into_string()
@@ -299,34 +319,3 @@ impl MetaService {
item_meta
}
}
/// Stores metadata entries from a plugin response into the database.
///
/// # Arguments
///
/// * `conn` - Database connection.
/// * `item_id` - Item ID to associate with the metadata.
/// * `response` - The plugin response containing metadata.
fn store_plugin_metadata(conn: &Connection, item_id: i64, response: &MetaPluginResponse) {
for meta_data in &response.metadata {
let db_meta = crate::db::Meta {
id: item_id,
name: meta_data.name.clone(),
value: meta_data.value.clone(),
};
if let Err(e) = crate::db::store_meta(conn, db_meta) {
log::warn!("META_SERVICE: Failed to store metadata: {e}");
}
}
}
impl Default for MetaService {
/// Provides a default `MetaService` instance.
///
/// # Returns
///
/// A new `MetaService` via `new()`.
fn default() -> Self {
Self::new()
}
}

View File

@@ -1,25 +1,22 @@
pub mod async_data_service;
pub mod async_item_service;
/// Business logic services for the Keep application.
///
/// This module provides the core service layer that orchestrates item storage,
/// compression, metadata collection, and filtering. Services are used by both
/// local CLI modes and the HTTP server.
pub mod compression_service;
pub mod data_service;
pub mod error;
pub mod filter_service;
pub mod item_service;
pub mod meta_service;
pub mod status_service;
pub mod sync_data_service;
pub mod types;
pub mod utils;
pub use async_data_service::AsyncDataService;
pub use async_item_service::AsyncItemService;
pub use compression_service::CompressionService;
pub use data_service::DataService;
pub use error::CoreError;
pub use filter_service::{FilterService, register_filter_plugin};
pub use item_service::ItemService;
pub use meta_service::MetaService;
pub use status_service::StatusService;
pub use sync_data_service::SyncDataService;
pub use types::{ItemWithContent, ItemWithMeta};
pub use types::{ItemInfo, ItemWithContent, ItemWithMeta};
pub use utils::{calc_byte_range, extract_tags, parse_comma_tags};

View File

@@ -74,7 +74,7 @@ impl StatusService {
settings: &Settings,
data_path: PathBuf,
db_path: PathBuf,
) -> StatusInfo {
) -> anyhow::Result<StatusInfo> {
// Get meta plugins directly from config
let meta_plugin_types: Vec<MetaPluginType> =
crate::modes::common::settings_meta_plugin_types(cmd, settings);
@@ -91,10 +91,10 @@ impl StatusService {
db_path,
&meta_plugin_types,
enabled_compression_type,
);
)?;
// Add detailed filter plugins information
let filter_plugins_map = get_available_filter_plugins();
let filter_plugins_map = get_available_filter_plugins()?;
let mut filter_plugins_info = Vec::new();
for (name, creator) in filter_plugins_map {
@@ -114,7 +114,7 @@ impl StatusService {
// Add configured meta plugins information
status_info.configured_meta_plugins = settings.meta_plugins.clone();
status_info
Ok(status_info)
}
}

View File

@@ -1,364 +0,0 @@
use crate::common::status::StatusInfo;
use crate::compression_engine::{CompressionType, get_compression_engine};
use crate::config::Settings;
use crate::db::Item;
use crate::db::Meta;
use crate::modes::common::settings_compression_type;
use crate::services::data_service::DataService;
use crate::services::error::CoreError;
use crate::services::item_service::ItemService;
use crate::services::meta_service::MetaService;
use crate::services::status_service::StatusService;
use crate::services::types::{ItemWithContent, ItemWithMeta};
use clap::Command;
use rusqlite::Connection;
use std::collections::HashMap;
use std::io::{Cursor, Read, Write};
use std::path::{Path, PathBuf};
pub struct SyncDataService {
item_service: ItemService,
settings: Settings,
}
impl SyncDataService {
pub fn new(data_path: PathBuf, settings: Settings) -> Self {
Self {
item_service: ItemService::new(data_path),
settings,
}
}
pub fn with_connection(data_path: PathBuf, settings: Settings, _conn: &Connection) -> Self {
Self::new(data_path, settings)
}
pub fn item_service(&self) -> &ItemService {
&self.item_service
}
pub fn settings(&self) -> &Settings {
&self.settings
}
pub fn get_data_path(&self) -> &PathBuf {
self.item_service.get_data_path()
}
pub fn save_item<R: Read>(
&self,
content: R,
cmd: &mut Command,
settings: &Settings,
tags: &mut Vec<String>,
conn: &mut Connection,
) -> Result<Item, CoreError> {
self.item_service
.save_item(content, cmd, settings, tags, conn)
}
pub fn save_item_with_reader<R: Read>(
&self,
conn: &mut Connection,
reader: &mut R,
tags: Vec<String>,
metadata: HashMap<String, String>,
) -> Result<ItemWithMeta, CoreError> {
let mut cmd = Command::new("keep");
let settings = &self.settings;
let mut tags = tags;
// Read content from reader
let mut content = Vec::new();
reader.read_to_end(&mut content)?;
let item = self.save_item(&*content, &mut cmd, settings, &mut tags, conn)?;
let item_id = item
.id
.ok_or_else(|| CoreError::InvalidInput("Item missing ID".to_string()))?;
// Set metadata
for (key, value) in metadata {
crate::db::add_meta(conn, item_id, &key, &value)?;
}
self.get_item(conn, item_id)
}
/// Save an item with granular control over compression and meta plugins.
///
/// This method allows clients to control whether compression and meta plugins
/// run server-side or were already handled by the client.
///
/// # Arguments
///
/// * `conn` - Database connection.
/// * `content` - Raw content bytes.
/// * `tags` - Tags to associate with the item.
/// * `metadata` - Client-provided metadata.
/// * `compress` - Whether the server should compress the content.
/// * `run_meta` - Whether the server should run meta plugins.
///
/// # Returns
///
/// * `Result<ItemWithMeta, CoreError>` - The saved item with full details.
pub fn save_item_raw(
&self,
conn: &mut Connection,
content: &[u8],
tags: Vec<String>,
metadata: HashMap<String, String>,
compress: bool,
run_meta: bool,
) -> Result<ItemWithMeta, CoreError> {
let mut cursor = Cursor::new(content);
self.save_item_raw_streaming(conn, &mut cursor, tags, metadata, compress, run_meta)
}
/// Save an item from a streaming reader with granular control over compression.
///
/// Unlike `save_item_raw` which takes a pre-buffered `&[u8]`, this method
/// reads from the reader in chunks and writes directly to the compression
/// engine, avoiding buffering the entire content in memory.
pub fn save_item_raw_streaming(
&self,
conn: &mut Connection,
reader: &mut dyn Read,
tags: Vec<String>,
metadata: HashMap<String, String>,
compress: bool,
run_meta: bool,
) -> Result<ItemWithMeta, CoreError> {
let mut cmd = Command::new("keep");
let settings = &self.settings;
let mut tags = tags;
if tags.is_empty() {
tags.push("none".to_string());
}
let compression_type = if compress {
settings_compression_type(&mut cmd, settings)
} else {
CompressionType::None
};
let compression_engine = get_compression_engine(compression_type.clone())?;
let item_id;
let mut item;
{
item = crate::db::create_item(conn, compression_type.clone())?;
item_id = item
.id
.ok_or_else(|| CoreError::InvalidInput("Item missing ID".to_string()))?;
crate::db::set_item_tags(conn, item.clone(), &tags)?;
}
// Initialize meta plugins if requested
let meta_service = MetaService::new();
let mut plugins = if run_meta {
meta_service.get_plugins(&mut cmd, settings)
} else {
Vec::new()
};
if run_meta {
meta_service.initialize_plugins(&mut plugins, conn, item_id);
}
// Write content to file via streaming
let mut item_path = self.item_service.get_data_path().clone();
item_path.push(item_id.to_string());
let mut item_out = compression_engine.create(item_path)?;
let mut buffer = [0u8; crate::common::PIPESIZE];
let mut total_bytes = 0i64;
loop {
let n = reader.read(&mut buffer)?;
if n == 0 {
break;
}
item_out.write_all(&buffer[..n])?;
total_bytes += n as i64;
if run_meta {
meta_service.process_chunk(&mut plugins, &buffer[..n], conn, item_id);
}
}
item_out.flush()?;
drop(item_out);
// Finalize meta plugins
if run_meta {
meta_service.finalize_plugins(&mut plugins, conn, item_id);
}
// Add client-provided metadata
for (key, value) in &metadata {
crate::db::add_meta(conn, item_id, key, value)?;
}
item.size = Some(total_bytes);
crate::db::update_item(conn, item)?;
self.get_item(conn, item_id)
}
pub fn get_item(&self, conn: &mut Connection, id: i64) -> Result<ItemWithMeta, CoreError> {
self.item_service.get_item(conn, id)
}
pub fn get_item_content(
&self,
conn: &Connection,
id: i64,
) -> Result<ItemWithContent, CoreError> {
self.item_service.get_item_content(conn, id)
}
pub fn get_item_content_streaming(
&self,
conn: &Connection,
id: i64,
) -> Result<(Box<dyn Read + Send>, ItemWithMeta), CoreError> {
let (reader, _mime, _is_binary) = self
.item_service
.get_item_content_info_streaming(conn, id, None)?;
let item_with_meta = self.item_service.get_item(conn, id)?;
Ok((reader, item_with_meta))
}
pub fn list_items(
&self,
conn: &mut Connection,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, CoreError> {
self.item_service.list_items(conn, &tags, &meta)
}
pub fn delete_item(&self, conn: &mut Connection, id: i64) -> Result<Item, CoreError> {
let item_with_meta = self.item_service.get_item(conn, id)?;
let item = item_with_meta.item.clone();
self.item_service.delete_item(conn, id)?;
Ok(item)
}
pub fn find_item(
&self,
conn: &mut Connection,
ids: Vec<i64>,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<ItemWithMeta, CoreError> {
self.item_service.find_item(conn, &ids, &tags, &meta)
}
pub fn generate_status(
&self,
cmd: &mut Command,
settings: &Settings,
data_path: PathBuf,
db_path: PathBuf,
) -> StatusInfo {
let status_service = StatusService::new();
status_service.generate_status(cmd, settings, data_path, db_path)
}
}
impl DataService for SyncDataService {
type Error = CoreError;
fn save<R: Read>(
&self,
content: R,
cmd: &mut Command,
settings: &Settings,
mut tags: Vec<String>,
conn: &mut Connection,
) -> Result<Item, Self::Error> {
if tags.is_empty() {
tags.push("none".to_string());
}
self.item_service
.save_item(content, cmd, settings, &mut tags, conn)
}
fn get(&self, conn: &mut Connection, id: i64) -> Result<ItemWithMeta, Self::Error> {
self.get_item(conn, id)
}
fn get_content(
&self,
conn: &mut Connection,
id: i64,
) -> Result<(Box<dyn Read + Send>, ItemWithMeta), Self::Error> {
self.get_item_content_streaming(conn, id)
}
fn list(
&self,
conn: &mut Connection,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, Self::Error> {
self.list_items(conn, tags, meta)
}
fn delete(&self, conn: &mut Connection, id: i64) -> Result<Item, Self::Error> {
self.delete_item(conn, id)
}
fn find_item(
&self,
conn: &mut Connection,
ids: Vec<i64>,
tags: Vec<String>,
meta: HashMap<String, String>,
) -> Result<ItemWithMeta, Self::Error> {
self.find_item(conn, ids, tags, meta)
}
fn get_items(
&self,
conn: &mut Connection,
ids: &[i64],
tags: &[String],
meta: &HashMap<String, String>,
) -> Result<Vec<ItemWithMeta>, Self::Error> {
if ids.is_empty() {
return self.list_items(conn, tags.to_vec(), meta.clone());
}
let mut results = Vec::new();
for id in ids {
match self.get_item(conn, *id) {
Ok(item) => results.push(item),
Err(CoreError::ItemNotFound(_)) => continue,
Err(e) => return Err(e),
}
}
Ok(results)
}
fn generate_status(
&self,
settings: &Settings,
data_path: &Path,
db_path: &Path,
) -> Result<StatusInfo, Self::Error> {
let status_service = StatusService::new();
let mut cmd = Command::new("keep");
Ok(status_service.generate_status(
&mut cmd,
settings,
data_path.to_path_buf(),
db_path.to_path_buf(),
))
}
}

View File

@@ -40,6 +40,15 @@ impl ItemWithMeta {
.map(|m| (m.name, m.value))
.collect()
}
/// Returns a list of tag names for this item.
///
/// # Returns
///
/// `Vec<String>` - Tag names extracted from the tags list.
pub fn tag_names(&self) -> Vec<String> {
self.tags.iter().map(|t| t.name.clone()).collect()
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
@@ -53,3 +62,15 @@ pub struct ItemWithContent {
/// The content bytes.
pub content: Vec<u8>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ItemInfo {
pub id: i64,
pub ts: String,
pub uncompressed_size: Option<i64>,
pub compressed_size: Option<i64>,
pub closed: bool,
pub compression: String,
pub tags: Vec<String>,
pub metadata: HashMap<String, String>,
}

Some files were not shown because too many files have changed in this diff Show More