Files
Chromy/plans/11-improve-file-handling.md
T
2026-04-22 15:47:46 +02:00

1.1 KiB

11. Improve File Handling

Summary

Make file ingestion boundaries clearer by using Path, explicit UTF-8 decoding, and validation before reading.

Implementation Steps

  • Change internal file ingestion APIs to accept Path instead of raw str.
  • Convert CLI string paths to Path in the command adapter or handler.
  • Validate that the path exists and is a regular file before reading.
  • Read text with encoding="utf-8".
  • Raise a clear app-level file error for missing paths, directories, and decoding failures.
  • Leave PDF and future file loaders out of scope for now.

Public Interface Changes

  • CLI argument remains a file path string.
  • Error messages for missing or invalid files become clearer.

Test Plan

  • Test successful text-file loading.
  • Test missing file, directory path, and invalid UTF-8 handling.
  • Smoke test add-data with a valid UTF-8 file.

Assumptions

  • Only plain text ingestion is supported in this plan.
  • Existing metadata can continue storing the original path string as file_name unless a later plan changes metadata shape.