bd08c2bda33d23f6998675036f692b0c02abf331
chroma
A small command-line utility for working with a local Chroma database. It lets you create collections, ingest file contents as chunked embeddings, and run similarity queries against stored documents.
What it does
- manages local Chroma collections
- chunks files with
semchunk - generates embeddings with Chroma's default embedding function
- stores chunk text plus source file metadata
- queries collections and prints readable results
Requirements
- Python 3.12+
- a local environment able to install the project dependencies in
pyproject.toml
Installation
Using uv:
uv sync
Or with pip:
python -m venv .venv
source .venv/bin/activate
pip install -e .
Running the CLI
The project entrypoint is main.py.
uv run python main.py --help
Commands
list-collections | lc
create-collection | cc <collection>
delete-collection | dc <collection>
count | co <collection>
add-data | ad <collection> <file>
query | q <collection> <query_text>
Examples
Create a collection:
uv run python main.py create-collection notes
Add a file:
uv run python main.py add-data notes ./docs/example.txt
Count stored records:
uv run python main.py count notes
Search the collection:
uv run python main.py query notes "How do I configure this project?"
List collections:
uv run python main.py list-collections
Delete a collection:
uv run python main.py delete-collection notes
How ingestion works
When you run add-data, the file is:
- read from disk
- split into chunks
- embedded
- inserted into the target collection with the original file path stored as metadata
Query results include the stored document chunk, its id, distance, and file name when available.
Notes
- collections are stored in a local persistent Chroma database
add-datarequires the target collection to already exist- the CLI prints friendly messages for common errors such as missing collections or missing files
Description
Languages
Python
100%