README.md

# chroma

A small command-line utility for working with a local Chroma database. It lets you create collections, ingest file contents as chunked embeddings, and run similarity queries against stored documents.

## What it does

- manages local Chroma collections
- chunks files with `semchunk`
- generates embeddings with Chroma's default embedding function
- stores chunk text plus source file metadata
- queries collections and prints readable results

## Requirements

- Python 3.12+
- a local environment able to install the project dependencies in `pyproject.toml`

## Installation

Using `uv`:

```bash
uv sync
```

Or with pip:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -e .
```

## Running the CLI

The project entrypoint is `main.py`.

```bash
uv run python main.py --help
```

## Commands

```text
list-collections | lc
create-collection | cc <collection>
delete-collection | dc <collection>
count | co <collection>
add-data | ad <collection> <file>
query | q <collection> <query_text>
```

### Examples

Create a collection:

```bash
uv run python main.py create-collection notes
```

Add a file:

```bash
uv run python main.py add-data notes ./docs/example.txt
```

Count stored records:

```bash
uv run python main.py count notes
```

Search the collection:

```bash
uv run python main.py query notes "How do I configure this project?"
```

List collections:

```bash
uv run python main.py list-collections
```

Delete a collection:

```bash
uv run python main.py delete-collection notes
```

## How ingestion works

When you run `add-data`, the file is:

1. read from disk
2. split into chunks
3. embedded
4. inserted into the target collection with the original file path stored as metadata

Query results include the stored document chunk, its id, distance, and file name when available.

## Notes

- collections are stored in a local persistent Chroma database
- `add-data` requires the target collection to already exist
- the CLI prints friendly messages for common errors such as missing collections or missing files
add readme 2026-04-21 20:13:28 +02:00			`# chroma`

			`A small command-line utility for working with a local Chroma database. It lets you create collections, ingest file contents as chunked embeddings, and run similarity queries against stored documents.`

			`## What it does`

			`- manages local Chroma collections`
			- chunks files with `semchunk`
			`- generates embeddings with Chroma's default embedding function`
			`- stores chunk text plus source file metadata`
			`- queries collections and prints readable results`

			`## Requirements`

			`- Python 3.12+`
			- a local environment able to install the project dependencies in `pyproject.toml`

			`## Installation`

			Using `uv`:

			```bash
			`uv sync`
			```

			`Or with pip:`

			```bash
			`python -m venv .venv`
			`source .venv/bin/activate`
			`pip install -e .`
			```

			`## Running the CLI`

			The project entrypoint is `main.py`.

			```bash
			`uv run python main.py --help`
			```

			`## Commands`

			```text
			`list-collections \| lc`
			`create-collection \| cc <collection>`
			`delete-collection \| dc <collection>`
			`count \| co <collection>`
			`add-data \| ad <collection> <file>`
			`query \| q <collection> <query_text>`
			```

			`### Examples`

			`Create a collection:`

			```bash
			`uv run python main.py create-collection notes`
			```

			`Add a file:`

			```bash
			`uv run python main.py add-data notes ./docs/example.txt`
			```

			`Count stored records:`

			```bash
			`uv run python main.py count notes`
			```

			`Search the collection:`

			```bash
			`uv run python main.py query notes "How do I configure this project?"`
			```

			`List collections:`

			```bash
			`uv run python main.py list-collections`
			```

			`Delete a collection:`

			```bash
			`uv run python main.py delete-collection notes`
			```

			`## How ingestion works`

			When you run `add-data`, the file is:

			`1. read from disk`
			`2. split into chunks`
			`3. embedded`
			`4. inserted into the target collection with the original file path stored as metadata`

			`Query results include the stored document chunk, its id, distance, and file name when available.`

			`## Notes`

			`- collections are stored in a local persistent Chroma database`
			- `add-data` requires the target collection to already exist
			`- the CLI prints friendly messages for common errors such as missing collections or missing files`