add readme
This commit is contained in:
@@ -0,0 +1,106 @@
|
|||||||
|
# chroma
|
||||||
|
|
||||||
|
A small command-line utility for working with a local Chroma database. It lets you create collections, ingest file contents as chunked embeddings, and run similarity queries against stored documents.
|
||||||
|
|
||||||
|
## What it does
|
||||||
|
|
||||||
|
- manages local Chroma collections
|
||||||
|
- chunks files with `semchunk`
|
||||||
|
- generates embeddings with Chroma's default embedding function
|
||||||
|
- stores chunk text plus source file metadata
|
||||||
|
- queries collections and prints readable results
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- Python 3.12+
|
||||||
|
- a local environment able to install the project dependencies in `pyproject.toml`
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
Using `uv`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv sync
|
||||||
|
```
|
||||||
|
|
||||||
|
Or with pip:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
pip install -e .
|
||||||
|
```
|
||||||
|
|
||||||
|
## Running the CLI
|
||||||
|
|
||||||
|
The project entrypoint is `main.py`.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python main.py --help
|
||||||
|
```
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
```text
|
||||||
|
list-collections | lc
|
||||||
|
create-collection | cc <collection>
|
||||||
|
delete-collection | dc <collection>
|
||||||
|
count | co <collection>
|
||||||
|
add-data | ad <collection> <file>
|
||||||
|
query | q <collection> <query_text>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
|
||||||
|
Create a collection:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python main.py create-collection notes
|
||||||
|
```
|
||||||
|
|
||||||
|
Add a file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python main.py add-data notes ./docs/example.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
Count stored records:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python main.py count notes
|
||||||
|
```
|
||||||
|
|
||||||
|
Search the collection:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python main.py query notes "How do I configure this project?"
|
||||||
|
```
|
||||||
|
|
||||||
|
List collections:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python main.py list-collections
|
||||||
|
```
|
||||||
|
|
||||||
|
Delete a collection:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python main.py delete-collection notes
|
||||||
|
```
|
||||||
|
|
||||||
|
## How ingestion works
|
||||||
|
|
||||||
|
When you run `add-data`, the file is:
|
||||||
|
|
||||||
|
1. read from disk
|
||||||
|
2. split into chunks
|
||||||
|
3. embedded
|
||||||
|
4. inserted into the target collection with the original file path stored as metadata
|
||||||
|
|
||||||
|
Query results include the stored document chunk, its id, distance, and file name when available.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- collections are stored in a local persistent Chroma database
|
||||||
|
- `add-data` requires the target collection to already exist
|
||||||
|
- the CLI prints friendly messages for common errors such as missing collections or missing files
|
||||||
|
|||||||
Reference in New Issue
Block a user