davide
7d95872a8e
step-8: add ingest.py, align README
...
- ingest.py: embed chunks via Ollama nomic-embed-text, index in
ChromaDB (cosine space); --stem / --force / batch-100 / ETA display
- README: fix step-8 input path (step-5 → step-6), script path
(scripts/ → step-8/), add --force explanation and real timings
2026-04-14 10:59:40 +02:00
davide
a5f8b8d119
step-7: add check_env.py, README, update requirements
...
- check_env.py: verifica ollama, embedding model, LLM model, chromadb
- Rileva qualsiasi modello embedding/LLM installato (non lista fissa)
- step-7/README.md: guida installazione/disinstallazione Ollama, modelli, chromadb
- requirements.txt: aggiunge chromadb per step-8
2026-04-14 07:54:04 +02:00
davide
e70a9a41f0
step-6: add fix_chunks.py, make step-6 self-contained
...
- verify_chunks.py now reads from step-6/<stem>/chunks.json and
auto-copies from step-5 on first run (input and output both in step-6)
- fix_chunks.py: new script that applies fixes directly on chunks.json
(merge too-short/incomplete, split too-long, remove empty, add prefix)
supports --dry-run to preview changes before applying
- step6-fix.md skill updated to use fix_chunks.py workflow:
dry-run → user approval → apply → re-verify
2026-04-13 23:56:50 +02:00
davide
5126e0d971
step-5: add adaptive chunker
...
chunker.py splits any revised Markdown (step-4) into RAG-ready chunks.
Supports 4 strategies driven by structure_profile.json: h3_aware,
h2_paragraph_split, paragraph, sliding_window. Respects MIN/MAX_CHARS
and sentence-level overlap. Updates .gitignore and README paths.
2026-04-13 13:48:51 +02:00
davide
1631dff80d
step-4: add revise.py, step4-review skill, README update
...
- revise.py: automatic pre-processing (ALL-CAPS→##, numbered sections→###,
TOC removal, broken paragraph merging, whitespace normalization);
supports N. and Na. numbering patterns; universal heuristics
- .claude/commands/step4-review.md: Claude Code skill for qualitative
review of clean.md (🔴 /🟡 /🟢 report + interactive fixes)
- README: document step-4 workflow with revise.py and /step4-review
- .gitignore: exclude step-4/*/ and step-4/revision_log.md
2026-04-13 12:21:30 +02:00
davide
638ba17629
Inital commit
2026-04-12 23:53:13 +02:00