davide
|
5126e0d971
|
step-5: add adaptive chunker
chunker.py splits any revised Markdown (step-4) into RAG-ready chunks.
Supports 4 strategies driven by structure_profile.json: h3_aware,
h2_paragraph_split, paragraph, sliding_window. Respects MIN/MAX_CHARS
and sentence-level overlap. Updates .gitignore and README paths.
|
2026-04-13 13:48:51 +02:00 |
|
davide
|
1631dff80d
|
step-4: add revise.py, step4-review skill, README update
- revise.py: automatic pre-processing (ALL-CAPS→##, numbered sections→###,
TOC removal, broken paragraph merging, whitespace normalization);
supports N. and Na. numbering patterns; universal heuristics
- .claude/commands/step4-review.md: Claude Code skill for qualitative
review of clean.md (🔴/🟡/🟢 report + interactive fixes)
- README: document step-4 workflow with revise.py and /step4-review
- .gitignore: exclude step-4/*/ and step-4/revision_log.md
|
2026-04-13 12:21:30 +02:00 |
|