Commit 642349c
committed
refactor(pipeline): replace heading-level chunking with greedy token-based chunker
- Drop --chunk-level and auto-detect logic; chunk by flatten+greedy merge
- Improve verbose chunk preview with token count + head/tail snippet
- Interactive mode uses minimum heading level
- Add DocumentTree.get_chunks_at_level() and tests1 parent cf42bda commit 642349c
5 files changed
Lines changed: 260 additions & 84 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
257 | 257 | | |
258 | 258 | | |
259 | 259 | | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | 260 | | |
266 | 261 | | |
267 | 262 | | |
| |||
287 | 282 | | |
288 | 283 | | |
289 | 284 | | |
290 | | - | |
| 285 | + | |
291 | 286 | | |
292 | 287 | | |
293 | 288 | | |
| |||
341 | 336 | | |
342 | 337 | | |
343 | 338 | | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
349 | 339 | | |
350 | 340 | | |
351 | | - | |
352 | 341 | | |
353 | 342 | | |
354 | 343 | | |
355 | 344 | | |
356 | 345 | | |
357 | 346 | | |
| 347 | + | |
| 348 | + | |
358 | 349 | | |
359 | 350 | | |
360 | | - | |
| 351 | + | |
361 | 352 | | |
362 | 353 | | |
363 | 354 | | |
| |||
370 | 361 | | |
371 | 362 | | |
372 | 363 | | |
373 | | - | |
374 | 364 | | |
375 | 365 | | |
376 | 366 | | |
| |||
380 | 370 | | |
381 | 371 | | |
382 | 372 | | |
| 373 | + | |
| 374 | + | |
383 | 375 | | |
384 | 376 | | |
385 | | - | |
| 377 | + | |
386 | 378 | | |
387 | | - | |
388 | | - | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
389 | 389 | | |
390 | 390 | | |
391 | 391 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
149 | 184 | | |
150 | 185 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
13 | 12 | | |
14 | 13 | | |
15 | 14 | | |
0 commit comments