Skip to content

[fix](scan) Fix adaptive load batch sizing#63245

Open
mrhhsg wants to merge 1 commit into
apache:branch-4.1from
mrhhsg:codex/fix-load-adaptive-batch-size-4.1-v2
Open

[fix](scan) Fix adaptive load batch sizing#63245
mrhhsg wants to merge 1 commit into
apache:branch-4.1from
mrhhsg:codex/fix-load-adaptive-batch-size-4.1-v2

Conversation

@mrhhsg
Copy link
Copy Markdown
Member

@mrhhsg mrhhsg commented May 14, 2026

Summary

Related PR: #63005

  • update the file scan load path to feed converted blocks into adaptive batch size prediction before and after varchar truncation

Root Cause

The load path initialized adaptive batch sizing but did not update the predictor after converting loaded source blocks. As a result, the reader could keep using the initial small probe size.

Validation

  • git diff --check
  • git diff --cached --check
  • ./run-regression-test.sh --run -d unique_with_mow_c_p0 -s test_compact_with_seq2

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg mrhhsg force-pushed the codex/fix-load-adaptive-batch-size-4.1-v2 branch from 6e85a65 to 9b2cada Compare May 14, 2026 07:43
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 14, 2026

run buildall

@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented May 14, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review summary for PR 63245 at 9b2cada8649f99e29ec94850348ed4efdef50116:

  • Scope reviewed: full PR diff against base 97d04001d7ef49a7e10589d242fd16ce8fa0dc6b; the PR changes only be/src/exec/scan/file_scanner.cpp by adding adaptive batch-size updates before and after varchar/char truncation.
  • Existing review context: no existing inline review comments or replies were present, so no duplicate concerns were found.
  • Correctness checkpoint: no blocking correctness issue found. The predictor is updated after conversion and before truncation, which matches the stated goal of learning from the pre-truncation materialized block; the post-truncation call only records observability counters and does not affect prediction.
  • Safety/regression checkpoint: no new error path or ownership/lifetime issue found. The helper methods already guard _should_run_adaptive_batch_size() and tolerate empty blocks before calling update().
  • Compatibility checkpoint: no external API, serialization, or persisted data compatibility impact found.
  • Testing/validation checkpoint: git diff --check 97d04001d7ef49a7e10589d242fd16ce8fa0dc6b 9b2cada8649f99e29ec94850348ed4efdef50116 passed locally in the runner. I did not run the broader Doris regression suite due to scope/runtime.
  • User focus: no additional user-provided review focus was specified.
  • Repository review instructions: no AGENTS.md or repository code-review skill file was present in this checkout, so I applied the explicit review prompt requirements.

Opinion: no critical blocking issues found.

@mrhhsg mrhhsg marked this pull request as ready for review May 14, 2026 13:37
@mrhhsg mrhhsg requested a review from yiguolei as a code owner May 14, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants