Skip to content

CLDSRV-908: CopyObject handle checksums#6176

Open
leif-scality wants to merge 8 commits into
development/9.4from
improvement/CLDSRV-908-copy-object-handle-checksums
Open

CLDSRV-908: CopyObject handle checksums#6176
leif-scality wants to merge 8 commits into
development/9.4from
improvement/CLDSRV-908-copy-object-handle-checksums

Conversation

@leif-scality
Copy link
Copy Markdown
Contributor

@leif-scality leif-scality commented May 27, 2026

  • Forward a src object checksum to the dest object
  • Recompute the checksum when required (x-amz-checksum-algorithm header set, src object MPU with COMPOSITE checksum, ...). For external backends if a new GET is required then the checksum is not recalculated and the dest object gets no checksum. For local backends we always recalculate even if it requires a GET.
  • Compute a CRC64NMVE for the dest object if the src object has no checksum (counts as a recompute so the same conditions as above apply)
  • PFS and TLP are treated both as external backends
  ┌─────┬─────────────────────┬────────────────────────────────────┬──────────────────────────────────────────────────────────────────┐
  │  #  │   Source checksum   │ x-amz-checksum-algorithm requested │                            Recompute?                            │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 1   │ none                │ none                               │ Yes (Compute default CRC64NVME)                                  │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 2   │ FULL_OBJECT, algo X │ none                               │ No — propagate as-is                                             │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 3   │ FULL_OBJECT, algo X │ X (same algo)                      │ No — propagate as-is                                             │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 4   │ FULL_OBJECT, algo X │ Y (different algo)                 │ Yes                                                              │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 5   │ COMPOSITE, algo X   │ none                               │ Yes (can't propagate a MPU format digest to a single-object dest)│
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 6   │ COMPOSITE, algo X   │ X (same algo)                      │ Yes                                                              │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 7   │ COMPOSITE, algo X   │ Y (different algo)                 │ Yes                                                              │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 8   │ none                │ any algo                           │ Yes (source has no digest, must compute)                         │
  ├─────┼─────────────────────┼────────────────────────────────────┼──────────────────────────────────────────────────────────────────┤
  │ 9   │ any                 │ source is 0-byte                   │ Yes (special-cased — empty-bytes digest, no streaming)           │
  └─────┴─────────────────────┴────────────────────────────────────┴──────────────────────────────────────────────────────────────────┘
Backend / scenario Copy method Dest object gets checksum
file GET + PUT + DELETE Yes
mem GET + PUT + DELETE Yes
sproxyd / scality GET + PUT + DELETE Yes
aws_s3 (same location) Native (CopyObject) Only if propagated (none when recompute needed)
azure (same location) Native (beginCopyFromURL/CopyBlob) Only if propagated (none when recompute needed)
gcp (same location) Native (copyObject) Only if propagated (none when recompute needed)
external, cross-location, same type — propagate (e.g. aws us-east-1 → us-east-2, FULL_OBJECT/matching algo) Native (CopyObject/…, server-side cross-region) Yes (copied)
external, cross-location, same type — recompute Native (CopyObject/…) — recompute skipped (was GET + PUT + DELETE before the fix) No
external, cross-type or external ↔ local (e.g. aws → azure, aws → file) GET + PUT + DELETE Yes
external, any + SSE GET + PUT + DELETE Yes
copy-to-self, local backend (same location, no SSE, unversioned) metadata-only — reuse existing location (GET only, to hash) Yes
copy-to-self, external backend (same location, no SSE, unversioned) metadata-only — reuse existing location Only if propagated (none when recompute needed)
copy-to-self → different slot, same external type (e.g. versioned new version, or cross-location same type) Native (CopyObject/…) → new slot Only if propagated (none when recompute needed)
copy-to-self → different slot, local / cross-type / SSE (storage-class / cross-backend rewrite) GET + PUT + DELETE (incl. data.batchDelete of the old slot) Yes
0-byte source (same location / local) metadata-only — no data Yes
0-byte source → different external backend PUT (data.put(null)) Yes

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented May 27, 2026

Hello leif-scality,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented May 27, 2026

Incorrect fix version

The Fix Version/s in issue CLDSRV-908 contains:

  • None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 9.4.0

Please check the Fix Version/s of CLDSRV-908, or the target
branch of this pull request.

Comment thread lib/api/objectCopy.js Dismissed
Comment thread tests/functional/aws-node-sdk/test/object/objectCopy.js Outdated
@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

LGTM — clean implementation of checksum propagation and recompute on CopyObject. Stream handling (jsutil.once guards, Azure per-part passthrough, error propagation) and the _shouldRecomputeChecksum decision logic are solid. One minor test style issue flagged inline.

- require() inside describe/withV4 callback in functional test — move to top of file

Review by Claude Code

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 84.50704% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.32%. Comparing base (580d648) to head (22f8acf).
⚠️ Report is 3 commits behind head on development/9.4.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
lib/api/objectCopy.js 83.45% 22 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

Files with missing lines Coverage Δ
lib/api/apiUtils/integrity/validateChecksums.js 99.08% <100.00%> (+0.03%) ⬆️
lib/api/objectCopy.js 89.91% <83.45%> (-1.58%) ⬇️

... and 2 files with indirect coverage changes

@@                 Coverage Diff                 @@
##           development/9.4    #6176      +/-   ##
===================================================
+ Coverage            85.25%   85.32%   +0.06%     
===================================================
  Files                  208      208              
  Lines                13919    14048     +129     
===================================================
+ Hits                 11867    11986     +119     
- Misses                2052     2062      +10     
Flag Coverage Δ
file-ft-tests 68.84% <64.78%> (+0.03%) ⬆️
kmip-ft-tests 28.11% <27.46%> (-0.01%) ⬇️
mongo-v0-ft-tests 69.98% <64.78%> (-0.01%) ⬇️
mongo-v1-ft-tests 69.94% <64.78%> (-0.13%) ⬇️
multiple-backend 36.32% <5.63%> (-0.29%) ⬇️
sur-tests 35.24% <9.15%> (-1.13%) ⬇️
sur-tests-inflights 37.11% <9.15%> (-0.23%) ⬇️
unit 72.30% <84.50%> (+0.27%) ⬆️
utapi-v2-tests 34.53% <27.46%> (-0.25%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 5ca489a to 57d9894 Compare May 28, 2026 09:41
@claude
Copy link
Copy Markdown

claude Bot commented May 28, 2026

LGTM

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 57d9894 to 5bd2262 Compare May 28, 2026 13:29
Comment thread lib/api/objectCopy.js Outdated
Comment thread lib/api/objectCopy.js Outdated
@claude
Copy link
Copy Markdown

claude Bot commented May 28, 2026

  • Stream leak on source read error: when sourceStream errors in the recompute path, checksumStream is not destroyed, so data.put (reading from it) may hang indefinitely
    - Destroy checksumStream in the sourceStream error handler
    - Data orphan on copy-to-self with recompute: the recompute path writes new data via data.put, but deleteExistingData skips cleanup when sourceIsDestination is true, orphaning the old data locations
    - Track whether new data was written in the recompute path and allow cleanup in deleteExistingData accordingly

    Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 5bd2262 to 8b2b196 Compare May 28, 2026 15:13
@claude
Copy link
Copy Markdown

claude Bot commented May 28, 2026

LGTM

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 8b2b196 to 720f686 Compare May 28, 2026 16:28
@claude
Copy link
Copy Markdown

claude Bot commented May 28, 2026

LGTM

Review by Claude Code

Comment thread lib/api/apiUtils/integrity/validateChecksums.js Outdated
Comment thread lib/api/objectCopy.js Outdated
Comment thread lib/api/objectCopy.js Outdated
Comment on lines +67 to +68
if (err) {
done(err);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this error happens, it could be any error and the perPart is still piped.

So it could theoretically receive data from azure and continue to stream during a short time window until the final callback calls passthrough.destroy and unpipes the source perPart.

Maybe the perPart should be destroyed here to stop immediately the streaming

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would make every thing more complex for little gain, lets just let error propagate naturally

Comment thread lib/api/objectCopy.js Outdated
// into the master passthrough and use its 'end' as the completion
// signal — same pattern arsenal's data.copyObject uses.
const perPart = new PassThrough();
perPart.once('error', done);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider encapsulating the error to include some part details in the error for better troubleshooting if the error is ever logged somewhere else in the streaming path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a commit improving theerrors

Comment thread lib/api/objectCopy.js Outdated
// and masterKeyId stored properly in metadata
if (sourceIsDestination && storeMetadataParams.locationMatch
&& !isVersionedObj && !needsEncryption) {
&& !isVersionedObj && !needsEncryption && !shouldRecomputeChecksum) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't skip, you'll trigger a data GET + PUT + DELETE (for previous location).

This only for checksum recomputation. Should you rather define another path where if it's only recompute checksum, you allow only the GET to stream into the ChecksumTransform stream and then you discard the end of the data stream, and avoid having to do a data PUT + DELETE

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed I now Get and recompute but dont PUT and DELETE. I also removed recompute for external backend cases, so if the data does not go through cloudserver we dont GET, and the dest object receives no checksum if a recompute was required

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 720f686 to 6385c3e Compare June 3, 2026 10:56
Comment thread lib/api/objectCopy.js Dismissed
Comment thread lib/api/objectCopy.js
@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

Well-structured PR with thorough test coverage across all checksum scenarios (propagation, recompute, 0-byte, COMPOSITE, multi-part, Azure, copy-to-self, external backends).

- Stream leak on error in _recomputeChecksumAndStore: when one of sourceStream/checksumStream errors, the other is not destroyed, potentially leaking file descriptors from data.get. Add explicit destroy() calls on both streams in the error branch of the finish/done callbacks (applies to both the copy-to-self and stream-through-and-put paths).

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 6385c3e to 8ede498 Compare June 3, 2026 11:50
Comment thread lib/api/objectCopy.js Outdated
Comment thread tests/unit/api/objectCopy.js
@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

  • Potential data loss on copy-to-self with external backend (skipExternalBackendRecompute path): deleteExistingData at line 858 of objectCopy.js will delete the existing data locations even though no new data was written, because _shouldRecomputeChecksum() returns true but the recompute was skipped. The _dataReused flag is only set in _recomputeChecksumAndStore, not in the goGetData early-return path. Fix: set storeMetadataParams._dataReused = true before the early return at line 687.
    - The unit test at line 1152 (should early-return on copy-to-self) should also spy on data.batchDelete to catch this.

    Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 8ede498 to df2187c Compare June 3, 2026 14:36
Comment thread lib/api/objectCopy.js
@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

LGTM — solid implementation of all 9 checksum cases from the PR matrix. The propagation/recompute/default logic is clean, the _orphanedDataLocations helper correctly fixes orphan cleanup for copy-to-self rewrites, and tests cover all key paths (multi-part source, Azure backend, SSE, copy-to-self metadata-only, versioned, 0-byte, external backend skip).

One finding: streams are not destroyed on error in _recomputeChecksumAndStore — see inline comment.

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from df2187c to 8da10f6 Compare June 3, 2026 15:25
@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

LGTM

Thorough implementation. The truth table in the PR description maps cleanly to the _shouldRecomputeChecksum logic. Stream error handling is solid throughout (jsutil.once guards, per-stream error listeners, proper destroy on failure). The _orphanedDataLocations fix correctly replaces the blanket !sourceIsDestination guard with identity-based filtering, which closes the orphan leak on cross-backend copy-to-self. Test coverage is comprehensive across all nine cases, including Azure, SSE, multi-part, and 0-byte edge cases.

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 8da10f6 to 20b6ba3 Compare June 3, 2026 15:49
@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

LGTM

Thorough checksum handling for CopyObject with clean stream management, proper error handling via jsutil.once, and well-structured orphan detection via _orphanedDataLocations. The nine cases from the PR description are all covered by the implementation and test suite. No correctness issues found.

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 20b6ba3 to 4c36bc2 Compare June 3, 2026 17:00
@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

LGTM

Review by Claude Code

Copy-to-self reuses the existing data location, so the bytes never move.
On an external backend a checksum recompute would mean a GET purely to
hash. Skip it for any external backend (constants.externalBackends, incl.
pfs/tlp which have no native server-side copy); local backends still
recompute in place. The byte-moving path keeps using externalBackendCopy.
@leif-scality leif-scality force-pushed the improvement/CLDSRV-908-copy-object-handle-checksums branch from 4c36bc2 to 22f8acf Compare June 3, 2026 19:47
Comment thread lib/api/objectCopy.js
@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

LGTM — well-structured implementation with thorough test coverage across all the checksum scenarios (propagation, recompute, 0-byte, multi-part, Azure, copy-to-self, external backends, orphan cleanup, legacy string locations).

One note posted inline:

- Pre-existing needsEncryption bug (.algo vs .algorithm at line 692) makes the SSE guard in the new _recomputeChecksumAndStore copy-to-self path ineffective — not a regression from this PR, but worth tracking as a follow-up fix

Review by Claude Code

Comment on lines +969 to +983
it('should lowercase mixed-case input', () => {
const result = getCopyObjectChecksumAlgorithm({
'x-amz-checksum-algorithm': 'Sha256',
});
assert.strictEqual(result.error, null);
assert.strictEqual(result.algorithm, 'sha256');
});

it('should accept algorithms already in lowercase', () => {
const result = getCopyObjectChecksumAlgorithm({
'x-amz-checksum-algorithm': 'crc32',
});
assert.strictEqual(result.error, null);
assert.strictEqual(result.algorithm, 'crc32');
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could include these 2 inside the validAlgorithms array above to run in the loop, the test body is the same

Comment on lines +1010 to +1018
it('should produce an arsenal InvalidRequest error via the standard mapping', () => {
const result = getCopyObjectChecksumAlgorithm({
'x-amz-checksum-algorithm': 'GARBAGE',
});
const err = arsenalErrorFromChecksumError(result.error);
assert.strictEqual(err.is.InvalidRequest, true);
assert(err.description.includes('Checksum algorithm provided is unsupported'),
'expected AWS-shaped error description');
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be related to getCopyObjectChecksumAlgorithm but rather arsenalErrorFromChecksumError maybe you should have another describe separate for that function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants