[python] Fix null blob write#7901
Open
XiaoHongbo-Hope wants to merge 5 commits into
Open
Conversation
Adds test_blob_write_read_end_to_end_with_null_values which writes a batch containing None values in an inline blob column through the standard write_arrow -> commit -> read path and asserts the NULLs round-trip. The test currently fails: BlobFileWriter._to_blob raises ValueError on None. PR apache#7847 added null support to BlobFormatWriter.add_element / write_value and FormatBlobReader, plus the FileIO.write_blob direct path, but the DataBlobWriter -> BlobWriter -> BlobFileWriter chain used by TableWrite.write_arrow still rejects None. This test is added as a no-fix reproduction; the writer fix lands in a follow-up commit.
BlobFileWriter._to_blob now returns None for None input instead of raising. The downstream BlobFormatWriter.add_element already encodes None as a -1 length marker, and FormatBlobReader returns None on read, so values now round-trip end-to-end through write_arrow. Also renames the e2e reproduction test added in the previous commit from test_blob_write_read_end_to_end_with_null_values to test_null_blob.
727405c to
e50b290
Compare
Contributor
|
+1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
PR #7847 added null support to
BlobFormatWriter.add_element/write_valueandFormatBlobReader, but the publicwrite_arrowpath (DataBlobWriter→BlobWriter→BlobFileWriter._to_blob) still rejectsNoneand raisesValueError, so writing a batch with a NULL inline blob fails end-to-end:This PR makes
BlobFileWriter._to_blobreturnNoneforNoneinput.Tests
DataBlobWriterTest.test_null_blob