Skip to content

[Medium] parseFileAndRemoveComments throws std::out_of_range on a comment before the first newline #133

@zhaozhiwen

Description

@zhaozhiwen

A comment marker that appears before any newline (e.g. a comment on line 1, or no trailing newline) makes erase receive npos as its start position and throw std::out_of_range.

Affected files

  • gemc/guts/gutilities.cc:388-393

Details
The comment-stripping loop computes the surrounding newlines and erases between them:

size_t nFPos;
while ((nFPos = parsedString.find(commentChars)) != string::npos) {
    size_t firstNL  = parsedString.rfind('\n', nFPos);
    size_t secondNL = parsedString.find('\n', nFPos);
    parsedString.erase(firstNL, secondNL - firstNL);
}

rfind('\n', nFPos) returns string::npos when no newline precedes the comment marker — i.e. the comment is on the first line. parsedString.erase(npos, ...) throws std::out_of_range("basic_string::erase").

There is also a latent length issue: when secondNL == npos (comment on the last line with no trailing newline), secondNL - firstNL underflows; erase clamps an over-long count so it does not throw, but the intent (erase to end of buffer) should be expressed explicitly.

Impact
This is a public utility (declared in gutilities.h). It has no in-tree caller on this commit, so it is currently latent, but any caller passing a file whose first line is a comment will crash.

Proposed fix
Treat firstNL == npos as "erase from the start of the buffer" (position 0) and secondNL == npos as "erase to end".

 	size_t nFPos;
 	while ((nFPos = parsedString.find(commentChars)) != string::npos) {
 		size_t firstNL  = parsedString.rfind('\n', nFPos);
 		size_t secondNL = parsedString.find('\n', nFPos);
-		parsedString.erase(firstNL, secondNL - firstNL);
+		size_t eraseStart = (firstNL == string::npos) ? 0 : firstNL;
+		size_t eraseLen   = (secondNL == string::npos)
+		                      ? string::npos
+		                      : secondNL - eraseStart;
+		parsedString.erase(eraseStart, eraseLen);
 	}

With firstNL == npos the erase now starts at 0; with secondNL == npos it erases to the end of the string (erase accepts npos as count).


Split out from #102 (one issue per bug). Found via an AI-assisted code review with manual verification against commit 5f8ce875.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Planned

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions