The `\u` escape range in Section 6.4

6.4 Escape Sequences
https://www.w3.org/TR/2026/WD-rdf12-turtle-20260320/#sec-escapes

Section 6.4 says `\u` represents a Unicode code point in the ranges `U+0000` to `U+D7FF` and `U+E000` to `U+D7FF`. For a four-hex-digit escape, the non-surrogate part of the BMP should be `U+E000` to `U+FFFF`.

And there's a bigger problem: Unicode surrogates are not allowed. Allowing surrogates can cause problems with incomplete characters, but we (i18n WG) believe this issue shouldn't be resolved at the rdf-turtle level, but rather at a higher-level protocol.

rdf-turtle shouldn't prohibit surrogates, and it seems there's no such restriction in RDF 1.1 Turtle:

https://www.w3.org/TR/2014/REC-turtle-20140225/#sec-escapes

So it is also a breaking change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The `\u` escape range in Section 6.4 #131

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The \u escape range in Section 6.4 #131

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

The `\u` escape range in Section 6.4 #131