Skip to content

Ambiguity around whitespace inside netdoc arguments

From arti!2480 (comment 3086706)

In particular, the netdoc meta-format currently has:

KeywordLine ::= Keyword (WS Argument)*NL
...
WS = (SP | TAB)+
...
Argument := ArgumentChar+
ArgumentChar ::= any graphical printing ASCII character.

This leaves a parsing (lexing?) ambiguity when processing characters of an Argument and a whitespace character is encountered. According to the current spec it could be either part of the Argument (since "any graphical printing ASCII character" is permitted), or a separator between arguments. Similarly this doesn't unambiguously exclude NL inside an Argument.

I think ArgumentChar should explicitly exclude SP, TAB, and NL.

When updating this, @nickm also suggests adding a note along the lines of:

Note that this grammar doesn't preclude treating semantic arguments differently from the syntax describe here. For example, although the line published 2024-06-24 23:00:00 contains two syntactic Arguments, they are semantically treated as a single timestamp.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information