Ambiguity around whitespace inside netdoc arguments
From arti!2480 (comment 3086706)
In particular, the netdoc meta-format currently has:
KeywordLine ::= Keyword (WS Argument)*NL
...
WS = (SP | TAB)+
...
Argument := ArgumentChar+
ArgumentChar ::= any graphical printing ASCII character.
This leaves a parsing (lexing?) ambiguity when processing characters of an Argument
and a whitespace character is encountered. According to the current spec it could be either part of the Argument
(since "any graphical printing ASCII character" is permitted), or a separator between arguments. Similarly this doesn't unambiguously exclude NL
inside an Argument
.
I think ArgumentChar
should explicitly exclude SP
, TAB
, and NL
.
When updating this, @nickm also suggests adding a note along the lines of:
Note that this grammar doesn't preclude treating semantic arguments differently from the syntax describe here. For example, although the line
published 2024-06-24 23:00:00
contains two syntactic Arguments, they are semantically treated as a single timestamp.