summaryrefslogtreecommitdiff
path: root/test/spec.txt
diff options
context:
space:
mode:
Diffstat (limited to 'test/spec.txt')
-rw-r--r--test/spec.txt22
1 files changed, 12 insertions, 10 deletions
diff --git a/test/spec.txt b/test/spec.txt
index 9b2b977..6c660bb 100644
--- a/test/spec.txt
+++ b/test/spec.txt
@@ -212,12 +212,8 @@ to a certain encoding.
A [line](@line) is a sequence of zero or more [character]s
followed by a [line ending] or by the end of file.
-A [line ending](@line-ending) is, depending on the platform, a
-newline (`U+000A`), carriage return (`U+000D`), or
-carriage return + newline.
-
-For security reasons, a conforming parser must strip or replace the
-Unicode character `U+0000`.
+A [line ending](@line-ending) is a newline (`U+000A`), carriage return
+(`U+000D`), or carriage return + newline.
A line containing no characters, or a line containing only spaces
(`U+0020`) or tabs (`U+0009`), is called a [blank line](@blank-line).
@@ -270,6 +266,11 @@ Tabs in lines are expanded to spaces, with a tab stop of 4 characters:
</code></pre>
.
+## Insecure characters
+
+For security reasons, the Unicode character `U+0000` must be replaced
+with the replacement character (`U+FFFD`).
+
# Blocks and inlines
We can think of a document as a sequence of
@@ -4284,13 +4285,14 @@ corresponding codepoints.
[Decimal entities](@decimal-entities)
consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these
entities need to be recognised and transformed into their corresponding
-unicode codepoints. Invalid unicode codepoints will be written as the
-"unknown codepoint" character (`0xFFFD`)
+unicode codepoints. Invalid unicode codepoints will be replaced by
+the "unknown codepoint" character (`U+FFFD`). For security reasons,
+the codepoint `U+0000` will also be replaced by `U+FFFD`.
.
-&#35; &#1234; &#992; &#98765432;
+&#35; &#1234; &#992; &#98765432; &#0;
.
-<p># Ӓ Ϡ �</p>
+<p># Ӓ Ϡ � �</p>
.
[Hexadecimal entities](@hexadecimal-entities)