Age | Commit message (Collapse) | Author |
|
This reverts commit 4fbe344df43ed7f60a3d3a53981088334cb709fc.
|
|
We need to store the length of the original delimiter run,
instead of using the length of the remaining delimiters
after some have been subtracted.
Test case:
a***b* c*
Thanks to Raph Levin for reporting.
|
|
* Improve strbuf guarantees
Introduce BUFSIZE_MAX macro and make sure that the strbuf implementation
can handle strings up to this size.
* Abort early if document size exceeds internal limit
* Change types for source map offsets
Switch to size_t for the public API, making the public headers
C89-compatible again.
Switch to bufsize_t internally, reducing memory usage and improving
performance on 32-bit platforms.
* Make parser return NULL on internal index overflow
Make S_parser_feed set an error and ignore subsequent chunks if the
total input document size exceeds an internal limit. Make
cmark_parser_finish return NULL if an error was encountered. Add
public API functions to retrieve error code and error message.
strbuf overflow in renderers and OOM in parser or renderers still
cause an abort.
|
|
* open_new_blocks: always create child before advancing offset
* Source map
* Extent's typology
* In-depth python bindings
|
|
|
|
|
|
|
|
- Removed recursion in scan_to_closing_backticks
- Added an array of pointers to potential backtick closers
to subject
- This array is used to avoid traversing the subject again
when we've already seen all the potential backtick closers.
- Added a max bound of 1000 for backtick code span delimiters.
- This helps with pathological cases like:
x
x `
x ``
x ```
x ````
...
Thanks to Martin Mitáš for identifying the problem and for
discussion of solutions.
|
|
|
|
|
|
See jgm/CommonMark#427
|
|
|
|
|
|
This will need corresponding spec changes.
The change is this: when considering matches between an interior
delimiter run (one that can open and can close) and another delimiter
run, we require that the sum of the lengths of the two delimiter
runs mod 3 is not 0.
Thus, for example, in
*a**b*
1 23 4
delimiter 1 cannot match 2, since the sum of the lengths of
the first delimiter run (1) and the second (1,2) == 3.
Thus we get `<em>a**b</em>` instead of `<em>a</em><em>b</em>`.
This gives better behavior on things like
*a**b**c*
which previously got parsed as
<em>a</em><em>b</em><em>c</em>
and now would be parsed as
<em>a<strong>b</strong>c</em>
With this change we get four spec test failures, but in each
case the output seems more "intuitive":
```
Example 386 (lines 6490-6494) Emphasis and strong emphasis
*foo**bar**baz*
--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em>foo</em><em>bar</em><em>baz</em></p>
+<p><em>foo<strong>bar</strong>baz</em></p>
Example 389 (lines 6518-6522) Emphasis and strong emphasis
*foo**bar***
--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em>foo</em><em>bar</em>**</p>
+<p><em>foo<strong>bar</strong></em></p>
Example 401 (lines 6620-6624) Emphasis and strong emphasis
**foo*bar*baz**
--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em><em>foo</em>bar</em>baz**</p>
+<p><strong>foo<em>bar</em>baz</strong></p>
Example 442 (lines 6944-6948) Emphasis and strong emphasis
**foo*bar**
--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em><em>foo</em>bar</em>*</p>
+<p><strong>foo*bar</strong></p>
```
|
|
It is no longer needed; only the brackets struct needs it.
Thanks to @robinst.
|
|
See https://github.com/jgm/commonmark.js/pull/101
This uses a separate stack for brackets, instead of
putting them on the delimiter stack. This avoids the
need for looking through the delimiter stack for the next
bracket.
It also avoids a shortcut reference lookup when the reference
text contains brackets.
The change dramatically improved performance on the nested links
pathological test for commonmark.js. It has a smaller but measurable
effect here.
|
|
This reverts commit c069cb55bcadfd0f45890d846ff412b3c892eb87.
|
|
We reuse the parser for reference labels, instead
of just assuming that a slice of the link text
will be a valid reference label. (It might contain
interior brackets, for example.)
|
|
|
|
|
|
|
|
|
|
Previously we did this manually, which introduces many
places where errors can creep in.
|
|
Newer MSVC versions support enough of C99 to be able to compile cmark
in plain C mode. Only the "inline" keyword is still unsupported.
We have to use "__inline" instead.
|
|
in a reference link. (Spec change.)
|
|
API change. Sorry, but this is the time to break things,
before 1.0 is released. This matches the recent changes to
CommonMark.dtd.
|
|
|
|
|
|
Closes #68.
|
|
|
|
* Reformatted all source files.
* Added 'format' target to Makefile.
* Removed 'astyle' target.
* Updated .editorconfig.
|
|
Ensures that title is chunk with empty string rather than NULL,
as with other links.
Avoids "potential memory leak" warning from clang static analyzer
(though I couldn't measure one with valgrind).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Closes #59.
|
|
Note that our current procedure for removing nulls is not
working properly.
|
|
Use S_is_line_end_char.
|
|
|
|
This gives bad results in parsing reference links,
where we might have trailing blanks.
(finalize in blocks.c removes the bytes parsed as
a reference definition; before this change, some
blank bytes might remain on the line.)
|
|
```
[ref]: url
"title" ok
```
Here we should parse the first line as a reference.
|
|
|
|
|
|
Now we have an array of pointers (`potential_openers`),
keyed to the delim char.
When we've failed to match a potential opener prior to point X
in the delimiter stack, we reset `potential_openers` for that opener
type to X, and thus avoid having to look again through all the openers
we've already rejected.
See jgm/commonmark#43.
|
|
|
|
|
|
When they have no matching openers and cannot be openers themselves,
we can safely remove them.
This helps with a performance case: "a_ " * 20000.
See jgm/commonmark.js#43.
|