summaryrefslogtreecommitdiff
path: root/src/blocks.c
AgeCommit message (Collapse)Author
2015-06-16Added `CMARK_OPT_VALIDATE_UTF8` option.John MacFarlane
Also command line option `--validate-utf8`. This option causes cmark to check for valid UTF-8, replacing invalid sequences with the replacement character, U+FFFD. Reinstated api tests for utf8.
2015-06-16is_blank: recognize tab as a blank character.John MacFarlane
2015-06-16Removed utf8 validation.John MacFarlane
We now replace null characters in the line splitting code.
2015-06-16Renamed utf8proc_detab as utf8proc_check, removed detabbing function.John MacFarlane
Now it just replaces bad UTF-8 sequences and NULLs. This restores benchmarks to near their previous levels.
2015-06-16Preliminary changes for new tab handling.John MacFarlane
We no longer preprocess tabs to spaces before parsing. Instead, we keep track of both the byte offset and the (virtual) column as we parse block starts. This allows us to handle tabs without converting to spaces first. Tabs are left as tabs in the output. Added `column` and `first_nonspace_column` fields to `parser`. Added utility function to advance the offset, computing the virtual column too. Note that we don't need to deal with UTF-8 here at all. Only ASCII occurs in block starts. Significant performance improvement due to the fact that we're not doing UTF-8 validation -- though we might want to add that back in.
2015-06-16astyle formatting changes.John MacFarlane
2015-06-11Removed "add newline if line doesn't have one."John MacFarlane
This isn't actually needed.
2015-06-07Check for overflow in S_parser_feedNick Wellnhofer
Guard against too large chunks passed via the API.
2015-06-07Convert code base to strbuf_tNick Wellnhofer
There are probably a couple of places I missed. But this will only be a problem if we use a 64-bit bufsize_t at some point. Then, we'll get warnings from -Wshorten-64-to-32.
2015-06-06Rename `is_line_end_char` to `S_is_line_end_char`.John MacFarlane
2015-06-06Factored out `S_find_first_nonspace` in `S_proces_line`.John MacFarlane
Added fields `offset`, `first_nonspace`, `indent`, and `blank` to `cmark_parser` struct. This just removes some repetition in the code.
2015-06-06astyle formatting changes.John MacFarlane
2015-06-06Allow new list item container indented > 4 spaces.John MacFarlane
This fixes cases like: ``` 1. a 2. b 3. c ```
2015-06-04Cleaned up with is_line_end_char function.John MacFarlane
2015-06-03Revised "add newline to end if missing" for performance.John MacFarlane
From btrask's alternate code in the comment on https://github.com/jgm/cmark/pull/18. Note: this gives a 1-2% performance boot in our benchmark, probably enough to make it worth while.
2015-06-03Merge branch 'master' of https://github.com/btrask/cmark into btrask-masterJohn MacFarlane
Conflicts: src/blocks.c
2015-06-03Fixed bug in list item parsing when items indented >= 4 spaces.John MacFarlane
Closes #52.
2015-04-07Check length before reading.Ben Trask
2015-04-07Try to match existing style better.Ben Trask
2015-04-07Bug fixes for CRLF support.Ben Trask
2015-04-07Fix regression in remove_trailing_blank_lines().Ben Trask
2015-04-07Support for CRLF and CR line endings.Ben Trask
2015-03-27Removed an unnecessary check.John MacFarlane
By the time we check for a list start, we've already checked for an HRULE, so we don't need to repeat that check here. Thanks to Robin Stocker for pointing out a similar redundancy in commonmark.js.
2015-02-20Cleaned up some comments.John MacFarlane
2015-02-19Fixed use-after-free error.John MacFarlane
Closes #9, confirmed with ASAN. Avoid using `parser->current` in the loop that creates new blocks, since `finalize` in `add_child` may have removed the current parser (if it contains only reference definitions). This isn't a great solution; in the long run we need to rewrite to make the logic clearer and to make it harder to make mistakes like this one.
2015-02-19Fixed use-after-free bug.John MacFarlane
This arose when a paragraph containing only reference links and blank space was finalized. Finalization would remove the node. `finalize` returns the parent node, but the problem arose because we had both `cur` and `parser->current`, and only one was being updated. Solution: remove `cur`, which is a holdover from before we had `parser->current`. I believe this will close #9 -- @JordanMilne can you test and confirm?
2015-02-16Made 'options' an int rather than a long.John MacFarlane
For consistency with the API.
2015-02-16Move normalization step from main to cmark_parser_finish.John MacFarlane
2015-02-15Added options parameter to cmark_parse_document, cmark_parse_file.John MacFarlane
Also to some non-exported functions in blocks and inlines.
2015-02-14astyle changes (code formatting only).John MacFarlane
2015-01-18Readjust parser->current after closing fenced block.John MacFarlane
Added assertion to raise an error if finalize is called on a closed block (as was happening undetected because of the fallback behavior).
2015-01-17Removed some unneeded tests (code clarity).John MacFarlane
2015-01-17Small code clarification.John MacFarlane
2015-01-17Put check for fence close with the other checks for end-of-block.John MacFarlane
This is a more logical arrangement and follows recent changes to the JS implementation.
2015-01-16Fixed #285 in cmark.John MacFarlane
2015-01-16Nonrecursive rewrite of ends_with_blank_line.John MacFarlane
Closes #286.
2015-01-16Renamed parameters cmark_node -> node.John MacFarlane
Minor code reformatting: This corrects an overzealous global replace from earlier.
2015-01-05Reformatted code consistently with astyle.John MacFarlane
2014-12-29Added cmark_ prefix to functions in cmark_ctype.John MacFarlane
2014-12-29Added cmark_ctype.h with locale-independent isspace, ispunct, etc.John MacFarlane
Otherwise cmark's behavior varies unpredictably with the locale. `is_punctuation` in utf8.h has also been adjusted so that everything that counts all ASCII symbol characters count as punctuation, even though some are not in P* character classes.
2014-12-28Improved end column/end line calculations in finalize.John MacFarlane
2014-12-28Added end_column to cmark_node struct.John MacFarlane
API exports cmark_node_get_column. XML writer indicates start and end line and column for block-level nodes.
2014-12-28blocks.c - removed unneeded start_line parameter from make_block.John MacFarlane
2014-12-28blocks.c: removed redundant line_number param in finalize.John MacFarlane
Also break_out_of_lists.
2014-12-28Rename CMARK_NODE_LIST_ITEM -> CMARK_NODE_ITEM.John MacFarlane
2014-12-16Added 'literal' field to 'code' struct.John MacFarlane
In the last few commits we were using as.code.fenced and as.literal at the same time for NODE_CODE_BLOCK, which obviously led to problems.
2014-12-15Re-added cmark_ prefix to strbuf and chunk.John MacFarlane
Reverts 225d720.
2014-12-14Use cmark_iter to avoid stack allocation in process_inlines.John MacFarlane
2014-12-14Use chunk for fenced code info, instead of strbuf.John MacFarlane
2014-12-14Use as.literal instead of string_content for HTML and code blocks.John MacFarlane
This is for consistency with the other types of nodes that have literal strings as contents.