summaryrefslogtreecommitdiff
path: root/src/blocks.c
AgeCommit message (Collapse)Author
2016-06-06cmake: Global handler for OOM situationsVicent Marti
2016-06-06buffer: proper safety checks for unbounded memoryVicent Marti
The previous work for unbounded memory usage and overflows on the buffer API had several shortcomings: 1. The total size of the buffer was limited by arbitrarily small precision on the storage type for buffer indexes (typedef'd as `bufsize_t`). This is not a good design pattern in secure applications, particualarly since it requires the addition of helper functions to cast to/from the native `size` types and the custom type for the buffer, and check for overflows. 2. The library was calling `abort` on overflow and memory allocation failures. This is not a good practice for production libraries, since it turns a potential RCE into a trivial, guaranteed DoS to the whole application that is linked against the library. It defeats the whole point of performing overflow or allocation checks when the checks will crash the library and the enclosing program anyway. 3. The default size limits for buffers were essentially unbounded (capped to the precision of the storage type) and could lead to DoS attacks by simple memory exhaustion (particularly critical in 32-bit platforms). This is not a good practice for a library that handles arbitrary user input. Hence, this patchset provides slight (but in my opinion critical) improvements on this area, copying some of the patterns we've used in the past for high throughput, security sensitive Markdown parsers: 1. The storage type for buffer sizes is now platform native (`ssize_t`). Ideally, this would be a `size_t`, but several parts of the code expect buffer indexes to be possibly negative. Either way, switching to a `size` type is an strict improvement, particularly in 64-bit platforms. All the helpers that assured that values cannot escape the `size` range have been removed, since they are superfluous. 2. The overflow checks have been removed. Instead, the maximum size for a buffer has been set to a safe value for production usage (32mb) that can be proven not to overflow in practice. Users that need to parse particularly large Markdown documents can increase this value. A static, compile-time check has been added to ensure that the maximum buffer size cannot overflow on any growth operations. 3. The library no longer aborts on buffer overflow. The CMark library now follows the convention of other Markdown implementations (such as Hoedown and Sundown) and silently handles buffer overflows and allocation failures by dropping data from the buffer. The result is that pathological Markdown documents that try to exploit the library will instead generate truncated (but valid, and safe) outputs. All tests after these small refactorings have been verified to pass. --- NOTE: Regarding 32 bit overflows, generating test cases that crash the library is trivial (any input document larger than 2gb will crash CMark), but most Python implementations have issues with large strings to begin with, so a test case cannot be added to the pathological tests suite, since it's written in Python.
2016-04-09Reformatted.John MacFarlane
2016-04-09Correctly handle list marker followed only by spaces.John MacFarlane
This change allows us to pass the new test introduced in 75f231503d2b5854f1ff517402d2751811295bf7. Previously when a list marker was followed only by spaces, cmark expected the following content to be indented by the same number of spaces. But in this case we should treat the line just like a blank line and set list padding accordingly.
2016-03-26Handle buffer split across a CRLF line ending (closes #117).John MacFarlane
Adds an internal field to the parser struct to keep track of last_buffer_ended_with_cr.
2016-03-26Reset partially_consumed_tab on every new lineNick Wellnhofer
Fixes issue #114.
2016-03-12Switch from "inline" to "CMARK_INLINE"Nick Wellnhofer
Newer MSVC versions support enough of C99 to be able to compile cmark in plain C mode. Only the "inline" keyword is still unsupported. We have to use "__inline" instead.
2016-02-12blocks: More documentation and refactoringMathieu Duponchelle
2016-02-10Removed unnecessary check for empty string_content.John MacFarlane
2016-02-10Revert "Simplified condition for lazy line."John MacFarlane
This reverts commit 4d2d486333c358eb3adf3d0649163e319a3b8b69. This commit caused a valgrind invalid read. ==29731== Invalid read of size 4 ==29731== at 0x40500E: S_process_line (blocks.c:1050) ==29731== by 0x403CF7: S_parser_feed (blocks.c:526) ==29731== by 0x403BC9: cmark_parser_feed (blocks.c:494) ==29731== by 0x433A95: main (main.c:168) ==29731== Address 0x51d5b60 is 64 bytes inside a block of size 128 free'd ==29731== at 0x4C27D4E: free (vg_replace_malloc.c:427) ==29731== by 0x4015F0: S_free_nodes (node.c:134) ==29731== by 0x401634: cmark_node_free (node.c:142) ==29731== by 0x4033B1: finalize (blocks.c:259) ==29731== by 0x40365E: add_child (blocks.c:337) ==29731== by 0x4046D8: try_new_container_starts (blocks.c:836) ==29731== by 0x404F12: S_process_line (blocks.c:1015) ==29731== by 0x403CF7: S_parser_feed (blocks.c:526) ==29731== by 0x403BC9: cmark_parser_feed (blocks.c:494) ==29731== by 0x433A95: main (main.c:168)
2016-02-09Factored out contains_inlines.John MacFarlane
2016-02-09Simplified condition for lazy line.John MacFarlane
2016-02-09Added code comments.John MacFarlane
2016-02-09Added code comment.John MacFarlane
2016-02-06Code cleanup: add function to test for space or tab.John MacFarlane
2016-02-06Use an assertion to check for in-range html_block_type.John MacFarlane
It's a programming error if the type is out of range.
2016-02-06Merge branch 'refactor-S_processLine' of ↵John MacFarlane
https://github.com/MathieuDuponchelle/cmark into MathieuDuponchelle-refactor-S_processLine
2016-02-06Fixed handling of tabs in lists.John MacFarlane
2016-02-07blocks: Factorize S_processLinesMathieu Duponchelle
It's the core of the program and I had too much trouble making sense of it, two loops with many cases and other code interspersed hurt my head. All the tests still passed before rebasing, now I've got the exact same set of issues as master.
2016-02-06Properly handle tabs with blockquotes and fenced blocks.John MacFarlane
2016-02-06Clarify logic in S_advance_offset.John MacFarlane
2016-02-06S_advance_offset: Only set partially_consumed_tab in columns mode.John MacFarlane
2016-02-05Simplified add_line (only need parser parameter).John MacFarlane
2016-02-05Properly handle partially consumed tab.John MacFarlane
E.g. in ``` - foo <TAB><TAB>bar ``` we should consume two spaces from the second tab, including two spaces in the code block.
2016-02-05Added partially_consumed_tab to parser.John MacFarlane
This keeps track of when we have gotten partway through a tab when consuming initial indentation.
2016-02-05Fixed tabs in indentation.John MacFarlane
Closes #101. This patch fixes `S_advance_offset` so that it doesn't gobble a tab character when advancing less than the width of a tab.
2016-01-07Allow multiline setext header content, as per spec.John MacFarlane
2015-12-28Reformat sources.John MacFarlane
2015-12-28Replaced hard-coded 4 with TAB_STOP.John MacFarlane
2015-12-28Rename NODE_HTML -> NODE_HTML_BLOCK, NODE_INLINE_HTML -> NODE_HTML_INLINE.John MacFarlane
API change. Sorry, but this is the time to break things, before 1.0 is released. This matches the recent changes to CommonMark.dtd.
2015-12-28Use input not parser->curline to determine last line length.John MacFarlane
Ultimately I think we can get rid of parser->curline and avoid an unnecessary allocation per line.
2015-12-22Rename hrule -> thematic_break.John MacFarlane
CMARK_NODE_HRULE -> CMARK_NODE_THEMATIC_BREAK. However we've defined the former as the latter to keep backwards compatibility. See jgm/CommonMark 8fa94cb460f5e516b0e57adca33f50a669d51f6c
2015-12-22CMARK_NODE_HEADER -> CMARK_NODE_HEADING.John MacFarlane
Defined CMARK_NODE_HEADER to CMARK_NODE_HEADING to ease the transition.
2015-12-22Rename 'header' -> 'heading'.John MacFarlane
See jgm/CommonMark commit 0cdbcee4e840abd0ac7db93797b2b75ca4104314 Note that we have defined cmark_node_get_header_level = cmark_node_get_heading_level and cmark_node_set_header_level = camrk_node_set_heading_level for backwards compatibility in the API.
2015-12-19Use fully qualified versions of constants.John MacFarlane
2015-08-10Remove need to disable MSVC warning 4244Kevin Wojniak
2015-08-09Fixed bug with HRULE after blank line.John MacFarlane
This previously caused cmark to break out of a list, thinking it had two consecutive blanks.
2015-08-09Check for empty string before trying to look at line ending.John MacFarlane
2015-08-09Make sure every line fed to S_process_line ends with `\n`.John MacFarlane
So `S_process_line` sees only unix style line endings. Closes #72, avoiding mixed line endings. Ultimately we probably want a better solution, allowing the line ending style of the input file to be preserved. This solution forces output with newlines.
2015-08-08Treat line ending with EOF as ending with newline.John MacFarlane
Closes #71. Added a test to api_test.
2015-08-06Prefix utf8proc functions to avoid conflict with existing libraryKevin Wojniak
2015-07-27Disallow list item starting with multiple blank lines.John MacFarlane
See jgm/CommonMark#332
2015-07-27Use clang-format, llvm style, for formatting.John MacFarlane
* Reformatted all source files. * Added 'format' target to Makefile. * Removed 'astyle' target. * Updated .editorconfig.
2015-07-16Allow tabs before closing ##s in ATX headerJohn MacFarlane
2015-07-14astyle reformatting.John MacFarlane
2015-07-14Limit 'start' to 8 digits to avoid undefined behavior (overflows).John MacFarlane
This should be added to the spec.
2015-07-11Removed dependence on debug.h.John MacFarlane
(It uses GNU extensions, and we don't need it anyway.)
2015-07-10Updates for new HTML block spec.John MacFarlane
* Rewrote spec for HTML blocks. A few other spec examples also changed as a result. * Removed old `html_block_tag` scanner. Added new `html_block_start` and `html_block_start_7`, as well as `html_block_end_n` for n = 1-5. * Rewrote block parser for new HTML block spec.
2015-06-18Minor astyle reformatting.John MacFarlane
2015-06-17Fixed off-by-one error in line splitting routine.John MacFarlane
This caused certain NULLs not to be replaced. Found my 'make fuzztest'.