Age | Commit message (Collapse) | Author |
|
We dispense with the hashes and just do string comparsions.
Since the array is in order, we can search intelligently
and should never need to do more than 8 or so comparisons.
This reduces binary size even further, at a small cost
in performance. (This shouldn't matter too much, as
it's only detectable in really entity-heavy sources.)
|
|
We now use -1 instead of 0 to indicate leaf nodes.
|
|
|
|
The primary advantage is a big reduction in the size of
the compiled library and executable (> 100K).
There should be no measurable performance difference in
normal documents. I detected a slight performance
hit (around 5%) in a file containing 1,000,000 entities.
* Removed `src/html_unescape.gperf` and `src/html_unescape.h`.
* Added `src/entities.h` (generated by `tools/make_entities_h.py`).
* Added binary tree lookup functions to `houdini_html_u.c`, and
use the data in `src/entities.h`.
|
|
The old one had many errors.
The new one is derived from the list in the npm entities package.
Since the sequences can now be longer (multi-code-point), we
have bumped the length limit from 4 to 8, which also affects
houdini_html_u.c.
An example of the kind of error that was fixed in given
in jgm/commonmark.js#47: `≧̸` should be rendered as "≧̸" (U+02267
U+00338), but it's actually rendered as "≧" (which is the same as
`≧`).
|
|
There are probably a couple of places I missed. But this will only
be a problem if we use a 64-bit bufsize_t at some point. Then, we'll
get warnings from -Wshorten-64-to-32.
|
|
This closes #33.
|
|
|
|
Don't store length of UTF-8 string. It can be computed by
NULL-terminating strings shorter than 4 bytes and using strnlen.
Use gperf's string pool option. This allows to use an 'int' index into the
string pool instead of a pointer and is helpful on 64-bit systems.
Shaves about 75 KB off the 32-bit binaries on Linux and 128 KB off the
64-bit binaries on OS X.
|
|
Reverts 225d720.
|
|
The separate directory presents problems for some simple
extension building systems, like luarocks.
|