Introduction
Introduction Statistics Contact Development Disclaimer Help
improve parsing whitespace after end tag names - webdump - HTML to plain-text c…
git clone git://git.codemadness.org/webdump
Log
Files
Refs
README
LICENSE
---
commit 72b23084b7c64c298c6b90ae6ad9f53f497cec57
parent a0118e672fd3fa0004ccf2850eaef4ec4bc6fb39
Author: Hiltjo Posthuma <[email protected]>
Date: Sat, 29 Jun 2024 18:29:21 +0200
improve parsing whitespace after end tag names
Real site example:
https://www.gnupg.org/gph/en/manual.html
Has HTML such as:
<P
CLASS="COPYRIGHT"
>Copyright &copy; 1999 by <SPAN
CLASS="HOLDER"
>The Free Software Foundation</SPAN
></P
>
...
This incorrectly showed ">" in the end tag as data.
Reported by Jason Hood, thanks!
Diffstat:
M xml.c | 2 ++
1 file changed, 2 insertions(+), 0 deletions(-)
---
diff --git a/xml.c b/xml.c
@@ -386,6 +386,8 @@ xml_parse(XMLParser *x)
else if (c == '>' || ISSPACE(c)) {
x->tag[x->taglen] = '\0';
if (isend) { /* end tag, start…
+ while (c != '>' && c !…
+ c = GETNEXT();
if (x->xmltagend)
x->xmltagend(x…
x->tag[0] = '\0';
You are viewing proxied material from codemadness.org. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.