On Mon, 2007-06-04 at 01:07 -0600, Frank Cox wrote: > I have found a bug in tidy-0.99.0-12.20070228.fc7. > > However, tidy and libtidy are not listed as a component on > bugzilla.redhat.com, even though the tidy packager is listed as "Fedora Project > <http://bugzilla.redhat.com/bugzilla>" in the rpm. > > So, what's my next step? > > (The bug turns a 328-line html file into a 32110-line monster, consisting > mostly of font directives. The same file turns into 1433 lines using tidy on > Fedora Core 6.) I'm not too surprised that you get a monster file out of it, you're feeding it broken HTML in the first place. HTML tidy "tidies" HTML code (reformats it in a neat way), it's not a fix-up-broken-HTML tool. HTML tidy could, possibly, be something to fix that particular error, but the real fault is elsewhere. Copying a simplified example line from your source: <P><FONT><FONT><U>stuff</U> <U>stuff</U></FONT></P> Notice there's two opening font tags, but only one closing one. There's a huge number of lines, like that. To attempt to fix it, it's either got to put in an extra closing font tag, or merge together the two opening ones. But that's really a job that doesn't belong to HTML tidy to have to sort out, whatever generated the HTML in the first place needs fixing. You'd be better off making one CSS rule applied to paragraphs on the page, anyway. Rather that a gazillion font elements. For example: <style type="text/css"> p {margin-bottom: 0; font-family: Times New Roman; font-size: 12pt;} </style> <p>advert text</p> <p>advert text</p> <p>advert text</p> -- (This box runs FC6, my others run FC4 & FC5, in case that's important to the thread.) Don't send private replies to my address, the mailbox is ignored. I read messages from the public lists.