Wanted: OSS to monitor changes to websites or diff HTML files

Larry Cook lcook at sybase.com
Fri Feb 4 15:21:01 EST 2005


Drew,

> Something that converts HTML to text, like this?
> 
> http://www.icewalkers.com/Linux/Software/51170/html2txt.html

Thanks for the pointer.  Unfortunately it doesn't do any better of a job than 
my simple Perl filter, and it even appears to have a bug where it prints 
letters in the first column twice.  But there were also two other HTML to Text 
converters.  One only did a little better than my script, but the other one 
actually does some parsing, so it appears to get rid of some the M$ crap. 
Maybe I'll be able to enhance it to get rid of it all.

Thanks,
Larry



More information about the gnhlug-discuss mailing list