hotshitu
August 26th, 2002, 10:35
Is there an easy way to convert a couple thousand HTML files (110+ megs) plus pertaining indices (hdr, tre, tps and such) into an offline readable format while retaining some structure (dunno pdf or doc would be nice) as far as I can see the cd was made using some Macromedia presentation app (Director??) that taps into a database thingie. I would actually use the original forntend but unfortunaely the GUI is so fugly and slow and clunky and only works at 640x480 res or in a wee window that I'd really love to be able to extract/convert the content instead. So if you know of a program (preferably free/shareware) that can do the job please let me know. Good ideas apreciated. :)
PS The catch is, some of the pages are apparently word-wrapped while others aren't. I'm absolutely clueless as to what their actual format is, you can read single pages but I think the htm extension is just a random place holder.
Eagle
August 26th, 2002, 17:45
They are HTML? Why not load them in IE.
2fast4u
August 26th, 2002, 18:51
Originally posted by Eagle
They are HTML? Why not load them in IE.
ya, ie or some similar browser is very likely installed on 98% of all systems by now. so now problem with html actually.
hotshitu
August 27th, 2002, 10:30
You both misunderstood.
Here is what I have
- about 25 folders, each comprising approx. 1500 files (=pages). They are plain text, some are just formatted in a really weird way, like, right to left or vertically, my guess is that's just dto prevent a straight cut'n'paste rip though.
- a bunch of files with weird extensions which I take it are database index files, but I'm not 100% certain (did some research but no viable results)
- an extremely sucky reader/frontend
This is what I want:
EITHER: an alternative frontend, one that recognizes the indices, displays the vanilla text and has a full text search option,
OR a program that can somehow convert the indices or merge them with the textflies so that I'd get some kinda hyperlinked structure that is readable in a standard word processor/explorer... yadda yadda.
Looks like I'm gonna have to return the CDROM (Goethe. Zeit Leben Werk) which is a real bummer cause it's the only one I know of that has G.'s Complete Works, Berlin ed. Thanks though, I somehow think this wasn't quite the right place to ask anyway.
edit: Problem solved for now. A colleague of mine just popped in an with a copy of digibib volume 4. It's not nearly as complete as mine but the text viewer is way better and has a neat export function. PhD here I come :)
vBulletin v3.6.2, Copyright ©2000-2010, Jelsoft Enterprises Ltd.