Can anyone recommend any decent HTML clean up programs to get rid of the HTML nonsense that MS Word inserts? I've got a few word docs that need HTMLification, and it just puts loads of unnecessary code in there.
Can anyone recommend any decent HTML clean up programs to get rid of the HTML nonsense that MS Word inserts? I've got a few word docs that need HTMLification, and it just puts loads of unnecessary code in there.
How would a program know what to get rid of and what to keep?
It would need to have a telepathic link into my brain.
:-P
What I mean is, converting something that converts this, outputted from word:
to something like this:Code:<html> <head> <meta http-equiv=Content-Type content="text/html; charset=windows-1252"> <meta name=Generator content="Microsoft Word 11 (filtered)"> <title>This is an html test page for word</title> <style> <!-- /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; margin-bottom:.0001pt; font-size:12.0pt; font-family:"Times New Roman";} @page Section1 {size:612.0pt 792.0pt; margin:39.7pt 39.7pt 39.7pt 39.7pt;} div.Section1 {page:Section1;} --> </style> </head> <body lang=EN-US> <div class=Section1> <p class=MsoNormal><b><span lang=EN-GB>This</span></b><span lang=EN-GB> is an </span><span lang=EN-GB style='font-size:14.0pt'>html</span><span lang=EN-GB> test page for <span style='color:red'>word</span>.</span></p> </div> </body> </html>
or something similar. It doesn't have to be as complex as that, i'm just showing that to illustrate the point.Code:<html> <head> <meta http-equiv=Content-Type content="text/html; charset=windows-1252"> <title>This is an html test page for word</title> </head> <p><b>This</b>is an <font size="4">html</font>test page for <font color="#FF0000">word</font>.</p> </body> </html>
Because it's cleverer than you
Fishcake: Wordcleaner - http://www.zapadoo.com/wordcleaner/index.htm or http://textism.com/wordcleaner/
for a start
(\__/)
(='.'=)
(")_(")
(\__/)
(='.'=)
(")_(")
Cheers stoo, i'll check those out!!
If you've got Dreamweaver then it has a built in HTML stripper designed specifically to take out all the crap that Word puts in.
Didn't know Dreamweaver did that. Haven't used Word for web design for ages though.
Don't use Word for web design ever.....
Don't have dreamweaver unfortunately... I don't design pages in word, but it was for transferring a document that was already in word format into HTML.
All done now anyway
Got there before me!Originally Posted by Iain
Translating something from word? Not a task I'd relish
I've had to do that several times, mainly HR documents that need converting before they're put up on the intranet at work...Originally Posted by RoGuE|SaBeR
Takes *ages* even with a word-cleaner..
(\__/)
(='.'=)
(")_(")
If you use Linux, I've found that Abiword does a good job of converting text to XHTML. I haven't used it extensively, but it is certainly better than Word (although I still use a text editor to do it all)
"Well, there was your Uncle Tiberius who died wrapped in cabbage leaves but we assumed that was a freak accident."
Why would anyone build a site with anything else than notepad or equivelent?
ITS CRAZY TALK!!!!!!!!!!!!!!
(\__/)
(='.'=)
(")_(")
Because they don't want to learn HTML? Because they don't have the time? Because they can't learn HTML? If you do it all yourself, things can go wrong quite easily, and things can happen that you didn't expect, and take up a lot of time to solve or correct. Some people just want to build a website that works, in which case a WYSIWYG editor comes in very handy!Originally Posted by nvening
"Well, there was your Uncle Tiberius who died wrapped in cabbage leaves but we assumed that was a freak accident."
There are currently 1 users browsing this thread. (0 members and 1 guests)