Tidy up your HTML with ... tidy!

May 23, '06 07:30:01AM

Contributed by: benholt

I was messing around in the /usr/bin folder and found a binary called tidy, which is installed by default on OS X. Immediately curious, I looked up the function of this program. Its purpose is to generate cleaned-up versions of HTML, XML, and XHTML files, and it can even convert them. This is useful if you code web pages by hand, as I do. It fixes your mistakes, like the following, and many more:

<h1>heading
<h2>subheading</h3>
tidy will tell you the errors and then spit out a fixed version, which you can optionally save to a new file. Here are some examples: Convert HTML to well-formed XML and output to a new file (hit control-D after the output is finished):
tidy -asxml test.html -output fixed.xml
Just show the errors and quit:
tidy -errors test.html
Use upper case tags for output:
tidy -upper test.html
For more options, try man tidy or tidy -h. One use for tidy I can think of is cleaning up those horribly formatted Word webpages. tidy doesn't seem to like them either, but with the use of a configuration file and the word-2000 option, it cleans it up pretty nicely. A simple script could be written to feed a Word webpage through tidy, and then strip out the extra annoyances. See this page for more details.

Comments (5)


Mac OS X Hints
http://hints.macworld.com/article.php?story=20060518123322675