Other Topics

Convert HTML Docs into Text Docs for Better Readability


For some reason, we have HTML document in our computer and we want to have the info in Text format from that HTML doc, we will copy from HTML doc and paste that info in a Text doc. OK, it is the successful case if we have just few HTML docs. What if we have many HTML docs, will we do same procedure with all docs? No! at least, I will not do it because it is very tiring job.

There is a very simple tool called HTMLAsText which can convert a single HTML doc to Text doc as well as bulk of HTML docs to text doc. Think you have saved some HTML pages during web surfing. Later, when you are disconnected from Internet, you want to read those pages, it may possible that those pages will not display info in an order if you are not connected to Internet. You need Internet connection again. But if you have HTMLAsText, you do not need Internet to visit that web page again to look the info in ordered form. HTMLAsText will display the info in ordered form in Text doc instead of HTML doc.

In fact, this is a very important tool. You do not need to install it. When it run, it does not store, change or associate to any file in your computer. You simply double click on the HTMLAsText and its window will open. Now choose the source HTML doc and the destination Text doc. Play with some formatting options shown on the HTMLAsText window and get Text file which is more readable compared to HTML file. Its features are:

* HTMLAsText automatically removes all tags and scripts from the document.
* The remaining text is formatted according to the number of characters per line that you select.
* All HTML entities (e.g.: &amp, &lt) are converted into the corresponding ASCII characters.
* Unordered lists (<ul> tag) and ordered lists (<ol> tag) are formatted accordingly. The bullets beside the items of unordered lists are replaced by ASCII characters according to your selection.
* Definition lists are formatted by adding spaces in the left side of the definition lines.
* Optionally, centered and right-aligned paragraphs are formatted accordingly by adding space characters in the left side of the lines.
* Optionally, HTMLAsText allows you to add a line under each heading (<h1> – <h6> tags)
* Simple tables can be delimited by spaces, tab characters, commas, or CRLF.
* Preformatted text blocks (<pre> tag) are copied “as is”, without formatting the text.
* You can convert multiple HTML files in the same folder at once, by using wildcard. (e.g.: c:files*.html)
* You can run the conversion process without displaying any user interface, by using the /run command-line option.

The home page of HTMLAsText and a complete guide to use it can be accessed HERE.

Remember! This is a free tool.


Related posts

Notify of
1 Comment
Newest Most Voted
Inline Feedbacks
View all comments


Would love your thoughts, please comment.x