text2html Perl script README V2.0 INTRODUCTION text2html is a Perl script that converts arbitrary formatted text to HTML format. The program tries to find usual formattings and converts them to HTML control. The result usually is a good looking HTML file. You can use the program in an environment where there is a huge number of text formatted legacy files, or when your system somehow can not generate any better than text file. This can also be because of the people behind the tools. You can also use the tool if you need really good looking Web pages from text documents. In this case you can use the formatter to get a web page that can be a starting point for your editing. Even this usually is better that just starting from the plain text. This tool is NOT intended to be a replacement for any HTML editor or any format2html converter if format is something like RTF, DOC, TeX or alike. Those formats are better defined, and the converters can result better looking output as they can rely more on the source formatting and need less heuristics. To see an example what this program can do see the html version of this file. Or compare what you are reading now to the original, if you are reading the HTML format. USAGE Just start it on the command line: perl text2html.pl input_file output_file It will obviously read the input_file and write the output_file. If you do not like the result, and the effort matches the benefit then edit the result manually or using an HTML editor. (I also have scripts that repair some of the damages that some HTML editors do, so don't worry.) FORMATTING The program tries to find o bulleted paragraphs, but no sub bullets, like this - this is the first, - the second, and - the third sub bullet o numbered paragraphs, but sub numbers o eMail addresses o verbatim paragraphs o http references, like http://www.isys.hu/c/verhas, ftp://wuarchive.uw.edu or gopher://mygopher.com or news://nntp.net.com o table of contents references o headlines o block quotes o (C), (R) and [TM] symbols And finally the program should take care of centered paragraphs as well. In most cases the program finds the right constructs and it formats the text proper. It can happen many times that the result format does not really match that of the text version. In such a case you are unlucky. The program was designed to work with most of the formatting. When judging formatting failures, please bear in mind that this program was tuned to convert this very read.me file flawlessly :-) FEEDBACK Blessing is OK, honestly, I like it. Telling that this is shit as it is: o without reasoning: I can stand o explaining good, and real why: I appreciate and thank (e.g.: you know a really better one with high quality AI built in that does a better job) If the program fails: o Is it really a failure or just you are unlucky getting badly formatted text? Is it in infinite loop, or you have a very long text and a very slow machine? o Do you have the latest version? o Try to find out why, and tell me the result. o If you can not find out the reasons, produce the shortest possible input file that results the bug, and send it to me, along with the following informations: operating system, perl version, text2html version, your machine type, and your eMail. o Send your comments by eMail. DISCLAIMER The usual stuff, but anyway, here it is: This software is provided on an "AS IS", basis, without warranty of any kind, including without limitation the warranties of merchantability, fitness for a particular purpose and non-infringement. The entire risk as to the quality and performance of the Software is borne by you. Should the Software prove defective, you and not the author assume the entire cost of any service and repair. AUTHOR Peter Verhas peter@verhas.com CURRENT VERSION 2.0 HISTORY - July 12, 1997. V0.0 created
text- July 13, 1997. V1.0 - November 16, 1998. V2.0