1. - Overview
1.1. - Prerequisites
1.2. - Why
1.3. - What
1.4. - How
1.5. - HTML output
2. - Terminology / Jargon
3. - Language elements
3.1. - Document structure
3.1.1. - Line concatenations
3.1.2. - Headings
3.1.3. - Underlining
3.1.4. - HTML lists
3.1.5. - Sections containing source code / literal sections
3.2. - Keyword indexing
3.3. - Sections containing jargon or terminology
The "NML parser" is a Perl script which reads plain-text file(s) containing any kind of documentation or notes.
It generates nicely formatted, readable, indexed HTML from those input files.
Special markup tags may be added to the source text files, which affect the generated HTML.
If you want to see an example of an NML-generated document: you are currently reading one.
You must have Perl installed.
You may be able to get a free version of Perl from one of these sites:
· www.perl.com // O'Reilly site
· www.activeperl.com // Standard free Perl distribution
· www.indigostar.com // IndigoStar distribute PERL and Perl2Exe (commercial)
NML was created in order to help the author manage a large collection of technical notes.
The following are key features of NML:
· The source documents are also readable and useable.
· Compact
· Index generation - to quickly add search capability
· Platform independance (it's just plain text)
· Intuitive syntax
When the HTML files are generated, certain structures are recognised by the parser in the input files.
E.g:
· Keyword cross-indexing
· Heading level management
· Some HTML markup (e.g. lists, underlining)
However the NML parser will read any textual input file(s), and generate HTML from them.
Obviously none of the useful features like cross-indexing and document linking will be there, but it will still work.
This is what happens:
Source text file(s) --> NML parser (Perl) --> Generated HTML
The NML grammar is simple enough to be implemented in Perl, so it will therefore run on any platform which supports Perl.
It's a similar concept to HTML in as far as an HTML file is really just a plain text file containing "magic" markup tags (like <U> etc), which are meaningful to a browser.
An NML file just contains plain text with tags which are meaningful to the NML parser
Note that the parser will overwrite any files in the output directory which were created earlier. This is probably what you want to do - just don't start to edit the HTML manually.
The command line interface
The NML parser is provided as a Perl script.
The CLI is as follows:
In general:
nml_parse.pl destination-directory file1 file2 ... fileN
E.g.
nml_parse.pl generated_docs my_doc1.txt mydoc2.txt
Unix users can set the magic string in the first line of the file to point at the perl interpreter. The usual location is
/usr/bin/perl
So the magic string is
#!/usr/bin/perl
If you have installed Perl somewhere else, you will have to change the magic string.
DOS or Windows users may like to create a batch (.bat) file like the following:
perl c:\utils\nml_parse.pl c:\output file1.txt file2.txt
Technical details
The main development and testing was performed using Perl 5.6 under Linux Red Hat 7.2.
If you want to alter the way the HTML looks, you have to do it by altering the Perl code: there is no way to do it via NML markup. This is to try and keep the source file as natural and human-readable as possible.
The generated HTML should conform to HTML 4.01 transitional, as validated by the W3C Markup Validation Service at
http://validator.w3.org
Some simple CSS is also used, to give consistency in the output.
-
CLI - Command Line Interface
- How to run a script from a command line.
-
NML - Note Markup Language
- Formatting rules for plain text documents.
- Sorry but you get nowhere unless you have a fancy acronym ;-)
- The NML parser is implemented as a Perl script.
-
Perl - Practical Extraction and Report Language
- A very portable programming language, available free for many operating systems
- For example: Unix, Linux, MS-DOS, VMS, OS/2, Macintosh, Windows
This section describes how to write NML documents.
Each input filename (minus any suffix) will be used to generate the output HTML filename.
The first line in each source document will be used as the document's title.
It should be followed by one blank line.
If you want to generate a single long line of HTML, but keep the source text within a fixed line width, indent subsequent lines with a single space.
For example:
This
will
generate a
single long
line of
HTML.
Headings are generated by lines which start with a single digit inside curly braces.
E.g. Top level heading (a <H1>):
{0}Main Section
E.g. Lower level headings (<H2> etc):
{1}Subsection
E.g. Still lower level...
{2}Sub-subsection
etc.
Numbering is performed automatically, and heading text can contain keywords.
Entire lines can be underlined in two ways:
· Lines which begin with an underscore will be underlined.
· A line completely made up of dashes (-) will cause the previous line to be underlined.
Examples:
Lines beginning with underscore:
_This will be underlined
Lines underlined manually:
This line will also be underlined
---------------------------------
Note that:
It doesn't matter if your underlines don't quite line up
--------------------------------------------------------------
Individual words can be underlined by surrounding them with underscores.
E.g.
Only one _word_ will be underlined in this sentence.
Note that individual words are not underlined like this in literal sections (because sometimes #defines are used in C which begin with an underscore).
These are easy to generate, just by starting the line with an asterix character
E.g.
This is some text before the list
* This will be the first item in the list
* Second item
Do this by beginning the line with TAB Open-square-bracket SPACE ( [ )
E.g.
Here are some normal notes
[ // Here is some source code
[ // There are two lines
Back to some more notes
The square-bracket character will disappear from the final HTML, so you can copy and paste source code samples straight into an editor.
Words are added to the index by surrounding them with special "highlighting" characters, e.g.
This {*word*} will be indexed.
Note:
· There must be no space between the "{" and the "*"
· Keyword indexing is case-sensitive
· Keyword indexing is not done in literal sections
One index page is generated per letter of the alphabet, regardless of whether there are any entries for that letter.
Anything indexed which doesn't start with a letter will be indexed under "symbols"
Very specialised handling for technical definition and jargon lists, like those you can see in section 2 of this document.
The NML syntax is very rigidly defined, with the following rules:
· The entire section must be titled "Terminology/Jargon"
· Definitions start with the word, then a TAB, then its definition. They are automatically indexed.
· The meanings go on subsequent line(s), and must start with a TAB.
· There must be a blank line after the entire definition is finished.
Here is an example:
{0}Terminology / Jargon
-----------------------
CLI Command Line Interface
How to run a script from a command line.
Your copyright © notice here