Use all HTML 2.0 tags (except forms)


Contents: headings, lists, markup, entities, images, end,

headings This is level 2

heading level 3

heading level 4

heading level 5
heading level 6

List constructs

HTML supports 5 different list constructs:
ul
Unordered lists
ol
Ordered lists
dl
definition lists (like this)
menu
menu lists
dir
directory lists

All of the list constructs can nest, with each successive nesting level indented relative to the previous one. The HTML blockquote element, although not technically a list construct, nests with the other lists.

Lists have two rendering styles, regular

Which leaves a little space between the elements, and compact

Compact lists are normally a bit more scrunched together, but when converting to ASCII text there is no difference.

Examples of list elements

This is a summary of the list types:

Here is an example of an ordered list

  1. This is item 1
  2. This is item 2
  3. You can see the list items are numbered
    1. When ordered lists are nested
    2. Each nesting level gets its own numbers
  4. This is the end of the ordered list

The other two list styles are directory lists and menu lists. For now, directory lists are rendered excactly like unordered lists, because no-one uses them. Menu lists are also like un-ordered lists, but they're not supposed to nest, and they get a different list marker, like this:

  • menu item 1
  • menu item 2
  • menu item 3
  • Mixed lists

    Definition lists, ordered lists, and un-ordered lists can be freely intermingled (block quotes too). Menu lists and directory lists can be combined with the others, but officially only at the highest nesting level.


    Phrase markup

    HTML has many tags that affect the font style, one can use either functional tags such as strong or cite, which is the prefered method, or markup tags, such as bold or italic.

    The basic font style is a plain medium-roman font.

    HTML defines many tags that alter the font style. The three primary style alterations are:

    There are two additional styles that are deprecated, but still used in some contexts. They are

    Font styles may be arbitrarily nested, to obtain combinations like bold italic or Underlined bold mono-spaced. all of the combinations are enumerated somewhere else.

    The functional tags, which map to one of more of the combinations above are:

  • cite - for citations
  • code - for code fragments
  • em - for emphasis
  • kbd - for user input
  • samp - for samples
  • strong - for extra emphasis
  • var - for variable references
  • And here is a simple address (using the address tag)

    123 Main Anytown USA

    HTML entities

    In HTML, the characters &, <, >, and " are special, and need to be represented as: &amp;, &lt;, &gt;, and &quot;.

    In addition, the entire latin-1 character set can be represented similarly, such as &reg; (®) or &ouml; (ö).


    Images

    Inline images are ignored when converting to text unless the alt attribute is used, in which case the alternate text given is output.


    Ending

    This file is distributed as part of HTML2text by Joe Moss (joe@morton.rain.com).