Everything hsc knows about HTML, it retrieves from a file named hsc.prefs at startup. This file contains information about all tags, entities and icon entites. Additionally, some special attributes are set up there also.
The main advantage of this concept is that it's rather
easy to add new syntax elements. For this purpose the hsc tags
<$deftag>
, <$defent>
, <$defstyle>
and <$deficon>
can be used.
It is a serious problem about HTML that no one can give you competent answer to the question ``Now which tags are part of HTML?''. On the one hand, there is w3c, which you meanwhile can ignore, on the other hand, there are the developers of popular browsers, which implement whatever they just like.
The hsc.prefs coming with this distribution should support most elements needed for everyday use. With the hsc V0.923 release, the prefs have been updated to HTML 4.01; since V0.925 there has also been support for automatic distinction between ``classic'' HTML and XHTML. If you run hsc in XHTML mode, some obsolete attributes will not be known any more, and new ones added.
$HOME/lib
/usr/local/lib/hsc/
and /usr/lib/hsc/
PROGDIR:
, which is automatically
assigned to the same directory where the hsc binary resides
when hsc is invokedIf it is unable to find hsc.prefs anywhere, it will abort with an error message.
If you want to find out where hsc has read hsc.prefs from, you can use STATUS=VERBOSE when invoking hsc. This will display the preferences used.
This tag defines
a new entity. The (required) attribute NAME
declares the
name of the entity, RPLC
the character that should be
replaced by this entity if found in the hsc-source and NUM
is the numeric representation of this entity. NUM
may be in the
range 128-65535, allowing for any Unicode (UCS-2 to be exact) character to be
assigned a corresponding entity. Definitions in the range 128-255 are done in the
prefs-file to allow users with character sets other than ISO-8859-1 (Latin-1)
to change the replacement characters; some other characters such as
mathematical symbols or typographical entities are predefined internally by
hsc. They reside at fixed positions in the Unicode charset and are unlikely to
ever change.
Example: <$defent NAME="uuml" RPLC="ü" NUM="252">
The ENTITYSTYLE
commandline option affects the way hsc will render entities in the resulting
HTML file. Setting the PREFNUM
attribute for an entity will make it
use the numeric representation if ENTITYSTYLE=replace
, no matter
what representation was used in the source text.
Unlike previous versions, hsc 0.931 and later allow redefinition of
entities. In this case, symbolic and numeric representation must match the
previous definition; only the PREFNUM
flag and the
RPLC
character will be updated. This allows to change the default
rendering/replacement of internally defined entities.
Warning #92 will be issued and should be
ignored if you really want to do this.
This tag defines
a new icon-entity. The only (required) attribute is NAME
which declares the name of the icon.
Example: <$deficon NAME="mail">
This tag defines
a new tag, and is used quite similar to <$macro>
, exept that a
tag-definition requires no macro-text and end-tag to be followed.
Example: <$deftag IMG SRC:uri/x/z/r ALT:string ALIGN:enum("top|bottom|middle") ISMAP:bool WIDTH:string HEIGHT:string>
To fully understand the above line, you might also want to read the sections about attributes and options for tags and macros.
For those, who are not smart enough or simply to lazy, here are some simple examples, which should also work somehow, though some features of hsc might not work:
<$deftag BODY /CLOSE BGCOLOR:string> <$deftag IMG SRC:uri ALT:string ALIGN:string ISMAP:bool>
This tag lets you define a new CSS property and optionally a list of values
that are allowed for it. If you omit the VAL
attribute, any value
will be permitted. Otherwise it should be a list in pretty much the same style as for
enum
parameters: words (which may include spaces) separated by
vertical bars.
<$defstyle name="text-align" val="left|center|right|justify"> <$defstyle name="text-indent" val="%P"> <$defstyle name="clip" val="%r|auto">
The text-align
property has a short list of four possible
values, so they are simply listed as an enumeration. text-indent
on
the other hand is numeric, so its values cannot be listed exhaustively.
Therefore, a special code resembling C-style format strings is used. The
following are supported:
z-index
property.word-spacing
.%n
, but also allows percentages
e.g. font-size
.text-indent
.background-color
. One of
HSC.COLOR-NAMES
.#rgb
''
or ``#rrggbb
''.rgb(r,g,b)
'',
where each of r, g and b may be a decimal value between 0 and
255 or a percentage between 0 and 100.uri(
...)
'',
e.g. for background-image
.rect(a,b,c,d)
'' with
a, b, c and d being numeric specs with a dimension, e.g. for
clip
.Note: If both the above placeholdes and an enumeration of
values are used, as for ``clip
'', the placeholder
must be the first element!
This tag defines an attribute list shortcut to support your laziness
when editing the prefs file. It allows to collect an arbitrary number of
attribute declarations under a single name that you can use later in
<$deftag>
or <$macro>
tags by putting the shortcut name in
square brackets.
Example:<$varlist HVALIGN
ALIGN:enum("left|center|right|justify|char")
VALIGN:enum("top|middle|bottom|baseline")>
<$deftag THEAD /AUTOCLOSE /LAZY=(__attrs) /MBI="table" [HVALIGN]>
This is the same as:
<$deftag THEAD /AUTOCLOSE /LAZY=(__attrs) /MBI="table"
ALIGN:enum("left|center|right|justify|char")
VALIGN:enum("top|middle|bottom|baseline")>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
Browsers should read that line, obtain the DTD and parse the source according to it. The problem about DTDs: they are written in SGML. And the problem about SGML: It's awful. It's unreadable. It's a pure brain-wanking concept born by some wireheads probably never seriously thinking about using it themselves. Even when there is free code available to SGML-parse text.
As a result, only less browsers did support this because it was too easy to write a browser spitting on the SGML-trash, simply parsing the code ``tag-by-tag'', developers decided to spend more time on making their product more user-friendly than computer-friendly (which is really understandable).
These browsers became even more popular when they supported tags certain people liked, but were not part of DTDs. As DTDs were published by w3c, and w3c did not like those tags, they did not made it into DTDs for a long time or even not at all (which is really understandable, too).
This did work for a certain degree until HTML-2.0. Several people (at least most of the serious w3-authoring people) did prefer to conform to w3c than use the funky-crazy-cool tags of some special browsers, and the funky-crazy-cool people did not care about DTDs or HTML-validators anyway.
However, after HTML-2.0, w3c fucked up. They proposed the infamous HTML-3.0 standard, which was never officially released, and tried to ignore things most browsers did already have implemented (which not all of them were useless crap, I daresay.). After more than a year without any remarkable news from w3c, they finally canceled HTML-3.0, and instead came out with the pathetic HTML-0.32.
Nevertheless, many people were very happy about HTML-0.32, as it finally was a statement after that many things became clear. It became clear that you should not expect anything useful from w3c anymore. It became clear that the browser developers rule. It became clear that no one is going to provide useful DTDs in future, as browser developers are too lazy and incompetent to do so. It became clear that anarchy has broken out for HTML-specifications.
So, as a conclusion, reasons not to use DTDs but an own format are:Quite unexpectedly, with HTML-4.0 this has changed to some extent, as the DTDs are quite readable and well documented. The general syntax of course still sucks, error handling is unbearable for ``normal'' users and so on. Although it will take them more than this to get back the trust they abused in the recent years, at least it is a little signal suggesting there are some small pieces of brain intact somewhere in this consortium.
There is also a disadvantage of this concept: reading hsc.prefs every time on startup needs an awful lot of time. Usually, processing your main data takes shorter than reading the preferences. You can reduce this time, if you create your own hsc.prefs with all tags and entities you don't need removed. But I recommend to avoid this because you might have to edit your preferences again with the next update of hsc, if any new features have been added.