


NSGMLS(1)                                               NSGMLS(1)


NAME
       nsgmls - a validating SGML parser

       An SGML System Conforming to
       International Standard ISO 8879 --
       Standard Generalized Markup Language

SYNOPSIS
       nsgmls  [ -deglprsuvx ] [ -alinktype ] [ -ffile ] [ -iname
       ] [ -mfile ] [ -tfile ] [ -wwarning_type ]  [  filename...
       ]

DESCRIPTION
       Nsgmls  parses  and  validates the SGML document entity in
       filename...  and prints on the standard  output  a  simple
       text  representation  of its Element Structure Information
       Set.  (This is the  information  set  which  a  structure-
       controlled  conforming  SGML application should act upon.)
       Note that the document entity may be spread  amongst  sev-
       eral  files;  for  example, the SGML declaration, document
       type declaration and document instance set could  each  be
       in  a  separate file.  If no filenames are specified, then
       nsgmls will read the document  entity  from  the  standard
       input.   Each filename is actually interpreted as a system
       identifier.  A command line filename of - can be  used  to
       refer  to the standard input.  (Normally in a system iden-
       tifier, fd:0 is used to refer to standard input.)

       The following options are available:

       -alinktype
              Make link  type  linktype  active.   Not  all  ESIS
              information is output in this case: the active LPDs
              are not explicitly  reported,  although  each  link
              attribute  is  qualified  with  its link type name;
              there is no information about result elements; when
              there  are  multiple  link  rules applicable to the
              current element, nsgmls always chooses the first.

       -d     Warn about duplicate entity declarations.

       -e     Describe open entities in  error  messages.   Error
              messages  always  include  the position of the most
              recently opened external entity.

       -ffile Redirect errors to file.   This  is  useful  mainly
              with  shells  that  do  not  support redirection of
              stderr.

       -g     Show the GIs of open elements in error messages.

       -iname Pretend that

                     <!ENTITY % name "INCLUDE">



                                                                1





NSGMLS(1)                                               NSGMLS(1)


              occurs at the start of the document  type  declara-
              tion  subset  in  the  SGML document entity.  Since
              repeated definitions of an entity are ignored, this
              definition will take precedence over any other def-
              initions of this entity in the document type decla-
              ration.   Multiple  -i options are allowed.  If the
              SGML declaration replaces the reserved name INCLUDE
              then  the new reserved name will be the replacement
              text of the entity.  Typically  the  document  type
              declaration will contain

                     <!ENTITY % name "IGNORE">

              and  will use %name; in the status keyword specifi-
              cation of a marked section  declaration.   In  this
              case  the effect of the option will be to cause the
              marked section not to be ignored.

       -l     Output L commands giving the  current  line  number
              and filename.

       -mfile Map  public  identifiers and entity names to system
              identifiers using the catalog entry file whose sys-
              tem  identifier  is  file.  Multiple -m options are
              allowed.  Catalog entry files specified with the -m
              option will be searched before the defaults.

       -p     Parse  only  the  prolog.   Nsgmls  will exit after
              parsing the document type declaration.  Implies -s.

       -r     Warn about defaulted references.

       -s     Suppress  output.   Error  messages  will  still be
              printed.

       -tfile Output to  file  the  RAST  result  as  defined  by
              ISO/IEC 13673:1995 (actually this isn't quite an IS
              yet;  this  implements  the  Intermediate  Editor's
              Draft  of  1994/08/29,  with  changes  to implement
              ISO/IEC JTC1/SC18/WG8 N1777).  The normal output is
              not produced.

       -u     Warn about undefined elements: elements used in the
              DTD but not defined.

       -v     Print the version number.

       -wwarning_type
              Give warnings  according  to  the  value  of  warn-
              ing_type:

              mixed  Warn  about mixed content models that do not
                     allow #pcdata anywhere.




                                                                2





NSGMLS(1)                                               NSGMLS(1)


              sgmldecl
                     Warn about various dubious constructions  in
                     the SGML declaration.

              should Warn  about  various recommendations made in
                     ISO 8879 that the document does  not  comply
                     with.   (Recommendations  are expressed with
                     ``should'', as  distinct  from  requirements
                     which are usually expressed with ``shall''.)

              default
                     Warn about defaulted references.   (Same  as
                     -r.)

              duplicate
                     Warn  about  duplicate  entity declarations.
                     (Same as -d.)

              undefined
                     Warn about undefined elements: elements used
                     in the DTD but not defined.  (Same as -u.)

              all    Give all available warnings.

              Multiple -w options are allowed.

       -x     Suppress  check  that  for  each ID reference value
              there is an element with that ID.

       -X     If the -t option is being  used,  do  not  give  an
              error  when  a  character that is not a significant
              character in the reference concrete  syntax  occurs
              in  a literal in the SGML declaration.  This may be
              useful  in  conjunction  with  certain  buggy  test
              suites.

   External entities
       An external entity resides in one or more storage objects,
       each of which contains a sequence of  bytes.   The  entity
       manager  component  of  nsgmls  maps a sequence of storage
       objects into an entity as follows:

       1.     The bytes in each storage object are converted into
              characters, each represented by a single bit combi-
              nation, according to the encoding translation asso-
              ciated with the storage object.

       2.     The  characters in each storage object are concate-
              nated.

       3.     The sequence of characters is treated as a sequence
              of lines each terminated by a line terminator.  The
              line terminator is either a line feed or a carriage
              return  or  a  a carriage return followed by a line



                                                                3





NSGMLS(1)                                               NSGMLS(1)


              feed.  Nsgmls determines which line  terminator  to
              use  for a storage object according to which of the
              possible line terminators is  used  for  the  first
              line  of  the  storage  object.   A record start is
              inserted at the  beginning  of  each  line,  and  a
              record  end at the end of each line.  If there is a
              partial line (a line that doesn't end with the line
              terminator) at the end of the entity, then a record
              start will be inserted before it but no record  end
              will be inserted after it.

       An  encoding translation defines a translation between the
       storage coding system and the entity coding  system.   The
       storage  coding  system represents characters by sequences
       of bytes; it can be  variable  width  and  stateful.   The
       entity coding system represents each character by a single
       bit combination; it is fixed-width (but not limited  to  8
       bits)  and  stateless.   Note  that  the  SGML declaration
       describes the entity coding system not the storage  coding
       system.

   System identifiers
       A  system  identifier  describes  a  sequence  of  storage
       objects, each optionally associated with a encoding trans-
       lation.  Nsgmls will attempt to interpret a system identi-
       fier as a keyword  followed  by  a  colon  followed  by  a
       string,  which  is interpreted in a keyword-dependent way.
       Keywords are case-insensitive.  The following keywords are
       recognized:

       file   The  string is interpreted as a filename.  The sys-
              tem identifier describes a  single  storage  object
              that will be read from the named file.

       fd     The  string  is as a number.  The system identifier
              describes a single storage object  that  will  read
              from  the  file  descriptor  with that number.  For
              example, fd:0 will read  the  storage  object  from
              standard input.

       concat The string is treated as a list of substrings sepa-
              rated by + characters.  Each of the  substrings  is
              in turn interpreted as a system identifier, and the
              sequences of storage objects that each  denote  are
              concatenated.    The   concat   system   identifier
              describes  the  resulting   sequence   of   storage
              objects.

       http   The  string  together  with  the  http:  prefix  is
              treated as a URL.  This is implemented  only  under
              Unix.

       utf8   The  string  is  interpreted as a system identifer.
              Each storage object that it describes that  is  not



                                                                4





NSGMLS(1)                                               NSGMLS(1)


              associated  with  a encoding translation is associ-
              ated with an encoding translation  that  translates
              UTF8  to  fixed-width encoding.  Invalid multi-byte
              sequences are represented by the character  0xFFFD.
              This  keyword  is recognized only in the multi-byte
              version of nsgmls.

       replace
              The string is interpreted as a  system  identifier.
              Numeric  character references using the SGML refer-
              ence  concrete  syntax  will  be   recognized   and
              replaced  within  each  storage  object  identifier
              occuring in the system identifier.

       ucs2   The string is interpreted as  a  system  identifer.
              Each  storage  object that it describes that is not
              associated with a encoding translation  is  associ-
              ated  with  an encoding translation that translates
              UCS2 to a fixed width encoding.  The more  signifi-
              cant  octet  of  each character always precedes the
              less significant octet irrespective of the system's
              native byte-order.  The codes 0xFFFE and 0xFEFF are
              not treated specially in any way.  This keyword  is
              recognized   only  in  the  multi-byte  version  of
              nsgmls.

       unicode
              The string is interpreted as  a  system  identifer.
              Each  storage  object that it describes that is not
              associated with a encoding translation  is  associ-
              ated with the an encoding translation, which trans-
              lates the Unicode coding system  to  a  fixed-width
              encoding.   The  Unicode  coding system treats each
              pair of octets as a character in the system's  byte
              order.   If  the  first character is the byte order
              mark character  (0xFEFF),  it  will  be  discarded.
              (This  is necessary to avoid problems with the SGML
              document entity: a byte order mark before the  SGML
              declaration would be a syntax error.)  If the first
              character is the byte order  mark  character  byte-
              swapped,  it  will  be  discarded and the remaining
              characters will be byte-swapped.  This  keyword  is
              recognized   only  in  the  multi-byte  version  of
              nsgmls.

       ujis   The string is interpreted as  a  system  identifer.
              Each  storage  object that it describes that is not
              associated with a encoding translation  is  associ-
              ated with an encoding translation where the storage
              coding  system  is  variable-width  (packed)   UJIS
              (EUC), and the entity coding system represents each
              character in the same way as the EUC complete  two-
              byte  format.  In the entity coding system the code
              of characters in the G0 set (usually  the  Japanese



                                                                5





NSGMLS(1)                                               NSGMLS(1)


              version of ISO 646) is unchanged; The code of char-
              acters in the G1 set (usually JIS X  0208-1990)  is
              ORed  with 0x8080; the code of characters in the G2
              set  (usually  half-width  katakana  from   JIS   X
              0201-1986) is ORed with 0x0080; the code of charac-
              ters in the G3 set (JIS X 0212-1990) is  ORed  with
              0x8000.   This  keyword  is  recognized only in the
              multi-byte version of nsgmls.

       sjis   The string is interpreted as  a  system  identifer.
              Each  storage  object that it describes that is not
              associated with a encoding translation  is  associ-
              ated with an encoding translation where the storage
              coding system is Shift JIS and  the  entity  coding
              system is the same as with the ujis encoding trans-
              lation (except for characters in the G3  set  which
              are  not representable using Shift JIS.)  This key-
              word is recognized only in the  multi-byte  version
              of nsgmls.

       identity
              The  string  is  interpreted as a system identifer.
              Each storage object that it describes that  is  not
              associated  with  a encoding translation is associ-
              ated with the identity encoding  translation.   The
              identity coding system converts bytes to characters
              by zero-extending each character.

       raw    The string is interpreted as a  system  identifier.
              No  translation  of line-terminators onto RS and RE
              characters  will  be  performed  for  each  storage
              object that it describes.  Error messages referring
              to these storage objects will not contain line num-
              bers.

       cooked The  string  is interpreted as a system identifier.
              This undoes the effect of any earlier raw  keyword.

       huge   This  keyword  is intended for use with huge files,
              for which the cost of keeping track of line  bound-
              aries  (roughly  one  byte  per line) is too large.
              The string is interpreted as a  system  identifier.
              For  each  storage object that it describes, nsgmls
              will not keep track of where line boundaries  occur
              as  it  usually  does.  Error messages referring to
              these storage objects will not  contain  line  num-
              bers.

       If  a system identifier does not contain a keyword or uses
       a keyword that is not recognized, then the system  identi-
       fier  will be treated as a filename.  Note that the system
       identifier file:utf8:doc.sgm  identifies  the  file  named
       utf8:doc.sgm  but  utf8:file:doc.sgm  identifies  the file
       named doc.sgm using the utf8 coding scheme.



                                                                6





NSGMLS(1)                                               NSGMLS(1)


       A relative filename in a system identifier is  interpreted
       relative  to  the  file  in which the system identifier is
       specified, if any, and otherwise relative to  the  current
       directory.  This applies both to system identifiers speci-
       fied in SGML documents, and to system  identifiers  speci-
       fied in catalog entry files.

       If  a  system  identifier  does  not  specify the encoding
       translation,  the  encoding  translation  of  the  storage
       object  in  which the system identifier was specified will
       be used.

       The raw keyword will be implied for an  NDATA  entity  and
       for  a  system identifier defined in a storage object that
       was raw.  This can be overridden using the cooked keyword.

   System identifier generation
       If  a  system identifier is not specified, then the entity
       manager will attempt to generate one using  catalog  entry
       files in the format defined in the SGML Open Draft Techni-
       cal Resolution on Entity Management.  A catalog entry file
       contains  a  sequence  of  entries in one of the following
       four forms:

       PUBLIC pubid sysid
              This specifies that sysid should  be  used  as  the
              system  identifier  if  the  public  identifier  is
              pubid.  Sysid is a system identifier as defined  in
              ISO  8879  and  pubid  is  a  public  identifier as
              defined in ISO 8879.

       ENTITY name sysid
              This specifies that sysid should  be  used  as  the
              system identifier if the entity is a general entity
              whose name is name.

       ENTITY %name sysid
              This specifies that sysid should  be  used  as  the
              system  identifier  if  the  entity  is a parameter
              entity whose name is name.  Note that there  is  no
              space between the % and the name.

       DOCTYPE name sysid
              This  specifies  that  sysid  should be used as the
              system  identifier  if  the  entity  is  an  entity
              declared in a document type declaration whose docu-
              ment type name is name.

       LINKTYPE name sysid
              This specifies that sysid should  be  used  as  the
              system  identifier  if  the  entity  is  an  entity
              declared in a link type declaration whose link type
              name is name.




                                                                7





NSGMLS(1)                                               NSGMLS(1)


       OVERRIDE
              This specifies that system identifiers specified in
              the  catalog  should  override  system  identifiers
              specified  in the document.  Normally, if an entity
              declaration in  the  document  specifies  a  system
              identifier, the catalog is not consulted.  If OVER-
              RIDE is specified, then  the  catalog  is  searched
              first;  the  system only uses the system identifier
              specified in the document, if no match is found  in
              the catalog.

       SGMLDECL sysid
              This  specifies  that if the document does not con-
              tain an SGML declaration, the SGML  declaration  in
              sysid should be implied.

       The  last  four forms are extensions to the SGML Open for-
       mat.  The delimiters can be omitted from  the  sysid  pro-
       vided  it  does not contain any white space.  Comments are
       allowed between parameters delimited by  --  as  in  SGML.
       The  environment  variable  SGML_CATALOG_FILES  contains a
       list of catalog entry files.  The  list  is  separated  by
       colons  under  Unix and by semi-colons under MSDOS.  These
       will be searched after any catalog entry  files  specified
       using  the -m option.  If this environment variable is not
       set, then a system dependent list of catalog  entry  files
       will  be used.  A match in a catalog entry file for a PUB-
       LIC entry will take precedence over a match  in  the  same
       file for an ENTITY, DOCTYPE or LINKTYPE entry.

   System declaration
       The system declaration for nsgmls is as follows:

                               SYSTEM "ISO 8879:1986"
                                       CHARSET
       BASESET  "ISO 646-1983//CHARSET
                 International Reference Version (IRV)//ESC 2/5 4/0"
       DESCSET  0 128 0
       CAPACITY PUBLIC  "ISO 8879:1986//CAPACITY Reference//EN"
                                      FEATURES
       MINIMIZE DATATAG NO        OMITTAG  YES     RANK     YES   SHORTTAG YES
       LINK     SIMPLE  YES 65535 IMPLICIT YES     EXPLICIT YES 1
       OTHER    CONCUR  NO        SUBDOC   YES 100 FORMAL   YES
       SCOPE    DOCUMENT
       SYNTAX   PUBLIC  "ISO 8879:1986//SYNTAX Reference//EN"
       SYNTAX   PUBLIC  "ISO 8879:1986//SYNTAX Core//EN"
                                      VALIDATE
                GENERAL YES       MODEL    YES     EXCLUDE  YES   CAPACITY NO
                NONSGML YES       SGML     YES     FORMAL   YES
                                        SDIF
                PACK    NO        UNPACK   NO

       The limit for the SUBDOC parameter is memory dependent.




                                                                8





NSGMLS(1)                                               NSGMLS(1)


       Any legal concrete syntax may be used.

   SGML declaration
       The  SGML declaration may be omitted, the following decla-
       ration will be implied:
                           <!SGML "ISO 8879:1986"
                                   CHARSET
       BASESET  "ISO 646-1983//CHARSET
                 International Reference Version (IRV)//ESC 2/5 4/0"
       DESCSET    0  9 UNUSED
                  9  2  9
                 11  2 UNUSED
                 13  1 13
                 14 18 UNUSED
                 32 95 32
                127  1 UNUSED
       CAPACITY PUBLIC    "ISO 8879:1986//CAPACITY Reference//EN"
       SCOPE    DOCUMENT
       SYNTAX
       SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
                18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 255
       BASESET  "ISO 646-1983//CHARSET International Reference Version
                 (IRV)//ESC 2/5 4/0"
       DESCSET  0 128 0
       FUNCTION RE                    13
                RS                    10
                SPACE                 32
                TAB       SEPCHAR     9
       NAMING   LCNMSTRT  ""
                UCNMSTRT  ""
                LCNMCHAR  "-."
                UCNMCHAR  "-."
                NAMECASE  GENERAL     YES
                          ENTITY      NO
       DELIM    GENERAL   SGMLREF
                SHORTREF  SGMLREF
       NAMES    SGMLREF
       QUANTITY SGMLREF
                ATTCNT    99999999
                ATTSPLEN  99999999
                DTEMPLEN  24000
                ENTLVL    99999999
                GRPCNT    99999999
                GRPGTCNT  99999999
                GRPLVL    99999999
                LITLEN    24000
                NAMELEN   99999999
                PILEN     24000
                TAGLEN    99999999
                TAGLVL    99999999
                                  FEATURES
       MINIMIZE DATATAG   NO
                OMITTAG   YES




                                                                9





NSGMLS(1)                                               NSGMLS(1)


                RANK      YES
                SHORTTAG  YES
       LINK     SIMPLE    YES 1000
                IMPLICIT  YES
                EXPLICIT  YES 1
       OTHER    CONCUR    NO
                SUBDOC    YES 99999999
                FORMAL    YES
                                APPINFO NONE>
       with the exception that characters 160 through 254 will be
       assigned to DATACHAR.

       A character in a base character set is described either by
       giving its number in a  universal  character  set,  or  by
       specifying  a  minimum  literal.   The  constraints on the
       choice of universal character set are that characters that
       are significant in the SGML reference concrete syntax must
       be in the universal character set and must have  the  same
       number  in  the  universal character set as in ISO 646 and
       that each character in the character set  must  be  repre-
       sented  by  exactly  one number; that character numbers in
       the range 0 to 31 and 127 to 159  are  control  characters
       (for  the  purpose of enforcing SHUNCHAR CONTROLS).  It is
       recommended that ISO 10646 (Unicode) be used as  the  uni-
       versal  character  set,  except  in environments where the
       normal document character sets  are  large  character  set
       which cannot be compactly described in terms of ISO 10646.
       The public identifier of a base character set can be asso-
       ciated  with an entity that describes it by using a PUBLIC
       entry in the catalog entry file.  The  entity  must  be  a
       fragment of an SGML declaration consisting of the the por-
       tion of a character set description, following the DESCSET
       keyword  that  is,  it  must  be  a  sequence of character
       descriptions, where each character description specifies a
       described  character  number, the number of characters and
       either a character number in the universal character  set,
       a  minimum  literal or the keyword UNUSED.  Character num-
       bers in the universal character  set  can  be  as  big  as
       99999999.

       In addition nsgmls has built in knowledge of a few charac-
       ter sets.  These  are  identified  using  the  designating
       sequence  in  the public identifier.  The following desig-
       nating sequences are recognized:
       Designating       ISO         Minimum      Number
         Escape      Registration   Character       of             Description
        Sequence        Number       Number     Characters
       ------------------------------------------------------------------------------
       ESC 2/5 4/0        -             0          128       full set of ISO 646 IRV
       ESC 2/8 4/0        2             0          128       G0 set of ISO 646 IRV
       ESC 2/8 4/2        6             0          128       G0 set of ASCII
       ESC 2/1 4/0        1             0           32       C0 set of ISO 646

       The graphic character sets do not strictly include C0  and



                                                               10





NSGMLS(1)                                               NSGMLS(1)


       C1  control  character sets.  For convenience, nsgmls aug-
       ments the graphic character sets with the appropriate con-
       trol character sets.

       It  is  not  necessary for every character set used in the
       SGML declaration to be known to nsgmls provided that char-
       acters  in the document character set that are significant
       both in the reference concrete syntax and in the described
       concrete  syntax  are described using known base character
       sets and that  characters  that  are  significant  in  the
       described  concrete  syntax  are  described using the same
       base character sets or the same minimum literals  in  both
       the document character set description and the syntax ref-
       erence character set description.

       The public identifier for a public concrete syntax can  be
       associated  with  an  entity that describes using a PUBLIC
       entry in the catalog entry file.  The  entity  must  be  a
       fragment  of  an SGML declaration consisting of a concrete
       syntax description starting with the SHUNCHAR  keyword  as
       in  an  SGML declaration.  The entity can also make use of
       the following extensions:

              An added function can be expressed as  a  parameter
              literal instead of a name.

              The  replacement  for a reference reserved name can
              be expressed as a parameter literal  instead  of  a
              name.

              The  LCNMSTRT, UCNMSTRT, LCNMCHAR and UCNMCHAR key-
              words may each be followed by more than one parame-
              ter  literal.  A sequence of parameter literals has
              the same meaning  as  a  single  parameter  literal
              whose  content  is the concatenation of the content
              of each of the  literals  in  the  sequence.   This
              extension  is  useful because of the restriction on
              the length of a parameter literal in the SGML  dec-
              laration to 240 characters.

              The  total number of characters specified for UCNM-
              CHAR or UCNMSTRT may exceed  the  total  number  of
              characters   specified  for  LCNMCHAR  or  LCNMSTRT
              respectively.  Each character in UCNMCHAR or  UCNM-
              STRT  which does not have a corresponding character
              in the same position in  LCNMCHAR  or  LCNMSTRT  is
              simply  assigned  to  UCNMCHAR  or UCNMSTRT without
              making it the upper-case form of any character.

              A parameter following any  of  LCNMSTRT,  UCNMSTRT,
              LCNMCHAR  and  UCNMCHAR keywords may be followed by
              the name token ...  and another parameter  literal.
              This has the same meaning as the two parameter lit-
              erals  with  a   parameter   literal   in   between



                                                               11





NSGMLS(1)                                               NSGMLS(1)


              containing  in order each character whose number is
              greater than the number of the  last  character  in
              the  first parameter literal and less than the num-
              ber of the first character in the second  parameter
              literal.  A parameter literal must contain at least
              one character for each ...  to which  it  is  adja-
              cent.

              A  number  may be used as a parameter following the
              LCNMSTRT, UCNMSTRT, LCNMCHAR and UCNMCHAR  keywords
              or  as  a  delimiter  in the DELIM section with the
              same meaning as a parameter literal containing just
              a numeric character reference with that number.

              The  parameters  following  the LCNMSTRT, UCNMSTRT,
              LCNMCHAR and  UCNMCHAR  keywords  may  be  omitted.
              This  has  the  same meaning as specifying an empty
              parameter literal.

              Within the specification  of  the  short  reference
              delimiters,  a parameter literal containing exactly
              one character may be followed by the name token ...
              and  another  parameter  literal containing exactly
              one character.  This has  the  same  meaning  as  a
              sequence of parameter literals one for each charac-
              ter number that is greater than  or  equal  to  the
              number of the character in the first parameter lit-
              eral and less than or equal to the  number  of  the
              character in the second parameter literal.

       The  public  identifier  for  a public capacity set can be
       associated with an entity that describes  using  a  PUBLIC
       entry  in  the  catalog  entry file.  The entity must be a
       fragment of an SGML declaration consisting of  a  sequence
       of capacity names and numbers.

   Output format
       The output is a series of lines.  Lines can be arbitrarily
       long.  Each line consists of an initial command  character
       and  one  or more arguments.  Arguments are separated by a
       single space, but when a command takes a fixed  number  of
       arguments  the last argument can contain spaces.  There is
       no space between the command character and the first argu-
       ment.    Arguments   can   contain  the  following  escape
       sequences.

       \\     A \.

       \n     A record end character.

       \|     Internal SDATA entities are bracketed by these.

       \nnn   The character whose code is nnn octal.




                                                               12





NSGMLS(1)                                               NSGMLS(1)


       A record start character  will  be  represented  by  \012.
       Most  applications  will need to ignore \012 and translate
       \n into newline.

       The possible command characters and arguments are as  fol-
       lows:

       (gi    The start of an element whose generic identifier is
              gi.  Any attributes for this element will have been
              specified with A commands.

       )gi    The  end an element whose generic identifier is gi.

       -data  Data.

       &name  A reference to an external data entity  name;  name
              will have been defined using an E command.

       ?pi    A processing instruction with data pi.

       Aname val
              The  next  element  to  start has an attribute name
              with value val which takes  one  of  the  following
              forms:

              IMPLIED
                     The value of the attribute is implied.

              CDATA data
                     The  attribute  is  character data.  This is
                     used for attributes whose declared value  is
                     CDATA.

              NOTATION nname
                     The attribute is a notation name; nname will
                     have been defined using a N  command.   This
                     is  used for attributes whose declared value
                     is NOTATION.

              ENTITY name...
                     The attribute is a list  of  general  entity
                     names.   Each  entity  name  will  have been
                     defined using an I, E or S command.  This is
                     used  for attributes whose declared value is
                     ENTITY or ENTITIES.

              TOKEN token...
                     The attribute is a list of tokens.  This  is
                     used  for attributes whose declared value is
                     anything else.

       Dename name val
              This is the same as the A command, except  that  it
              specifies  a  data attribute for an external entity



                                                               13





NSGMLS(1)                                               NSGMLS(1)


              named ename.  Any D commands will come after the  E
              command  that  defines  the  entity  to  which they
              apply, but before any & or A commands  that  refer-
              ence the entity.

       atype name val
              The next element to start has a link attribute with
              link type type, name name,  and  value  val,  which
              takes the same form as with the A command.

       Nnname nname.   Define  a  notation.  This command will be
              preceded  by  a  p  command  if  the  notation  was
              declared  with a public identifier, and by a s com-
              mand if the notation was  declared  with  a  system
              identifier.   A notation will only be defined if it
              is to be referenced in an E command or in an A com-
              mand  for  an  attribute  with  a declared value of
              NOTATION.

       Eename typ nname
              Define an external data  entity  named  ename  with
              type  typ (CDATA, NDATA or SDATA) and notation not.
              This command will be preceded by one or more f com-
              mands  giving the filenames generated by the entity
              manager from the system and public identifiers,  by
              a p command if a public identifier was declared for
              the entity, and by a s command if a system  identi-
              fier  was  declared  for the entity.  not will have
              been defined using a N  command.   Data  attributes
              may  be  specified for the entity using D commands.
              An external data entity will only be defined if  it
              is  to be referenced in a & command or in an A com-
              mand for  an  attribute  whose  declared  value  is
              ENTITY or ENTITIES.

       Iename typ text
              Define  an  internal  data  entity named ename with
              type typ (CDATA or SDATA) and entity text text.  An
              internal  data entity will only be defined if it is
              referenced in an A command for an  attribute  whose
              declared value is ENTITY or ENTITIES.

       Sename Define a subdocument entity named ename.  This com-
              mand will be preceded by one  or  more  f  commands
              giving  the  filenames generated by the entity man-
              ager from the system and public identifiers, by a p
              command if a public identifier was declared for the
              entity, and by a s command if a  system  identifier
              was  declared for the entity.  A subdocument entity
              will only be defined if it is  referenced  in  a  {
              command  or  in an A command for an attribute whose
              declared value is ENTITY or ENTITIES.

       ssysid This command applies to the next E, S or N  command



                                                               14





NSGMLS(1)                                               NSGMLS(1)


              and specifies the associated system identifier.

       ppubid This  command applies to the next E, S or N command
              and specifies the associated public identifier.

       ffilename
              This command applies to the next E or S command and
              specifies  the  effective  system  identifier.  The
              effective system identifier is the  system  identi-
              fier  generated  by  the  system from the specified
              external identifier and other information about the
              entity.

       {ename The  start  of  the  SGML subdocument entity ename;
              ename will have been defined using a S command.

       }ename The end of the SGML subdocument entity ename.

       Llineno file
       Llineno
              Set the current  line  number  and  filename.   The
              filename  argument will be omitted if only the line
              number has changed.  This will be  output  only  if
              the -l option has been given.

       #text  An  APPINFO  parameter of text was specified in the
              SGML declaration.  This is not strictly part of the
              ESIS,  but  a  structure-controlled  application is
              permitted to act on it.  No # command will be  out-
              put  if  APPINFO NONE  was  specified.  A # command
              will occur at most once, and may be  preceded  only
              by a single L command.

       C      This command indicates that the document was a con-
              forming SGML document.  If this command is  output,
              it  will  be the last command.  An SGML document is
              not  conforming  if  it  references  a  subdocument
              entity that is not conforming.

ENVIRONMENT
       NSGMLS_CODE
              If  this  is set to the name of a encoding transla-
              tion, then that encoding translation will  be  used
              as  the default encoding translation for everything
              (including file input, file output, message output,
              filenames  and  command line arguments).  Otherwise
              the identity encoding  translation  will  be  used.
              Setting this to ucs2 or unicode is unlikely to give
              reasonable results.

SEE ALSO
       The SGML Handbook, Charles F. Goldfarb
       ISO 8879 (Standard Generalized Markup Language),  Interna-
       tional Organization for Standardization



                                                               15





NSGMLS(1)                                               NSGMLS(1)


BUGS
       Not all ESIS information for LINK is reported.

AUTHOR
       James Clark (jjc@jclark.com).




















































                                                               16


