|
SAP NetWeaver '04 | |||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
See:
Description
| Interface Summary | |
| IHTMLContentHandler | IHTMLContentHandler receives events from a IHTMLReader. |
| IHTMLElement | Represents a HTML tag for an event. |
| IHTMLElementStart | Extends IHTMLElement to handle attributes. |
| IHTMLFilter | Processes HTML events from a parent reader. |
| IHTMLReader | Reads HTML documents and generates events. |
| Class Summary | |
| HTMLFilterImpl | Default Implementation of IHTMLFilter. |
| HTMLInputStream | A InputStream on top of a IHTMLReader. |
| HTMLReaderFactory | HTMLReaderFactory creates instances of IHTMLReader. |
| HTMLScriptRemover | Removes script content and noscript tags. |
| HTMLStreamWriter | Writes events from a IHTMLReader onto a stream. |
| HtmlTag | Copyright (c) SAP AG 2001-2002 |
| HtmlTokenizer | HtmlTokenizer Copyright (c) SAP AG 2001-2003 |
| Exception Summary | |
| HTMLException | HTMLException is the base class for all exceptions in this package. |
Contains classses that handle the parsing of HTML.
HtmlTokenizer and
HtmlTag implement a "pull"-style
parsing of HTML documents.
The client of HtmlTokenizer calls next() until the end
of the document is reached. The tokenizer returns the type of the next
parsed token and also its string content. A client can then use HtmlTag
to access string content of a TAG token in a structured way.
IHTMLReader and
IHTMLContentHandler are the
basic interfaces for "push"-tyle parsing of HTML documents.
IHTMLReader follows closely the SAX API approach. A content handler
is installed in a reader which receives events for every parsed
document part. A client of IHTMLReader invokes parse()
on the reader whereas the complete document is read. During this, all
events are sent to the installed content handler.
As a mixture betwenn "push" and "pull", IHTMLReader also offers a
way of "controled-push" parsing. The client can invoke
parseNextEvent(), whereas the reader sends one event
to the content handler and returns to the client afterwards.
<meta>
tag as explained here.
IHTMLFilter is a general filter
interface for the push parser. Filters can be chained and appear as
a IHTMLReader to the client. Each filter installs itself as content
handler in its IHTMLReader.HTMLFilterImpl which implements
the identity function, e.g. all events are forwarded unchanged.
OutputStream
or Writer by using the HTMLStreamWriter.
Likewise the output from a filter/reader can be used as InputStream
to read from by using the HTMLInputStream.
<html>).
The basic working assumption for the parsers is: "report anything which does
not look like a tag as text token/event."
Both parsers do not care about namespace declarations (reporting them as attributes on the tag/element) or even namespace prefixes. IHTMLReader elements only have a name where the prefix is part of. As a consequence XHTML documents which use a non-empty namespace prefix for the xhtml namespace, will not be properly handled by content handlers.
|
SAP NetWeaver '04 | |||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||