AddTag method
Adds HTML tag to be parsed. Only the tags added to the parser
will be recognized. The other parts of the page is packed in
plain/text nodes and treated as regular text. Thus the document
model returned by the ParseString method contains only the parts
of interest - explicitly specified through AddTag method.
Syntax:
parserobject.AddTag TagName, HasClosing, SelfNestingAllowed, _
CopyContent, CannotContainOtherTags, _
RequiredAttr, RequiredAttrValue
Parameters:
TagName - string, case insensetive. The tag name - for example
"TITLE" or "BODY"
HasClosing - Boolean. If True the tag has a closing tag for
example </TITLE>. If False the tag is assumed self-closing
SelfNestingAllowed - Boolean. Allow the tag to contain the
same tags. E.g. <DIV> may contain other <DIV> tags
(true). False means the content of the tag will not be searched
for the contained tags of the same class - appropriate for the
TITLE for example.
CopyContent - Boolean. If true the content of the tag will be
placed in the "__content" item of the resulting node (
node("__content") will contain the text between
<TAG> and </TAG>). This could be convenient if
searching for data. Changing the "__content" item in
the resulting node will NOT change the document regenerated by
the GenerateDoc method! In most cases False should be used to
lower the memory consumption.
CannotContainOtherTags - Boolean. Instructs the parser to not
search the tag content for the other tags specified through
other calls to the AddTag method.
RequiredAttr and RequiredAttrValue - Strings. If they are
empty they are ignored. If the RequiredAttr is set to nonempty
string than the tag is recognized only if it contains a HTML
attribute with this name. If also RequiredAttrValue is not empty
than also this attribute must be set to the value specified in
order to recognize the tag. For example this technique is used
by SCRIPT RUNAT=SERVER tags in the ASP pages. It could be useful
for your own needs too - for example it is often useful to
extract information from meta tags. All the META tags that have
NAME attribute are typically related to the page content
(contain keywords for example). Thus using these parameters you
are able to request only META tags that have this attribute and
ignore the others.
Samples:
parser.AddTag "META", False, False, False, True,
"NAME", "Keywords"
Instructs the parser to strip the META tags that contain
keywords
parser.AddTag "DIV", True, False, False, False,
"DIVTYPE", "Advert"
Instructs the parser to strip the DIVisions with HTML
attribute "DIVTYPE" set to "Advert". Suppose
you will insert in them some rotating advertisements.
Remarks:
Using the TextEmbedParser makes possible to think for the
HTML in way similar to the way script sees the page in the
browser. But this is done on the server side. The object model
is stripped to the tags you request - thus it is simplified.
The parser will not perform encoding or decoding of the
content of the tags - e.g. (non breaking space) will
appear "as is". When using a HTML as template decoding
and encoding of the template are not very useful because the
unchaged parts of the template are just copied or moved
throughout the document tree. But when placing text/replacing
template nodes the script may need to use the ASP's
Server.HTMLEncode method. Decoding and encoding are also ambiguous
because the parser strips only part of the HTML model and it
cannot be sure what part of the unparsed test is a HTML code and
what is text. In the rare cases when decoding of part of the
template is required only the script may perform such operations
based on some prior knowledge for the template structure.
However such techniques are rarely needed when working with
templates.
|