Parser is an abstract class with the following interfaces:
Node& nextNode(); void setRoot(Node& root); NodeStream& copy();
Parser is implemented by several concrete classes, the most useful of which are:
Parser is very similar to NodeSource, but differs in offering a guarantee that it provides a series of NEW nodes, which are not associated with any existing document or subtree.
When nextNode() is invoked on a parser, a reference to a node is returned. If we have a parser, say an XMLParser, set up and we want to use it to construct a document, we will repeatedly call nextNode() on the parser and use the returned node reference to construct the document. This is typically done in a loop:
XMLParser parser(sourcePath);
while (true) {
Node& node = stream.nextNode();
if (!node.isValid()) break;
// process the node here
}
Within the loop, the node can be processed using hand code:
XMLParser parser(sourcePath);
while (true) {
Node& node = stream.nextNode();
if (!node.isValid()) break;
// use hand code for processing
// make sure that the node is an element:
if (node.getNodeType() != Node::ELEMENT_NODE) continue;
DOMString nodeName = node.getQualifiedName();
// output the node name:
wrt::Console << "FOUND: " << nodeName << wrt::endl;
}
Or, for standard operations, it is easier to use a Processor:
XMLParser parser(sourcePath);
while (true) {
Node& node = stream.nextNode();
if (!node.isValid()) break;
// use a Processor to output
XMLWriter().process(node);
}
A ProcessBox is a container for processors, which are added with the void addProcessor(processor) function. It has several constructors, which you should investigate. We deal here with two of them:
The first of these constructs a ProcessBox that manages a Parser; the other, a ProcessBox that is itself managed, and which is added to a ProcessBox, just like other processors, because it is itself a Processor.
Now, ProcessBox has another interface function: void process(), which takes no parameter. When void process() is invoked on a ProcessBox without a Parser, it does nothing because it has no parser. Otherwise, a Node reference is obtained from the parser and passed, in order, to each of the added Processors, but with an important feature:
This may sound complicated, but it has two important advantages:
A Parser makes nodes available on condition that they be respected. Specifically, each time a parser returns a node reference from nextNode(), it is a reference to a node held inside the parser. This node is replaced next time nextNode() is called. Therefore it is essential that each node be persisted (using Node::copy()) before the next call. For example, Builder (a Processor) constructs a document from the nodes assembled by a Parser. For this to succeed, it is essential that the node be copied. This construction takes place in a loop:
Document getDocument(const DOMString& sourcePath) {
XMLParser parser(sourcePath);
Builder builder;
while (true) {
Node& node = parser.nextNode();
if (!node.isValid()) break;
builder.process(node.copy());
}
return builder.getDocument(); // return a document value (not reference)
}
Limpid provides an adapter class, CloneParser, to help out here, so that we can use any of the standard streams to generate a safe series of nodes:
ListStream listStream(document.getDocumentElement());
CloneParser parser(listStream, true); // true specifies deep clones
Document newDocument;
Element newRoot("ROOT");
newDocument.appendChild(newRoot);
while (true) {
Node& node = parser.nextNode();
if (!node.isValid()) break;
newRoot.appendChild(node);
}
Here, we have set up a new document, added a document element and then attached to it a deep clone (complete copy) of each of the child nodes of the original document element, regardless of the node types. The structure of the original document is not changed in any way because the CloneParser handles the cloning details. If we wished simply to copy only the direct children of the original document element, we would have specified a false flag in the CloneParser constructor.