Limpid is about the DOM and using the DOM. And the centre of the DOM is the Node; so the first set of classes that I developed were those in the limpid/dom directory: those for each of the Node types, NodeList and NamedNodeMap to conform to the W3C Recommendation.
It did not take me long to discover the deceptive simplicity of a DOM tree: child nodes appended to parent nodes and attributes set on elements. To construct a small document, the code in Java is pretty straightforward:
Document document = new Document();
Element docElement = new Element("DE");
document.appendChild(docElement);
docElement.setAttribute("attr1", "attr1Val");
Element child = new Element("child");
docElement.appendChild(child);
...etc
Now all this is simple and safe. Using the keyword, we are creating nodes and connecting them together. We need not be concerned where these nodes are located and how their lifetimes are managed; the Java memory management does all the worrying for us. Now let us transport this code into C++. Perhaps not surprisingly, it is pretty similar:
Document *document = new Document();
Element *docElement = new Element("DE");
document->appendChild(docElement);
docElement->setAttribute("attr1", "attr1Val");
Element child = new Element("child");
docElement->appendChild(child);
...etc
It really the same code; it is just that Java has chosen to use reference (dot) syntax instead of the pointer (->) syntax of C/C++. Java still uses pointers, though.
One of the policy decisions that I made very early in the development of Limpid was that the user be exposed to the creation of pointers (the keyword) or pointer syntax (one important lesson of Java is that this approach is well accepted by users of an API). I wanted code along the following lines:
Document document;
Element docElement("DE");
document.appendChild(docElement);
docElement.setAttribute("attr1", "attr1Val");
Element child("child");
docElement.appendChild(child);
...etc
And that is what Limpid code looks like. Values (which are local) are used everywhere instead of pointers, and the user leaves the (often thorny) problem of memory management to the Limpid framework.
All nodes, from the primitive node (Node) to complicated ones (like Element, Attribute and Document) have the same essential structure:
Parent nodes, including Element, Attribute, Entity and Document, have a NodeList in the node content. This is used to record, in sequence, the child nodes that have been appended to it.
If a node value is created locally, it has a local head (on the stack) and a body on the heap/free store; if a node is created using the keyword, the head and body are on the heap. If a node currently on the stack is copied (using its copy() interface), all that happens is:
On the other hand, if a node is copied on the stack (ie, within a function), an additional head is created in the stack space and the reference counter in the content incremented.
This separation of head and content means that the true (structural) polymorphism of a node resides in its node content and its behavioural (interface) polymorphism is associated in the head (the node proper?).
All forms of NodeContent contain fields that define the status type and location of a node
CharacterData nodes contain a pointer to a text array:
ParentNodes (Document, Element, Attribute, Entity) have an NodeContent that contains a NodeList to hold a list of pointers to child nodes:
In addition, ElementContent contains a NamedNodeMap to hold a list of pointers to attribute nodes
While AttributeContent contains a pointer to its ownerElement:
NodeList and NamedNodeMap have a pointer to the NodeContent containing it:
We are now in a position to define the sequence of events for these deceptively simple operations:
Document getDocument() {
Document document; // step 1
Element docElement("DE"); // step 2
document.appendChild(docElement); // step 3
docElement.setAttribute("attr1", "attr1Val"); // step 4
Element child("child"); // step 5
docElement.appendChild(child); // step 6
return document; // step 7
}
Step 1
Step 2
Step 3
Step 4
Steps 5 and 6
Step 7
Like any other local object on the stack, a node is destroyed when a function exits. To do this, C++ invokes the appropriate destructor. The sequence of events in destroying a node is:
The important point to make is that all this complexity should be invisible to the user of the Limpid API. In combination RAII and RLID ensure that all the correct events occur at the correct time.