HTML Content Delivery Phases
Smooks HTML content delivery is a 3 phase process:
-
Assembly: The assembly phase is the process of assembling the content
to be manipulated/transformed i.e. getting it into a Document Object Model (DOM).
This means parsing the input document and iterating over
it to apply all AssemblyUnits. This phase
can result in DOM elements getting added to, or trimmed from, the DOM. This phase is also
very usefull for gathering information about the DOM, which can be used during the
transformation phase (see below).
-
Transformation: The transformation phase takes the assembled DOM and
iterates over it to apply all TransUnits.
This phase will only operate on DOM elements that were present in the assembled
document; TransUnits will not be applied
to elements that are introduced to the DOM during this phase.
-
Serialisation: The serialisation phase takes the transformed DOM and
iterates over it to apply all SerializationUnits
which write the document to the target device output stream.
This whole process sounds like a lot of processing. Well, it is. Three iterations
over the DOM. However, the thinking on this is that:
-
The assembly phase is bypassed if there no AssemblyUnits
configured for the requesting device.
-
AssemblyUnits are stateless which means
that multiple instances don't need to be instanciated.
-
The transformation phase is likely to be the most processing intensive of the three
phases but the DOM to be transformed should have been reduced as much as possible by the
assembly phase. Remember, assembly doesn't just mean "adding" to the DOM.
-
The transformation phase is only applied to elements that were in the DOM
at the start of that phase.
-
TransUnits are normally stateless (TransUnitPrototype)
which means that multiple instances shouldn't need to be instanciated.
-
A single instance of the DefaultSerializationUnit
performs the vast majority of the work in the serialisation phase. Only elements that require
special attention require a specialised SerializationUnit.