org.apache.xml.serialize

Class BaseMarkupSerializer

Implemented Interfaces:
org.xml.sax.ContentHandler, org.xml.sax.ext.DeclHandler, org.xml.sax.DocumentHandler, DOMSerializer, org.xml.sax.DTDHandler, org.xml.sax.ext.LexicalHandler, Serializer
Known Direct Subclasses:
HTMLSerializer, TextSerializer, XMLSerializer

public abstract class BaseMarkupSerializer
extends java.lang.Object
implements org.xml.sax.ContentHandler, org.xml.sax.DocumentHandler, org.xml.sax.ext.LexicalHandler, org.xml.sax.DTDHandler, org.xml.sax.ext.DeclHandler, DOMSerializer, Serializer

Base class for a serializer supporting both DOM and SAX pretty serializing of XML/HTML/XHTML documents. Derives classes perform the method-specific serializing, this class provides the common serializing mechanisms.

The serializer must be initialized with the proper writer and output format before it can be used by calling setOutputCharStream(Writer) or setOutputByteStream(OutputStream) for the writer and setOutputFormat(OutputFormat) for the output format.

The serializer can be reused any number of times, but cannot be used concurrently by two threads.

If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.

The serializer supports both DOM and SAX. DOM serializing is done by calling serialize(Document) and SAX serializing is done by firing SAX events and using the serializer as a document handler. This also applies to derived class.

If an I/O exception occurs while serializing, the serializer will not throw an exception directly, but only throw it at the end of serializing (either DOM or SAX's org.xml.sax.DocumentHandler.endDocument.

For elements that are not specified as whitespace preserving, the serializer will potentially break long text lines at space boundaries, indent lines, and serialize elements on separate lines. Line terminators will be regarded as spaces, and spaces at beginning of line will be stripped.

When indenting, the serializer is capable of detecting seemingly element content, and serializing these elements indented on separate lines. An element is serialized indented when it is the first or last child of an element, or immediate following or preceding another element.

Version:
$Revision: 1.51 $ $Date: 2004/02/12 16:56:07 $

Authors:
Assaf Arkin
Rahul Srivastava
Elena Litani, IBM

See Also:
Serializer, LSSerializer

Field Summary

protected String
_docTypePublicId
The system identifier of the document type, if known.
protected String
_docTypeSystemId
The system identifier of the document type, if known.
protected EncodingInfo
_encodingInfo
protected OutputFormat
_format
The output format associated with this serializer.
protected boolean
_indenting
True if indenting printer.
protected Hashtable
_prefixes
Association between namespace URIs (keys) and prefixes (values).
protected Printer
_printer
The printer used for printing text parts.
protected boolean
_started
If the document has been started (header serialized), this flag is set to true so it's not started twice.
protected org.w3c.dom.Node
fCurrentNode
Current node that is being processed
protected org.apache.xerces.dom.DOMErrorImpl
fDOMError
protected org.apache.xerces.dom3.DOMErrorHandler
fDOMErrorHandler
protected org.w3c.dom.ls.LSSerializerFilter
fDOMFilter
protected StringBuffer
fStrBuffer
Temporary buffer to store character data
protected short
features

Constructor Summary

BaseMarkupSerializer(OutputFormat format)
Protected constructor can only be used by derived class.

Method Summary

org.xml.sax.ContentHandler
asContentHandler()
DOMSerializer
asDOMSerializer()
org.xml.sax.DocumentHandler
asDocumentHandler()
void
attributeDecl(String eName, String aName, String type, String valueDefault, String value)
protected void
characters(String text)
Called to print the text contents in the prevailing element format.
void
characters(char[] chars, int start, int length)
protected void
checkUnboundNamespacePrefixedNode(org.w3c.dom.Node node)
DOM level 3: Check a node to determine if it contains unbound namespace prefixes.
void
comment(String text)
void
comment(char[] chars, int start, int length)
protected ElementState
content()
Must be called by a method about to print any type of content.
void
elementDecl(String name, String model)
void
endCDATA()
void
endDTD()
void
endDocument()
Called at the end of the document to wrap it up.
void
endEntity(String name)
void
endNonEscaping()
void
endPrefixMapping(String prefix)
void
endPreserving()
protected ElementState
enterElementState(String namespaceURI, String localName, String rawName, boolean preserveSpace)
Enter a new element state for the specified element.
void
externalEntityDecl(String name, String publicId, String systemId)
protected void
fatalError(String message)
protected ElementState
getElementState()
Return the state of the current element.
protected String
getEntityRef(int ch)
Returns the suitable entity reference for this character value, or null if no such entity exists.
protected String
getPrefix(String namespaceURI)
Returns the namespace prefix for the specified URI.
void
ignorableWhitespace(char[] chars, int start, int length)
void
internalEntityDecl(String name, String value)
protected boolean
isDocumentState()
Returns true if in the state of the document.
protected ElementState
leaveElementState()
Leave the current element state and return to the state of the parent element.
protected org.apache.xerces.dom3.DOMError
modifyDOMError(String message, short severity, org.w3c.dom.Node node)
The method modifies global DOM error object
void
notationDecl(String name, String publicId, String systemId)
protected void
prepare()
protected void
printCDATAText(String text)
protected void
printDoctypeURL(String url)
Print a document type public or system identifier URL.
protected void
printEscaped(String source)
Escapes a string so it may be printed as text content or attribute value.
protected void
printEscaped(int ch)
protected void
printText(String text, boolean preserveSpace, boolean unescaped)
protected void
printText(char[] chars, int start, int length, boolean preserveSpace, boolean unescaped)
Called to print additional text with whitespace handling.
void
processingInstruction(String target, String code)
void
processingInstructionIO(String target, String code)
boolean
reset()
void
serialize(org.w3c.dom.Document doc)
Serializes the DOM document using the previously specified writer and output format.
void
serialize(org.w3c.dom.DocumentFragment frag)
Serializes the DOM document fragmnt using the previously specified writer and output format.
void
serialize(org.w3c.dom.Element elem)
Serializes the DOM element using the previously specified writer and output format.
protected void
serializeElement(org.w3c.dom.Element elem)
Called to serializee the DOM element.
protected void
serializeNode(org.w3c.dom.Node node)
Serialize the DOM node.
protected void
serializePreRoot()
Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first.
void
setDocumentLocator(org.xml.sax.Locator locator)
void
setOutputByteStream(OutputStream output)
void
setOutputCharStream(Writer writer)
void
setOutputFormat(OutputFormat format)
void
skippedEntity(String name)
void
startCDATA()
void
startDTD(String name, String publicId, String systemId)
void
startDocument()
void
startEntity(String name)
void
startNonEscaping()
void
startPrefixMapping(String prefix, String uri)
void
startPreserving()
protected void
surrogates(int high, int low)
void
unparsedEntityDecl(String name, String publicId, String systemId, String notationName)

Field Details

_docTypePublicId

protected String _docTypePublicId
The system identifier of the document type, if known.


_docTypeSystemId

protected String _docTypeSystemId
The system identifier of the document type, if known.


_encodingInfo

protected EncodingInfo _encodingInfo


_format

protected OutputFormat _format
The output format associated with this serializer. This will never be a null reference. If no format was passed to the constructor, the default one for this document type will be used. The format object is never changed by the serializer.


_indenting

protected boolean _indenting
True if indenting printer.


_prefixes

protected Hashtable _prefixes
Association between namespace URIs (keys) and prefixes (values). Accumulated here prior to starting an element and placing this list in the element state.


_printer

protected Printer _printer
The printer used for printing text parts.


_started

protected boolean _started
If the document has been started (header serialized), this flag is set to true so it's not started twice.


fCurrentNode

protected org.w3c.dom.Node fCurrentNode
Current node that is being processed


fDOMError

protected final org.apache.xerces.dom.DOMErrorImpl fDOMError


fDOMErrorHandler

protected org.apache.xerces.dom3.DOMErrorHandler fDOMErrorHandler


fDOMFilter

protected org.w3c.dom.ls.LSSerializerFilter fDOMFilter


fStrBuffer

protected final StringBuffer fStrBuffer
Temporary buffer to store character data


features

protected short features

Constructor Details

BaseMarkupSerializer

protected BaseMarkupSerializer(OutputFormat format)
Protected constructor can only be used by derived class. Must initialize the serializer before serializing any document, by calling setOutputCharStream(Writer) or setOutputByteStream(OutputStream) first

Method Details

asContentHandler

public org.xml.sax.ContentHandler asContentHandler()
            throws IOException
Specified by:
asContentHandler in interface Serializer


asDOMSerializer

public DOMSerializer asDOMSerializer()
            throws IOException
Specified by:
asDOMSerializer in interface Serializer


asDocumentHandler

public org.xml.sax.DocumentHandler asDocumentHandler()
            throws IOException
Specified by:
asDocumentHandler in interface Serializer


attributeDecl

public void attributeDecl(String eName,
                          String aName,
                          String type,
                          String valueDefault,
                          String value)
            throws org.xml.sax.SAXException
Specified by:
attributeDecl in interface org.xml.sax.ext.DeclHandler


characters

protected void characters(String text)
            throws IOException
Called to print the text contents in the prevailing element format. Since this method is capable of printing text as CDATA, it is used for that purpose as well. White space handling is determined by the current element state. In addition, the output format can dictate whether the text is printed as CDATA or unescaped.

Parameters:
text - The text to print


characters

public void characters(char[] chars,
                       int start,
                       int length)
            throws org.xml.sax.SAXException


checkUnboundNamespacePrefixedNode

protected void checkUnboundNamespacePrefixedNode(org.w3c.dom.Node node)
            throws IOException
DOM level 3: Check a node to determine if it contains unbound namespace prefixes.

Parameters:
node - The node to check for unbound namespace prefices


comment

public void comment(String text)
            throws IOException


comment

public void comment(char[] chars,
                    int start,
                    int length)
            throws org.xml.sax.SAXException


content

protected ElementState content()
            throws IOException
Must be called by a method about to print any type of content. If the element was just opened, the opening tag is closed and will be matched to a closing tag. Returns the current element state with empty and afterElement set to false.

Returns:
The current element state


elementDecl

public void elementDecl(String name,
                        String model)
            throws org.xml.sax.SAXException
Specified by:
elementDecl in interface org.xml.sax.ext.DeclHandler


endCDATA

public void endCDATA()
Specified by:
endCDATA in interface org.xml.sax.ext.LexicalHandler


endDTD

public void endDTD()
Specified by:
endDTD in interface org.xml.sax.ext.LexicalHandler


endDocument

public void endDocument()
            throws org.xml.sax.SAXException
Called at the end of the document to wrap it up. Will flush the output stream and throw an exception if any I/O error occured while serializing.
Specified by:
endDocument in interface org.xml.sax.ContentHandler
endDocument in interface org.xml.sax.DocumentHandler

Throws:
org.xml.sax.SAXException - An I/O exception occured during serializing


endEntity

public void endEntity(String name)
Specified by:
endEntity in interface org.xml.sax.ext.LexicalHandler


endNonEscaping

public void endNonEscaping()


endPrefixMapping

public void endPrefixMapping(String prefix)
            throws org.xml.sax.SAXException
Specified by:
endPrefixMapping in interface org.xml.sax.ContentHandler


endPreserving

public void endPreserving()


enterElementState

protected ElementState enterElementState(String namespaceURI,
                                         String localName,
                                         String rawName,
                                         boolean preserveSpace)
Enter a new element state for the specified element. Tag name and space preserving is specified, element state is initially empty.

Returns:
Current element state, or null


externalEntityDecl

public void externalEntityDecl(String name,
                               String publicId,
                               String systemId)
            throws org.xml.sax.SAXException
Specified by:
externalEntityDecl in interface org.xml.sax.ext.DeclHandler


fatalError

protected void fatalError(String message)
            throws IOException


getElementState

protected ElementState getElementState()
Return the state of the current element.

Returns:
Current element state


getEntityRef

protected String getEntityRef(int ch)
Returns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".

Parameters:
ch - Character value

Returns:
Character entity name, or null


getPrefix

protected String getPrefix(String namespaceURI)
Returns the namespace prefix for the specified URI. If the URI has been mapped to a prefix, returns the prefix, otherwise returns null.

Parameters:
namespaceURI - The namespace URI

Returns:
The namespace prefix if known, or null


ignorableWhitespace

public void ignorableWhitespace(char[] chars,
                                int start,
                                int length)
            throws org.xml.sax.SAXException


internalEntityDecl

public void internalEntityDecl(String name,
                               String value)
            throws org.xml.sax.SAXException
Specified by:
internalEntityDecl in interface org.xml.sax.ext.DeclHandler


isDocumentState

protected boolean isDocumentState()
Returns true if in the state of the document. Returns true before entering any element and after leaving the root element.

Returns:
True if in the state of the document


leaveElementState

protected ElementState leaveElementState()
Leave the current element state and return to the state of the parent element. If this was the root element, return to the state of the document.

Returns:
Previous element state


modifyDOMError

protected org.apache.xerces.dom3.DOMError modifyDOMError(String message,
                                                         short severity,
                                                         org.w3c.dom.Node node)
The method modifies global DOM error object

Parameters:
message -
severity -

Returns:
a DOMError


notationDecl

public void notationDecl(String name,
                         String publicId,
                         String systemId)
            throws org.xml.sax.SAXException
Specified by:
notationDecl in interface org.xml.sax.DTDHandler


prepare

protected void prepare()
            throws IOException


printCDATAText

protected void printCDATAText(String text)
            throws IOException


printDoctypeURL

protected void printDoctypeURL(String url)
            throws IOException
Print a document type public or system identifier URL. Encapsulates the URL in double quotes, escapes non-printing characters and print it equivalent to printText.

Parameters:
url - The document type url to print


printEscaped

protected void printEscaped(String source)
            throws IOException
Escapes a string so it may be printed as text content or attribute value. Non printable characters are escaped using character references. Where the format specifies a deault entity reference, that reference is used (e.g. <).

Parameters:
source - The string to escape


printEscaped

protected void printEscaped(int ch)
            throws IOException


printText

protected void printText(String text,
                         boolean preserveSpace,
                         boolean unescaped)
            throws IOException


printText

protected void printText(char[] chars,
                         int start,
                         int length,
                         boolean preserveSpace,
                         boolean unescaped)
            throws IOException
Called to print additional text with whitespace handling. If spaces are preserved, the text is printed as if by calling printText(String,boolean,boolean) with a call to Printer.breakLine for each new line. If spaces are not preserved, the text is broken at space boundaries if longer than the line width; Multiple spaces are printed as such, but spaces at beginning of line are removed.

Parameters:
preserveSpace - Space preserving flag
unescaped - Print unescaped


processingInstruction

public final void processingInstruction(String target,
                                        String code)
            throws org.xml.sax.SAXException
Specified by:
processingInstruction in interface org.xml.sax.ContentHandler
processingInstruction in interface org.xml.sax.DocumentHandler


processingInstructionIO

public void processingInstructionIO(String target,
                                    String code)
            throws IOException


reset

public boolean reset()


serialize

public void serialize(org.w3c.dom.Document doc)
            throws IOException
Serializes the DOM document using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.
Specified by:
serialize in interface DOMSerializer

Parameters:
doc - The document to serialize


serialize

public void serialize(org.w3c.dom.DocumentFragment frag)
            throws IOException
Serializes the DOM document fragmnt using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.
Specified by:
serialize in interface DOMSerializer

Parameters:


serialize

public void serialize(org.w3c.dom.Element elem)
            throws IOException
Serializes the DOM element using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.
Specified by:
serialize in interface DOMSerializer

Parameters:
elem - The element to serialize


serializeElement

protected void serializeElement(org.w3c.dom.Element elem)
            throws IOException
Called to serializee the DOM element. The element is serialized based on the serializer's method (XML, HTML, XHTML).

Parameters:
elem - The element to serialize


serializeNode

protected void serializeNode(org.w3c.dom.Node node)
            throws IOException

Parameters:
node - The node to serialize

See Also:
serializeElement(Element)


serializePreRoot

protected void serializePreRoot()
            throws IOException
Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first. Instead such PIs and comments are accumulated inside a vector and serialized by calling this method. Will be called when the root element is serialized and when the document finished serializing.


setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)
Specified by:
setDocumentLocator in interface org.xml.sax.ContentHandler
setDocumentLocator in interface org.xml.sax.DocumentHandler


setOutputByteStream

public void setOutputByteStream(OutputStream output)
Specified by:
setOutputByteStream in interface Serializer


setOutputCharStream

public void setOutputCharStream(Writer writer)
Specified by:
setOutputCharStream in interface Serializer


setOutputFormat

public void setOutputFormat(OutputFormat format)
Specified by:
setOutputFormat in interface Serializer


skippedEntity

public void skippedEntity(String name)
            throws org.xml.sax.SAXException
Specified by:
skippedEntity in interface org.xml.sax.ContentHandler


startCDATA

public void startCDATA()
Specified by:
startCDATA in interface org.xml.sax.ext.LexicalHandler


startDTD

public final void startDTD(String name,
                           String publicId,
                           String systemId)
            throws org.xml.sax.SAXException
Specified by:
startDTD in interface org.xml.sax.ext.LexicalHandler


startDocument

public void startDocument()
            throws org.xml.sax.SAXException
Specified by:
startDocument in interface org.xml.sax.ContentHandler
startDocument in interface org.xml.sax.DocumentHandler


startEntity

public void startEntity(String name)
Specified by:
startEntity in interface org.xml.sax.ext.LexicalHandler


startNonEscaping

public void startNonEscaping()


startPrefixMapping

public void startPrefixMapping(String prefix,
                               String uri)
            throws org.xml.sax.SAXException
Specified by:
startPrefixMapping in interface org.xml.sax.ContentHandler


startPreserving

public void startPreserving()


surrogates

protected void surrogates(int high,
                          int low)
            throws IOException


unparsedEntityDecl

public void unparsedEntityDecl(String name,
                               String publicId,
                               String systemId,
                               String notationName)
            throws org.xml.sax.SAXException
Specified by:
unparsedEntityDecl in interface org.xml.sax.DTDHandler


Copyright B) 1999-2004 Apache XML Project. All Rights Reserved.