Package com.ctc.wstx.io
Class UTF32Reader
- java.lang.Object
-
- java.io.Reader
-
- com.ctc.wstx.io.UTF32Reader
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Readable
public final class UTF32Reader extends Reader
Since JDK does not come with UTF-32/UCS-4, let's implement a simple decoder to use.
-
-
Field Summary
Fields Modifier and Type Field Description protected static char
CONVERT_LSEP_TO
In xml 1.1, LSEP bit like \n, or \r.protected static char
CONVERT_NEL_TO
In xml 1.1, NEL (0x85) behaves much the way \n does (can be follow \r as part of the linefeedprotected boolean
mBigEndian
protected byte[]
mByteBuffer
protected int
mByteBufferEnd
Pointed to the end marker, that is, position one after the last valid available byte.protected int
mByteCount
Total read byte count; used for error reporting purposesprotected int
mBytePtr
Pointer to the next available byte (if any), iff less thanmByteBufferEnd
protected int
mCharCount
Total read character count; used for error reporting purposesprotected ReaderConfig
mConfig
protected char
mSurrogate
Although input is fine with full Unicode set, Java still uses 16-bit chars, so we may have to split high-order chars into surrogate pairs.protected char[]
mTmpBuf
protected boolean
mXml11
protected static char
NULL_BYTE
protected static char
NULL_CHAR
-
Constructor Summary
Constructors Constructor Description UTF32Reader(ReaderConfig cfg, InputStream in, byte[] buf, int ptr, int len, boolean recycleBuffer, boolean isBigEndian)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
canModifyBuffer()
Method that can be used to see if we can actually modify the underlying buffer.void
close()
void
freeBuffers()
This method should be called along with (or instead of) normal close.protected InputStream
getStream()
int
read()
Although this method is implemented by the base class, AND it should never be called by Woodstox code, let's still implement it bit more efficiently just in caseint
read(char[] cbuf, int start, int len)
protected int
readBytes()
Method for reading as many bytes from the underlying stream as possible (that fit in the buffer), to the beginning of the buffer.protected int
readBytesAt(int offset)
Method for reading as many bytes from the underlying stream as possible (that fit in the buffer considering offset), to the specified offset.protected void
reportBounds(char[] cbuf, int start, int len)
protected void
reportInvalidXml11(int value, int bytePos, int charPos)
protected void
reportStrangeStream()
void
setXmlCompliancy(int xmlVersion)
Method that can be called to indicate the xml conformance used when reading content using this reader.
-
-
-
Field Detail
-
mBigEndian
protected final boolean mBigEndian
-
mXml11
protected boolean mXml11
-
mSurrogate
protected char mSurrogate
Although input is fine with full Unicode set, Java still uses 16-bit chars, so we may have to split high-order chars into surrogate pairs.
-
mCharCount
protected int mCharCount
Total read character count; used for error reporting purposes
-
mByteCount
protected int mByteCount
Total read byte count; used for error reporting purposes
-
NULL_CHAR
protected static final char NULL_CHAR
- See Also:
- Constant Field Values
-
NULL_BYTE
protected static final char NULL_BYTE
- See Also:
- Constant Field Values
-
CONVERT_NEL_TO
protected static final char CONVERT_NEL_TO
In xml 1.1, NEL (0x85) behaves much the way \n does (can be follow \r as part of the linefeed- See Also:
- Constant Field Values
-
CONVERT_LSEP_TO
protected static final char CONVERT_LSEP_TO
In xml 1.1, LSEP bit like \n, or \r. Need to choose one as the result. Let's use \n, for simplicity- See Also:
- Constant Field Values
-
mConfig
protected final ReaderConfig mConfig
-
mByteBuffer
protected byte[] mByteBuffer
-
mBytePtr
protected int mBytePtr
Pointer to the next available byte (if any), iff less thanmByteBufferEnd
-
mByteBufferEnd
protected int mByteBufferEnd
Pointed to the end marker, that is, position one after the last valid available byte.
-
mTmpBuf
protected char[] mTmpBuf
-
-
Constructor Detail
-
UTF32Reader
public UTF32Reader(ReaderConfig cfg, InputStream in, byte[] buf, int ptr, int len, boolean recycleBuffer, boolean isBigEndian)
-
-
Method Detail
-
setXmlCompliancy
public void setXmlCompliancy(int xmlVersion)
Method that can be called to indicate the xml conformance used when reading content using this reader. Some of the character validity checks need to be done at reader level, and sometimes they depend on xml level (for example, xml 1.1 has new linefeeds and both more and less restricted characters).
-
read
public int read(char[] cbuf, int start, int len) throws IOException
- Specified by:
read
in classReader
- Throws:
IOException
-
canModifyBuffer
protected final boolean canModifyBuffer()
Method that can be used to see if we can actually modify the underlying buffer. This is the case if we are managing the buffer, but not if it was just given to us.
-
close
public void close() throws IOException
- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Specified by:
close
in classReader
- Throws:
IOException
-
read
public int read() throws IOException
Although this method is implemented by the base class, AND it should never be called by Woodstox code, let's still implement it bit more efficiently just in case- Overrides:
read
in classReader
- Throws:
IOException
-
getStream
protected final InputStream getStream()
-
readBytes
protected final int readBytes() throws IOException
Method for reading as many bytes from the underlying stream as possible (that fit in the buffer), to the beginning of the buffer.- Throws:
IOException
-
readBytesAt
protected final int readBytesAt(int offset) throws IOException
Method for reading as many bytes from the underlying stream as possible (that fit in the buffer considering offset), to the specified offset.- Returns:
- Number of bytes read, if any; -1 to indicate none available (that is, end of input)
- Throws:
IOException
-
freeBuffers
public final void freeBuffers()
This method should be called along with (or instead of) normal close. After calling this method, no further reads should be tried. Method will try to recycle read buffers (if any).
-
reportBounds
protected void reportBounds(char[] cbuf, int start, int len) throws IOException
- Throws:
IOException
-
reportStrangeStream
protected void reportStrangeStream() throws IOException
- Throws:
IOException
-
reportInvalidXml11
protected void reportInvalidXml11(int value, int bytePos, int charPos) throws IOException
- Throws:
IOException
-
-