java.lang.Object
- jahuwaldt.js.util.TextTokenizer

All Implemented Interfaces:

java.lang.Iterable<javolution.text.Text>, java.util.Enumeration<javolution.text.Text>, java.util.Iterator<javolution.text.Text>, javolution.lang.Realtime, javolution.lang.Reusable
```
public final class TextTokenizer
extends java.lang.Object
implements java.util.Enumeration<javolution.text.Text>, java.util.Iterator<javolution.text.Text>, java.lang.Iterable<javolution.text.Text>, javolution.lang.Realtime, javolution.lang.Reusable
```
The text tokenizer class allows an application to break a Text object into tokens. The tokenization method is much simpler than the one used by the StreamTokenizer class. The TextTokenizer methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.
The set of delimiters (the characters that separate tokens) may be specified either at creation time or on a per-token basis.
An instance of TextTokenizer behaves in one of two ways, depending on whether it was created with the returnDelims flag having the value true or false:
- If the flag is false, delimiter characters serve to separate tokens. A token is a maximal sequence of consecutive characters that are not delimiters.
- If the flag is true, delimiter characters are themselves considered to be tokens. A token is thus either one delimiter character, or a maximal sequence of consecutive characters that are not delimiters.
A TextTokenizer object internally maintains a current position within the text to be tokenized. Some operations advance this current position past the characters processed.
A token is returned by taking a subtext of the text that was used to create the TextTokenizer object.
The following is one example of the use of the tokenizer. The code:
```
     TextTokenizer tt = TextTokenizer.valueOf("this is a test");
     while (tt.hasMoreTokens()) {
         System.out.println(tt.nextToken());
     }
 
```
prints the following output:
```
     this
     is
     a
     test
 
```
TextTokenizer is heavily based on java.util.StringTokenizer. However, there are some improvements and additional methods and capabilities.

Modified by: Joseph A. Huwaldt
Version:

February 17, 2025

Author:

Joseph A. Huwaldt Date: March 12, 2009

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`int`	`countTokens()`	Calculates the number of times that this tokenizer's `nextToken` method can be called before it generates an exception.
`int`	`countTokens(java.lang.CharSequence delims)`	Calculates the number of times that this tokenizer's `nextToken` method can be called before it generates an exception using the given set of delimiters.
`boolean`	`getHonorQuotes()`	Returns `true` if this tokenizer honors quoted text (counts it as a single token).
`boolean`	`hasMoreElements()`	Returns the same value as the `hasMoreTokens` method.
`boolean`	`hasMoreTokens()`	Tests if there are more tokens available from this tokenizer's text.
`boolean`	`hasNext()`	Returns the same value as the `hasMoreTokens()` method.
`java.util.Iterator<javolution.text.Text>`	`iterator()`	Returns an iterator over the tokens returned by this tokenizer.
`static void`	`main(java.lang.String[] args)`	Testing code for this class.
`static TextTokenizer`	`newInstance()`	Return a text tokenizer with an initially empty string of text and with no delimiters.
`javolution.text.Text`	`next()`	Returns the same value as the `nextToken()` method.
`javolution.text.Text`	`nextElement()`	Returns the same value as the `nextToken` method.
`javolution.text.Text`	`nextToken()`	Returns the next token from this text tokenizer.
`javolution.text.Text`	`nextToken(java.lang.CharSequence delim)`	Returns the next token in this text tokenizer's text.
`static void`	`recycle(TextTokenizer instance)`	Recycles a `TextTokenizer` instance immediately (on the stack when executing in a `StackContext`).
`void`	`remove()`	This implementation always throws `UnsupportedOperationException`.
`void`	`reset()`	Resets the internal state of this object to its default values.
`javolution.text.Text`	`restOfText()`	Retrieves the rest of the text as a single token.
`void`	`setDelimiters(java.lang.CharSequence delim)`	Set the delimiters for this TextTokenizer.
`void`	`setHonorQuotes(boolean honorQuotes)`	Sets whether or not this tokenizer recognizes quoted text using the specified quote character.
`void`	`setQuoteChar(char quote)`	Set the character to use as the "quote" character.
`void`	`setReturnEmptyTokens(boolean returnEmptyTokens)`	Set whether empty tokens should be returned from this point in in the tokenizing process onward.
`void`	`setText(java.lang.CharSequence text)`	Set the text to be tokenized in this TextTokenizer.
`javolution.text.Text`	`toText()`	Returns the same value as the `nextToken()` method.
`static TextTokenizer`	`valueOf(java.lang.CharSequence text)`	Return a text tokenizer for the specified character sequence.
`static TextTokenizer`	`valueOf(java.lang.CharSequence text, java.lang.CharSequence delim)`	Return a text tokenizer for the specified character sequence.
`static TextTokenizer`	`valueOf(java.lang.CharSequence text, java.lang.CharSequence delim, boolean returnDelims)`	Return a text tokenizer for the specified character sequence.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface java.util.Enumeration
asIterator

Methods inherited from interface java.lang.Iterable
forEach, spliterator

Methods inherited from interface java.util.Iterator
forEachRemaining

Method Detail

newInstance
```
public static TextTokenizer newInstance()
```
Return a text tokenizer with an initially empty string of text and with no delimiters. Use setText(java.lang.CharSequence) and setDelimiters(java.lang.CharSequence) to make this instance useful.

reset
```
public void reset()
```
Resets the internal state of this object to its default values.

Specified by:

reset in interface javolution.lang.Reusable

valueOf
```
public static TextTokenizer valueOf(java.lang.CharSequence text,
                                    java.lang.CharSequence delim,
                                    boolean returnDelims)
```
Return a text tokenizer for the specified character sequence. All characters in the delim argument are the delimiters for separating tokens.
If the returnDelims flag is true, then the delimiter characters are also returned as tokens. Each delimiter is returned as a string of length one. If the flag is false, the delimiter characters are skipped and only serve as separators between tokens.
Note that if delim is null, this constructor does not throw an exception. However, trying to invoke other methods on the resulting TextTokenizer may result in a NullPointerException.

Parameters:

text - the text to be parsed.

delim - the delimiters.

returnDelims - flag indicating whether to return the delimiters as tokens.

valueOf
```
public static TextTokenizer valueOf(java.lang.CharSequence text,
                                    java.lang.CharSequence delim)
```
Return a text tokenizer for the specified character sequence. The characters in the delim argument are the delimiters for separating tokens. Delimiter characters themselves will not be treated as tokens.

Parameters:

text - the text to be parsed.

delim - the delimiters.

valueOf
```
public static TextTokenizer valueOf(java.lang.CharSequence text)
```
Return a text tokenizer for the specified character sequence. The tokenizer uses the default delimiter set, which is " \t\n\r\f": the space character, the tab character, the newline character, the carriage-return character, and the form-feed character. Delimiter characters themselves will not be treated as tokens.

Parameters:

text - the text to be parsed.

setText
```
public void setText(java.lang.CharSequence text)
```
Set the text to be tokenized in this TextTokenizer.
This is useful when for TextTokenizer re-use so that new string tokenizers do not have to be created for each string you want to tokenizer.
The text will be tokenized from the beginning of the text.

Parameters:

text - the text to be parsed.

setDelimiters
```
public void setDelimiters(java.lang.CharSequence delim)
```
Set the delimiters for this TextTokenizer. The position must be initialized before this method is used (setText does this and it is called from the constructor).

Parameters:

delim - the delimiters

setQuoteChar
```
public void setQuoteChar(char quote)
```
Set the character to use as the "quote" character. All text between quote characters is considered a single token. The default quote character is '"'.

See Also:

setHonorQuotes(boolean)

setHonorQuotes
```
public void setHonorQuotes(boolean honorQuotes)
```
Sets whether or not this tokenizer recognizes quoted text using the specified quote character. If true is passed, this tokenizer will consider any text between the specified quote characters as a single token. Honoring of quotes defaults to false.

See Also:

setQuoteChar(char)

getHonorQuotes
```
public boolean getHonorQuotes()
```
Returns true if this tokenizer honors quoted text (counts it as a single token).

setReturnEmptyTokens

public void setReturnEmptyTokens(boolean returnEmptyTokens)

Set whether empty tokens should be returned from this point in in the tokenizing process onward.

Empty tokens occur when two delimiters are next to each other or a delimiter occurs at the beginning or end of a string. If empty tokens are set to be returned, and a comma is the non token delimiter, the following table shows how many tokens are in each string.

String		Number of tokens
"one,two"		2 - normal case with no empty tokens.
"one,,three"		3 including the empty token in the middle.
"one,"		2 including the empty token at the end.
",two"		2 including the empty token at the beginning.
","		2 including the empty tokens at the beginning and the ends.
""		1 - all strings will have at least one token if empty tokens are returned.

Parameters:: returnEmptyTokens - true if and only if empty tokens should be returned.

hasMoreTokens
```
public boolean hasMoreTokens()
```
Tests if there are more tokens available from this tokenizer's text. If this method returns true, then a subsequent call to nextToken with no argument will successfully return a token.

Returns:

true if and only if there is at least one token in the text after the current position; false otherwise.

nextToken
```
public javolution.text.Text nextToken()
```
Returns the next token from this text tokenizer.

Returns:

the next token from this text tokenizer.

Throws:

java.util.NoSuchElementException - if there are no more tokens in this tokenizer's text.

nextToken
```
public javolution.text.Text nextToken(java.lang.CharSequence delim)
```
Returns the next token in this text tokenizer's text. First, the set of characters considered to be delimiters by this TextTokenizer object is changed to be the characters in the string delim. Then the next token in the text after the current position is returned. The current position is advanced beyond the recognized token. The new delimiter set remains the default after this call.

Parameters:

delim - the new delimiters.

Returns:

the next token, after switching to the new delimiter set.

Throws:

java.util.NoSuchElementException - if there are no more tokens in this tokenizer's text.

hasMoreElements
```
public boolean hasMoreElements()
```
Returns the same value as the hasMoreTokens method. It exists so that this class can implement the Enumeration interface.

Specified by:

hasMoreElements in interface java.util.Enumeration<javolution.text.Text>

Returns:

true if there are more tokens; false otherwise.

See Also:

Enumeration, hasMoreTokens()

nextElement
```
public javolution.text.Text nextElement()
```
Returns the same value as the nextToken method. It exists so that this class can implement the Enumeration interface.

Specified by:

nextElement in interface java.util.Enumeration<javolution.text.Text>

Returns:

the next token in the text.

Throws:

java.util.NoSuchElementException - if there are no more tokens in this tokenizer's text.

See Also:

Enumeration, nextToken()

iterator
```
public java.util.Iterator<javolution.text.Text> iterator()
```
Returns an iterator over the tokens returned by this tokenizer.

Specified by:

iterator in interface java.lang.Iterable<javolution.text.Text>

hasNext
```
public boolean hasNext()
```
Returns the same value as the hasMoreTokens() method. It exists so that this class can implement the Iterator interface.

Specified by:

hasNext in interface java.util.Iterator<javolution.text.Text>

Returns:

true if there are more tokens; false otherwise.

See Also:

Iterator, hasMoreTokens()

next
```
public javolution.text.Text next()
```
Returns the same value as the nextToken() method. It exists so that this class can implement the Iterator interface.

Specified by:

next in interface java.util.Iterator<javolution.text.Text>

Returns:

the next token in the text.

Throws:

java.util.NoSuchElementException - if there are no more tokens in this tokenizer's text.

See Also:

Iterator, nextToken()

remove
```
public void remove()
```
This implementation always throws UnsupportedOperationException. It exists so that this class can implement the Iterator interface.

Specified by:

remove in interface java.util.Iterator<javolution.text.Text>

Throws:

java.lang.UnsupportedOperationException - always is thrown.

See Also:

Iterator

countTokens
```
public int countTokens()
```
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception. The current position is not advanced.

Returns:

the number of tokens remaining in the text using the current delimiter set.

See Also:

nextToken()

countTokens
```
public int countTokens(java.lang.CharSequence delims)
```
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.

Parameters:

delims - the new set of delimiters.

Returns:

the number of tokens remaining in the text using the new delimiter set.

See Also:

countTokens()

restOfText
```
public javolution.text.Text restOfText()
```
Retrieves the rest of the text as a single token. After calling this method hasMoreTokens() will always return false.

Returns:

any part of the text that has not yet been tokenized.

toText
```
public javolution.text.Text toText()
```
Returns the same value as the nextToken() method. It exists so that this class can implement the Realtime interface.

Specified by:

toText in interface javolution.lang.Realtime

Returns:

the next token in the text.

Throws:

java.util.NoSuchElementException - if there are no more tokens in this tokenizer's text.

See Also:

Realtime, nextToken()

recycle
```
public static void recycle(TextTokenizer instance)
```
Recycles a TextTokenizer instance immediately (on the stack when executing in a StackContext).

main

public static void main(java.lang.String[] args)

Testing code for this class.

Class TextTokenizer

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface java.util.Enumeration

Methods inherited from interface java.lang.Iterable

Methods inherited from interface java.util.Iterator

Method Detail

newInstance

reset

valueOf

valueOf

valueOf

setText

setDelimiters

setQuoteChar

setHonorQuotes

getHonorQuotes

setReturnEmptyTokens

hasMoreTokens

nextToken

nextToken

hasMoreElements

nextElement

iterator

hasNext

next

remove

countTokens

countTokens

restOfText

toText

recycle

main