How to use StringTokenizer to parse Strings into tokens in Java | Tutorial with examples
This tutorial explains how to use
What does java.util.StringTokenizer class do
What is a token?
The portions of the string between two delimiters is a token. Tokens contain the actual information which we want to extract from the input string.
What is a delimiter? The character/group of characters marking the end of the token is a delimiter.
So, given a
How extract tokens from a String using StringTokenizer
If you try to access the next token using
Let us now see the
OUTPUT of the above code
Explanation of the code
java.util.StringTokenizer
class to parse a String
containing delimited data tokens. It first explains what a StringTokenizer
does along with the basic concepts of delimiters and tokens. Next it uses a Java code example to show how to code using a StringTokenizer.What does java.util.StringTokenizer class do
StringTokenizer
class breaks a given String
containing data into smaller tokens. To do so it uses the concept of delimiters. The words in bold - tokens and delimiters are the key terms one needs to understand to use StringTokenizer
. Take a look at the diagram below and then the two definitions which follow -
What is a delimiter? The character/group of characters marking the end of the token is a delimiter.
So, given a
String
containing data, if we want to read the tokens present in this string using the StringTokenizer
class, then there needs to be a delimiter defined for separating the tokens. We can then instruct the StringTokenizer
class to extract the tokens from between these delimiters.
StringTokenizer
class has provided a set of methods to iterate through and extract the tokens read from the input String
. Out of the methods provided by StringTokenizer you will be using, for all practical purposes, the following 2 methods for most of the plausible scenarios -
hasMoreTokens()
: This method returns a boolean value indicating whether any more 'unprocessed' tokens are present in the input string.nextToken()
: This method returns the nextString
token.
hasMoreTokens()
and nextToken()
work in tandem to move through the tokens in a very similar fashion to the way hasNext()
and next()
methods of Iterator
interface work together. You need to keep checking whether you have any tokens remaining using hasMoreTokens()
method before accessing the next token. Just like an Iterator
, StringTokenizer
maintains an internal pointer to the next token to be read. A call to nextToken()
reads the token the pointer is currently pointing to and moves the pointer ahead so that it now points to the next token.If you try to access the next token using
nextToken()
without checking for hasMoreTokens()
, then you run the risk of java.util.NoSuchElementException
being thrown in the scenario when there are no more tokens remaining. So, as an accepted practice, a call to nextToken()
is always preceded by a check for the next token's existence using hasMoreTokens()
.Let us now see the
StringTokenizer
in action. In the Java code example that follows we will be extracting 5 names(tokens) which are delimited by commas(delimiter).Java code example showing StringTokenizer usage
package com.javabrahman.corejava;
import java.util.StringTokenizer;
public class StringTokenizerExample {
public static void main(String args[]) {
String rawData="John,David,George,Frank,Tom";
StringTokenizer tokenizer=new StringTokenizer(rawData,",");
while(tokenizer.hasMoreTokens()){
System.out.println(tokenizer.nextToken());
}
}
}
John David George Frank Tom
- In the
StringTokenizerExample
class'smain()
method we first create a string containing the comma-delimited names which is namedrawData
. - Next we create a
StringTokenizer
instance, namedtokenizer
, using its 2-parameter constructor. The first parameter is the stringrawData
, while its second parameter is the delimiter i.e. ',
'(comma). - We then create an infinite
while
loop which executes till the methodhasMoreToken()
returns true, i.e. till there are tokens remaining to be read from tokenizer. - Inside the loop we keep getting the next token values, i.e. names, and we keep printing them. After each
nextToken()
calltokenizer
’s internal pointer moves ahead to point to the next token. This sequence of token fetching, moving forward, and re-entering the loop continues tillhasMoreToken()
returns a false value and the loop ends. - The output of the above program is as expected - the 5 names printed in 5 lines(Note - Each name is printed in a separate line since we used
println()
method to print them).
StringTokenizer
class in parsing a given input String
to retrieve values(sub-strings) stored in it. We then saw the two main methods of StringTokenizer
, and then saw a Java example showing the StringTokenizer
and its methods in action.