What's the best way to have stringTokenizer split up a line of text into predefined variables

I'm not sure if the title is very clear, but basically what I have to do is read a line of text from a file and split it up into 8 different string variables. Each line will have the same 8 chunks in the same order (title, author, price, etc). So for each line of text, I want to end up with 8 strings.

The first problem is that the last two fields in the line may or may not be present, so I need to do something with stringTokenizer.hasMoreTokens, otherwise it will die messily when fields 7 and 8 are not present.

I would ideally like to do it in one while of for loop, but I'm not sure how to tell that loop what the order of the fields is going to be so it can fill all 8 (or 6) strings correctly. Please tell me there's a better way that using 8 nested if statements!

EDIT: The String.split solution seems definitely part of it, so I will use that instead of stringTokenizer. However, I'm still not sure what the best way of feeding the individual strings into the constructor. Would the best way be to have the class expecting an array, and then just do something like this in the constructor:

line[1] = isbn;
line[2] = title;

Asked by: Daryl454 | Posted: 23-01-2022

Answer 1

The best way is to not use a StringTokenizer at all, but use String's split method. It returns an array of Strings, and you can get the length from that.

For each line in your file you can do the following:

String[] tokens = line.split("#");

tokens will now have 6 - 8 Strings. Use tokens.length() to find out how many, then create your object from the array.

Answered by: Lily525 | Posted: 24-02-2022

Answer 2

Regular expression is the way. You can convert your incoming String into an array of String using the split method


Answered by: Freddie124 | Posted: 24-02-2022

Answer 3

Would a regular expression with capture groups work for you? You can certainly make parts of the expression optional.

An example line of data or three might be helpful.

Answered by: Kevin216 | Posted: 24-02-2022

Answer 4

Is this a CSV or similar file by any chance? If so, there are libraries to help you, for example Apache Commons CSV (link to alternatives on their page too). It will get you a String[] for each line in the file. Just check the array size to know what optional fields are present.

Answered by: Arnold514 | Posted: 24-02-2022

Similar questions

java - StringTokenizer - first string?

I have this code to parse url string such as "?var=val" but when "search" is just "var=val" this code fails, how to make just "var=val" work as well? StringTokenizer st1 = new StringTokenizer(search, "?&;"); while(st1.hasMoreTokens()){ String st2= st1.nextToken(); int ii = st2.indexOf("="); if (ii > 0) { int ib = st2.length(); myparms.put( ...

java - Why is this StringTokenizer not tokenizing properly following the second time?

I want to parse the following using StringTokenizer for every string matching agent>. I tried it using the code like this. Where I am going wrong? StringTokenizer stringtokenizer=new StringTokenize(hari,"agent>"); while(stringtokenizer.hasMoreTokens()) { String token = stringtokenizer.nextToken(); System.out.println("output ="+token); } It is tokenizing properly...

java - trouble with StringTokenizer

I'm getting the following error message and I can't seem to figure out the problem. Would really appreciate any help. The error message reads as:- BaseStaInstance.java:68: cannot find symbol symbol : constructor StringTokenizer(java.lang.Object,java.lang.String) location: class java.util.StringTokenizer st = new StringTokenizer(buf,","); ...

java - StringTokenizer problem of tokenizing

String a ="the STRING TOKENIZER CLASS ALLOWS an APPLICATION to BREAK a STRING into TOKENS.  "; StringTokenizer st = new StringTokenizer(a); while (st.hasMoreTokens()){ System.out.println(st.nextToken()); Given above codes, the output is following, the STRING TOKENIZER CLASS ALLOWS an APPLICATION to BREAK a STRING into TOKENS.  My only question is why the ...

java - Consecutive Delimiters in StringTokenizer

I have to tokenize the following String 12/12/2010:{content1:[{xyz,abc}],13/12/2010:{content2:[{xyz,abc},{content3:[{aa,bb}]}] I nee to split up the above string if it has }] consequtively. So I did, String[] tokens = null; StringTokenizer csvToken = new StringTokenizer(csvString,"]}"); tokens = new String[csvToken.countTokens()]; int tmp = 0; while(csvToken.hasMoreToke...

stringtokenizer - Infinite Loop in Java

So I'm trying to look at an input, and count words that fit a certain criteria (or rather, excluding words that I don't want to count). The error is in the following code: BufferedReader br; BufferedWriter bw; String line; int identifiers = 0; boolean count = false; try { br = new BufferedReader(new FileReader("A1.input")); line = br.readLine(); while(li...

Precision error when parsing a double variable using StringTokenizer in Java

I'm using the StringTokenizer class to read a text file. In the text file, there's a couple of double values. interest = Double.parseDouble(aString.nextToken()); System.out.println(interest); It shows up fine in the console, however when I try to print it later, System.out.println("Fixed Daily Interest = " + customers[i].get_interest() + "\n"); Precision ...

java - stringtokenizer and an unknown variable type

I’m currently working on a project in which I stream in a text document and tokenize it. The only problem is the types in the text document is unknown, is there any way to check what variable type it is before I set it to a variable in the program?

java - Error whilst using StringTokenizer on text file with multiple lines

I'm trying to read a text file and split the words individually using string tokenizer utility in java. The text file looks like this; a 2000 4 b 3000 c 4000 d 5000 Now, what I'm trying to do is get each individual character from the text file and store it into an array list. I then try and print every element in the arraylist in the end. Here is my code;

Performance of StringTokenizer class vs. String.split method in Java

In my software I need to split string into words. I currently have more than 19,000,000 documents with more than 30 words each. Which of the following two ways is the best way to do this (in terms of performance)? StringTokenizer sTokenize = new StringTokenizer(s," "); while (sTokenize.hasMoreTokens()) { or String[] splitS = s.split(" "); for(int i =0; i < splitS...

Still can't find your answer? Check out these amazing Java communities for help...

Java Reddit Community | Java Help Reddit Community | Dev.to Java Community | Java Discord | Java Programmers (Facebook) | Java developers (Facebook)