0

To speed-up a lookup search into a multi-record file I wish to store its elements into a String array of array so that I can just search a string like "AF" into similar strings only ("AA", "AB, ... , "AZ") and not into the whole file.

The original file is like this:

AA
ABC
AF
(...)
AP
BE
BEND
(...)
BZ
(...)
SHORT
VERYLONGRECORD
ZX

which I want to translate into

AA  ABC     AF  (...)   AP
BE  BEND    (...)   BZ
(...)
SHORT
VERYLONGRECORD
ZX

I don't know how much records there are and how many "elements" each "row" will have as the source file can change in the time (even if, after being read into memory, the array is only read).

I tried whis solution:

in a class I defined the string array of (string) arrays, without defining its dimensions

public static String[][] tldTabData;

then, in another class, I read the file:

public static void tldLoadTable() {

    String rec = null;
    int previdx = 0;
    int rowidx = 0;
    // this will hold each row
    ArrayList<String> mVector = new ArrayList<String>();

    FileInputStream fStream;
    BufferedReader bufRead = null;

    try {
        fStream = new FileInputStream(eVal.appPath+eVal.tldTabDataFilename);
        // Use DataInputStream to read binary NOT text.
        bufRead = new BufferedReader(new InputStreamReader(fStream));
    } catch (Exception er1) {
        /* if we fail the 1.st try maybe we're working into some "package" (e.g. debugging)
         * so we'll try a second time with a modified path (e.g. adding "bin\") instead of
         * raising an error and exiting.
         */
        try {
            fStream = new FileInputStream(eVal.appPath +
                "bin"+ File.separatorChar + eVal.tldTabDataFilename);
            // Use DataInputStream to read binary NOT text.
            bufRead = new BufferedReader(new InputStreamReader(fStream));
        } catch (FileNotFoundException er2) {
            System.err.println("Error: " + er2.getMessage());
            er2.printStackTrace();
            System.exit(1);
        }
    }
    try {
        while((rec = bufRead.readLine()) != null) {
            // strip comments and short (empty) rows
            if(!rec.startsWith("#") && rec.length() > 1) {
                // work with uppercase only (maybe unuseful)
                //rec.toUpperCase();
                // use the 1st char as a row index
                rowidx = rec.charAt(0);
                // if row changes (e.g. A->B and is not the 1.st line we read)
                if(previdx != rowidx && previdx != 0)
                {
                    // store the (completed) collection into the Array
                    eVal.tldTabData[previdx] = mVector.toArray(new String[mVector.size()]);
                    // clear the collection itself
                    mVector.clear();
                    // and restart to fill it from scratch
                    mVector.add(rec);
                } else
                {
                    // continue filling the collection
                    mVector.add(rec);
                }
                // and sync the indexes
                previdx = rowidx;
            }
        }
        streamIn.close();
        // globally flag the table as loaded
        eVal.tldTabLoaded = true;
    } catch (Exception er2) {
        System.err.println("Error: " + er2.getMessage());
        er2.printStackTrace();
        System.exit(1);
    }
}

When executing the program, it correctly accumulates the strings into mVector but, when trying to copy them into the eVal.tldTabData I get a NullPointerException.

I bet I have to create/initialize the array at some point but having problems to figure where and how.

First time I'm coding in Java... helloworld apart. :-)

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
eewing
  • 545
  • 1
  • 4
  • 5
  • the application you are doing,seems to be like a dictonary,and u need it to be dynamic ,so i suggest you to switch to another datastructure instead of arrays like treemap/hashmap.(the best datastructure for a dictonary is tries).At present you are using arrays which does'nt help it to be dynamic.Also provide the slack trace on where you get the null pointer exception – Kaushik Sivakumar Oct 02 '13 at 16:27
  • java.lang.NullPointerException at readFile.tldLoadTable(readFile.java:189) at ruleset.rule_07(ruleset.java:197) at ruleset.main(ruleset.java:70) at readFile.main(readFile.java:77) at eVal.main(eVal.java:97) got it when execute this line: eVal.tldTabData[previdx] = mVector.toArray(new String[mVector.size()]); My main problem is not in having performances when loading the items into memory, but while I'll search through them after loaded: I hope I can do something like: foreach(item in eVal.tldTabData[row]) ... and try matching a string with "item". – eewing Oct 02 '13 at 16:41

1 Answers1

0

you can use a Map to store your strings per row;

here something that you'll need :

        //Assuming that mVector already holds all you input strings
        Map<String,List<String>> map = new HashMap<String,List<String>>();

        for (String str : mVector){
            List<String> storedList;
            if (map.containsKey(str.substring(0, 1))){
                storedList = map.get(str.substring(0, 1));
            }else{
                storedList = new ArrayList<String>();
                map.put(str.substring(0, 1), storedList);
            }
            storedList.add(str);
        }

        Set<String> unOrdered = map.keySet();
        List<String> orderedIndexes = new ArrayList<String>(unOrdered);
        Collections.sort(orderedIndexes);

        for (String key : orderedIndexes){//get strings for every row
            List<String> values = map.get(key);
            for (String value : values){//writing strings on the same row
                System.out.print(value + "\t"); // change this to writing to some file
            }
            System.out.println(); // add new line at the end of the row
        }
Eugen Halca
  • 1,775
  • 2
  • 13
  • 26
  • Thank you very much, Eugen, your solution and sample perfectly fits. I was trying to directly use the vector-of-vector of strings since days without understanding how to correctly define and use it. The program now debugs well and I've just to fix some little remaining bugs (mainly logical ones) but the big rock on my way was the lack of **Map** knowledge. – eewing Oct 03 '13 at 16:49