0

I have an intersting bug where program hit this line

System.out.println("!tempLine.equals(raf.readLine().toString())");

every time at a random index in the loop. Not sure how is it possible that tempLine does not equal raf.readLine().toString() where a previous assignment always apllied. Another interesting (not sure it's related) is that at some point raf.readLine() and raf.readLine().toString() have 2 different values. Deperate for help :-)

   private static Map<String, List<KeyPhraseAnnotation>> getKeyPhrasesFromNewNlp(String filename) throws Exception
{
    String  manualMemoPrefix = "Caller/Customer Name:";

    FreeTextProcessingPipeline nlpPipeline = springContext.getBean(FreeTextProcessingPipeline.class);
    nlpPipeline.initialize();
    Map<String, List<KeyPhraseAnnotation>> kpMatrix = new HashMap<String, List<KeyPhraseAnnotation>>();

    //Map<String, FreeTextProcessingResult> results = nlpPipeline.processFiles(folder, "en-US");
    Random rand = new Random();
    BufferedReader br = new BufferedReader(new FileReader(filename));
    RandomAccessFile raf = new RandomAccessFile(filename,"rw");
    String line;
    long counter = 0;
    int lines = 0;
    int k = 0;
    int s_max = 0;
    int s_min = 0;
    int t = 0;
    int hit=0;
    double e ;
    int max_num_of_documents = 500;
    String[] parts = null;

    while (br.readLine() != null) lines++;

    t = (int) Math.floor((lines/max_num_of_documents));
    k = t;
    e = 0.1 * k;
    String  tempLine = null;
    String  memo_manual = null;
    int current_num_docs = 0;
    while (current_num_docs<max_num_of_documents){

        System.out.println("this is u in beginning of loop: " + current_num_docs);
        tempLine = null;
        s_max = (int) (k+e);
        s_min = (int) (k-e);
        hit = s_min + (int)(Math.random() * ((s_max - s_min) + 1))  ;

        if(hit<lines && hit>0){
        raf.seek(hit);
        }
        else{
        break;  
        }

        tempLine = raf.readLine().toString();
        if (!tempLine.equals(raf.readLine().toString())) 
        {
            System.out.println("!tempLine.equals(raf.readLine().toString())");
        }
        parts = tempLine.split("\\|");
        //String sessionId = parts[0];
        if(parts.length == 21){
            memo_manual = parts[15];
        }
        else { 
            memo_manual="";
            System.out.println(raf.readLine() + "               " + tempLine);
        }

        if (memo_manual.toLowerCase().contains(manualMemoPrefix.toLowerCase())){
            FreeTextProcessingRequest request = new FreeTextProcessingRequest();
            request.setText(memo_manual);
            FreeTextProcessingResult result =  nlpPipeline.processRequest(request);
            List<KeyPhraseAnnotation> list = Arrays.asList(result.getDefaultView().getKeyPhraseAnnotations());
            kpMatrix.put(Long.toString(counter), list);


                for (KeyPhraseAnnotation kp : list){
                    System.out.println(kp.getValue() +" : " +kp.getImportance());

                }
            //t += s_max+1;
            current_num_docs++;
            k = k + t;  
        }
        System.out.println("this is u in end of loop: " + current_num_docs);
    }

    System.out.println("OUT OF FOR");
    /*while ((line = br.readLine()) != null && DocCounter < 50000) {

        String[] parts = line.split("\\|");
        //String sessionId = parts[0];
        String memo_manual = parts[15];
        //String category = parts[2];

        //String  AccountBalance = "2139";
        String  manualMemoPrefix = "Caller/Customer Name:";
        //if (category.equals(AccountBalance) &&  memo_manual.toLowerCase().contains(manualMemoPrefix.toLowerCase())){
        if (memo_manual.toLowerCase().contains(manualMemoPrefix.toLowerCase())){
        DocCounter ++ ;

        FreeTextProcessingRequest request = new FreeTextProcessingRequest();
        request.setText(memo_manual);
        FreeTextProcessingResult result =  nlpPipeline.processRequest(request);
        List<KeyPhraseAnnotation> list = Arrays.asList(result.getDefaultView().getKeyPhraseAnnotations());
        kpMatrix.put(Long.toString(counter), list);


            for (KeyPhraseAnnotation kp : list){
                System.out.println(kp.getValue() +" : " +kp.getImportance());

            }
        }
        counter++;
    }*/
    br.close(); 

    }
Oomph Fortuity
  • 5,710
  • 10
  • 44
  • 89
user3628777
  • 529
  • 3
  • 10
  • 20

2 Answers2

3
    tempLine = raf.readLine().toString(); // first readLine
    if (!tempLine.equals(raf.readLine().toString())) // second readLine

Each readLine reads a new line, so of course tempLine.equals(raf.readLine().toString()) would return false (since you are comparing two different lines). It would only be true if two consecutive lines are equal.

Eran
  • 387,369
  • 54
  • 702
  • 768
0
tempLine = raf.readLine().toString();
    if (!tempLine.equals(raf.readLine().toString())) 
    {
        System.out.println("!tempLine.equals(raf.readLine().toString())");
    }

raf.readLine() is a function, each time you call it it reads the next line.

ekaerovets
  • 1,158
  • 10
  • 18
  • How come raf.seek(hit) is moving the pointer NOT to the beginning of line ? my use case is that I need to randomly read a sample of lines from the file. So I randomly determine variable 'hit' and want to read the next line in file in the position of 'hit'. Apeerantly, rad.seek() is taking be to a middle of the line and not to the beginning as desired. Could you advise ? – user3628777 Sep 05 '14 at 12:35
  • `hit` here is not the number of the line, as you'd like it to be, but the number of the byte in your file, which most often is situated not at beginning of the line at all. Try this: `raf.seek(hit); raf.readLine(); // reading a part of the line from hit till end; String theLineYouWant = raf.readLine(); //read the next line, which should be a normal, unabridged line;`. Of course, you must consider the case when there's no line after those at `hit` position. – ekaerovets Sep 05 '14 at 14:20