2

I'm building an android/Java program which reads from a text file and store each sentence in the text file in an array list. Then it checks the occurrence of each word in the sentence and print out all the sentences that contains repeated words.

This is the code that I am using to print out the final result:

    protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.text4);
    text = (TextView)findViewById(R.id.info2);
    BufferedReader reader = null;

    try {
        reader = new BufferedReader(
                new InputStreamReader(getAssets().open("input3.txt")));

        String line;

        List<String> sentences = new ArrayList<String>();

        for ( String line2; (line2 = reader.readLine()) != null;) {

            for (String sentence : line2.split("(?<=[.?!\t])")) {
                sentence = sentence.trim();
                if (! sentence.isEmpty()) {
                    sentences.add(sentence);
                }                   
            }  

            String[] keys = line2.split(" ");
            String[] uniqueKeys;

            int count = 0;
            uniqueKeys = getUniqueKeys(keys);

            for(String key: uniqueKeys)
            {
                if(null == key)
                {
                    break;
                }           
                for(String s : keys)
                {
                    if(key.equals(s))
                    {
                        count++;
                    }               
                }

                if(key.equals("a") || key.equals("the")|| key.equals("is")|| key.equals("of")|| key.equals("and")|| key.equals("The") || key.equals("some") || key.equals("on") || key.equals("during") || key.equals("to") || key.equals("since") || key.equals("in") || key.equals("by") || key.equals("for") || key.equals("were") ||key.equals("--") || key.equals("in") || key.equals("as") || key.equals("that") || key.equals("may") || key.equals("can") || key.equals("without") || key.equals("You")){
                    count = 0;
                }

                if(count >1 ){

                    MyKey = key;


                    Pattern word = Pattern.compile("\\b"+key+"\\b", Pattern.CASE_INSENSITIVE);

                    //sentences is the arrayList of sentences in this program
                    LinkedHashSet<String> lhs = new LinkedHashSet<String>();
                    for (String sentence : sentences) {
                        //checks the occurance of keyword within each sentence 
                        if (word.matcher(sentence).find()) {


                            lhs.add(sentence);


                        }                                          

                    }
                    for (String sentence2 : lhs) {
                        text.append(sentence2);                                     
                    }


                }
                count=0;
            }   


        }


    } catch (IOException e) {
         Toast.makeText(getApplicationContext(),"Error reading file!",Toast.LENGTH_LONG).show();
         e.printStackTrace();
    }finally {
        if (reader != null) {
            try {
                reader.close();
            } catch (IOException e) {
                //log the exception
            }            

        }

    }







}
  1. My program first reads a text file and then store each sentence in my text file in an arrayList of sentences called "sentences".

  2. Then it reads each word in the text file and it stores each word that is being repeated more than once in an arrayList called "key".

  3. Then it checks whether the "key" exist in each sentence and if it does, it adds those sentences into an LinkedHashSet called "lhs".

  4. Then it should display all the sentences in the LinkedHashSet on the output screen.

On this occasion, values of my "key" are "rate", "states" and "government"

However, my text file contains this sentence: "Thirteen states reported an unemployment rate above the current national rate."

As you can see, it contains both "states" and "rate" which are two of my keywords.

When I run this program, this particular sentence appear twice on the output screen because the program looks for each "key" separately so it thinks that they are two different sentences.

This is why I used LinkedHashSet to prevent this but it still displays this sentence twice on the output screen.

How should I fix this?

1 Answers1

0

Every time that word match with sentence you are creating a new LinkedHashSet instance.

Try this:

//sentences is the arrayList of sentences in this program
LinkedHashSet<String> lhs = new LinkedHashSet<String>();  
for (String sentence : sentences) {
    //checks the occurance of keyword within each sentence 
    if (word.matcher(sentence).find()) {
        lhs.add(sentence);
        }
}

//displays the final result on the output window
String text = "";
for (String sentence2 : lhs) {
    text.append(sentence2);                                     
}
febaisi
  • 644
  • 6
  • 15
  • why did you add String text = ""; text is the name of my TextView – user5679217 Dec 14 '15 at 20:48
  • To avoid null pointer exception. If you already declare it before.. just remove mine one .. If it works for you .. please .. up vote! (: – febaisi Dec 14 '15 at 20:55
  • Can you please check it now, I have added all the code from my program – user5679217 Dec 14 '15 at 20:55
  • Is it a build error? Runtime error? Post your doubt. – febaisi Dec 14 '15 at 21:01
  • When I run this program above, it displays sentences which contains "key" as I explained in my question. However, some sentences like the example above, contains 2 "key" words. Therefore, the program identifies that particular sentence as 2 separate sentences. That's why I tried to use LinkedHashSet to prevent the duplicate sentences from displaying. – user5679217 Dec 14 '15 at 21:05
  • Now I updated my code with your code (Check the question) but it still gives the same result – user5679217 Dec 14 '15 at 21:12