I'm building an android/Java program which reads from a text file and store each sentence in the text file in an array list. Then it checks the occurrence of each word in the sentence and print out all the sentences that contains repeated words.
This is the code that I am using to print out the final result:
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.text4);
text = (TextView)findViewById(R.id.info2);
BufferedReader reader = null;
try {
reader = new BufferedReader(
new InputStreamReader(getAssets().open("input3.txt")));
String line;
List<String> sentences = new ArrayList<String>();
for ( String line2; (line2 = reader.readLine()) != null;) {
for (String sentence : line2.split("(?<=[.?!\t])")) {
sentence = sentence.trim();
if (! sentence.isEmpty()) {
sentences.add(sentence);
}
}
String[] keys = line2.split(" ");
String[] uniqueKeys;
int count = 0;
uniqueKeys = getUniqueKeys(keys);
for(String key: uniqueKeys)
{
if(null == key)
{
break;
}
for(String s : keys)
{
if(key.equals(s))
{
count++;
}
}
if(key.equals("a") || key.equals("the")|| key.equals("is")|| key.equals("of")|| key.equals("and")|| key.equals("The") || key.equals("some") || key.equals("on") || key.equals("during") || key.equals("to") || key.equals("since") || key.equals("in") || key.equals("by") || key.equals("for") || key.equals("were") ||key.equals("--") || key.equals("in") || key.equals("as") || key.equals("that") || key.equals("may") || key.equals("can") || key.equals("without") || key.equals("You")){
count = 0;
}
if(count >1 ){
MyKey = key;
Pattern word = Pattern.compile("\\b"+key+"\\b", Pattern.CASE_INSENSITIVE);
//sentences is the arrayList of sentences in this program
LinkedHashSet<String> lhs = new LinkedHashSet<String>();
for (String sentence : sentences) {
//checks the occurance of keyword within each sentence
if (word.matcher(sentence).find()) {
lhs.add(sentence);
}
}
for (String sentence2 : lhs) {
text.append(sentence2);
}
}
count=0;
}
}
} catch (IOException e) {
Toast.makeText(getApplicationContext(),"Error reading file!",Toast.LENGTH_LONG).show();
e.printStackTrace();
}finally {
if (reader != null) {
try {
reader.close();
} catch (IOException e) {
//log the exception
}
}
}
}
My program first reads a text file and then store each sentence in my text file in an arrayList of sentences called "sentences".
Then it reads each word in the text file and it stores each word that is being repeated more than once in an arrayList called "key".
Then it checks whether the "key" exist in each sentence and if it does, it adds those sentences into an LinkedHashSet called "lhs".
Then it should display all the sentences in the LinkedHashSet on the output screen.
On this occasion, values of my "key" are "rate", "states" and "government"
However, my text file contains this sentence: "Thirteen states reported an unemployment rate above the current national rate."
As you can see, it contains both "states" and "rate" which are two of my keywords.
When I run this program, this particular sentence appear twice on the output screen because the program looks for each "key" separately so it thinks that they are two different sentences.
This is why I used LinkedHashSet to prevent this but it still displays this sentence twice on the output screen.
How should I fix this?