0

I'm working on a project where I want to be able to be able to parse some text and find nouns and a lot of the text I want to parse has pronouns in it for Example => "Emma the parrot was a bird. She lived in a tall tree".

I don't want to work with "She's" etc. as they aren't seen as nouns in the dictionary I'm working with so I've been working on a method to replace She etc with the previous occurrence of a name. So the above example would output to => "Emma the parrot was a bird. Emma lived in a tall tree".

The method is working fine when I have a small sample however when I'm working with 3-4 different people in one text it doesn't work.

public static String replacePronouns(String text, ArrayList<String> dictionary) {
        String[] strArray = text.replaceAll("\\.", " .").replaceAll("\\,", "").split("\\s+");
        String previousName = "";
        for(int i = 0; i < strArray.length; i++ ) {
            //we'll have to set this to be more dynamic -> change to pronouns in dicitonary
            if(strArray[i].equals("His") || strArray[i].equals("She") || strArray[i].equals("she") || strArray[i].equals("him") || strArray[i].equals("he") || strArray[i].equals("her")) {
                for(int j = (i-1); j>=0; j--) {
                    int count = dictionary.size()-1;
                    boolean flag = false;
                    while(count>=0 && flag==false) {
                        if(strArray[j].equals(dictionary.get(count).split(": ")[1]) && dictionary.get(count).split(": ")[0].equals("Name")) {
                            previousName = strArray[j];
                            flag = true; }
                        count--;
                    } }
                strArray[i] = previousName; } }
        return Arrays.toString(strArray).replaceAll("\\[", "").replaceAll("\\,", "").replaceAll("\\]", "");
    }

It takes in my text

String text = "Karla was a bird and she had beautifully colorful feathers. She lived in a tall tree.

And a "dictionary"

ArrayList<String> dictionary = new ArrayList<>();
        dictionary.add("Name: hunter");
        dictionary.add("Name: Karla");
        dictionary.add("Noun: hawk");
        dictionary.add("Noun: feathers");
        dictionary.add("Noun: tree");
        dictionary.add("Noun: arrows");
        dictionary.add("Verb: was a");
        dictionary.add("Verb: had");
        dictionary.add("Verb: missed");
        dictionary.add("Verb: knew");
        dictionary.add("Verb: offered");
        dictionary.add("Verb: pledged");
        dictionary.add("Verb: shoot");

But it always outputs Karla in this example, even if we had "The hunter shot his gun" in the same string. Any help on why this isn't working would be appreciated

  • If I understand correctly, this could prove challenging, take for example `Emma talked to Karla, She told her ...`, this could mean 2 things, `Emma talked to Karla, Emma told her ...` OR `Emma talked to Karla, Karla told her ...`. Which is it ? – BaSsGaz Oct 17 '17 at 18:35
  • @BaSsGaz That's a problem I'm yet to even look at (however it will be a problem down the line). For now it's more so just about getting the actual pronoun method working as expected. And in your example when I do reach that problem I'll be treating issues like that with "Emma talked to Karla, Emma told Karla" –  Oct 17 '17 at 18:38

1 Answers1

0

This isn't working because you continue looping over j even after you've found a match in the dictionary. That is - you keep looking back towards the beginning of the string, and eventually find "Karla", even though you've already matched "hunter".

There are many ways you could fix this. One very simple one would be to move boolean flag = false; up to before the for loop over j, and change the condition from j >= 0 to j >= 0 && !flag, so that you stop looping as soon as flag is true. Like so :

public static String replacePronouns(String text, ArrayList<String> dictionary) {
        String[] strArray = text.replaceAll("\\.", " .").replaceAll("\\,", "").split("\\s+");
        String previousName = "";
        for (int i = 0; i < strArray.length; i++) {
            boolean flag = false;
            // we'll have to set this to be more dynamic -> change to pronouns in dicitonary
            if (strArray[i].equals("His") || strArray[i].equals("She") || strArray[i].equals("she") || strArray[i].equals("him") || strArray[i].equals("he") || strArray[i].equals("her")) {
                for (int j = (i - 1); j >= 0 && flag == false; j--) {
                    int count = dictionary.size() - 1;
                    while (count >= 0) {
                        if (strArray[j].equals(dictionary.get(count).split(": ")[1]) && dictionary.get(count).split(": ")[0].equals("Name")) {
                            previousName = strArray[j];
                            flag = true;
                        }
                        count--;
                    }
                }
                strArray[i] = previousName;
            }
        }
        return Arrays.toString(strArray).replaceAll("\\[", "").replaceAll("\\,", "").replaceAll("\\]", "");
    }

If you placed your } characters in a more standard way, this kind of error would be easier to see.

BaSsGaz
  • 666
  • 1
  • 18
  • 31
Dawood ibn Kareem
  • 77,785
  • 15
  • 98
  • 110