I have a really big feed file with lot of coloumns. I will represennt one of the coloumns with a strings and i want to check these strings...
Let's see we have these string values (in a coloumn), the feed is obviously finctional:):
"Gia Joe Black Viper"
"Street Fighter...Ken"
"Mortal Kombat, Scorpion"
"Gia Joe Desert Fox"
"Mortal Kombat, Sub Zero"
"Street Fighter...Ryu"
I want to find the matches in the strings... So to simplify the task is: to find one of the string substring in an another string and collect these substrings in to a HashSet...
So basicaly the result tags would be:
Gi Joe
Mortal Kombat
Street Fighter
I write a simple code to test the algorithm, but i want to minimalize the time complexity of this task, space complexity is not as much important as time... (You can think a feed like 10.000 line about, so it is cardinal to have low time complexity) you can find and read below my code:
String[] stringArray = new String[6];
stringArray[0] = "Mortal Kombat - Scorpion";
stringArray[1] = "Street Fighter - Ken";
stringArray[2] = "Mortal Kombat - Scorpion";
stringArray[3] = "Gi Joe - Desert Fox";
stringArray[4] = "Gi Joe - Desert Dog";
stringArray[5] = "Street Fighter - Ryu";
HashSet<String> commonStrings = new HashSet();
for (int i = 0; i < stringArray.length; i++) {
String[] splittedString = stringArray[i].split("[ ]");
System.out.println("i"+i);
for (int j = 0; j < stringArray.length; j++) {
System.out.println("j"+j);
String matchable = "";
for (int k = 0; k < splittedString.length; k++) {
System.out.println("k"+k);
if(k==0)matchable=matchable;
else {matchable = matchable + " " + splittedString[k];}
if(j!=i){
System.out.println("StringArray["+j+"]("+stringArray[j]+")index.of("+matchable+")"+"is"+matchable.indexOf(stringArray[j]));
if (stringArray[j].indexOf(matchable) > 0) {
commonStrings.add(matchable);
}
}
}
}
Any suggestion appreciated to make my code better, thank you!