I got an challenge to make an algorithm in java that calculates how much possible DNA chains can an string form. String can contain these 5 characters (A, G, C, T, ?)
? in the string can be (A, G, C or T) but ? may not cause an pair in the string. For example, in this string "A?G" ? could only be C or T. There can be infinite pair of question marks, since they are all characters in the end.
The function form is this
public static int chains(String base) {
// return the amount of chains
}
if the base string would be "A?C?" possible combinations would be 6 = (AGCA, AGCG ,AGCT ,ATCA ,ATCG ,ATCT)
Cases (??? - 36) (AGAG - 1) (A???T - 20)
(? - 4) (A? - 3) (?A - 3) (?? - 12) (A?A - 3) (A?C - 2) ...
Max length of the given base(pohja) string is 10!
Criteria: 1. Combinations that have two characters in a row are are illegal combinations so those don't count.
What I have so far:
public static int chains(String pohja) {
int sum = 1;
int length = pohja.length();
char[] arr = pohja.toCharArray();
int questionMarks = 0;
if (length == 1) {
if (pohja.equals("?"))
return 4;
else
return 1;
} else if (length == 2) {
boolean allQuestionMarks = true;
for (int i = 0; i < 2; i++) {
if (arr[i] != '?')
allQuestionMarks = false;
else
questionMarks++;
}
if (allQuestionMarks) return 12;
if (questionMarks == 1) {
return 3;
} else {
return 2;
}
} else {
questionMarks = 0;
for (int i = 0; i < length; i++) {
if (arr[i] == '?') questionMarks++;
}
for (int i = 1; i < length - 1; i++) {
boolean leftIsLetter = isLetter(arr[i - 1]);
boolean rightIsLetter = isLetter(arr[i + 1]);
boolean sameSides = false;
if (arr[i - 1] == arr[i + 1]) sameSides = true;
if (arr[i] != '?') { // middle is char
if (leftIsLetter && rightIsLetter) { // letter(left) == letter(right)
if (sameSides) {
// Do nothing!
} else {
sum *= 3;
}
} else if (!leftIsLetter && !rightIsLetter) { // !letter(left) == !letter(right)
} else { // letter(left) != letter(right)
}
} else { // Middle is ?
if (leftIsLetter && rightIsLetter) { // letter(left) == letter(right)
if (sameSides) {
sum *= 3;
} else {
sum *= 2;
}
} else if (!leftIsLetter && !rightIsLetter) { // !letter(left) == !letter(right)
sum *= 9;
} else { // letter(left) != letter(right)
if (arr[i - 1] == '?') { // ? is on the left
} else { // ? is on the right
sum *= 2;
}
}
}
}
}
return sum;
}
public static boolean isLetter(char c) {
boolean isLetter = false;
char[] dna = { 'A', 'G', 'C', 'T' };
for (int i = 0; i < 4; i++) {
if (c == dna[i]) isLetter = true;
}
return isLetter;
}
Yeah, I know, my code is a mess. If the length of pohja(base) is 3 or more, my algorithm will check 3 characters at a time and modify sum depending on the characters that the algorithm is checking.
Could anyone give an hint on how I can solve this? :) Thanks in advance, TuukkaX.