How to get the low and high counts of characters from a string?

Question

So I am having trouble with the second part of this project. I have the below code which gives counts for each entry, but I do not know how to get the highs and lows...Thanks in advance!

A1Adept

This program should process the input as A1Novice does, but in addition to producing the counts, it should also keep track of the DNA strand with the smallest and largest number of each of the nucleobases and print those strands to the output. So, given the following input:

A 
CC 
AATA 
GGG
TTT
end

The program should produce the following output:

A count: 4
C count: 2
G count: 3
T count: 4
Low A count: A
High A count: AATA
Low C count: CC
High C count: CC
Low G count: GGG
High G count: GGG
Low T count: AATA
High T count: TTT

package a1;

import java.util.Scanner;

public class A1Novice {
    public static void main(String[] args){
        Scanner s = new Scanner(System.in);
        System.out.println("Enter nucleobases: (enter end when done)");
        process(s);
    }

    public static void process(Scanner s){
        int a = 0, c = 0, g = 0, t = 0;
        while(s.hasNext()){
            String id = s.next();
            if(id.equalsIgnoreCase("end")){
                break;
            }
            for(int i = 0; i < id.length(); i++){
                char singleChar = id.charAt(i);
                if (singleChar=='A' || singleChar=='a'){
                    a++;
                }
                else if(singleChar=='C' || singleChar=='c'){
                    c++;
                }
                else if(singleChar=='G' || singleChar=='g'){
                    g++;
                }
                else if(singleChar=='T' || singleChar=='t'){
                    t++;
                }

            }
        }  
        System.out.println("A count: " + a);
        System.out.println("C count: " + c);
        System.out.println("G count: " + g);
        System.out.println("T count: " + t);
    }
}

Since this is a school project, I think you should try harder and accomplish the goal by yourself. But here is a tip: keep the number of `A` of the current string and the current string itself in two variables; if the next string has more `A` than what you put in the variable, then update. Same for other letters. — Stefano Sanfilippo, Jan 21 '14 at 22:08

score 0 · Answer 1 · answered Jan 21 '14 at 22:09

What you could do is write a class that stores the data for just one of the four bases -

the letter that represents that base,
the total count so far,
the line that so far has the most number of that base (and what that number is)
the line that so far has the least number of that base (and what that number is).

Instantiate four of these (you'll want to pass the letter in the constructor).

Then write a method in that class that takes a line of text as a parameter, and updates all the fields of the class according to that line. You'll also want some methods to display the fields of the class. Lastly, have nested loops in your main method (or some other method) that pass each line of text to each of the objects in turn.

I'm not going to write your code for you, though. Stack Overflow doesn't pay me enough.

Alex Goja · Answer 2 · 2014-01-21T22:35:45.700

0

I might be overthinking this, but here I go. My initial thought would be to store each line of input into an array list using hasNextLine() instead of hasNext(). This way you would end up with an array ArrayList arrInput with the following contents = {"A","CC","AATA","GGG","TTT"}. Now you create an array, let's say processArray that has the same size to your arrInput. Each entry of the processArray has another array of length 4 (assuming that only letters A,C,T,G can occur in the input) which will store the numbers of A's, C's, T's or G's for each line of input. I've attached a graphical representation of the concept, but again, like I said before I think I'm overthinking this. enter image description here

edited Jan 21 '14 at 22:35

answered Jan 21 '14 at 22:28

Alex Goja

548
5
18

This is good, but it only solves part of the problem. We also need to know the text of the line with the most and least number of each letter. – Dawood ibn Kareem Jan 21 '14 at 23:03
what if we extend the size of the processArr by 2 with the same format, so instead of having processArr with indexes from 0 to 4 we have from 0 to 6. I will explain why, I hope it will make sense. So we use a variable indexOfMaxOccurence and numberOfMaxOccurence. We make the assumption that in entry 5 of the processArr we will store the indexOfMaxOccurrence (which will be between 0 to 4) and in entry 6 we store the numberOfMaxOccurrences. Now we have both the numberOfMaxOccurrences and the index where the numberOfMaxOccurrences occurred which can be used to get the text. – Alex Goja Jan 24 '14 at 00:19

How to get the low and high counts of characters from a string?

2 Answers2