-1

I have a CSV file full of data downloaded from Fitbit. The data inside the CSV file follows a basic format:

<Type of Data>
<Columns-comma-separated>
<Data-related-to-columns>

Here is a small example of the layout of the file:

Activities
Date,Calories Burned,Steps,Distance,Floors,Minutes Sedentary,Minutes Lightly Active,Minutes Fairly Active,Minutes Very Active,Activity Calories
"2016-07-17","3,442","9,456","4.41","12","612","226","18","44","1,581"
"2016-07-18","2,199","7,136","3.33","10","370","93","12","46","1,092"
...other logs
Sleep
Date,Minutes Asleep,Minutes Awake,Number of Awakenings,Time in Bed
"2016-07-17","418","28","17","452"
"2016-07-18","389","26","10","419"

Now, I am using CSVParser from Apache Common's library to go through this data. My goal is to turn this into Java Objects that can turn relevant data into Json (I need the Json to upload into a different website). CSVParser has an iterator that I can use to iterate through the CSVRecords in the file. So, essentially, I have a "list" of all of the data.
Because the file contains different types of data (Sleep logs, Activity logs, etc), I need to get a subsection/sub-list of the file, and pass it into a class to analyse it.

I need to iterate over the list and look for the keyword that identifies a new section of the file (e.g. Activities, Foods, Sleep, etc). Once I have identified what the next part of the file is, I need to select all of the following rows up until the next category.

Now, for the question in this Question: I don't know how to use an iterator to get the equivalent of List.sublist(). Here is what I have been trying:

while (iterator.hasNext())
{
    CSVRecord current = iterator.next();
    if (current.get(0).equals("Activities"))
    {
        iterator.next(); //Columns
        while (iterator.hasNext() && iterator.next().get(0).isData()) //isData isn't real, but I can't figure out what I need to do.
        {
            //How do I sublist it here?
        }
    }
}

So, I need to determine if the next CSVRecord begins with a quote/has data, and loop until I find the next category, and finally pass a subsection of the file (using the iterator) to another function to do something with the correct log.

Edit

I considered converting it first to a List with a while loop, and then sub-listing, but that seemed wasteful. Correct me if I am wrong.

Also, I can't assume that each section will have the same amount of rows following it. They might have similar, but there is also the food logs, which follow a completely different pattern. Here are two different days. Foods follows the normal pattern, but the Food Logs do not.

Foods
Date,Calories In
"2016-07-17","0"
"2016-07-18","1,101"

Food Log 20160717
Daily Totals
"","Calories","0"
"","Fat","0 g"
"","Fiber","0 g"
"","Carbs","0 g"
"","Sodium","0 mg"
"","Protein","0 g"
"","Water","0 fl oz"

Food Log 20160718
Meal,Food,Calories
"Lunch"
"","Raspberry Yogurt","190"
"","Almond Sweet & Salty Granola Bar","140"
"","Goldfish Baked Snack Crackers, Cheddar","140"
"","Bagels, Whole Wheat","190"
"","Braided Twists Honey Wheat Pretzels","343"
"","Apples, raw, gala, with skin - 1 medium","98"
"Daily Totals"
"","Calories","1,101"
"","Fat","21 g"
"","Fiber","13 g"
"","Carbs","202 g"
"","Sodium","1,538 mg"
"","Protein","28 g"
"","Water","24 fl oz"
Community
  • 1
  • 1
Cache Staheli
  • 3,510
  • 7
  • 32
  • 51
  • If I was in your place I would use something that is structured data friendly like Bean.io. This API is too low level for my taste. – Alexander Petrov Jul 18 '16 at 20:26
  • 1
    On another note. If you want to use List.sublist() why dont you create one ArrayList and fill it in with the contents of the Iterator ? It is a bit naive solution, but you already said you are ready to use sublist. – Alexander Petrov Jul 18 '16 at 20:27
  • @AlexanderPetrov I considered it, but it seemed like it would be wasteful. Regardless, the other part of my question stands. I'm unsure how to figure out what portions of the file to sublist. – Cache Staheli Jul 18 '16 at 20:29
  • If I understand correctly "2016-07-17","3,442","9,456","4.41","12","612","226","18","44","1,581" is a separate CSVRecord and "2016-07-18","2,199","7,136","3.33","10","370","93","12","46","1,092" is also a separate CSV record so essentially you have a list section after every 2 non list sections. Also every List section starts with a Date. – Alexander Petrov Jul 18 '16 at 20:37
  • @AlexanderPetrov Correct. Each is it's own `CSVRecord` However, they won't always have the same amount of rows. – Cache Staheli Jul 18 '16 at 20:39
  • Well the CSVRecord is Iterator as well. The fact that the column number is not fixed is not an issue as long as all columns in the row are of the same type and are relevant only to this list. – Alexander Petrov Jul 18 '16 at 20:42
  • @AlexanderPetrov how is the `CSVRecord` an Iterator as well? – Cache Staheli Jul 18 '16 at 20:44
  • Sorry my bad. It is not Iterator, but you have a method under CSVRecord Iterator iterator(); And you can obtain an iterator over all columns defined in this record. https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/CSVRecord.html – Alexander Petrov Jul 18 '16 at 20:47
  • Also you have size() not that it matters. – Alexander Petrov Jul 18 '16 at 20:48

1 Answers1

1

The easiest way to do what you want is to simply remember that previous category data, and when you hit a new category, process that previous category data and reset for the next category. This should work:

String categoryName = null;
List<List<String>> categoryData = new ArrayList<>();
while (iterator.hasNext()) {
    CSVRecord current = iterator.next();
    if (current.size() == 1) { //start of next category
        processCategory(categoryName, categoryData);
        categoryName = current.get(0);
        categoryData.clear();
        iterator.next(); //skip header
    } else { //category data
        List<String> rowData = new ArrayList<>(current.size());
        CollectionUtils.addAll(rowData, current.iterator()); //uses Apache Commons Collections, but you can use whatever
        categoryData.add(rowData);
    }
}
processCategory(categoryName, categoryData); //last category of file

And then:

void processCategory(String categoryName, List<List<String>> categoryData) {
    if (categoryName != null) { //first category of the file, skip
        //do stuff
    }
}

The above assumes that a List<List<String>> is the data structure that you want to deal with, but you can tweak as you see fit. I might even recommend simply passing List<Iterable<String>> to the process method (CSVRecord implements Iterable<String>) and handling the row data there.

This can definitely be cleaned up further, but it should get you started.

ach
  • 6,164
  • 1
  • 25
  • 28