11

I'm reading a CSV file downloaded form google trend, here are the contents of file when opened in notepad (first two lines only):

ferrari ferrari (std error)
0.735 2%

When I read the file using readline the line read contains space between each character, in above case the output is:

f e r r a r i f e r r a r i ( s t d e r r o r )
0 . 7 3 5 2 %

(There are tabs between "ferrari" and "ferrari" and between 0.735 and 2% which stackoverflow is not showing)

Newline character at end of each line is also read two times. Why is it that? Any solution?

Here is the code I'm using to read the file:

BufferedReader Reader = new BufferedReader(new FileReader("trend.csv"));
String line = null;
while ((line = Reader.readLine()) != null)
    System.out.println(line);

Edit: there are also some strange characters read at the starting of file

Edut: Got the solution

It was the encoding problem, changed the first line to:

BufferedReader Reader = new BufferedReader(new InputStreamReader(new FileInputStream("trend.csv"), "UTF-16"));
Uzair Farooq
  • 917
  • 3
  • 15
  • 25
  • I ran your exact code on my machine and it printed correctly. What environment are you running that in? Windows 7, Eclipse Helios here. – Logan Jan 05 '12 at 04:47
  • I'm using windows 7 and eclipse. You've copied the file from my question. Use this file: http://www.google.com/trends/viz?q=ferrari&date=2011-9&geo=all&graph=all_csv&sort=0&scale=1&sa=N – Uzair Farooq Jan 05 '12 at 04:50
  • @Jonathan Wood you were right. thanks – Uzair Farooq Jan 05 '12 at 04:56
  • 1
    Since Santhosh Reddy Mandad's answer solved your problem, you should click a checkmark near his answer to make it "accepted". This will greatly help both people looking for answers and people looking for unanswered questions, as this will confirm correct answer and remove question from unanswered list. – Oleg V. Volkov Jul 09 '12 at 14:01

2 Answers2

17

It is due to the character encoding... I've just downloaded the file from trends and tried, it had the same problem.

I got around with this if I use UTF-16 character set.

public class TrendReader
{
    public static void main(String args[]) throws Exception
    {
        //BufferedReader Reader = new BufferedReader(new FileReader("trends.csv"));
        BufferedReader Reader = new BufferedReader(new InputStreamReader(new FileInputStream("trends.csv"), "UTF-16"));
        String line = null;
        while ((line = Reader.readLine()) != null)
        {
            System.out.println(line);
        }
    }
}
0

You need to check the encoding of the file and based on that you should specify it while reading the file:

BufferedReader Reader = new BufferedReader(new InputStreamReader(new 
FileInputStream("trends.csv"), "UTF-8"));

If you expect the file in UTF-8 then change the encoding of the file, instead of your code, it's pretty simple you can use any open source CSV reader like OpenOffice to read this file and while opening specify the encoding :)

IKavanagh
  • 6,089
  • 11
  • 42
  • 47