0

I'm trying to access the header values for each record which is present in CSV file url from github using Apache commons csv library.

This is my code:

@Service
public class CoronaVirusDataService {

    private static String virus_data_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/Aysen_Chile_07032021/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv";
    
    @PostConstruct
    public void getVirusData()
    {
        try
        {
        URL url = new URL(virus_data_url);
        HttpURLConnection con = (HttpURLConnection) url.openConnection();
        BufferedReader in = new BufferedReader( new InputStreamReader(con.getInputStream()));
        
        while((in.readLine()) != null)
        {
            StringReader csvReader = new StringReader(in.readLine());
            Iterable<CSVRecord> records = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(csvReader);
            for (CSVRecord record : records) {
                String country = record.get("Country/Region");
                System.out.println(country);
            }       
        }
        in.close();
        }
        catch(Exception e) 
        {
            e.printStackTrace();
        }
    }
}

When i run the application i'm getting this error:

java.lang.IllegalArgumentException: A header name is missing in [, Afghanistan, 33.93911, 67.709953, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 4, 4, 4, 5, 7, 8, 11, 12, 13, 15, 16, 18, 20, 24, 25, 29, 30, 34, 41, 43, 76, 80, 91, 107, 118, 146, 175, 197, 240, 275, 300, 338, 368, 424, 445, 485, 532, 556, 608, 666, 715, 785, 841, 907, 934, 997, 1027, 1093]
at org.apache.commons.csv.CSVParser.createHeaders(CSVParser.java:501)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:412)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:378)
at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:1157)
at com.p1.Services.CoronaVirusDataService.getVirusData(CoronaVirusDataService.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
Rahul
  • 65
  • 1
  • 6

2 Answers2

2

You should not read line by line if you want to read first line as header because the Apache CSV tries to read every line as header. So the exception is thrown. Instead you should pass reader to read data. Below code works fine.

@Service
public class CoronaVirusDataService {

    private static String virus_data_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/Aysen_Chile_07032021/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv";
    
    @PostConstruct
    public void getVirusData()
    {
        try
        {
        URL url = new URL(virus_data_url);
        HttpURLConnection con = (HttpURLConnection) url.openConnection();
        BufferedReader in = new BufferedReader( new InputStreamReader(con.getInputStream()));

            Iterable<CSVRecord> records = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(in);
            for (CSVRecord record : records) {
                String country = record.get("Country/Region");
                System.out.println(country);
            }       
   
        in.close();
        }
        catch(Exception e) 
        {
            e.printStackTrace();
        }
    }
}
Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92
Sujit Sharma
  • 110
  • 10
0

You want to parse an HTTP file with headers and of the standard CSV format. The code will be lengthy if you try to do the parsing in Java. But, it is simple to finish this using SPL, the open-source Java package. You just need one line of code:

A
1 =httpfile("https://raw.githubusercontent.com/CSSEGISandData/COVID-/Aysen_Chile_07032021/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
.import@ct(Country/Region)

SPL offers JDBC driver to be invoked by Java. Just store the above SPL script as httpcsv.splx and invoke it in Java as you call a stored procedure:

…
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
st=con.prepareCall("call httpcsv()");
st.execute();
…

Or execute the SPL string within a Java program in the way we execute a SQL statement:

…
st = con.prepareStatement("==httpfile(\"https://raw.githubusercontent.com/CSSEGISandData/COVID-19/Aysen_Chile_07032021/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv\").import@ct(Country/Region)");
st.execute();
…