0

I want to prepare dataset from the data available in http://stat.data.abs.gov.au/Index.aspx?DataSetCode=ATSI_BIRTHS_SUMM

Data API:

http://stat.data.abs.gov.au/restsdmx/sdmx.ashx/GetData/ATSI_BIRTHS_SUMM/1+4+5+7+8+9+10+13+14+15+18+19+20.IM+IB.0+1+2+3+4+5+6+7.A/all

from pandasdmx import Request

Agency_Code = 'ABS'
Dataset_Id = 'ATSI_BIRTHS_SUMM'

ABS = Request(Agency_Code)
data_response = ABS.data(resource_id='ATSI_BIRTHS_SUMM')
print(data_response.url)

DF = data_response.write(data_response.data.obs(with_values=True, with_attributes=True), parse_time=False)

Above gives error: ValueError: Type names and field names cannot be a keyword: 'None'

DF = data_response.write(data_response.data.series, parse_time=False), This works but Dimension items coming in column wise.

Support Links:

http://stat.data.abs.gov.au/restsdmx/sdmx.ashx/GetDataStructure/all
http://stat.data.abs.gov.au/restsdmx/sdmx.ashx/GetDataStructure/ATSI_BIRTHS_SUMM
http://stat.data.abs.gov.au/Index.aspx?DataSetCode=ATSI_BIRTHS_SUMM

Please suggest better way to retrieve data.

Learnings
  • 2,780
  • 9
  • 35
  • 55

1 Answers1

1

Your example

DF = data_response.write(data_response.data.series, parse_time=False)

Produces a stacked DataFrame, by unstack().reset_index() you will get a "flat" DataFrame.

data_response.write().unstack().reset_index()
  MEASURE INDIGENOUS_STATUS ASGS_2011 FREQUENCY TIME_PERIOD       0
0       1                IM         0         A        2001  8334.0

Is this what you are looking for?

Fredrik Erlandsson
  • 1,279
  • 1
  • 13
  • 22