7
df = pd.read_stata('file.dta')
for cols in df.columns.values:
    name = cols.lower()
    type = df[cols].dtype
    #label = ...

I need to get the labels/descriptions in python for each column.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
Alam
  • 403
  • 5
  • 14

3 Answers3

9

In Pandas 0.22, you can also access this by creation of the iterator. I.e.

import pandas as pd
itr = pd.read_stata('file.dta', iterator=True)
itr.variable_labels()

This will return a dictionary where the keys are variable names and the values are variable labels. I think this is easier to remember than pd.io.stata.StataReader.

Kyle Barron
  • 2,452
  • 22
  • 17
5

This will return a dictionary of labels:

>>> pd.io.stata.StataReader('file.dta').variable_labels()
{'x': 'x label', 'y': 'y label'}
Nick Cox
  • 35,529
  • 6
  • 31
  • 47
JohnE
  • 29,156
  • 8
  • 79
  • 109
  • 1
    `reader` is not defined in that answer so it wasn't clear where it came from. From your answer it seems it is from pd.io so that means something new for me. :) – ayhan Jun 28 '17 at 20:48
  • 1
    Ah, yes, good point! Thanks! I presume it was just a typo (now fixed, btw), but I'm happy to have added something of value in any event. – JohnE Jun 28 '17 at 20:51
1

I got this

reader = pd.io.stata.StataReader('file.dta')
header = reader.variable_labels()
for var in header:
    name = var
    label = header[name]
Alam
  • 403
  • 5
  • 14
  • I was about to comment on typo, but you fixed it. I am not sure what you were trying to do with the `for` loop though (?) as "header" is already a dictionary. Btw in retrospect I would have just done my answer as a comment but it got two quick upvotes so I decided to leave it. – JohnE Jun 28 '17 at 20:54
  • Yes, I was writing it in a csv file row by row and then doing a little more manipulations with it. But yes..thanks for your input! :) – Alam Jun 28 '17 at 21:01