0

I have data in the form of json and csv, and sometimes need to provide this to external analysts. They're used to working with sav files though, so I would like to give them some utility function which will enable them to load the data and work with it in the same way they would if it had been in the form of a sav file.

I'm familiar with python, though I have access to Stata to try something out (tho have never opened it).

An example of the data to be used for this is the following:

import pandas as pd
import numpy as np

variable_value_labels = {
    "col_a": {
        1: "first thing",
        2: "something second",
        3: "meaning of three",
    },
    "col_b": {
        1: "No",
        2: "Yes",
    },
}

column_names_to_labels = {
    "col_a": "this is a column here",
    "col_b": "and another",
    "col_c": "this is a column without variable value labels",
}

N = 10

df = pd.DataFrame(
    {
        "col_a": np.random.choice(list(variable_value_labels["col_a"].keys()), N),
        "col_b": np.random.choice(list(variable_value_labels["col_b"].keys()), N),
        "col_c": np.random.rand(N),
    }
)

The raw data for the above is:

column names to labels json:

{"col_a": "this is a column here", "col_b": "and another", "col_c": "this is a column without variable value labels"}

variable value labels json:

{"col_a": {"1": "first thing", "2": "something second", "3": "meaning of three"}, "col_b": {"1": "No", "2": "Yes"}}

dataframe csv:

col_a,col_b,col_c
1,1,0.8360787635373775
2,1,0.3373961604172684
1,2,0.6481718720511972
2,1,0.36824153984054797
2,2,0.9571551589530464
3,2,0.14035078041264515
1,1,0.8700872583584364
3,1,0.4736080452737105
1,2,0.8009107519796442
1,2,0.5204774795512048

What I would like is some function / process which enables something such as:

function( dataframe_path, variable_value_labels_path, column_names_to_labels_path ):
    return <sav file with above info>

If it's possible to pass it a path to a directory containing these data instead that would also be good.

baxx
  • 3,956
  • 6
  • 37
  • 75
  • Importing CSV files into Stata can be done with the `import` command. Type `help import_delimited` into the Stata console for syntax explanation. As for importing json data, I will redirect you to a question that was asked on the stata forum: (https://www.statalist.org/forums/forum/general-stata-discussion/general/1357829-creating-a-stata-data-file-from-a-json-formatted-file) – JR96 Mar 23 '21 at 14:53
  • @JR96 this does not answer the question, I'm specifically asking how it can be imported such that, following the import, anything originally written with a sav file in mind would function in the same way – baxx Mar 23 '21 at 15:50

0 Answers0