0

For an assignment I am required to download a dta file and answer some questions on a do file. However, when I type the command "use" on Stata and select the dta file, I get the response: too many observations {p 4 4 2} Dataset contains more than 2 gigaobs (billion observations). r(1001);

Is there a way to overcome the problem?

  • On this evidence, no; you've been set an impossible assignment. See https://www.stata.com/help.cgi?limits for the limits here and take the matter up immediately with your teachers. – Nick Cox Feb 21 '23 at 17:18
  • That said, something else is wrong because by the same principle it should not have been possible to hold the data in Stata or `save` the dataset as .dta. – Nick Cox Feb 21 '23 at 17:32

1 Answers1

0

I'm not quite sure of the problem itself, but you could try loading some of the data and analyzing it in parts like:

use in 1/20000 using "yourdata.dta", clear

If you want to analize it sistematically or have some condition like "keep certain observations", you could do something like:

local j = 1000000
local h = 1
forvalues i = 1(1000000)total_observations_in_your_data - 1000000{
    use in `i'/`j' using "yourdata.dta", clear
    keep if condition==1
    tempfile myfile`h'
    save `myfile`h''
    local j = `j' + 1000000
    local h = `h'+1
}

local k = `h'-1

use `myfile1'
forval r = 2(1)`k'{
append using `myfile`r''
}

save "yourdata_aux.dta", replace

In this case i'm using 1000000 observations per time, if you want something different, you have to change every"1000000" and the "total_observations_in_your_data"

Ignacio2424
  • 116
  • 6