-4

Need help observing simple regression as well as xt-regression for panel data.

The dataset consists of 16 participants in which daily observations were made.

I would like to observe the difference between pre-test (from the first date on which observations were taken) and post-test (the last date on which observations were made) across different variables.

also I was advised to do xtregress, re

what is this re? and its significance?

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
lei10003
  • 5
  • 2
  • 2
    Welcome to StackOverflow. Could you take a look at the documentation, try your Stata model, and tell us what errors you see? This is a nice problem, but if we solve it for you, you won't learn anything about Stata. Also, SO generally does not supply code for you. – rajah9 Jul 27 '16 at 13:22
  • Take a look at `help xtregress`. – lmo Jul 28 '16 at 17:56
  • I did. I am more concerned about picking out the first and last observation in a particular variable. As in the first and last observation in a panel. I believe once I do that I can move forward with the regression on my own. – lei10003 Jul 28 '16 at 23:00

2 Answers2

1

Perhaps this sample code will set you in the direction you seek.

clear
input id year x
1 2001 11
1 2002 12
1 2003 13
1 2004 14
2 2001 21
2 2002 22
2 2003 23
3 1005 35
end
xtset id year
bysort id (year): generate firstx = x[1]
bysort id (year): generate lastx = x[_N]
list, sepby(id)

With regard to xterg, re, that fits a random effects model. See help xtreg for more details, as well as the documentation for xtreg in the Stata Longitudinal-Data/Panel-Data Reference Manual included in your Stata documentation.

1

If the goal is to fit some xt model at the end, you will need the data in long form. I would use:

bysort id (year): keep if inlist(_n,1,_N)

For each id, this puts the data in ascending chronological order, and keeps the first and last observation for each id.

The RE part of your question is off-topic here. Try Statalist or CV SE site, but do augment your questions with details of the data and what you hope to accomplish. These may also reveal that getting rid of the intermediate data is a bad idea.


Edit:

Add this after the part above:

bysort id (year): gen t= _n
reshape wide x year, i(id) j(t)
order id x1 x2 year1 year2
dimitriy
  • 9,077
  • 2
  • 25
  • 50
  • this reallllyyy!!! helped! thanks re means random error, so you think getting rid of intermediate data is bad? why? – lei10003 Aug 01 '16 at 04:12
  • Panel data estimates need data. Having more may allow you to estimate the effect more precisely or to look at dynamics of how the effect changes over time. Maybe that is a lofty goal with 16 panels. – dimitriy Aug 01 '16 at 04:19
  • that's true. I understand, that was in fact what I was trying to do before however my supervisor made changes hence my confusion. – lei10003 Aug 01 '16 at 21:14
  • How can I separate it so that I can analyze by regression the last date against the first date in the panel? – lei10003 Aug 06 '16 at 05:53