I have checked a previous post about it (SPSS Last available measurement in a row of variables), but I still have doubts on how to perform this task more efficiently. I have a dataset with 98 million rows and 120 variables (one for each month, from January/2005 to December/2014). I need to identify the first and the last valid (non-missing) measurements in a row of variables, for each observation in the dataset. The dataset looks like:
v1 v2 ... v120
1 2 ... 5
. 2 ... 5
3 1 ... .
I have tried the version using loops (also suggested on this topic: SPSS Last available measurement in a row of variables). I used the syntaxes below, but they didn’t work. I keep getting error messages… It is very likely that, since I did not understand all the steps in it, I misused it.
DEFINE LAST_VALID ()
!DO !@ = 1 !TO 120 .
!LET !a = !CONCAT("v", !@) .
COMPUTE LAST_VALID = !a .
!DOEND .
!ENDDEFINE.
LAST_VALID .
EXECUTE.
ERROR MESSAGES (some of them):
Error # 4382 in column 1024. Text: (End of Command) >An equals sign was >not found when expected after a target variable in a >COMPUTE command. >Execution of this command stops.
Warning # 231 >The depth of macro nesting has reached the current limit. >To increase the >limit, use SET MNEST. To check the limit use the SHOW >command.
Warning # 210 in column 9. Text: !ERROR_MACRO >A macro symbol is invalid >in this context. >The symbol will be treated as an invalid special >character.
Error # 4285 in column 9. Text: !ERROR_MACRO >Incorrect variable name: >either the name is more than 64 characters, or it is >not defined by a >previous command. >Execution of this command stops.
DEFINE FIRST_VALID ()
!DO !@ = 1 !TO 120 .
!LET !a = !CONCAT("v", !@) .
LOOP IF MISSING (FIRST_VALID) = 1.
COMPUTE FIRST_VALID = !a .
END LOOP IF FIRST_VALID > 0.
!DOEND .
!ENDDEFINE.
FIRST_VALID.
EXECUTE.
ERROR MESSAGES (some of them):
Warning # 231 >The depth of macro nesting has reached the current limit. >To increase the >limit, use SET MNEST. To check the limit use the SHOW >command.
Warning # 210 in column 18. Text: !ERROR_MACRO >A macro symbol is invalid >in this context. >The symbol will be treated as an invalid special >character.
Error # 4007 in column 18. Text: !ERROR_MACRO >The expression is >incomplete. Check for missing operands, invalid operators, >unmatched >parentheses or excessive string length. >Execution of this command stops.
Error # 4846 in column 18. Text: !ERROR_MACRO >The LOOP command contains >unrecognized text after the end of the IF clause.
Warning # 210 in column 9. Text: !ERROR_MACRO >A macro symbol is invalid >in this context. >The symbol will be treated as an invalid special >character.
Error # 4285 in column 9. Text: !ERROR_MACRO >Incorrect variable name: >either the name is more than 64 characters, or it is >not defined by a >previous command. >Execution of this command stops.
Error # 4014 in column 13. Text: !ERROR_MACRO >SPSS Statistics was >expecting an expression but encountered the end of the >command. Check the >expression for omitted or extra operands, operators, and >parentheses. >Execution of this command stops.
Error # 4045. Command name: END LOOP >The END LOOP command does not follow >an unclosed LOOP command. Maybe the LOOP >command was not recognized >because of an error. Use the level-of-control >shown to the left of the >SPSS Statistics commands to determine the range of >LOOPs and DO IFs.*
I am probably forgetting to adjust my syntax for something, but I cannot figure out what. I apologize if it is too obvious…