5

In a dataset, there are 10 variables V1, V2,..., V10.

How can I select cases in which the value of any of those variables is greater or equal, say, 10?

I tried this but it didn't work:

temporary.
select if any(v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, ge 10).
list id.

This and a couple of others didn't work either:

select if ((v1, v2, v3, v4, v5, v6, v7, v8, v9, v10) ge 10).
NonSleeper
  • 849
  • 1
  • 6
  • 17

3 Answers3

4

You could use VECTOR/LOOP approach here and specifying the loop to be exited as soon as the first variable meets the given criteria, in your case variable to be greater than value of 10, so to not unnecessarily continue looping over remaining variables:

*****************************************.
* set up dummy data.
set seed = 10.
input program.
loop #i = 1 to 500.
compute case = #i.
end case.
end loop.
end file.
end input program.
dataset name sim.
execute.
vector v(10, F1.0).
do repeat v = v1 to v10.
compute v = TRUNC(RV.UNIFORM(1,12)).
end repeat.
execute.
*****************************************.

vector v=v1 to v10.
loop i=1 to 10.
  if (v(i) > 10) Keep=1.
end loop if v(i) > 10.
select if Keep.
eli-k
  • 10,898
  • 11
  • 40
  • 44
Jignesh Sutar
  • 2,909
  • 10
  • 13
3

You'll have to loop for it:

do repeat vr=v1 to v10.
   if vr ge 10 KeepMe=1.
end repeat.
select if KeepMe=1.
eli-k
  • 10,898
  • 11
  • 40
  • 44
  • Thanks for your and @Jignesh Sutar's solutions. I however wonder if it could also be done similar as follows. For example, if the criteria is to select case with values equal 10, then it can be: `if any(10, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10) keep = 1.` Or is it that it can't be written in this style if the selection criteria is a range instead of a single value? – NonSleeper Aug 23 '16 at 00:28
  • `any`can only be used with a single value (e.g. - 10, as in your example above); `range`, on the other hand, cand only use one variable: `select if range(var1,10,999)` – horace_vr Aug 23 '16 at 04:50
  • Of course if you absolutely want to avoid looping you could `select if v1>=10 or v2>=10 0r ....`, but as horace_vr says - there's no command/function that will make all these comparisons in one fell swoop. – eli-k Aug 23 '16 at 05:41
1

This will also work:

count cnt_ = v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 (10 thru highest).
exe.
select if cnt_>0.
exe.

The cnt_variable is used for counting how many variables have a value of 10 or greater. Then the selection command selects what you need.

Also, don't forget about execute, to apply all pending transformations. Otherwise nothing will happen.

horace_vr
  • 3,026
  • 6
  • 26
  • 48
  • Would it be `count cnt_ = v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 (10 thru highest).`? – NonSleeper Aug 23 '16 at 05:01
  • 1
    Nice! I always forget about the `COUNT` command. Remember you can reference variables in consecutive order simply as `V1 to V10`, also the first `EXE` isn't necessary. It introduces an extra data pass when it can all be done simultaneously at the second data pass. – Jignesh Sutar Aug 23 '16 at 08:51
  • @JigneshSutar you are correct on both accounts: `v1 to v10` works IF variables are consecutive; about the `exe` - I have seen many instances when people forgot about it completely (even in some of the answers for this question). I made a habit of adding them whenever I move to another set of transformations (e.g. - from counting to selections); but you are correct about the first one not being absolutely necessary. – horace_vr Aug 23 '16 at 09:27
  • @horace_vr +1 for the COUNT from me too... regarding the `exe`, I only use it when I need transformations run NOW (either to see the results or if further analysis is dependent on transformations being run first, eg. after using LAG function) but otherwise I try to avoid it as it is not a necessary part of the analysis and it may waste significant amount of time when running your syntax on a large file. – eli-k Aug 23 '16 at 10:07