Identify rows where a specific pattern change occurs

Question

Introduction/background

I am required to analyse respondent data in an spss data sheet but i believe it may be best to copy this data into excel or google sheets for this particular analysis I am about to describe.

600+ participants have been asked to respond to 100 questions. The participants will each have their own row in the data and their response to the questions will be in 100 columns from left to right in their respective row. Each response will either be coded as 1 or 2 (it is a binary response with only one of these two options to be provided in each column/cell).

There should be a trend in the data (left to right) where all participants will reach a point (i.e. column/cell) where they change from strictly responding with a 1 to strictly responding with a 2 from that point onwards. If the participants are paying attention and following logic, they will then only answer 2 for the remainder of their responses.

Analysis

I need to identify all participants that did not strictly continue answering with a 2 response after switching over to first providing a 2 response instead of strictly 1 responses.

i.e. if someone answers 1, 1, 1, 1, 1, 1 then switches to 2, 2, 2, 2, 2, but switches back to 1 instead of continuing to answer 2 for the remainder of the questions, i need to identify these outliers so that they can be isolated from the rest of the data-set.

Does anyone know of a function (either in excel, google sheets or a platform of your choice) that will allow me to bulk identify which participants erroneously entered a 1 after first changing their response to 2's.

I hope all that makes sense and sorry if it was a bit of a ramble.

score 2 · Accepted Answer · answered May 14 '20 at 01:27

You can do a TEXTJOIN() for each row and look for the pattern "21" (i.e. an '1' after a '2'). If found, this is an outliner.

Sth like the followings:

=IF(ISERROR(FIND("21",TEXTJOIN("",TRUE,[data range of a row]))),FALSE,TRUE)

TRUE will be an outliner

score 1 · Answer 2 · answered May 14 '20 at 08:03

1

This can be done easily in SPSS, no need to copy data to a different platform.
(although I love @OliverLeung's neat Excel one liner!)

The following code will loop through the variables and look for the value 1 appearing after the value 2. Where that happens the variable oops will get the value 1:

compute pvr=0.
do repeat vr=var1 to var40.
if vr=1 and pvr=2 oops=1.
compute pvr=vr.
end repeat.

answered May 14 '20 at 08:03

eli-k

10,898
11
40
44

Hi @eli-k - apologies for the super late reply! I have tried to run your spss syntax on a test data file with only one row of 6 variables displaying the pattern mentioned. Unfortunately it doesn't seem to behave as expected, my syntax and output below: Syntax: compute pvr=0. do repeat vr=var1 to var6. if vr=1 and pvr=2 oops=1. compute pvr=vr. end repeat. Result/Output file: if vr=1 and pvr=2 oops=1. Appreciate any insight as I am a noob with SPSS! I was hoping that all rows with the offending pattern (1 in the case of this test data set) would be highlighted or listed somehow – Martinc1402 Jun 19 '20 at 09:57
take a look in the data - you'll see a new variable has been created called `oops`. That variable gets the value 1 in your example, and every time there is a `1` after a `2` (otherwise the value is `0`). You can now sort or filter the file by `oops` and examine those cases. You can run `freq oops` to see a count of the occurrences in the output. – eli-k Jun 19 '20 at 12:35

Identify rows where a specific pattern change occurs

Introduction/background

Analysis

2 Answers2

Linked