0

I am trying to read in a data file with SAS that has a hierarchical structure but there is no record type variable which seems to be a requirement for creating several observations per header.

The data looks something like this:

Monkey & Horse Dance HORSE1 DDD4226 0001
3232233321221121.........
3222233333321332.........
Monkey & Horse Dance HORSE2 DDD5210 0001
1222121212221222.........
Monkey & Horse Dance HORSE3 DDD5405 0001
1111123211111211.........
1111111111111111.........

the desired output would be something like this:

Monkey & Horse Dance HORSE1 DDD4226 0001 3 2 3 2 2 3 3 3 2 1 2 2 1 1 2 1
Monkey & Horse Dance HORSE1 DDD4226 0001 3 2 2 2 2 3 3 3 3 3 3 2 1 3 3 2
Monkey & Horse Dance HORSE2 DDD5210 0001 1 2 2 2 1 2 1 2 1 2 2 2 1 2 2 2
Monkey & Horse Dance HORSE3 DDD5405 0001 1 1 1 1 1 2 3 2 1 1 1 1 1 2 1 1

I have been trying something like this:

data monkey;
    infile monkey;
    informat var7-var22 1;
    retain var1 var2 var3 var4 var6 var7;
    input define 1 @;
    if define='M' then input @1 var1 $14. var2 $char5. var3 $char5. var4 7. +0 var6;
    if define=('1' or '2' or '3' or '4' or '5') then input var7-var22;
run;

Could anyone point me in the right direction?

TLama
  • 75,147
  • 17
  • 214
  • 392
statsNoob
  • 1,325
  • 5
  • 18
  • 36

1 Answers1

0

Using this as test data:

Monkey & Horse Dance HORSE1 DDD4226 0001
123456789012345
123456789012345
Monkey & Horse Dance HORSE2 DDD5210 0001
123456789012345
Monkey & Horse Dance HORSE3 DDD5405 0001
123456789012345
123456789012345

This code should do what (I think) you want:

data monkey;
    infile "C:\monkey.txt" lrecl=40 truncover;

    length    var1 $14
            var2 $5
            var3 $6
            var4 $7
            var5 $4
            var6-var20 3;

    retain var1-var5;

    input @1 testchar $1. @;

    if testchar="M" then do;
        input    @1 var1 $14.
                @16 var2 $5.
                @22 var3 $6.
                @29 var4 $7.
                @38 var5 $4.;
    end;
    else do;
        input @1 var6 1. var7 1. var8 1. var9 1. var10 1. var11 1. var12 1. var13 1. var14 1. var15 1. var16 1. var17 1. var18 1. var19 1. var20 1.;
        output;
    end;
run;

I would also recommend several things to you, though: 1) I'd recommend calling your variables things that have some sort of meaning... not just "var1" or "var2". 2) Can you be sure the first character on a line will always be "M"? If not, this code will not work. 3) Have you picked up a good guide on learning SAS programming? A lot of these sorts of concepts about reading in data should be covered in the first chapter or two.

Good luck!

John Chrysostom
  • 3,973
  • 1
  • 34
  • 50