3

I have 2 questions related to comparing character vectors in Dyalog APL. The following code will compare character vectors one-by-one:

a←'ATCG'
b←'GTCA'
a=b
  • In order to speed up (in case of 2 vectors, as well as in case of comparing many vectors to a single vector), should I convert character vector to a numeric vector or it won't matter in APL (similar to comparing chars in C)?
  • I am comparing DNA sequences (which may consist of letter from the ATCG alphabet only). Is there anything I can do to speed up various operations on such vectors?
syntagma
  • 23,346
  • 16
  • 78
  • 134
  • No, it shouldn't matter because at the end of the day, the characters are encoded some way numerically and the comparison would be a number comparison. I'm not sure if there are any advantages given a small alphabet. However, if you don't care 'how' two strings mismatch/match but rather whether or not they are the same, I would recommend using the match operator which returns 1 if equivalent, 0 otherwise (I'm guessing it'll return quickly once a single diff has been found) – Chris Zhang Aug 18 '14 at 23:02

1 Answers1

1

Interestingly, on my (old) version of Dyalog APL, converting characters to small integers actually runs some 25% faster. This may have been sped up in more recent versions.

Try

a <- []av iota 'ATCG'   // sorry, no apl characters
b <- []av iota 'GTCA'
a = b

Be sure that the largest value is less than 128.

To check that you have the smallest possible representation of integers, use the []dr function. []dr a should return 82 for an integer -128 <= x <= 127.

Dyalog APL will automagically convert to the lowest possible integer width.

Lobachevsky
  • 1,222
  • 9
  • 17