arrayOfWords BYTE "BICYCLE", "CANOE", "SCATEBOARD", "OFFSIDE", "TENNIS"
is just another way to write
arrayOfWords BYTE "BICYCLECANOESCATEBOARDOFFSIDETENNIS"
and this is far from being an array.
Furthermore mov edx, offset arrayOfWords[2]
is not an array indexing.
Brackets in assembly are used to denote an addressing mode, not array indexing.
That's why I can't stop stressing out to NOT1 use the syntax <symbol>[<displacement>]
(your arrayOfWords[2]
) - it is a very silly and confusing way to write [<symbol> + <displacement>]
(in your case [arrayOfWords + 2]
).
You can see that mov edx, OFFSET [arrayOfWords + 2]
(that in my opinion is clearer written as mov edx, OFFSET arrayOfWords + 2
since the instruction is not accessing any memory) is just loading edx
with the address of the C character in BICYCLE (the third char of the big string).
MASM has a lot of high-level machinery that I never bothered learning, but after a quick glance at the manual linked in the footnotes, it seems that it has no high-level support for arrays.
That's a good thing, we can use a cleaner assembly.
An array of strings is not a continuous block of strings, it is a continuous block of pointers to strings.
The strings can be anywhere.
arrayOfWords DWORD OFFSET strBicycle,
OFFSET strCanoe,
OFFSET strSkateboard,
OFFSET strOffside,
OFFSET strTennis
strBicycle BYTE "BICYCLE",0
strCanoe BYTE "CANOE", 0
strSkateboard BYTE "SKATEBOARD", 0
strOffside BYTE "OFFSIDE", 0
strTennis BYTE "TENNIS", 0
Remember: the nice feature of arrays is constant access time; if the strings were to be put all together we'd get a more compact data structure but no constant access time since there'd be no way to know where a string starts but by scanning the whole thing.
With pointers we have constant access time, in general, we require all the elements of an array to be homogeneous, like the pointers are.
To load the address of the i-th2 string in the array we simply read the i-th pointer.
Suppose i is in ecx
then
mov edx, DWORD PTR [arrayOfWords + ecx*4]
call writeString
since each pointer is four bytes.
If you want to read the byte j of the string i then, supposing j is in ebx
and i in ecx
:
mov esi, DWORD PTR [arrayOfWords + ecx*4]
mov al, BYTE PTR [esi + ebx]
The registers used are arbitrary.
1 Despite what Microsoft writes in its MASM 6.1 manual:
Referencing Arrays
Each element in an array is referenced with an index number, beginning with zero. The array index appears in brackets after the array name, as in
array[9]
Assembly-language indexes differ from indexes in high-level languages, where the index number
always corresponds to the element’s position. In C, for example, array[9] references the array’s
tenth element, regardless of whether each element is 1 byte or 8 bytes in size.
In assembly language, an element’s index refers to the number of bytes between the element and the start of the array.
2 Counting from zero.