I have the following function (which I cleaned up a bit to make it easier to understand) which takes the destination array gets the element at index n
adds to it the src1[i]
and then multiplies it with src2[i]
(nothing too fancy):
static void F(long[] dst, long[] src1, long[] src2, ulong n)
{
dst[n] += src1[n];
dst[n] *= src2[n];
}
no this generates following ASM
:
<Program>$.<<Main>$>g__F|0_0(Int64[], Int64[], Int64[], UInt64)
L0000: sub rsp, 0x28
L0004: test r9, r9
L0007: jl short L0051
L0009: mov rax, r9
L000c: mov r9d, [rcx+8]
L0010: movsxd r9, r9d
L0013: cmp rax, r9
L0016: jae short L0057
L0018: lea rcx, [rcx+rax*8+0x10]
L001d: mov r9, rcx
L0020: mov r10, [r9]
L0023: mov r11d, [rdx+8]
L0027: movsxd r11, r11d
L002a: cmp rax, r11
L002d: jae short L0057
L002f: add r10, [rdx+rax*8+0x10]
L0034: mov [r9], r10
L0037: mov edx, [r8+8]
L003b: movsxd rdx, edx
L003e: cmp rax, rdx
L0041: jae short L0057
L0043: imul r10, [r8+rax*8+0x10]
L0049: mov [rcx], r10
L004c: add rsp, 0x28
L0050: ret
L0051: call 0x00007ffc9dadb710
L0056: int3
L0057: call 0x00007ffc9dadbc70
L005c: int3
as you can it adds bunch of stuff and because I can guarantee that the n
will be in between the legal range: I can use pointers.
static unsafe void G(long* dst, long* src1, long* src2, ulong n)
{
dst[n] += src1[n];
dst[n] *= src2[n];
}
Now this generates much simpler ASM
:
<Program>$.<<Main>$>g__G|0_1(Int64*, Int64*, Int64*, UInt64)
L0000: lea rax, [rcx+r9*8]
L0004: mov rcx, rax
L0007: mov rdx, [rdx+r9*8]
L000b: add [rcx], rdx
L000e: mov rdx, [rax] ; loads the value again?
L0011: imul rdx, [r8+r9*8]
L0016: mov [rax], rdx
L0019: ret
As you may have noticed, there is an extra MOV
there (I think, at least I can't reason why is it there).
Question
- How can I remove that line? In
C
I could use the keywordrestrict
if I'm not wrong. Is there such keyword in C#? I couldn't find anything on internet sadly.
Note
- Here is SharpLab link.
- Here is the
C
example:
void
f(int64_t *dst,
int64_t *src1,
int64_t *src2,
uint64_t n) {
dst[n] += src1[n];
dst[n] *= src2[n];
}
void
g(int64_t *restrict dst,
int64_t *restrict src1,
int64_t *restrict src2,
uint64_t n) {
dst[n] += src1[n];
dst[n] *= src2[n];
}
this generates:
f:
mov r10, rdx
lea rdx, [rcx+r9*8]
mov rax, QWORD PTR [rdx]
add rax, QWORD PTR [r10+r9*8]
mov QWORD PTR [rdx], rax ; this is strange. It loads the value back to [RDX]?
; shouldn't that be other way around? I don't know.
imul rax, QWORD PTR [r8+r9*8]
mov QWORD PTR [rdx], rax
ret
g:
mov r10, rdx
lea rdx, [rcx+r9*8]
mov rax, QWORD PTR [rdx]
add rax, QWORD PTR [r10+r9*8]
imul rax, QWORD PTR [r8+r9*8]
mov QWORD PTR [rdx], rax
ret
and here is the Godbolt link.