First of all, let's start with the following links about MOVDQA and MOVDQU which are already in this community:
- MOVDQU instruction + page boundary
- MOVUPD vs. MOVDQU (x86/x64 assembly)
- Difference between MOVDQA and MOVAPS x86 instructions?
- Assembly "movdqa" access violation
Here, I am sharing links that describe these two instructions
- https://c9x.me/x86/html/file_module_x86_id_184.html
- https://mudongliang.github.io/x86/html/file_module_x86_id_183.html
Now let's dive into my problem. I am using Linux(64-bit). I have created a test project in C++ that uses several assembly implementations. I need better insight using MOVDQA and MOVDQU that loads data to xmm registers. Here I am sharing some of my experiments:
// Initialization 1
std::string_view lhs{"Once upon a time in Germany"}; // length = 27
std::string_view rhs = lhs.substr(20, 7); // RHS points to "germany"
# Experiment 1.1
# Here: %rax = lhs, %rsi = rhs
movdqa (%rax), %xmm11 // SIGSEGV in this line, although it has enough memory allocations
movdqa (%rsi), %xmm12
# Experiment 1.2
# Here: %rax = lhs, %rsi = rhs
movdqu (%rax), %xmm11 // data successfully loaded into register
movdqu (%rsi), %xmm12 // data successfully loaded into register with some garbage
// Initialization 2
void *lhs = mmap ( NULL, 27*sizeof(unsigned char), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0 );
void *rhs = lhs + 20;
# Experiment 2.1
# Here: %rax = lhs, %rsi = rhs
movdqa (%rax), %xmm11 // data successfully loaded into register
movdqa (%rsi), %xmm12 // SIGSEGV in this line... Expected!
# Experiment 2.2
# Here: %rax = lhs, %rsi = rhs
movdqu (%rax), %xmm11 // data successfully loaded into register
movdqu (%rsi), %xmm12 // data successfully loaded into register with some garbage
// Initialization 3
unsigned char *lhs = new unsigned char[27];
unsigned char *rhs = lhs + 20;
# Experiment 3.1
# Here: %rax = lhs, %rsi = rhs
movdqa (%rax), %xmm11 // data successfully loaded into register
movdqa (%rsi), %xmm12 // SIGSEGV in this line
# Experiment 3.2
# Here: %rax = lhs, %rsi = rhs
movdqu (%rax), %xmm11 // data successfully loaded into register
movdqu (%rsi), %xmm12 // data successfully loaded into register with some garbage
Questions:
- Can anyone explain experiment 1.1?
- It seems to me, using MOVDQU is always safe. If it's not, What are the scenarios MOVDQU throws SIGSSEGV?
- Also, What are the scenarios MOVDQA throws SIGSSEGV?
- Can anyone share any reference about showing the Performance using MOVDQA and MOVDQU (i.e which is faster)?