Questions tagged [neon]

NEON is a vector-processing instruction set for ARM processors. Please use this tag together with [arm] if asking about the AArch32 version of NEON (to run on 32-bit ARM processors), or [arm64] for AArch64. See also the [simd] tag.

NEON is a vector-processing instruction set for ARM processors. It's also known as Advanced SIMD (Single Instruction Multiple Data).

NEON can be used on either 32-bit or 64-bit ARM processors, as part of the AArch32 or AArch64 architectures respectively. However, there are significant differences between the AArch32 and AArch64 versions of NEON (register usage, instruction mnemonics, instruction availability), so please use this tag together with either for AArch32, or for AArch64.

The tag may also be appropriate, especially for questions about SIMD algorithms that may be implemented with NEON.

Don't forget to include a tag for the programming language you are coding in, perhaps , or . In the latter cases, consider the tags or for how you access the instructions.

More information at

  1. Neon page in ARM website
  2. Wikipedia article on ARM
885 questions
-1
votes
1 answer

How to get the half 64bit of Vn.8h in armv8 like D register in armv7?

I load the data like this: ld1 {v8.8h, v9.8h, v10.8h, v11.8h}, [%8], #64 But when I use the data to calculate, it goes wrong: smlal v16.4s, v8.2d[0], v0.h[0] The error is: /tmp/cc2h1F9Y.s:523: Error: operand 2 must be a SIMD vector register…
Y.Zhu
  • 27
  • 8
-1
votes
1 answer

Eventual ARM Linux Memory Fragmentation with NEON Copy but not memcpy

I am running Linux 4.4 on a BeagleBone X-15 ( ARM Cortex-A15 ) board. My application mmaps the output of the SGX GPU and needs to copy the DRM backing store. Both memcpy and my custom NEON copy code work... but the NEON code is much faster ( ~11ms…
PhilBot
  • 748
  • 18
  • 85
  • 173
-1
votes
1 answer

ARM NEON: How do I bit-shift a whole 64-bit d register?

I want to logically left or right shift a d register (64-bit) by an arbitrary number of bits, with the count in another register. (Not an assemble-time constant.) It contains integer values and what I need to do is, to "move" them to the right…
Florian S
  • 552
  • 3
  • 15
-1
votes
3 answers

NEON int32 conversion to float gives wrong result

In NEON inline assembly, after conversion from Signed int32 to Float the number is different. Here the output for Float and Signed int32 is printed: It differs randomly (not only for each even number). There is only conversion (no any other…
RanL
  • 139
  • 9
-1
votes
1 answer

Cannot compile NEON code on xcode 8.3.2

I wrote an ARM NEON function in an individual file csc_rotation.S to do the colorspace conversion and I added the pure assembly file into a iOS app project to test it, and then compile the code under armv7 arch on Xcode. Then I got these…
Andy Hu
  • 97
  • 1
  • 7
-1
votes
2 answers

armv8-a: test if SIMD register is != 0

It's a question very similar to this one. On armv7-a, I have the following assembly code: vcmp.f64 d0, #0 vmrs APSR_nzcv, fpscr beq .jumpover How can I convert this code to armv8-a? I want to test if there is any non-zero pixel in v0.16b. EDIT #1 I…
gregoiregentil
  • 1,793
  • 1
  • 26
  • 56
-1
votes
1 answer

arm neon instruction

I have some code and, I want to use Neon instruction to change it, but I really don't know how to complete it... Can anyone help me? void add(int n,float *a,float *b,float t) { int i, size = (n+2) * (n+2); for(i = 0; i < size; i++) …
-1
votes
1 answer

Windows phone 8 neon inline assembly ffmpeg

Possible Duplicate: windows phone8 wp8 arm neon assembly I am about to transplant a project just like ffmpeg onto wp8(ARM). Unfortunately, most part of the project was written by arm neon inline assembly code (NEON inline assembly) with AT&T…
-2
votes
1 answer

Why matrix multiply (float32_4x4) with armv8 NEON instructions is slower?

Below code is using NEON instructions (from UE4) void matrixMultiplyNeon(float* ret, float32x4_t* A, float32x4_t* B) { float32x4_t * R = (float32x4_t*)ret; float32x4_t temp, r0, r1, r2, r3; auto low = vget_low_f32(A[0]); auto high…
Jihui Xu
  • 22
  • 3
-2
votes
1 answer

Add NEON to Android.mk but get "Invalid address 0xe76a4080 passed to free: value not allocated"

I tried to run project on Android and plan to add NEON code in the future. I don't have error when I run my regular code but when I add NEON flags in Android.mk, without changing any other code, when I got error Invalid address 0xe76a4080 passed to…
debug_all_the_time
  • 564
  • 1
  • 5
  • 18
-2
votes
1 answer

Why they still have separate floating point unit , if there is Neon for fast processing of floating points in ARM cortex processors.

Neon (advanced SIMD) is very fast for add,subtract,multiply and floating point operations like single precision and double precision. Why ARM company still have another separate unit for floating point calculation as you can see in picture. i am…
Jawwad Rafiq
  • 327
  • 3
  • 20
-2
votes
1 answer

How many functional units does NEON on Cortex-a8 have?

My question is how many and what all functional units does the NEON unit on ARM cortex-a8 have? If I have read correctly, the TRM doesn't explicitly say anything about the number of functional units on NEON core of ARM cortex-a8.
nguns
  • 440
  • 6
  • 21
-3
votes
3 answers

how to initialize and process arrays in arm neon assembly

I want to convert this C program into ARM NEON assembly: int main() { int str1[]={1,2,3,4,5,6,7,8,9,10}; int str2[]={11,12,3,4,8,1,4,5,8,3}; int str3[10],i; for(i=0;i<10;i++) { str3[i] = str1[i]+str2[i]; …
jacob
  • 1
  • 1
  • 2
-6
votes
1 answer

Spectre: Is SIMD the reason?

short question: I read an article about the spectre vulnerable. It says that only high end ARM processors are affected, not the low end ones. Since low end ARM CPUs doesn't support SIMD instructions (aka NEON extension on ARM) it sound to me like…
Citrullin
  • 2,269
  • 14
  • 29
-6
votes
1 answer

Difference between intrinsic, inline, external in embedded system?

I need to know about the difference between intrinsic, inline and external function in C/C++ programming. Thnx for help ^^
SES
  • 13
  • 3
1 2 3
58
59