1

Can swizzling in GLSL somehow be with minus? For example: vec.-yx-wz
The purpose for this is to get 2d normale with simple define:

#DEFINE NORMALE_PACK(v) (v).-yx-wz
#DEFINE NORMALE_1(v) dir.-yx
#DEFINE NORMALE_2(v) vec.-wz

void main(){
... 
float l = dot( NORMALE_PACK(dir), dir2);
}

without this I achive this with:

void main(){
... 
vec4 normale = vec4(-dir.y, dir.x, -dir.w, dir.z);      // +1 cycle on modern hardware, more - on older
float l = dot( normale, dir2);
}
tower120
  • 5,007
  • 6
  • 40
  • 88

1 Answers1

3

+1 cycle compared to what? Do you have other option working that is only one instruction and no additional overhead?

No shader assembly language i've seen can perform per-component negation. What you described is two swizzled movs (one of which is with negation prefix), like

MOV result.position.xz, -vertex.position.yyww;
MOV result.position.yw, vertex.position.xxzz;

You can reduce it to one instruction by using something like vec4 n = vec4(dir.yxwz) * vec4(-1.0, 1.0, -1.0, 1.0). That would be like:

PARAM c[1] = { { -1, 1 } };
MUL result.position, vertex.position.yxwz, c[0].xyxy;

(in both cases i've used result.position and vertex.position from ARB vertex program just as example).

But, it uses extra constant register, so it is not necessarily better.

Of course, both versions could be wrapped into macro.

update

I now see what you wanted to do. Something that generates code like (for latest AMD):

  0  x: DOT4        R0.x, -R0.y,  R1.x      
     y: DOT4        ____,  R0.x,  R1.y      
     z: DOT4        ____, -R0.w,  R1.z      
     w: DOT4        ____,  R0.z,  R1.w 

Instead you see extra MOVs. However, it isn't looks like valid code (I don't think DOT can take partially-negated arguments. Can't say more without carefully reading instructions manual), so compiler adds extra MOV (which is not necessarily results in extra cycle, by the way - depends on other instructions nearby).

keltar
  • 17,711
  • 2
  • 37
  • 42
  • +1 cycle compared to what? - Not compared - that instruction just takes 1 cycle(in best case, according to AMD shader analyzer). With define working solution there will be no overhead at all. – tower120 Mar 31 '14 at 05:23
  • And it looks like AMD can do per-component negotiation. If I understand this right z: MOV R0.z, R1.z w: MOV R1.w, -R1.w t: MOV R1.y, -R1.y That's from HD2900 assembly (from AMD shader analyzer) – tower120 Mar 31 '14 at 05:31
  • Perhaps, they don't have assembly representation and I haven't red their latest instructions manual. But I'm not getting what you want - to make assign cost 0 cycles? Not going to happen, any operation costs something. 1 is very good already (as in my second example). – keltar Mar 31 '14 at 05:35
  • Even swizzling? Without redifinition to other value? Look at my not worked 1st exampled :) – tower120 Mar 31 '14 at 05:39
  • No, swizzling costs virtually nothing, but it isn't instruction. Assign (mov), however, is an instruction and costs at least one cycle (if not bundled with another compatible instructions - in that case, they will run in parallel, but still consume at least one cycle together). – keltar Mar 31 '14 at 05:43