openmp - Use omp simd without a for -
i have following instructions:
unsigned long int xdiff = seq1.x ^ seq2.x; unsigned long int ydiff = seq1.y ^ seq2.y; unsigned long int zdiff = seq1.z ^ seq2.z;
it's possible vectorize using omp simd
?
actually if define positions array, don't need anything, compiler vectorize you.
struct position { unsigned long pos[3]; }; struct position foo( struct position seq1, struct position seq2) { struct position diff; for( int = 0; < 2; ++i ) diff.pos[i] = seq1.pos[i] ^ seq2.pos[i]; return diff; }
gcc since 4.6 vectorize using -o3
flag. if provide architecture specific flags (for example intel vector extensions: -msse42
, -mavx
, etc.), can control vector instruction set compiler has use. if want build own machine, can compile -march=native
.
foo(position, position): movdqu xmm1, xmmword ptr [rsp+32] mov rax, rdi movdqu xmm0, xmmword ptr [rsp+8] pxor xmm0, xmm1 movdqu xmmword ptr [rdi], xmm0 ret
if "unroll loop manually" (like in example)
diff.pos[0] = seq1.pos[0] ^ seq2.pos[0]; diff.pos[1] = seq1.pos[1] ^ seq2.pos[1]; diff.pos[2] = seq1.pos[2] ^ seq2.pos[2];
this no longer case:
foo(position, position): mov rdx, qword ptr [rsp+32] xor rdx, qword ptr [rsp+8] mov rax, rdi mov qword ptr [rdi], rdx mov rdx, qword ptr [rsp+40] xor rdx, qword ptr [rsp+16] mov qword ptr [rdi+8], rdx mov rdx, qword ptr [rsp+48] xor rdx, qword ptr [rsp+24] mov qword ptr [rdi+16], rdx ret
also, #pragma omp simd
directive can applied loops:
simd [2.8.1] applied loop indicate loop can transformed simd loop.
#pragma omp simd [clause[,] clause] ...] for-loops
Comments
Post a Comment