To quote Goldberg (a paper that every Computer Scientist, Computational Scientist, and possibly even every scientist who programs, should read):
Due to roundoff errors, the associative laws of algebra do not
necessarily hold for floating-point numbers.
This means that when you change the order of operations—even when using ostensibly associative arithmetic—you are likely to get slightly different answers.
Parallelism, by definition, results in different ordering of operations relative to serial arithmetic. "Embarrasingly parallel" computations, that is, computations where each output element is computed independently from all others, sometimes do not have to worry about this. But collective operations, like reductions or scans, and spatial neighborhood computations, such stencils (as in FDTD), do experience this effect.
In practice, even using a different compiler (and even different compiler options) can change the result of floating point computation, even when compiling the same code, with or without parallelism.