1

There are a lot of popular talks this year on C++ cache utilization optimizations (alike this). From that videos it seems like having god objects (pseudocode):

class apples {
    vector<int> property_1_Values;
    vector<float> property_2_Values;
};

instead of

class apple {
    int property_1;
    float property_2;
};

so that iterationg from N'th element to M'th would be cache optimal (*they also say tht cpu can predict not only ++/-- but also +-const sequences).

Well I can see the point, also I can see how to reimplement my programs logic to fit into such model... yet it feels like a really architecturaly bad idea - create god objects, reinvent inheritance... So it seams for me that this is should be a compiler optimization not programmers headake.

So I wonder what OO language VM/compiler has already implemented such objects restructuring at programm compile/evecution time? (so that OO programmer would not have to make such handmade optimizations)? .NET, JVM, Clang, anyone?

Update:

Profile as a guidence for implementation of such thing is a really sad and bad answer - to implement such god object optimally one would require tons of profiling, debugging etc (such god object is in a sence a micro GC becuse reduce vector size on each object addition or removal would be painfull...) this is why I've hoped existing VMs already heve done that. I heve not sceen code generators/template classes that would provide a factory with resonable interface in C+... so idea of such thing seems to be scary when you have 100+k lines of code and a new not well tested god object...

Seki
  • 11,135
  • 7
  • 46
  • 70
DuckQueen
  • 772
  • 10
  • 62
  • 134
  • 3
    My gut feeling is that it would be incredibly unlikely for anything like this to be an automatic, invisible optimization. There are so many high-level visible differences between the two cases that you'd have to perform an insanely deep analysis to guarantee that this change is not observable. Moreover, which version is better is *highly* profile dependent. Don't take this one point as something general and too important. Optimizations like this are highly situational, and always metric-driven. – Kerrek SB Jul 06 '14 at 10:34
  • Problem for me is that it seems quite a scarry idea to create a mix of god objects, good traditionl OO patterns... as you sad it is resanoble Profiler Gioded Optimization... – DuckQueen Jul 06 '14 at 10:42
  • The good news is that you really need to do this kind of optimization for critical code paths only. Remember that premature optimization is root of evil. – Erbureth Jul 06 '14 at 10:43
  • At the presentation they talked on 100x difference for some cases... Problem I see here it would take really lots of time even for experienced programmer to implement such optimization thrue out 100k+ lines programm running in production enviroment... – DuckQueen Jul 06 '14 at 10:48
  • Yes, but this is likely to be that effective only on heavily-repeated subroutines operating on large amounts of data. The majority of the code is likely not to be affected at all. – Erbureth Jul 06 '14 at 10:53
  • The standard answer for any perf question applies again: use a profiler. – Hans Passant Jul 06 '14 at 11:18
  • Wouldn't this fall to pieces if you were accessing property_1 and property_2 pairs? There is no optimal layout that is independent of access patterns. – juanchopanza Jul 06 '14 at 11:23
  • @juanchopanza: no - 64 bits cache lines from each of 2 (or N what have you) arrays would be loaded into L1 cache (according to linked video). and you have up to up to ~16 kbs per thread in L1 and much more into L2 up to ~256KBs depending on CPU/Accellerator – DuckQueen Jul 06 '14 at 11:39
  • Then you should explain why that would be better than accessing each `apple` in an array of apples. You can't expect everyone to watch the video. – juanchopanza Jul 06 '14 at 11:42

0 Answers0