6

Ok, I have 80,000 "Box" Mesh with simple textures I have set a view distance and only draw the ones you can see which leave 600 to 1000 for the DrawModel function belowe The problume is I only get 10 frame per second and my view distance is crappy Also, I have done memory test to all my code and the "mesh.draw()" takes 30 Frame per second off. nothing else take NEAR that much. Any help?

        private void DrawModel(MeshHolder tmpMH)
        {          
            Model tmpDrawModel = (Model)_Meshs[tmpMH.MeshFileName];
            Matrix[] transforms = new Matrix[tmpDrawModel.Bones.Count];
            tmpDrawModel.CopyAbsoluteBoneTransformsTo(transforms);
            foreach (ModelMesh mesh in tmpDrawModel.Meshes)
            {
                foreach (BasicEffect effect in mesh.Effects)
                {

                    effect.LightingEnabled = false;

                    effect.TextureEnabled = true;
                    effect.Texture = (Texture2D)_Textures[tmpMH.GetTexture(Count)]; 



                    effect.View = _MainCam.View;
                    effect.Projection = _projection;
                    effect.World =
                         transforms[mesh.ParentBone.Index] *
                        Matrix.CreateFromYawPitchRoll(tmpMH.Rotation.Y, tmpMH.Rotation.X, tmpMH.Rotation.Z) *
                        Matrix.CreateScale(tmpMH.Scale) *
                        Matrix.CreateTranslation(tmpMH.Position);
                }

                    mesh.Draw();               
            }
        }
codegreen
  • 71
  • 4
  • This is very similar to [this question](http://stackoverflow.com/questions/5268192/how-many-low-poly-models-can-xna-handle). – Andrew Russell Jun 03 '11 at 04:21

2 Answers2

7

What is killing your performance - as you say - is ModelMesh.Draw. When you draw models, it works like this:

for each frame
  for each Model
    for each ModelMesh // you call Draw(), which does:
      for each ModelMeshPart
        for each Effect
          for each EffectPass
            Draw some triangles // sends a batch of instructions to the GPU

So the question is: how many times each frame are you sending a batch to the GPU? Because you can only send a few thousand* batches per frame before you saturate the CPU - hitting the "batch limit". (Each batch uses up CPU time in the graphics driver - it also uses some bandwidth and GPU time, but CPU time dominates.)

You may want to read this answer and this answer and this slide deck for additional information.

The solution is to modify your scene (eg: combine some mesh parts, do some culling, add instancing support) to reduce the number of batches you are sending to the GPU.


Also, try to avoid things like this in your Draw and Update loop:

Matrix[] transforms = new Matrix[tmpDrawModel.Bones.Count];

You should really do your best to avoid memory allocations that happen each frame - as they will eventually cause an expensive garbage collection and potentially a frame-rate hiccup (especially on Xbox). Try to store your buffer somewhere and reuse it.

Community
  • 1
  • 1
Andrew Russell
  • 26,924
  • 7
  • 58
  • 104
  • 1
    The BatchBatchBatch.pdf link doesn't seem to be working now. Here's one that is working: http://origin-developer.nvidia.com/docs/IO/8230/BatchBatchBatch.pdf?q=docs/IO/8230/BatchBatchBatch.pdf – Venesectrix Jun 16 '11 at 19:02
  • That one is broken too, now. I've updated it to a newer one that works. – Andrew Russell Dec 10 '13 at 08:38
3
effect.World =
    transforms[mesh.ParentBone.Index] *
    Matrix.CreateFromYawPitchRoll(
      tmpMH.Rotation.Y, tmpMH.Rotation.X, tmpMH.Rotation.Z) *
    Matrix.CreateScale(tmpMH.Scale) *
    Matrix.CreateTranslation(tmpMH.Position);

I'm not a profiler, but I feel this line is a pain. Matrix creation and multiplication is quite expensive! I understand this code is necessary so, unless you can pre-compute these matrixes, I'd try:

Matrix pitch, scale, translation, temp1, temp2;

Matrix.CreateFromYawPitchRoll(
    tmpMH.Rotation.Y, tmpMH.Rotation.X, tmpMH.Rotation.Z, out pitch);
Matrix.CreateScale(ref tmpMH.Scale, out scale);
Matrix.CreateTranslation(ref tmpMH.Position, out translation);
Matrix.Multiply(ref transforms[mesh.ParentBone.Index], ref pitch, out temp1);
Matrix.Multiply(ref temp1, ref scale, out temp2);
Matrix.Multiply(ref temp2, ref translation, out effect.World);

This might be faster since there's no need to copy each matrix on the stack for parameter passing (more than 20 times less stuff to copy!)

BlackBear
  • 22,411
  • 10
  • 48
  • 86
  • 1
    I *seriously* doubt that the un-micro-optimised matrix calculations are dominating CPU usage here. Also `effect.World` can't be an `out` parameter. – Andrew Russell Jun 03 '11 at 03:46
  • @Andrew Russel: In my opinion this will help. Copying things around is quite expensive not in terms of CPU usage, it simply slows down everything because yhe CPU needs to wait for the memory (http://en.wikipedia.org/wiki/Wait_state) – BlackBear Jun 03 '11 at 10:56
  • You're not going to hit main memory with either version of the code. The small amount of stack data here will easily fit in CPU cache. And putting the CPU into a wait state (such as by cache-miss) still counts as "usage" towards your CPU-limit. Your code will "help", due to requiring fewer instructions - but only by a *minuscule* amount - and certainly not enough to get the OP below the CPU-limit. – Andrew Russell Jun 04 '11 at 05:33