Monday, March 8, 2010
at
11:15 AM
|
I finally got to the stage of being able to render our first level using the deferred lighting renderer. It is still not complete yet however. Only the default materials required for level01 has been updated. Also, due to this changes, refraction needs to be redone in a different approach.
Along the way, I actually hit a few problems. First one was due to my mistake. When designing the G-Buffer, I assumed I will be able to construct the lighting pass with just the depth and normal. I was wrong. I needed the specular power value as well. Hence, I had to go back to the drawing board again and decided to go for the least accurate model of R8G8B8A8 where we store compacted view space normal map in the RGB channel and spec power in the Alpha channel. Interestingly, it turned out pretty good. So much for the "not accurate" crap mentioned in the CryEngine3 power point presentation slide. Personally, the inaccuracy is not really distinguishable. Besides, with proper good textures provided by artists, this small error isn't really a big deal; especially considering that we are not trying to achieve realism. What we want is beauty and style. :-)
Another of my issue was the way Ogre did its rendering. For every render target, Ogre would do a scene traversal to find all visible renderables and render them. This I found unacceptable. Reason being that it means Ogre would traverse the scene at least twice; First being G-Buffer stage, second being the final compositing and forward rendering stage. This is a waste of cpu resources hence I ended up listening to the render queue event during the G-Buffer pass and keep my own copy of all queuing renderables. Then I manually inject the render queue during the final compositing stage with a custom subclassed SceneMgrQueuedRenderableVisitor that tries to refetch the right material technique base on the new material scheme.
And the end result? I had our first level running at ~40-70fps with an average of 50fps at 1024x768. This is with HDR on but without Shadow. Not too bad for a crappy 8600GTS.
Oh, one thing interesting to note is that since the G-Buffer stage does not require much texture sampling, it actually renders really fast. That being so, and because we keep the Z-Buffer intact throughout the whole process, we actually gain some performance during the final compositing pass due to early Z-out. So you loose some, you win some. :-P (In theory, if you do a pre Z-only pass before filling the G-Buffer, you might speed up more if your scene is complex. But it will also increase batch count. So I'm not too sure if it's worth while.)
Unfortunately for us though was that because we did not planned to have deferred lighting from the start, we had to abide by some bad decisions done in the past. One notable issue was that our G-Buffer stage requires the diffuse, normal and spec map in the worst case scenario. This is because in the case of an alpha-rejection shaded material, we need to sample the alpha channel of diffuse for alpha-rejection, and specular power from the spec map. This means that we are sampling at least two textures for each material during the G-Buffer stage. This is not ideal as we should try to sample as little textures as possible in this pass.
That said, if I could fix this, I would make specular power part of the normal map's alpha instead of in the spec map's alpha; making only one texture sampling needed typically in the G-Buffer stage. This would also leave an extra alpha for reflection factor in the spec map for envmap reflective materials; which would be a win win solution(less one reflection factor texture map). Sadly, we're already a long way in art asset creation. Changing this now would mean loads of work fixing the old textures and materials.
Along the way, I actually hit a few problems. First one was due to my mistake. When designing the G-Buffer, I assumed I will be able to construct the lighting pass with just the depth and normal. I was wrong. I needed the specular power value as well. Hence, I had to go back to the drawing board again and decided to go for the least accurate model of R8G8B8A8 where we store compacted view space normal map in the RGB channel and spec power in the Alpha channel. Interestingly, it turned out pretty good. So much for the "not accurate" crap mentioned in the CryEngine3 power point presentation slide. Personally, the inaccuracy is not really distinguishable. Besides, with proper good textures provided by artists, this small error isn't really a big deal; especially considering that we are not trying to achieve realism. What we want is beauty and style. :-)
Another of my issue was the way Ogre did its rendering. For every render target, Ogre would do a scene traversal to find all visible renderables and render them. This I found unacceptable. Reason being that it means Ogre would traverse the scene at least twice; First being G-Buffer stage, second being the final compositing and forward rendering stage. This is a waste of cpu resources hence I ended up listening to the render queue event during the G-Buffer pass and keep my own copy of all queuing renderables. Then I manually inject the render queue during the final compositing stage with a custom subclassed SceneMgrQueuedRenderableVisitor that tries to refetch the right material technique base on the new material scheme.
And the end result? I had our first level running at ~40-70fps with an average of 50fps at 1024x768. This is with HDR on but without Shadow. Not too bad for a crappy 8600GTS.
Oh, one thing interesting to note is that since the G-Buffer stage does not require much texture sampling, it actually renders really fast. That being so, and because we keep the Z-Buffer intact throughout the whole process, we actually gain some performance during the final compositing pass due to early Z-out. So you loose some, you win some. :-P (In theory, if you do a pre Z-only pass before filling the G-Buffer, you might speed up more if your scene is complex. But it will also increase batch count. So I'm not too sure if it's worth while.)
Unfortunately for us though was that because we did not planned to have deferred lighting from the start, we had to abide by some bad decisions done in the past. One notable issue was that our G-Buffer stage requires the diffuse, normal and spec map in the worst case scenario. This is because in the case of an alpha-rejection shaded material, we need to sample the alpha channel of diffuse for alpha-rejection, and specular power from the spec map. This means that we are sampling at least two textures for each material during the G-Buffer stage. This is not ideal as we should try to sample as little textures as possible in this pass.
That said, if I could fix this, I would make specular power part of the normal map's alpha instead of in the spec map's alpha; making only one texture sampling needed typically in the G-Buffer stage. This would also leave an extra alpha for reflection factor in the spec map for envmap reflective materials; which would be a win win solution(less one reflection factor texture map). Sadly, we're already a long way in art asset creation. Changing this now would mean loads of work fixing the old textures and materials.
Posted by
Lf3T-Hn4D
2 comments:
I´m glad to hear that you got thatt far. Can you compare the framerates you gain with DS with your previous forward rendering?
Maybe you can batch you assets with a script. Putting together alpha channels etc. might be worth writing a small c++ program (e.g. with ogre or freeimage only). With this script you could create your assets like you do now.
Same goes for Material Script batching as well.
Thanks :) I just did a test on an older build and it got me about 58fps on the same scene posted above. Obviously DS is always slightly slower than doing them forward. But as you can see, the decrease in fps is probably not too bad.
As for batching assets with a script, I could probably do that. However, I don't have the time to build more tools as is. I'm already behind schedule for the level editor features. I need to add a few stuffs for the artists to setup lighting easier.
There's also much I want to do to cut down ram and cpu usage; namely allowing artist to specify batches that has no physics interaction and shadow casting state.
But I need to see how much I can improve. I still have to get script integration in where each level can have it's own unique scripts handling it's unique level features.
As for material script side, I have no issue there as I'm using my own custom material script generator. I basically built a system that will generate abstract materials using special templates base on user settings (shadow, parallax, etc). Then the artist simply inherit these fix named abstract materials applying textures through ogre's material script variables. Unfortunately I did not blog about this system since I felt it was quite a hacked up job.
On the side note, I'm currently adding shadows back for directional light. I'm almost done :)
Post a Comment