Daz Studio Iray - Rendering Hardware Benchmarking

11920212224

Comments

  • outrider42outrider42 Posts: 2,964
    edited August 11

    chrislb said:

    outrider42 said:

    ....

    It would be interesting if we could build a couple of benchmarks designed to purposely favor one of these metrics over the other, like the opposite of the Show me the power scene. But we have to be careful. Any such scene needs to balance using assets that all users have, and also keep in mind VRAM limitations. That is why the scene in this thread is like it is. It is a low render resolution and capped at 1800 iterations so that users with weaker hardware can run it within a reasonable time frame.

    With the benchmarks being limited to free assets, I wonder if there are enough free Daz assets available out there to make new benchmarks that highlight performance differences in different types of rendering? I tried looking before at sites with free to the public Daz and 3D rendering assets and soem of them had limited availability in certian types off assets.

    It can be done, the result may not be pretty but it would serve its purpose. We can use the textures from the base models but mix them using the new surfaces, along with some free to use textures. Regardless of how it looks, Iray still needs to make the calculations and resolve the material. It doesn't even need to be human skin, just something that takes up space in the surface tab to make Iray do more work.

    We do have more assets now. Remember Daz gave us free textures recently for Genesis 8.1 to add more diversity. These are available to anyone who downloads the updated Genesis 8 essentials. We could use these as our source, and simply mix them in the surfaces. We could also ignore 8.1 here and use these on regular Genesis 8. This would leave a gap in the neck area, but allow more people to do the test because not everyone has upgraded to 4.15 and 8.1 yet. Again, how they look in the render doesn't matter. Then drop them inside a simple box. This box also has complex surfaces to shade, with the light bouncing SSS everywhere. This sort of scattering can be expensive to resolve, sometimes more so than a perfect mirror reflection. We can use a few shaded cubes, rather than balls, because the cubes use far geometry than spheres.

    We can also use complex surface settings in the clothing and hair. We can make the clothing surfaces like the skin, just maybe change the color to not freak people out by them wearing skin shaded clothes. Its just leather, not people! And fake leather, too. Yep, no people were skinned in the creation of this leather jacket. Now get back in line for your Soylent Green! <.<

    The resolution of the render is not too important for the test. Render resolution should scale equally as you go up, across any hardware. Resolution mainly just increases VRAM and RAM use. It takes longer because there are more pixels to resolve. The content of the scene is what we want to test. A longer render might help get past boost clocks, but it would also make the test harder for people who lack decent hardware. We have to be careful how the scene is balanced so that users with older cards can still use it. Being able to run a bench and then see exactly how much faster other cards are can help users make that decision to upgrade. I think a test that takes roughly the same time this bench does on the same hardware would be fine. I don't want to have people cranking out 30 second renders with a 3080, but I don't want people to take hours because they have some 900 series card, either. I think a real shading test will be quite different than geometry tests, RTX cards cannot benefit much from their RT cores, so they will not run away from the GTX cards quite as easily. There might even be some surprises.

    Post edited by outrider42 on
  • RayDAntRayDAnt Posts: 914

    outrider42 said:

    chrislb said:

    outrider42 said:

    ....

    It would be interesting if we could build a couple of benchmarks designed to purposely favor one of these metrics over the other, like the opposite of the Show me the power scene. But we have to be careful. Any such scene needs to balance using assets that all users have, and also keep in mind VRAM limitations. That is why the scene in this thread is like it is. It is a low render resolution and capped at 1800 iterations so that users with weaker hardware can run it within a reasonable time frame.

    With the benchmarks being limited to free assets, I wonder if there are enough free Daz assets available out there to make new benchmarks that highlight performance differences in different types of rendering? I tried looking before at sites with free to the public Daz and 3D rendering assets and soem of them had limited availability in certian types off assets.

    It can be done, the result may not be pretty but it would serve its purpose. We can use the textures from the base models but mix them using the new surfaces, along with some free to use textures. Regardless of how it looks, Iray still needs to make the calculations and resolve the material. It doesn't even need to be human skin, just something that takes up space in the surface tab to make Iray do more work.

    We do have more assets now. Remember Daz gave us free textures recently for Genesis 8.1 to add more diversity. These are available to anyone who downloads the updated Genesis 8 essentials. We could use these as our source, and simply mix them in the surfaces. We could also ignore 8.1 here and use these on regular Genesis 8. This would leave a gap in the neck area, but allow more people to do the test because not everyone has upgraded to 4.15 and 8.1 yet. Again, how they look in the render doesn't matter. Then drop them inside a simple box. This box also has complex surfaces to shade, with the light bouncing SSS everywhere. This sort of scattering can be expensive to resolve, sometimes more so than a perfect mirror reflection. We can use a few shaded cubes, rather than balls, because the cubes use far geometry than spheres.

    We can also use complex surface settings in the clothing and hair. We can make the clothing surfaces like the skin, just maybe change the color to not freak people out by them wearing skin shaded clothes. Its just leather, not people! And fake leather, too. Yep, no people were skinned in the creation of this leather jacket. Now get back in line for your Soylent Green! <.<

    The resolution of the render is not too important for the test. Render resolution should scale equally as you go up, across any hardware. Resolution mainly just increases VRAM and RAM use. It takes longer because there are more pixels to resolve. The content of the scene is what we want to test. A longer render might help get past boost clocks, but it would also make the test harder for people who lack decent hardware. We have to be careful how the scene is balanced so that users with older cards can still use it. Being able to run a bench and then see exactly how much faster other cards are can help users make that decision to upgrade. I think a test that takes roughly the same time this bench does on the same hardware would be fine. I don't want to have people cranking out 30 second renders with a 3080, but I don't want people to take hours because they have some 900 series card, either. I think a real shading test will be quite different than geometry tests, RTX cards cannot benefit much from their RT cores, so they will not run away from the GTX cards quite as easily. There might even be some surprises.

    For a bulleted list of the basic requirements a potential benchmarking scene really needs to meet in order to be widely accessible see this section from the start of this thread.

    A little trivia: The benchmarking scene around which this thread is based was actually designed (by me) prior to the release of RTX compatibilty in Iray as a best-guess attempt at creating a scene that would take advantage of RTX accelerated rendering processes specifically while still being widely user-compatible. I thought (perhaps naively) at the time that putting a limited amount of geometric surfaces inside a lit, reflective box (ie. endless light-bouncing opportunities) would do the trick. As it turned out, even a few light bounces interacting with complex surface geometry is where it's at for RT acceleration. If I could go back and do this thread's benchmarking scene again, it would probably look something like the previously mentioned strand hair benchmark scene created by @Takeo.Kensei. Although I don't think that one is even runnable on lower end hardware.

  • outrider42outrider42 Posts: 2,964

    Takeo's bench has been run by CPUs, so it isn't restricted to high end hardware. Interestingly if you look over the results some CPUs actually handled the test pretty well. The reason why some people could not run the test was because it uses strand hair, which at the time the thread was made was still very new. Strand hair required a newer version of Daz Studio that some people had still not updated to. By now this should should no longer be an issue.

    But Takeo's strand hair test only shows one aspect of rendering.

    Most benchmarking sites run tests on different software or scenes to better demonstrate the differences in performance. It makes sense for Daz, because there is no such thing as a "real world scene" or "average scene" for Daz users. Everybody does their own thing and can have wildly different use cases. So I believe it makes sense to have different scenes that are built to focus one specific task.

    -A good geometry test. We kind of have this with Takeo's scene. But maybe we could make one a little tougher.

    -A scene focused on shading.

    -A scene that combines geometry and shading for a more balanced scene.

    -We could also do a reflection based test, too, that could be interesting. But 4 tests might be too many.

    I think such a suite would be really helpful for future buyers. The geometry would show off the RT cores, and the shading would show how well the CUDA cores run. The 3rd test could be considered a more "real world" like test. These tests would give buyers a better idea of just what their hardware can do, and how much performance they may get. GPUs are constantly changing, and the introduction of RT cores had a huge impact on this type of rendering. But this can always change, too. Nvidia may choose to create GPUs that focus more on ray tracing than shading in the future, or they may reverse course (this sounds unlikely right now, but you never know). No one knows just what the future holds.

    But the hardware is not the only thing that can change. The software can certainly change as well. A lot of tasks are handled with pure brute force, but maybe one day they will create a solution to make those tasks faster. Just recently we saw that Iray changed how it handled normal maps, and this change made objects with normal maps render faster than they did before. But this change was a very specific one, given it only impacted scenes that used normal maps. A scene that did not have any normal maps was not effected by this change.

  • JamesJABJamesJAB Posts: 1,754
    edited August 12

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    System Configuration
    System/Motherboard: Dell Precision T7610
    CPU: Dual Intel Xeon E5-2650 V2 @ 2.60GHz
    GPU: PNY Quadro RTX A5000 24GB, PNY Quadro RTX A4000 16GB
    System Memory: 64GB quad chanel 1600MHz Reg ECC
    OS Drive: 1TB WD SATA SSD - WDS100T2B0B-00YS70
    Asset Drive: 4TB WD RE4 - WD40EZRX-00S
    Operating System: Windows 10 Pro 21H1
    Nvidia Drivers Version: 471.68
    Daz Studio Version: 4.15.0.14 Public Build

    #1 RTX A4000 only

    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    Post edited by JamesJAB on
  • skyeshotsskyeshots Posts: 49
    edited August 14
    Thank you for sharing this! These are two excellent video cards. I would love to see the A5000 benchmarked on its own.
    Post edited by skyeshots on
  • skyeshotsskyeshots Posts: 49
    edited August 14

    JamesJAB said:

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    ...
    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    For your test #2 results, I think this should read:

    • Rendering Performance: (1800/70.111) = 25.67 iterations per second
    • Loading Time: (78.21-70.111) = 8.1

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Post edited by skyeshots on
  • JamesJABJamesJAB Posts: 1,754

    skyeshots said:

    JamesJAB said:

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    ...
    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    For your test #2 results, I think this should read:

    • Rendering Performance: (1800/70.111) = 25.67 iterations per second
    • Loading Time: (78.21-70.111) = 8.1

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Not sure why the load time wass o short with both GPUs running together.  I think I hit render on a blank scene before loading the benchmark scene on that Daz studio load. 

    If you go a few posts up I already did the A5000 alone.  If arrived a week or so before the A4000.

    "System Configuration
    System/Motherboard: Dell Precision T7610
    CPU: Dual Intel Xeon E5-2650 V2 @ 2.60GHz
    GPU: PNY Quadro RTX A5000 24GB
    System Memory: 64GB quad chanel 1600MHz Reg ECC
    OS Drive: 1TB WD SATA SSD - WDS100T2B0B-00YS70
    Asset Drive: 4TB WD RE4 - WD40EZRX-00S
    Operating System: Windows 10 Pro 21H1
    Nvidia Drivers Version: 471.11
    Daz Studio Version: 4.15.0.14 Public Build

    Benchmark Results
    2021-07-20 18:52:23.964 Finished Rendering
    2021-07-20 18:52:24.064 Total Rendering Time: 2 minutes 15.7 seconds
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 15.520s init, 113.449s render

    Iteration Rate: (1800 / 113.449) 15.8
    Loading Time: ((0 + 120 + 15.7) - 113.5) = 22.2 seconds"

    In my machine, the Quadro RTX A5000 is using PCIE lanes from one CPU and the RTX A4000 is using lanes from the other.  Though the Xeon E5 CPUs do have 48 lanes each, so the Precision T7610 has 2 PCIE 16x slots for each CPU (4 16x slots total).  In my opinion, the RTX A5000 and A4000 are so much better than the consumer RTX 30x0 cards because of power draw.  These cards are single power connector cards. (8pin for the A5000 and 6 pin for the A4000)  There is no wild crazy power use spikes, just a hard cap of 230W (A5000) and 140W (A4000).  And it doesn't hurt that the A4000 is a single slot card with 16GB of VRAM.

  • RayDAntRayDAnt Posts: 914
    JamesJAB said:

    skyeshots said:

    JamesJAB said:

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    ...
    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    For your test #2 results, I think this should read:

    • Rendering Performance: (1800/70.111) = 25.67 iterations per second
    • Loading Time: (78.21-70.111) = 8.1

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Not sure why the load time wass o short with both GPUs running together.  I think I hit render on a blank scene before loading the benchmark scene on that Daz studio load. 

    If you go a few posts up I already did the A5000 alone.  If arrived a week or so before the A4000.

    "System Configuration
    System/Motherboard: Dell Precision T7610
    CPU: Dual Intel Xeon E5-2650 V2 @ 2.60GHz
    GPU: PNY Quadro RTX A5000 24GB
    System Memory: 64GB quad chanel 1600MHz Reg ECC
    OS Drive: 1TB WD SATA SSD - WDS100T2B0B-00YS70
    Asset Drive: 4TB WD RE4 - WD40EZRX-00S
    Operating System: Windows 10 Pro 21H1
    Nvidia Drivers Version: 471.11
    Daz Studio Version: 4.15.0.14 Public Build

    Benchmark Results
    2021-07-20 18:52:23.964 Finished Rendering
    2021-07-20 18:52:24.064 Total Rendering Time: 2 minutes 15.7 seconds
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 15.520s init, 113.449s render

    Iteration Rate: (1800 / 113.449) 15.8
    Loading Time: ((0 + 120 + 15.7) - 113.5) = 22.2 seconds"

    In my machine, the Quadro RTX A5000 is using PCIE lanes from one CPU and the RTX A4000 is using lanes from the other.  Though the Xeon E5 CPUs do have 48 lanes each, so the Precision T7610 has 2 PCIE 16x slots for each CPU (4 16x slots total).  In my opinion, the RTX A5000 and A4000 are so much better than the consumer RTX 30x0 cards because of power draw.  These cards are single power connector cards. (8pin for the A5000 and 6 pin for the A4000)  There is no wild crazy power use spikes, just a hard cap of 230W (A5000) and 140W (A4000).  And it doesn't hurt that the A4000 is a single slot card with 16GB of VRAM.

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

  • RayDAntRayDAnt Posts: 914
    JamesJAB said:

    skyeshots said:

    JamesJAB said:

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    ...
    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    For your test #2 results, I think this should read:

    • Rendering Performance: (1800/70.111) = 25.67 iterations per second
    • Loading Time: (78.21-70.111) = 8.1

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Not sure why the load time wass o short with both GPUs running together.  I think I hit render on a blank scene before loading the benchmark scene on that Daz studio load. 

    If you go a few posts up I already did the A5000 alone.  If arrived a week or so before the A4000.

    "System Configuration
    System/Motherboard: Dell Precision T7610
    CPU: Dual Intel Xeon E5-2650 V2 @ 2.60GHz
    GPU: PNY Quadro RTX A5000 24GB
    System Memory: 64GB quad chanel 1600MHz Reg ECC
    OS Drive: 1TB WD SATA SSD - WDS100T2B0B-00YS70
    Asset Drive: 4TB WD RE4 - WD40EZRX-00S
    Operating System: Windows 10 Pro 21H1
    Nvidia Drivers Version: 471.11
    Daz Studio Version: 4.15.0.14 Public Build

    Benchmark Results
    2021-07-20 18:52:23.964 Finished Rendering
    2021-07-20 18:52:24.064 Total Rendering Time: 2 minutes 15.7 seconds
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 15.520s init, 113.449s render

    Iteration Rate: (1800 / 113.449) 15.8
    Loading Time: ((0 + 120 + 15.7) - 113.5) = 22.2 seconds"

    In my machine, the Quadro RTX A5000 is using PCIE lanes from one CPU and the RTX A4000 is using lanes from the other.  Though the Xeon E5 CPUs do have 48 lanes each, so the Precision T7610 has 2 PCIE 16x slots for each CPU (4 16x slots total).  In my opinion, the RTX A5000 and A4000 are so much better than the consumer RTX 30x0 cards because of power draw.  These cards are single power connector cards. (8pin for the A5000 and 6 pin for the A4000)  There is no wild crazy power use spikes, just a hard cap of 230W (A5000) and 140W (A4000).  And it doesn't hurt that the A4000 is a single slot card with 16GB of VRAM.

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

  • RayDAntRayDAnt Posts: 914
    JamesJAB said:

    skyeshots said:

    JamesJAB said:

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    ...
    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    For your test #2 results, I think this should read:

    • Rendering Performance: (1800/70.111) = 25.67 iterations per second
    • Loading Time: (78.21-70.111) = 8.1

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Not sure why the load time wass o short with both GPUs running together.  I think I hit render on a blank scene before loading the benchmark scene on that Daz studio load. 

    If you go a few posts up I already did the A5000 alone.  If arrived a week or so before the A4000.

    "System Configuration
    System/Motherboard: Dell Precision T7610
    CPU: Dual Intel Xeon E5-2650 V2 @ 2.60GHz
    GPU: PNY Quadro RTX A5000 24GB
    System Memory: 64GB quad chanel 1600MHz Reg ECC
    OS Drive: 1TB WD SATA SSD - WDS100T2B0B-00YS70
    Asset Drive: 4TB WD RE4 - WD40EZRX-00S
    Operating System: Windows 10 Pro 21H1
    Nvidia Drivers Version: 471.11
    Daz Studio Version: 4.15.0.14 Public Build

    Benchmark Results
    2021-07-20 18:52:23.964 Finished Rendering
    2021-07-20 18:52:24.064 Total Rendering Time: 2 minutes 15.7 seconds
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 15.520s init, 113.449s render

    Iteration Rate: (1800 / 113.449) 15.8
    Loading Time: ((0 + 120 + 15.7) - 113.5) = 22.2 seconds"

    In my machine, the Quadro RTX A5000 is using PCIE lanes from one CPU and the RTX A4000 is using lanes from the other.  Though the Xeon E5 CPUs do have 48 lanes each, so the Precision T7610 has 2 PCIE 16x slots for each CPU (4 16x slots total).  In my opinion, the RTX A5000 and A4000 are so much better than the consumer RTX 30x0 cards because of power draw.  These cards are single power connector cards. (8pin for the A5000 and 6 pin for the A4000)  There is no wild crazy power use spikes, just a hard cap of 230W (A5000) and 140W (A4000).  And it doesn't hurt that the A4000 is a single slot card with 16GB of VRAM.

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

  • RayDAntRayDAnt Posts: 914
    edited August 15
    JamesJAB said:

    skyeshots said:

    JamesJAB said:

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    ...
    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    For your test #2 results, I think this should read:

    • Rendering Performance: (1800/70.111) = 25.67 iterations per second
    • Loading Time: (78.21-70.111) = 8.1

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Not sure why the load time wass o short with both GPUs running together.  I think I hit render on a blank scene before loading the benchmark scene on that Daz studio load. 

    If you go a few posts up I already did the A5000 alone.  If arrived a week or so before the A4000.

    "System Configuration
    System/Motherboard: Dell Precision T7610
    CPU: Dual Intel Xeon E5-2650 V2 @ 2.60GHz
    GPU: PNY Quadro RTX A5000 24GB
    System Memory: 64GB quad chanel 1600MHz Reg ECC
    OS Drive: 1TB WD SATA SSD - WDS100T2B0B-00YS70
    Asset Drive: 4TB WD RE4 - WD40EZRX-00S
    Operating System: Windows 10 Pro 21H1
    Nvidia Drivers Version: 471.11
    Daz Studio Version: 4.15.0.14 Public Build

    Benchmark Results
    2021-07-20 18:52:23.964 Finished Rendering
    2021-07-20 18:52:24.064 Total Rendering Time: 2 minutes 15.7 seconds
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 15.520s init, 113.449s render

    Iteration Rate: (1800 / 113.449) 15.8
    Loading Time: ((0 + 120 + 15.7) - 113.5) = 22.2 seconds"

    In my machine, the Quadro RTX A5000 is using PCIE lanes from one CPU and the RTX A4000 is using lanes from the other.  Though the Xeon E5 CPUs do have 48 lanes each, so the Precision T7610 has 2 PCIE 16x slots for each CPU (4 16x slots total).  In my opinion, the RTX A5000 and A4000 are so much better than the consumer RTX 30x0 cards because of power draw.  These cards are single power connector cards. (8pin for the A5000 and 6 pin for the A4000)  There is no wild crazy power use spikes, just a hard cap of 230W (A5000) and 140W (A4000).  And it doesn't hurt that the A4000 is a single slot card with 16GB of VRAM.

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

    Post edited by RayDAnt on
  • outrider42outrider42 Posts: 2,964
    A big portion of that is because of how hard the gaming cards are cranked up. Undervolting the 3090 can drive the crazy power consumption down, and downclocking can drive it even further. It is possible to drop as much as 100 Watts off, and that will impact temps as well. Of course this can effect performance some, too.

    That said the A series are extremely well tuned. It is incredible they can get that kind of performance on a 8 pin. It is a shame that Nvidia didn't drop a card like that to gamers because it would have been mind blowing.
  • JamesJABJamesJAB Posts: 1,754

    RayDAnt said:

    JamesJAB said:

    skyeshots said:

    JamesJAB said:

    Just got and installed my shiny new RTX A4000 16GB card, so lets see how well it performs!

    ...
    Benchmark Results
    2021-08-11 20:56:04.778 Finished Rendering
    2021-08-11 20:56:04.855 Total Rendering Time: 3 minutes 7.83 seconds

    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 20:56:32.350 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 1800 iterations, 14.973s init, 166.740s render

    Iteration Rate: (1800 / 166.740) 10.8
    Loading Time: ((0 + 180 + 7.8) - 166.7) = 21.1 seconds

     

    #2 RTX A5000 + RTX A4000

    Benchmark Results
    2021-08-11 21:09:13.739 Finished Rendering
    2021-08-11 21:09:13.816 Total Rendering Time: 1 minutes 18.21 seconds

    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1064 iterations, 2.751s init, 69.878s render
    2021-08-11 21:09:25.730 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (NVIDIA RTX A4000): 736 iterations, 3.018s init, 70.111s render

    Iteration Rate: (1800 / 70.111) 25.7
    Loading Time: ((0 + 60 + 18.2) - 70.7) = 7.5 seconds

    For your test #2 results, I think this should read:

    • Rendering Performance: (1800/70.111) = 25.67 iterations per second
    • Loading Time: (78.21-70.111) = 8.1

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Not sure why the load time wass o short with both GPUs running together.  I think I hit render on a blank scene before loading the benchmark scene on that Daz studio load. 

    If you go a few posts up I already did the A5000 alone.  If arrived a week or so before the A4000.

    "System Configuration
    System/Motherboard: Dell Precision T7610
    CPU: Dual Intel Xeon E5-2650 V2 @ 2.60GHz
    GPU: PNY Quadro RTX A5000 24GB
    System Memory: 64GB quad chanel 1600MHz Reg ECC
    OS Drive: 1TB WD SATA SSD - WDS100T2B0B-00YS70
    Asset Drive: 4TB WD RE4 - WD40EZRX-00S
    Operating System: Windows 10 Pro 21H1
    Nvidia Drivers Version: 471.11
    Daz Studio Version: 4.15.0.14 Public Build

    Benchmark Results
    2021-07-20 18:52:23.964 Finished Rendering
    2021-07-20 18:52:24.064 Total Rendering Time: 2 minutes 15.7 seconds
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-07-20 18:55:04.599 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (NVIDIA RTX A5000): 1800 iterations, 15.520s init, 113.449s render

    Iteration Rate: (1800 / 113.449) 15.8
    Loading Time: ((0 + 120 + 15.7) - 113.5) = 22.2 seconds"

    In my machine, the Quadro RTX A5000 is using PCIE lanes from one CPU and the RTX A4000 is using lanes from the other.  Though the Xeon E5 CPUs do have 48 lanes each, so the Precision T7610 has 2 PCIE 16x slots for each CPU (4 16x slots total).  In my opinion, the RTX A5000 and A4000 are so much better than the consumer RTX 30x0 cards because of power draw.  These cards are single power connector cards. (8pin for the A5000 and 6 pin for the A4000)  There is no wild crazy power use spikes, just a hard cap of 230W (A5000) and 140W (A4000).  And it doesn't hurt that the A4000 is a single slot card with 16GB of VRAM.

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

    My cooling setup is the stock setup for the Dell Precision T7610 Workstation.  It is a pure front intake / rear exaust setup.  This setup is prefect for cooling blower style GPUs like the Nvidia Quadro line.  

  • Anyone interested in a benchmark run on a massively sup-par computer (a 10-year-old Dell XPS laptop)?

    System Configuration
    System/Motherboard: Dell Inc. 0XN71K
    CPU: Intel Core i7-2670QM @ stock (2.2GHz)
    GPU: 3GB NVIDIA GeForce  GT 555M Graphics Card @ stock
    System Memory: (BRAND MODEL unknown) 6GB (1X4GB + 1X2GB) @ 1 333MHz DDR3 Dual Channel
    OS Drive: (BRAND MODEL unknown) 1TB (2x500GB) Serial ATA (7200RPM) Dual HDD
    Asset Drive: SAME
    Operating System: Windows7 Home Premium 64 bit Service Pack 1
    Nvidia Drivers Version: 391.35
    Daz Studio Version: 4.15.0.2 Pro Edition 64 bit
    Optix Prime Acceleration: n/a

    Benchmark Results
    2021-08-17 16:41:17.971 Finished Rendering
    2021-08-17 16:41:18.066 Total Rendering Time: 6 hours 51 minutes 20.2 seconds

    2021-08-17 16:41:17.590 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-17 16:41:17.590 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CPU:      1800 iterations, 20.147s init, 24655.225s render


    Iteration Rate: (1800 / 24655.225s) = 0.073006837

    Loading Time: ((6 * 3600 + 51 * 60 + 20.2) - 24655.225) = 24.975 seconds

     

  • outrider42outrider42 Posts: 2,964
    edited August 17

    the_assassin said:

    Anyone interested in a benchmark run on a massively sup-par computer (a 10-year-old Dell XPS laptop)?

    System Configuration
    System/Motherboard: Dell Inc. 0XN71K
    CPU: Intel Core i7-2670QM @ stock (2.2GHz)
    GPU: 3GB NVIDIA GeForce  GT 555M Graphics Card @ stock
    System Memory: (BRAND MODEL unknown) 6GB (1X4GB + 1X2GB) @ 1 333MHz DDR3 Dual Channel
    OS Drive: (BRAND MODEL unknown) 1TB (2x500GB) Serial ATA (7200RPM) Dual HDD
    Asset Drive: SAME
    Operating System: Windows7 Home Premium 64 bit Service Pack 1
    Nvidia Drivers Version: 391.35
    Daz Studio Version: 4.15.0.2 Pro Edition 64 bit
    Optix Prime Acceleration: n/a

    Benchmark Results
    2021-08-17 16:41:17.971 Finished Rendering
    2021-08-17 16:41:18.066 Total Rendering Time: 6 hours 51 minutes 20.2 seconds

    2021-08-17 16:41:17.590 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-17 16:41:17.590 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CPU:      1800 iterations, 20.147s init, 24655.225s render


    Iteration Rate: (1800 / 24655.225s) = 0.073006837

    Loading Time: ((6 * 3600 + 51 * 60 + 20.2) - 24655.225) = 24.975 seconds

    Actually, yes, this is always interesting to see. The GPU in this device is no longer supported by Iray, so this render is all on CPU. This is a 4 core 8 thread mobile from 2011, and this demonstrates how bad things can be when trying to use old hardware. It took almost 7 hours, while even a modest Nvidia GPU can run this test in under 10 minutes.

    I am surprised you were able to get Daz Studio to function on this. On the old laptop I have, Daz Studio is so laggy and unresponsive it is basically unusable. But it only has 2 cores and it was always junk to begin with.

    However, you may be in a situation where you could STREAM a desktop to your laptop, so the desktop is doing all the work and the laptop is only displaying the video feed. This requires a decent network in the home to do, but is very doable. The laptop may be so old that it lacks modern wifi standards, in which case you can pop a $10 wifi USB dongle on it and make streaming work better. A proper mouse will likely be needed, too.

    Post edited by outrider42 on
  • skyeshotsskyeshots Posts: 49

    JamesJAB said:

    skyeshots said:

    ...

    I wanted to check the math because of the load times, especially the A4000 test > 21 seconds. Are you using Slots 2 and 4 on the T7610? This may seem counter-intuitive given the layout of the board, but if you are using Slots 1 and 3, it may be slowing your overall workloads. Moving your assets to an SSD may also (dramatically) change the pace of your real-world creative processes.

    These peripheral speeds (drives, CPU, bus, etc.) do not mean much in terms of final rendering performance, but they certainly do affect the overall creative processes.

    Not sure why the load time wass o short with both GPUs running together.  I think I hit render on a blank scene before loading the benchmark scene on that Daz studio load. 

    If you go a few posts up I already did the A5000 alone.  If arrived a week or so before the A4000.

    ..

    The load times just seem out of balance with the overall build. The second test may have benefited from cached resources. Just keep away from slot 3 on the T7610 as it will always be limited to 4x transfers, regardless of the CPU count.

    It looks like there are 2 additional PCIe 3.0x16 slots above the CPUs as well. These are the ones that specifically benefit from the 2nd Xeon CPU installed. Using a PCIe SSD in one of those for the Daz asset drive would be a modest expense and big step up from the 5400 RPM WD you have now. Perhaps 20x faster transfer rates, with quicker load times.

  • skyeshotsskyeshots Posts: 49

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

    Re: Thermaltake Tower 900: You should be able to vertical mount an exhaust style card with the tail facing up. There is no harm in mixing water and air, assuming no mechanical collisions. The one super important point about the blower cards though is that you need to offset the air that they exhaust with enough positive air pressure (intake fans) to properly feed the card(s). This might be the bigger obstacle in that tower, especially if you are using 480mm rads.

    For my setup, I have 4x 200mm fans as intake to offset 4 blower cards. Not sure if this is perfect, but it seems to get the job done.

  • RayDAntRayDAnt Posts: 914
    edited August 18

    skyeshots said:

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

    Re: Thermaltake Tower 900: You should be able to vertical mount an exhaust style card with the tail facing up. There is no harm in mixing water and air, assuming no mechanical collisions. The one super important point about the blower cards though is that you need to offset the air that they exhaust with enough positive air pressure (intake fans) to properly feed the card(s). This might be the bigger obstacle in that tower, especially if you are using 480mm rads.

    For my setup, I have 4x 200mm fans as intake to offset 4 blower cards. Not sure if this is perfect, but it seems to get the job done.

    Yeah... I have double 560mm rads in completely independent GPU/CPU loops, making my system a great platform for both low temps and low noise intensive computing (even under full continuous load, my Titan RTX never goes beyond 13c above ambient room temp.) However the design of the Tower 900 is such that - regardless of how big the radiators in the back compartment are - the entire front motherboard chamber only gets a single 140mm fan's worth of active cooling. For the first six months or so of having/using the case I didn't yet have watercooling for the GPU side. So I ran it with the stock cooler. And let me tell you - the GPU temps weren't pretty.

    Although a blower card wouldn't be as bad as all that, iI jsut can't conscience forking out the sort of money needed for an Ampere series pro card and not being able to take full advantage of the system I have prepared for it (the insane cooling headroom of my system was by design - from day one my plan was to eventally upgrade it to a quad card configuration.) WC hardware doesn't come cheap. (Availability permitting, of course) I would've picked up at least a 3090 ages ago (since waterblocks for those abound) if not for their hamstrung driver situation.

    Post edited by RayDAnt on
  • nonesuch00nonesuch00 Posts: 15,619

    outrider42 said:

    the_assassin said:

    Anyone interested in a benchmark run on a massively sup-par computer (a 10-year-old Dell XPS laptop)?

    System Configuration
    System/Motherboard: Dell Inc. 0XN71K
    CPU: Intel Core i7-2670QM @ stock (2.2GHz)
    GPU: 3GB NVIDIA GeForce  GT 555M Graphics Card @ stock
    System Memory: (BRAND MODEL unknown) 6GB (1X4GB + 1X2GB) @ 1 333MHz DDR3 Dual Channel
    OS Drive: (BRAND MODEL unknown) 1TB (2x500GB) Serial ATA (7200RPM) Dual HDD
    Asset Drive: SAME
    Operating System: Windows7 Home Premium 64 bit Service Pack 1
    Nvidia Drivers Version: 391.35
    Daz Studio Version: 4.15.0.2 Pro Edition 64 bit
    Optix Prime Acceleration: n/a

    Benchmark Results
    2021-08-17 16:41:17.971 Finished Rendering
    2021-08-17 16:41:18.066 Total Rendering Time: 6 hours 51 minutes 20.2 seconds

    2021-08-17 16:41:17.590 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-17 16:41:17.590 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CPU:      1800 iterations, 20.147s init, 24655.225s render


    Iteration Rate: (1800 / 24655.225s) = 0.073006837

    Loading Time: ((6 * 3600 + 51 * 60 + 20.2) - 24655.225) = 24.975 seconds

    Actually, yes, this is always interesting to see. The GPU in this device is no longer supported by Iray, so this render is all on CPU. This is a 4 core 8 thread mobile from 2011, and this demonstrates how bad things can be when trying to use old hardware. It took almost 7 hours, while even a modest Nvidia GPU can run this test in under 10 minutes.

    I am surprised you were able to get Daz Studio to function on this. On the old laptop I have, Daz Studio is so laggy and unresponsive it is basically unusable. But it only has 2 cores and it was always junk to begin with.

    However, you may be in a situation where you could STREAM a desktop to your laptop, so the desktop is doing all the work and the laptop is only displaying the video feed. This requires a decent network in the home to do, but is very doable. The laptop may be so old that it lacks modern wifi standards, in which case you can pop a $10 wifi USB dongle on it and make streaming work better. A proper mouse will likely be needed, too.

    My PNY GeForce GTX 1650 Super 4GB did the scene in close to 13 minutes.

  • JamesJABJamesJAB Posts: 1,754

    Since we are now talking about 11 min render times....  The important takaway on this one is that we are talking about a notebook with a 16GB GPU.

     

    System Configuration
    System/Motherboard: Dell Precision 7710
    CPU: Intel Xeon E3-1535M v5
    GPU: Quadro P5000 16GB
    System Memory: 64GB dual chanel 2133MHz
    OS Drive: 1TB Crucial NVME SSD - CT1000P5SSD8
    Asset Drive: 2TB Intel NVME SSD - SSDPEKNW020T8
    Operating System: Windows 10 Pro 20H2
    Nvidia Drivers Version: 452.39
    Daz Studio Version: 4.15.0.25 Public Build

    Benchmark Results
    2021-08-18 23:38:28.200 Finished Rendering
    2021-08-18 23:38:28.402 Total Rendering Time: 11 minutes 13.46 seconds

    2021-08-18 23:40:40.765 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-18 23:40:40.765 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (Quadro P5000): 1800 iterations, 3.779s init, 663.504s render

    Iteration Rate: (1800 / 663.504) 2.7
    Loading Time: ((0 + 660 + 13.46) - 663.5) = 9.96 seconds

  • outrider42outrider42 Posts: 2,964

    I don't think that card has been benched here before, so that is a nice one. The P5000 is mostly a GTX 1070, with a few extra CUDA and of course double VRAM.

    We have not had a 1080, 1070, or 1070ti benchmark done in a really long time, so this gives a small indication to where their performance may line up. At 2.7 iterations per second, this laptop P5000 slots right between the 1070ti and 1070 shown on the chart, edging closer to the 1070ti. The 3 GTX cards don't have marks for 4.14+ on the chart, so they should be a little faster now. That would be because of clockspeed, since this P5000 is both Quadro and laptop it has lower clocks.

    I am curious where the 3050ti and 3050 fall, even though with 4GB of VRAM they would be limiting in Daz today. They are currently only in laptops, and a desktop release has not been announced. But the laptops are out in the wild now.

  • Saxa -- SDSaxa -- SD Posts: 524
    edited August 19

    RayDAnt said:

    skyeshots said:

    What's your cooling setup for your A4000/5000's? I've been sorely tempted for some time to pick up an A5000 to accompany my Titan RTX. However, the lack of water-cooling blocks for any of the Ampere era pro cards has been holding me back since I run a fully spec'd out custom WC rig with room to spare, which really isn't designed for anything but watercooling (Google "Tower 900" from Thermaltake.)

    Re: Thermaltake Tower 900: You should be able to vertical mount an exhaust style card with the tail facing up. There is no harm in mixing water and air, assuming no mechanical collisions. The one super important point about the blower cards though is that you need to offset the air that they exhaust with enough positive air pressure (intake fans) to properly feed the card(s). This might be the bigger obstacle in that tower, especially if you are using 480mm rads.

    For my setup, I have 4x 200mm fans as intake to offset 4 blower cards. Not sure if this is perfect, but it seems to get the job done.

    Yeah... I have double 560mm rads in completely independent GPU/CPU loops, making my system a great platform for both low temps and low noise intensive computing (even under full continuous load, my Titan RTX never goes beyond 13c above ambient room temp.) However the design of the Tower 900 is such that - regardless of how big the radiators in the back compartment are - the entire front motherboard chamber only gets a single 140mm fan's worth of active cooling. For the first six months or so of having/using the case I didn't yet have watercooling for the GPU side. So I ran it with the stock cooler. And let me tell you - the GPU temps weren't pretty.

    Although a blower card wouldn't be as bad as all that, iI jsut can't conscience forking out the sort of money needed for an Ampere series pro card and not being able to take full advantage of the system I have prepared for it (the insane cooling headroom of my system was by design - from day one my plan was to eventally upgrade it to a quad card configuration.) WC hardware doesn't come cheap. (Availability permitting, of course) I would've picked up at least a 3090 ages ago (since waterblocks for those abound) if not for their hamstrung driver situation.

    have the 900 tower too.

    my fix was leave glass on main side between me & components.  The other 2 sides & back are open so negative air pressure is non-issue.  Pure air and most extreme gaming my RTX3090 ends up less than my new AMD cpu at peak 48degC with AC-Vallhalla (fan curve is custom but still can't hear with headphones).  Do have a pile of fans and room temp was 21degC.  Vacuum out the PCs a few times a year anyways so dust is not a big issue.   Rather do that so far than maintain water connections.  Or that's what I think so far.

    In retrospect, do luv the vertical hang for these beefy GPUS, but that tower, despite being spacious and so much room for fans is curiously a bit shy for air.  But was especially French designed for liquid.  Would have preferred a 3rd air vent on top for GPU series.

    Still figuring out the best setup for me.  So always interested to read other people's choices.

     

     

     

     

    Post edited by Saxa -- SD on
  • JamesJABJamesJAB Posts: 1,754

    I've got a Quadro P4000, Geforce GTX 1080 and GTX 1080 ti installed in Win Server 2016 machines.  I'll see if Studio will play nice in Win Server, and run some benchmark renders for you

  • outrider42outrider42 Posts: 2,964

    I have two 1080tis, unless you just want to compare numbers to see if they line up with each other. I have ran newer benches but they are not on the chart yet it seems. They are in a post somewhere, but I can dig the numbers up.

    A think at this point we have enough 3090 numbers, LOL. I think we could use some more variety. Even a lot of Turing has been forgotten since Ampere came along.

  • JamesJABJamesJAB Posts: 1,754
    edited August 20

    My two GTX 1080 ti cards are very different from each other.  One is an EVGA GTX 1080 ti SC2, and the other is a PNY GTX 1080 ti with a blower cooler.

    OK, so after figuring out how to get around the Win Server issue where Daz Studio does not want to run with elevated permissions....  And then figuring out what content needed installing for the benchmark.  I am now running the benchmark renders for my GTX 1080 and Quadro P4000 on Windows Server 2016.  So interesting fun fact with Windows Server, even though the video output device is built onto the motherboard (And I'm running from a remote desktop), it will choose the appropriate installed OpenGL device to render 3D content.

     

    System Configuration
    System/Motherboard: Dell Poweredge R720
    CPU: Dual Intel Xeon E5-2630L @ 2.0GHz
    GPU: Quadro P4000 8GB and Geforce GTX 1080 8GB
    System Memory: 64GB 1333MHz REG ECC DDR3
    OS Drive: 512GB SATA SSD
    Asset Drive = OS Drive
    Operating System: Windows Server 2016 Standard (ver 1607)
    Nvidia Drivers Version: 462.31 (Quadro Drivers)
    Daz Studio Version: 4.15.0.25 Public Build

    GTX 1080 only 

    Benchmark Results
    2021-08-20 17:25:55.473 Finished Rendering
    2021-08-20 17:25:55.518 Total Rendering Time: 10 minutes 17.85 seconds

    2021-08-20 17:26:54.347 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-20 17:26:54.347 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (GeForce GTX 1080): 1800 iterations, 5.906s init, 603.231s render

    Iteration Rate: (1800 / 603.231) 2.98
    Loading Time: ((0 + 600 + 17.85) - 603.23) = 14.62 seconds

     

    Quadro P4000 only

    Benchmark Results
    2021-08-20 17:40:58.659 Finished Rendering
    2021-08-20 17:40:58.734 Total Rendering Time: 11 minutes 23.26 seconds

    2021-08-20 17:42:48.085 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-20 17:42:48.085 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (Quadro P4000): 1800 iterations, 5.374s init, 670.393s render

    Iteration Rate: (1800 / 670.393) 2.68
    Loading Time: ((0 + 660 + 23.26) - 670.39) = 12.87 seconds

     

    GTX 1080 + Quadro P4000

    Benchmark Results

    2021-08-20 17:50:04.014 Finished Rendering
    2021-08-20 17:50:04.065 Total Rendering Time: 5 minutes 37.40 seconds

    2021-08-20 17:50:56.804 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-20 17:50:56.804 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (GeForce GTX 1080): 945 iterations, 5.792s init, 323.912s render
    2021-08-20 17:50:56.804 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (Quadro P4000): 855 iterations, 5.771s init, 323.878s render

    Iteration Rate: (1800 / 323.91) 5.55
    Loading Time: ((0 + 300 + 37.40) - 323.921) = 13.48 seconds

    Post edited by JamesJAB on
  • outrider42outrider42 Posts: 2,964

    Thank you for your efforts!

    So the 1080 did see a decent boost from 4.14. The previous bench was on a 2018 version of Iray, and they got 2.342 iterations per second. So hitting 2.98 is a solid boost, though like all the speed boosts observed, this involves scenes that use normal maps. If a scene happens to have no normal maps there is not much of a speed boost. But I'd wager most people have normal maps somewhere in their scene.

    The P4000 sits in a strange position between the 1060 6GB and 1070. The P4000 has 1792 CUDA cores. The 1070 has 1920, and the 1060 6GB only has 1280. The 1080 has 2560 cores. So the P4000 should slot closer to the 1070. However its iteration rate here was 2.68, which is not so far off the 1080. I expected it to be farther down from the 1080. Unless my math is wrong, the P4000 has 70% of the 1080's CUDA cores, but offers 89% of the performance? Very interesting.

    Iray has always scaled real well with multiple cards, but it really worked well here. If you were to add the individual times it would be 5.66. They hit 5.55, which is 98% scaling.

    I have a EVGA 1080ti SC as well. My other 1080ti is a MSI Gaming X. The Gaming X is massive compared to the SC, but that mass actually does give it better cooling. It cools far better, and as a result the fans are not as loud. So I use my Gaming X as my monitor card for, well, gaming. My EVGA only runs when I use Iray. However, the EVGA does perform a tiny bit better in Daz, I suspect that is because it is not driving the monitor. The difference is very slight, it always does a few more iterations than the Gaming X, and only a few.

  • JamesJABJamesJAB Posts: 1,754
    edited August 21

    The thing that makes the P4000 really stand out is two often overlooked specs....  Single slot and 105W (Quadro cards do not boost above the power limit like Geforce cards).

    Even with gaming, Since my wife and I both use Precision 7710 mobile workstations as our gaming laptops (She has my old P4000 and mine is running a P5000), these cards do a really good job of gaming above their theoretical performance level.  There must be some voodoo magic at work in the Pascal based Quadro cards.  Heck, my notebook Quadro P5000 scored 2.7 on the Iray benchmark out of a 100W MXM card.

    It's pretty amazing what the Pascal quadro cards do on a limited hard power cap.

    Post edited by JamesJAB on
  • JamesJABJamesJAB Posts: 1,754

    And Here is my other Pweredge R720, Home of the GTX 1080 ti cards.

    System Configuration
    System/Motherboard: Dell Poweredge R720
    CPU: Dual Intel Xeon E5-2640 @ 2.5GHz
    GPU: dual Geforce GTX 1080 ti 11GB
    System Memory: 48GB 1333MHz REG ECC DDR3
    OS Drive: 512GB SATA SSD
    Asset Drive = OS Drive
    Operating System: Windows Server 2016 Standard (ver 1607)
    Nvidia Drivers Version: 462.31 (Quadro Drivers)
    Daz Studio Version: 4.15.0.25 Public Build

    EVGA GTX 1080 ti SC2

    Benchmark Results

    2021-08-20 20:40:39.106 Finished Rendering
    2021-08-20 20:40:39.175 Total Rendering Time: 6 minutes 53.93 seconds

    2021-08-20 20:40:46.889 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-20 20:40:46.889 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (GeForce GTX 1080 Ti): 1800 iterations, 4.755s init, 400.922s render

    Iteration Rate: (1800 / 400.922) 4.49
    Loading Time: ((0 + 360 + 53.93) - 400.92) = 13.01 seconds

     

    PNY GTX 1080 ti Blower

    Benchmark Results

    2021-08-20 20:49:55.233 Finished Rendering
    2021-08-20 20:49:55.321 Total Rendering Time: 6 minutes 57.43 seconds

    2021-08-20 20:52:08.098 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-20 20:52:08.098 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (GeForce GTX 1080 Ti): 1800 iterations, 4.697s init, 403.989s render

    Iteration Rate: (1800 / 403.989) 4.46
    Loading Time: ((0 + 360 + 57.43) - 403.99) = 13.44 seconds

     

    Dual GTX 1080 ti

    Benchmark Results

    2021-08-20 20:29:25.763 Finished Rendering
    2021-08-20 20:29:25.850 Total Rendering Time: 3 minutes 46.48 seconds

    2021-08-20 20:29:31.673 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:
    2021-08-20 20:29:31.673 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (GeForce GTX 1080 Ti): 922 iterations, 8.226s init, 209.300s render
    2021-08-20 20:29:31.673 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (GeForce GTX 1080 Ti): 878 iterations, 5.883s init, 211.322s render

    Iteration Rate: (1800 / 211.322) 8.52
    Loading Time: ((0 + 180 + 46.48) - 211.322) = 15.16 seconds

  • JamesJABJamesJAB Posts: 1,754

    outrider42 said:

    Thank you for your efforts!

    So the 1080 did see a decent boost from 4.14. The previous bench was on a 2018 version of Iray, and they got 2.342 iterations per second. So hitting 2.98 is a solid boost, though like all the speed boosts observed, this involves scenes that use normal maps. If a scene happens to have no normal maps there is not much of a speed boost. But I'd wager most people have normal maps somewhere in their scene.

    The P4000 sits in a strange position between the 1060 6GB and 1070. The P4000 has 1792 CUDA cores. The 1070 has 1920, and the 1060 6GB only has 1280. The 1080 has 2560 cores. So the P4000 should slot closer to the 1070. However its iteration rate here was 2.68, which is not so far off the 1080. I expected it to be farther down from the 1080. Unless my math is wrong, the P4000 has 70% of the 1080's CUDA cores, but offers 89% of the performance? Very interesting.

    Iray has always scaled real well with multiple cards, but it really worked well here. If you were to add the individual times it would be 5.66. They hit 5.55, which is 98% scaling.

    I have a EVGA 1080ti SC as well. My other 1080ti is a MSI Gaming X. The Gaming X is massive compared to the SC, but that mass actually does give it better cooling. It cools far better, and as a result the fans are not as loud. So I use my Gaming X as my monitor card for, well, gaming. My EVGA only runs when I use Iray. However, the EVGA does perform a tiny bit better in Daz, I suspect that is because it is not driving the monitor. The difference is very slight, it always does a few more iterations than the Gaming X, and only a few.

    I would definatley like to see how your GTX 1080 ti cards perform on desktop Windows with Geforce drivers vs mine on Windows Server using Quadro drivers.  Also, if you would like for your benchmark records I could run it on my son's Precision 7510 that's running a Quadro M2200 4GB (Workstation version for the Geforce GTX 965M), and my wife's 7710 running a Quadro P4000 8GB.  And I'm pretty sure there is no need for another RTX 3070 benchmark in here.

    It might be intersting for someone with a newer LHR (Low Hash Rate) card to run a few Iray tests to see if there is any effect on render times.  (For example using a LHR card with no video output connected and/or connected to a PCIe 4x slot.)

  • outrider42outrider42 Posts: 2,964
    edited August 22

    I posted it in this thread, but with 24 pages, it may be tough to track down. (Edit: I found them on page 20 of this thread.) I have the notepad I saved. My times are faster than yours, but not by a lot. One thing I have noticed is that my 1080tis have always performed well compared to other 1080tis (I can't help but compare!) In fact I don't think anyone has posted faster times with either the single or dual 1080tis. Most are close. I don't know why that is, I do use an aggressive fan curve, but I don't think cooling effects the benchmark too much because of how short it is.

    This was done with the first version of 4.15. I can try doing the test again with the newer drivers. I need to update one of my Daz folders to get the current beta, too. I forget which GPU this is, I am pretty sure it is the EVGA, which is not hooked to the monitor. You can see the full details on page 20, but here are the chief parts.

    Daz 4.15.0.2

    Driver 457.30

    2021-02-25 00:28:27.967 Total Rendering Time: 6 minutes 37.22 seconds

    2021-02-25 00:28:40.071 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:

    2021-02-25 00:28:40.071 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (GeForce GTX 1080 Ti): 1800 iterations, 4.482s init, 389.181s render

    Using both I got these results

    2021-02-27 22:12:49.632 Total Rendering Time: 3 minutes 22.48 seconds

    2021-02-27 22:12:59.685 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : Device statistics:

    2021-02-27 22:12:59.685 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 0 (GeForce GTX 1080 Ti): 893 iterations, 1.395s init, 199.483s render

    2021-02-27 22:12:59.689 Iray [INFO] - IRAY:RENDER ::   1.0   IRAY   rend info : CUDA device 1 (GeForce GTX 1080 Ti): 907 iterations, 1.789s init, 199.044s render

    These times are pretty consistent for me with this version of Daz. I also tested some VRAM overclocking with both GPUs to +800 I was able to cut the render to 178 seconds. However, I didn't want to push things, so I reset the VRAM after that. I believe I may have caused instability to my PC with some VRAM overclockiing. I had some serious crashes after long renders...not just Daz, but my entire PC crashed.

    Post edited by outrider42 on
Sign In or Register to comment.