General GPU/testing discussion from benchmark thread
This discussion has been closed.
Adding to Cart…
Licensing Agreement | Terms of Service | Privacy Policy | EULA
© 2025 Daz Productions Inc. All Rights Reserved.You currently have no notifications.
Licensing Agreement | Terms of Service | Privacy Policy | EULA
© 2025 Daz Productions Inc. All Rights Reserved.
Comments
https://www.evga.com/products/product.aspx?pn=11G-P4-2281-KR
based on the 8% Windows takes (grrr) and not really getting twice the VRAM from Nvlink, that was why I was targetting ~16gb for a texture
I'm doing something really wrong. I have Bend on the River, Rivendale and Mighty Oak going, cranked the compression (I think) as suggested. I understand there may be issues w/ reporting tools like HW Monitor with VRAM and all, but ultimately I'm looking for when the GPU dumps everything to the CPU. Am I wrong?
I'll start dumping people in it now.
Whoa. Seeing something interesting/different. Bumped min/max compression to 8192/16384. The GPU started averaging 70%-ish and CPU was kicking in. Added 2nd GPU in and memory on both cards maxed, one GPU went to 0%, 2nd GPU maxed and CPU was minimal.
It seems the limit for texture compression is 16384
A 16384x16384 texture should weight 1GB before compression
To fill 11GB you need 11 of these textures to be loaded -> could be done with 1 fully clothed genesis
If I was you, I'd disable NVLink during the scene build up. Just use one GPU to get accurate VRAM monitoring
Can anyone explain this? I set the limit to 16384. I set only one GPU to be used. Yet the VRAM shown across cards is pretty low AND both GPUs are going.
Edit: Also, normally the cards (esp the top one) kick out a lot heat, generally running up to 70-80 degrees. Here they are maxed out for several minutes and barely heating up.
Can you post the contents of your log file? Specifically everything after the most recent line that looks like this:
ETA: Also I'd highly suggest switching over to using GPU-Z's Sensors tab for monitoring in an experiement like yours. It has a field called "Bus Interface Load" which will tell you how much data is flowing over the PCI-E bus to each of your cards at any given time (extremely useful for detecting the sort of slower memory pooling @ebergerly described several posts ago.
had cleared it. here's a run from just a couple minutes ago. same thing occured. except VRAM on both cards never went above 11% (i assume that is due to a fresh daz restart)
Edit: I dislike GPUz. I can't see both cards at the same time
You don't seem to understand what an Iteration is. By definition, an iteration is a single full completion of some sort of cyclical process. In the case of 3D rendering, the process in question is calculating one sample pixel value for each and every pixel location in a scene's framebuffer - ie. the same number of samples as there exist pixels in the final rendered image. So in the case of the benchmarking scene I constructed (which renders out to a 900x900 pixel square) each and every iteration works out to be:
810,000 samples.
Render time isn't a control value in the test I constructed. It is the main test value - which is the total opposite of a control value in a designed experiement.
Yes, I chose iteration as a control value in my test. Hence why it is a control value in my test.
It means EXACTLY that.
Reopen your scene, switch to the "Progressive Render Settings" section and set Render Quality (what controls convergance assessments in Iray - not Rendering Converged Ratio) to the maximum value allowed (currently 10000.) Then set Rendering Converged Ratio to 100% and click Render. Let us know how long it takes.
You want a surefire way to hammer your vram? Put in a genesis 8 or two, and crank their subdivision up. I had one scene that was wroking working workin, then all of a sudden kept dropping to cpu no matter what. Took me like a hour to figure out that the subd got cranked up to something dumb like 11 lol.
Yeah, subdivision can certain bump it up. Also that HDRI you are using is really old and probably not using much data at all.
And again, try to disable Nvlink if it is being used so that only one card is used for the render.
had subd level at 2 and render subd level min at 4 on the render above
i have evga's precision x1, sounds similar to afterburner. didn't think of using it, but now that i look at it i see it does do bus and fb, i'll watch that next run
Well, going by this log you are definitely only using one of your cards with Iray (if the other is also showing activity, it appears to be coming from some other process) and your texture memory usage is only up to 3.8155 GB across the board.
Totally forgot about SubD. You should absolutely play around with that and see how it affects things as well.
Honestly can't really disagree. However, it shows certain things (like the previously mentioned "Bus Interface Load") which are both extremely useful in this sort of testing and hard to find in other similar tools.So... yeah.
well, let me back up a sec! i just went 3/5 respectively and it definitely shunted everything to the CPU ;)
Yeah, the jump from 4 to 5 is massive.
I do know. Let's go on. If Sample per iteration is constant, whatever hardware and whatever render, why don't renders always converge to the same iteration for a fixed convergence or fixed time?
see timon630's post and any other as well
Personnaly, from this point I don't really need any answer and we could stop discussing the subject because it's going nowhere. And I don't really care what settings you use for your benchmark
I know datas received for each iteration is highly variable (you get way more samples at the beginning of a render than at the end)
The number of iteration and thus render time for render completion is highly luck dependant because you can't predict if you'll get a hit or miss when you shoot a ray
The only way to get something consitent to bench the hardware is to know how many Gigarays/s the hardware is capable of, as well as an other measurement that would be something like shading rate/s
I don't have the Gigaray/s atm (and don't know if I want it) and the question for me is wether your benchmark can be a good enough indication of the shading capability of the hardware
But still you only showed that total rendertime in your test has little variation (It is not in my tests but well)
So now you're trying to make me do a 10000x longer render by touching a parameter that is not convergence ratio? Rough math would give 333 days which is still finite time
Changing the quality setting will just make rendertime longer but doesn't prove anything since it's not specific to just unbiased renderers. Changing Quality requierement for any renderer biased or unbiased would change the rendertime whatsoever
Quality setting is at least a good way to artificially adapt a benchmark scene to new hardware without changing anything else (we could have done that to the SY scene) but it could give linear results and then not be usefull at all
I showed you Convergence Ratio at 100% is practical. It's all there is to know for the moment
95% convergence ratio can be practical for difficult scenes where you know that there are some place in the rendered picture where it will be difficult to get samples (low light)
A 100% convergence can be used if you know all your pixels will converge at some point if enough rays are shot (well lit scenes)
You didn't upscale any texture to 16384x16384. That's what your log tells
But you're right in saying that HWInfo's numbers are not logical but there's no way to tell if these numbers are reliable or not
For info, a 4096x4096 texture will only weight 68MB when uncompressed. So you'll need 161 of these to fill 11GB
That's a good idea if geometry information can be divided to be shared between GPUs. We don't know if that is the case. The only information we have is that textures can be divided
At least subd can be used to fill up the GPU to a certain limit (let's say 10GB) then new geometry and textures must be added.
@AelfinMaegik : All needed informations acn be found in your log :
Sidenote : NVLINK is not supported in Iray Interactive
I tested a G8F at SubD 5 and it consumes around 850MB geometry data + around 750 MB texture data. Sum of both is 1,6GB
So from your scene, you need to add three different character and crank their SubD to 5 to get the memory consumption between 10 and 11 GB
From there you save your scene before trying to add one more different character and render with one GPU to see if it fallbacks to CPU render
I remember reading somewhere that it's only materials that are shared across nVLink. Geometry still has to be on all cards.
For those who may be interested in setting up NVLINK with their RTX cards, aside from the info I already posted previously from Puget Systems and Chaos Group (saying you need to first set up SLI mode, then verify NVLINK with the software Puget developed, and also not trust some/many/all of the reporting apps), here's some more info direct from NVIDIA last month on things to consider:
I'm not sure the present state of PCI availability, but I assume this last item could be a signficant issue for some.
And of course since this is a CUDA 10 feature it's assumed the correct NVIDIA drivers are installed. And personally, FWIW, I wouldn't trust any of the reporting apps besides Task Manager, for reasons stated previously. And even that I'd be wary of.
With GPU-Z you can make plots of any or all of the GPU data, even for multiple GPU's, even on the same chart, even for separate renders.
I tend to agree about what an iteration is, but to expand a bit on samples...
In ray tracing/path tracing you send out a ray for each and every pixel in your image to determine a color for that pixel. In its simplest form you just determine what object the ray hit and look up its material color at that point. For a 1920 x 1080 image that comes out to over 2 million pixels. Initially, you send out one ray at the center of each pixel into the scene, and it bounces all over the scene to determine what final color to use for that pixel, based on effects of scene lights and shadows, etc.
However, as you can see if you turn the samples down to "1" or something like that, the resulting image looks like garbage, because even a very small pixel still covers a part of an object in the scene that is probably a range of colors. So for example, if "behind" pixel # 1,234,786 in the scene is a flower with a brown stem and bright blue leaf, and the first ray sees the brown stem and colors the pixel brown, it won't reflect the "average" color that pixel should be (ie, mostly bright blue).
So the idea is to, inside each pixel, send out more "sample" rays at random directions/locations inside that pixel and into the scene. And you get a color from each of those "samples", and when you're done making the set number of samples per pixel (I think Studio/Iray defaults to 5,000?) you average them together to get the single, final pixel color for your image. In its simplest form it's just a "for" loop that calculates random rays inside the bounds of the pixel and sends them into the scene, depending on the number of samples you specify. And with 5,000 samples and 2 million pixels, that's 10 BILLION initial rays, and that doesn't even include the bounces. Of course in real renderers there are things like BVH's to greatly optimize all those ray calculations, but that's the general idea.
Of course, at some point you're wasting time since the change in average sample color for each new sample is negligible, and more samples won't help. So if, in iteration 17, the color obtained in a pixel is RGB = (223, 114, 6), and in iteration 18 it's (223, 115, 6), it's really not changing enough to matter, so why do more samples? And that's the basic idea behind convergence.
Of course, in a real app like Iray there are many optimizations and software decisions made that might affect whether the software decides it needs another sample for any particular pixel or not. For example, if half the pixels are still changing color significantly between samples, but the rest have negligible changes, why bother sending more rays for the non-changing ones? I've never looked inside the Iray code, so I have no idea what it actually does, but I suppose it's possible they may use some logic to stop sampling those pixels that have converged, and maybe use some of those GPU cores for other pixels?
BTW, I'm guessing that's why there are Min and Max Samples settings, but that may not be true.
And it also raises the question regarding very high rez images, where each pixel is only "covering" a tiny portion of the scene, and maybe doesn't need many samples to get quick convergence? Just a thought, I haven't really played with it.
That's not the only reason for iterations though, is it? In fact the main reason should be the fact that light rays scatter randomly when they hit particular surfaces, especially rough ones. You could shoot the same traced ray from the camera on the exact same spot on an object and it can bounce in a totally different direction to finding a totally different light source in the end. In fact, this is probably the main source of 'noise' on an image, since for every pixel you need an incredibly large amount of samples for it to average out with neighboring pixels when there is a lot of bouncing going on, usually in confined or dark spaces.
As for your second point, in the latest Iray patch notest we have this:
I'm not exactly sure if this is used in DAZ though.
Here is an image from Studio of a render with 1 sample. You can see reflections and shadows and lights that bounced and so on. So even with 1 sample each ray bounced around many times in the image. And that depends on the "Max Path Length" setting. In my case I have it set to -1, which is a lot of bounces.
As far as what is an iteration? Who knows? In the simple raytracer I wrote last year it iterated across all pixels and all samples. I have no clue what Iray calls an iteration, and I'm not sure it really matters. In software terms you need loops to iterate across 2 million pixels and however many samples you want. And each ray bounces as many times as you tell it to.
Shouldn't a ray bounce as many times as it can until it loses energy? A ray of light can't bounce around forever since some of its energy is absorbed each time it's reflected, especially by dark colors. That's why in a sample image of one iteration like yours should have a ton of dark pixels, since when the rays were shot at some of those pixels, they never found a source of light before losing their energy. And also why there are pixels that are almost pure white even though they've hit relatively dark areas since the rays of those pixels bounced around in exactly the way they needed to in order to hit a source of light directly.
But ultimately, I think it pretty much shoots a single ray for each pixel randomly per iteration, although in a seeded manner (otherwise you can't really explain why some fireflies always appear on the same spot when the image is rendered over and over again.)
I don't think the average (or even non-average) raytracer models/simulates light energy.
Then how come there are dark pixels (0,0,0 pixels) in a single-iteration render?
Just to be clear BTW, when I'm talking about light rays and energy, I'm not talking about the real-world physics of how light bounces, but rather how the renderer simulates it. When I say 'losing energy', I mean the average rate of absorption for a particular color, not the rate of kinetic energy loss.
It's not about energy, it's about material properties and how they're modelled. A ray might randomly hit a material that might cause some or all of the incoming rays to be attenuated or to not reflect (ie, they may bounce into the material, or do subsurface scattering, etc.), depending on how the material is modelled. The material design defines how the ray(s) respond when they hit the surface.
Apologies, I'm really not seeing what point you're trying to make, so I'll leave it at that.
...you know that iterations - by definition - consists of a finite number of pixel value samples in any given scene (that anyone can determine via a very simple math equation?) Then why would you be asking me to prove that to you?
Quite simply - because Iray's implementations of both time limit and convergence limit aren't designed to be very accurate in order to avoid negatively impacting overall rendering performance (convergence isn't caculated after each and every iteration during a render because it is a computationally costly thing to do. And time limit only stops Iray from starting additional iterations from being produced at its set point in time - it doesn't interrupt in-progress ones. Thereby leading to a situation where Set render time limit and true render time limit are never actually the same.) If Iray's developers had made different choices about how these factors were calculated, then things would be different. But since that isn't the case..
As we have already established, each and every iteration is always going to be composed of exactly the same number of pixel value samples in a given scene. With that said, no two iterations - even ones from the same scene render - will ever take exactly the same amount of time to render because the amount of time it takes for each and every pixel sample varies based on both scene content at that location and the physical processing characteristics of the processing unit being used to perform that sample's calculation for that iteration (Iray uses a pseudo-random number mechanism to decide which GPU SM/CPU core gets the task of calculating a given pixel's value on each iteration to aid in load balancing - you can read all about it here.) But this doesn't change the fact that each and every iteration on a scene is always made up of the same number of indidual pixel sample values. And therefore a consistent measure of overall scene render progress.
To rely on iterations, since that is the only one of these statistics that is both a form of discrete data and constantly adhered to as a fixed constant by Iray itself.
The test I designed has many control values and just the following two test values:
What I illustrated here about my test is that if you keep all of its control values constant and one of its test values constant (render device) the remaining test value is always the same to within a dozen or so milliseconds. Which means that the test itself is accurate to within a dozen or so milliseconds.
This is how you evaluate the accuracy of a test - keep all values constant except for one, and then take note of any variances across multiple runs. In an absolutely perfect test there will be zero variances. My test demonstrably isn't a perfect test (again, see here.) But it doesn't need to be. All it needs to be is accurate to below one of the more common levels of statistical significance - which it demosntrably is by several orders of magnitude.
Here is Iray's own documentation on what those two controls do:
Float32 progressive_rendering_quality = 1
A convergence estimate for a pixel has to reach a certain threshold before a pixel is considered converged. This attribute is a relative quality factor for this threshold. A higher quality setting asks for better converged pixels, which means a longer rendering time. Render times will change roughly linearly with the given value, so that doubling the quality roughly doubles the render time.
Float32 progressive_rendering_converged_pixel_ratio = 0.95
If the progressive rendering quality is enabled, this attribute specifies a threshold that controls the stopping criterion for progressive rendering. As soon as the ratio of converged pixels of the entire image is above this given threshold, Iray Photoreal returns the final result for forthcoming render requests. Additionally, the render call will return 1 in this case indicating to the application that further render calls will have no more effect. Note that setting this attribute to a value larger than the default of 0.95 can lead to extremely long render times.
Rendering Quality is what determines convergence and not Rendering Converged Ratio. Rendering Converged Ratio just controls how many pixels (as a percentage) in a scene's framebuffer need to have been determined by the Rendering Quality factor provided to be "finished" in order for the overall scene as a whole to be classified as "finished".
In order to dial in the closest thing possible to true 100% convergence, the values needed would be:
Which, as anyone still following this particular conversation thread can guess, would take a mighty long time.