Test if NVLink is working for vram pooling?

Illidanstorm · September 2019

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Daz Jack Tomalin · September 2019

https://www.pugetsystems.com/labs/support-hardware/How-to-Enable-and-Test-NVIDIA-NVLink-on-Quadro-and-GeForce-RTX-Cards-in-Windows-10-1266/

When I tried a few months back, while NVlink worked, it wasn't supported in Iray.

Illidanstorm · September 2019

Ok it says that NVLink is working if I understand this correctly, but I don't see anything about vram pooling?

20.09.2019 18:14:30

NVLink appears to be enabled and functioning on this system.

[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]

Device: 0, GeForce RTX 2080 Ti, pciBusID: 8, pciDeviceID: 0, pciDomainID:0

Device: 1, GeForce RTX 2080 Ti, pciBusID: 9, pciDeviceID: 0, pciDomainID:0

Device=0 CAN Access Peer Device=1

Device=1 CAN Access Peer Device=0

***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.

So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix

D\D 0 1

0 1 1

1 1 1

Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)

D\D 0 1

0 167.96 5.63

1 5.71 508.75

Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)

D\D 0 1

0 521.29 46.85

1 41.91 242.42

Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)

D\D 0 1

0 138.03 7.96

1 7.94 520.66

Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)

D\D 0 1

0 387.12 91.46

1 45.66 137.29

P2P=Disabled Latency Matrix (us)

GPU 0 1

0 3.17 131.71

1 139.34 3.22

CPU 0 1

0 3.42 76.37

1 85.06 3.59

P2P=Enabled Latency (P2P Writes) Matrix (us)

GPU 0 1

0 61.32 2.26

1 3.29 8.81

CPU 0 1

0 3.42 2.04

1 2.03 3.36

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Mattymanx · September 2019

I dont see why it would work in Iray since the way Iray works with dual cards is to send the scene to all the cards and have them render the scene at the same time.

RayDAnt · September 2019

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

Illidanstorm · September 2019

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

But how do I check if it works or not?

RayDAnt · September 2019

Mattymanx said:

I dont see why it would work in Iray since the way Iray works with dual cards is to send the scene to all the cards and have them render the scene at the same time.

Iray has been programmed from the get-go to take advantage of Nvidia's proprietary Unified Memory resource management system to whatever extent it happens to be supported with a particular hardware/OS/driver/GPU combination. And although VRAM pooling itself isn't currently supported on Windows systems (see my previous post) non-concurrent memory sharing between any GPU in a system and system memory up to the size of VRAM located on that card is both supported and working in Iray if you test for it. In essence, memory access in Iray is fully virtualized. It's just a matter of how flexible the Nvidia driver sitting under it can be in managing resources that determines whether or not each card needs to get a full duplicate copy of everything in a scene for Iray rendering to work.

RayDAnt · September 2019

Illidanstorm said:

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

But how do I check if it works or not?

Install Linux (afaik any version will do) on a separate drive.
Install the latest version of Nvidia's driver for Unix systems.
Grab a of copy of Iray Server (the first month's free) and install it.
Connect your two Turing GPUs together with a compatible NVLink bridge.
Launch/configure Iray Server.
From a separate computer (or VM) running Windows, launch Daz Studio and go to Render Settings > Advanced > Bridge [BETA] and configure it to point to your Iray Server installation.
Load up a scene that you know is too large to render without CPU fallback and either Cue it for rendering or (this is new for non-Quadro cards as of this April) enable it for streaming.
What you will (should) get is succssfull GPU rendering at marginally slower speeds than a single card alone, but exponentially faster than CPU fallback.
BONUS: If you're feeling really adventurous, load up a scene that is larger than ALL of your GPU's VRAM capacity combined and set it to render on Iray Server. What you will get is an extremely sluggish system and a significantly slower render than step #7, but still no CPU fallback. This is because Unified Memory also transparently enables OOC rendering/memory sharing between GPUs and CPUs on Linux (and actually even does so to a limited extent on current Windows.)

Illidanstorm · September 2019

I'd rather not install Linux because I don't know anything about it.

Is there no way to check it under Windows? I just want to check if it works. Maybe it works already without me noticing.

RayDAnt · September 2019

Illidanstorm said:

I'd rather not install Linux because I don't know anything about it.

Is there no way to check it under Windows? I just want to check if it works. Maybe it works already without me noticing.

Make a scene that is big enough to trigger CPU fallback mode when you attempt to render it with either/both of your GPUs enabled.
Install a compatible NVLink bridge between your two GPUs.
UNVERIFIED: Enable SLI functionality in the Nvidia Control Panel *
Re-attempt to render the scene with BOTH NVLinked GPUs enabled for rendering. The Nvidia driver should enact GPU Oversubscription to transparently enable VRAM pooling, thereby preventing CPU fallback from taking place.

* I would suggest trying both with and without this enabled. Technically speaking SLI functionality directly conflicts with Iray's self-contained programming model. However there have been rumors (stemming from the initial Turing release) that NVLink functionality was only really being exposed on Windows systems if SLI was enabled (since SLI on Turing cards is virtualized as just another data stream over the NVLink interconnect.) However however, the test results you posted here indicate pretty conclusively that this is not the case. At least they do if you didn't have SLI enabled at that point. If you did, I'd also suggest disabling it and then running the PugetSystems test again to see if it changes anything just to be sure/potentially lay some rumors to rest.

Hurdy3D · September 2019

I can confirm that it's not working on Windows 10.

I tried a a few days ago with the newest DAZ 4.12 Beta.

Just created a Scene which is slightly bigger than 11GB and iray went into CPU mode.

Illidanstorm · September 2019

It won't work for me either, but it's no problem I guess.
I had to put in 5 stonemason enviroments, 6 HD figures and a car in there to even reach over 11 GB.

I was just curious

Hurdy3D · September 2019

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

@RayDAnt

Do you know if this issue has to be fixed by Microsoft or Nivida?

If it's Microsoft whe are doomed ;)

Asari · September 2019

Illidanstorm said:
It won't work for me either, but it's no problem I guess.
I had to put in 5 stonemason enviroments, 6 HD figures and a car in there to even reach over 11 GB.

I was just curious

Wow! Thank you for testing this it's certainly interesting!

Sevrin · September 2019

gerster said:

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

@RayDAnt

Do you know if this issue has to be fixed by Microsoft or Nivida?

If it's Microsoft whe are doomed ;)

Nvidia has no incentive to fix it because that would reduce the market for Titans, and Microsoft's gonna be like ¯\_(ツ)_/¯

Hurdy3D · September 2019

Sevrin said:

gerster said:

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

@RayDAnt

Do you know if this issue has to be fixed by Microsoft or Nivida?

If it's Microsoft whe are doomed ;)

Nvidia has no incentive to fix it because that would reduce the market for Titans, and Microsoft's gonna be like ¯\_(ツ)_/¯

Why do they offer this feature if they have no interest in it?

kyoto kid · September 2019

...good question.

Although I do admit the cost differential between two 2080Ti's (about 1,100$ each not counting applicable sales tax/VAT) with the bridge (additional 79$) compared to a single Titan RTX is not all that much, and you get an additional 2 GB of VRAM as well as the ability to switch off Windows WDDM.

True you don't get double the core count for rendering speed, but then you aren't driving two 250w cards either (the Titan RTX draws at peak 280w).

RayDAnt · September 2019

gerster said:

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

@RayDAnt

Do you know if this issue has to be fixed by Microsoft or Nivida?

If it's Microsoft whe are doomed ;)

It's a Microsoft issue, I'm afraid.

At the risk of oversimplyfing things, the underlying source of the problem lies in both Windows' and OS X's implementations of page faulting. Normally ,operating systems only support page faulting from the CPU perspective (since OSs are designed to run on CPUs.) However, in order for VRAM pooling (GPU Ovesubscription) to work there also needs to be support for page faulting from the GPU perspective. Consequently memory sharing on Nvidia GPUs is dependent upon Nvidia being able to implement their own extension to the operating system's default page faulting mechanism.

On Linux (an open-source operating system ecosystem) this is easy for Nvidia to overcome by just including whatever additional operating system code is required as part of its driver package. However on OSs like Windows and OS X (closed-source) this is a total no-go. Microsoft/Apple are the sole arbiters of what code does/doesn't go into core OS functionality like page faulting. Hence the current lack of functionality despite all the individual components (drivers, hardware interconnects) being present.

Hurdy3D · September 2019

RayDAnt said:

gerster said:

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

@RayDAnt

Do you know if this issue has to be fixed by Microsoft or Nivida?

If it's Microsoft whe are doomed ;)

It's a Microsoft issue, I'm afraid.

At the risk of oversimplyfing things, the underlying source of the problem lies in both Windows' and OS X's implementations of page faulting. Normally ,operating systems only support page faulting from the CPU perspective (since OSs are designed to run on CPUs.) However, in order for VRAM pooling (GPU Ovesubscription) to work there also needs to be support for page faulting from the GPU perspective. Consequently memory sharing on Nvidia GPUs is dependent upon Nvidia being able to implement their own extension to the operating system's default page faulting mechanism.

On Linux (an open-source operating system ecosystem) this is easy for Nvidia to overcome by just including whatever additional operating system code is required as part of its driver package. However on OSs like Windows and OS X (closed-source) this is a total no-go. Microsoft/Apple are the sole arbiters of what code does/doesn't go into core OS functionality like page faulting. Hence the current lack of functionality despite all the individual components (drivers, hardware interconnects) being present.

Interesting, thank you!

If nvidia is really interested into giving us memory pooling with gForce cards, why they didn't do the same as with Quadro cards, which seems to support memory pooling on Windows?

Are there any information, if nvidia working togehter with Microsoft to get memory pooling for gForce cards to work?

RayDAnt · September 2019

gerster said:

RayDAnt said:

gerster said:

RayDAnt said:

Illidanstorm said:

So I've got a second 2080 ti now and I'm wondering how I can check if NVlink is pooling the vram together in iray?

Memory pooling between devices in a system is both supported and (demonstrably) working in current versions of Iray. However, due to the way current versions of Windows and OS X manage dedicated video memory at a sub oem driver level, support for VRAM pooling via NVLink (aka GPU oversubscription) on both of these platforms is currently broken. From the current Cuda programming guide:

GPUs with SM architecture 6.x or higher (Pascal class or newer) provide additional Unified Memory features such as on-demand page migration and GPU memory oversubscription that are outlined throughout this document. Note that currently these features are only supported on Linux operating systems. Applications running on Windows (whether in TCC or WDDM mode) or macOS will use the basic Unified Memory model as on pre-6.x architectures even when they are running on hardware with compute capability 6.x or higher. See Data Migration and Coherency for details.

As things stand, the only path to successfully getting VRAM pooling to work as expected with Iray is to use Iray Server on Linux.

@RayDAnt

Do you know if this issue has to be fixed by Microsoft or Nivida?

If it's Microsoft whe are doomed ;)

It's a Microsoft issue, I'm afraid.

At the risk of oversimplyfing things, the underlying source of the problem lies in both Windows' and OS X's implementations of page faulting. Normally ,operating systems only support page faulting from the CPU perspective (since OSs are designed to run on CPUs.) However, in order for VRAM pooling (GPU Ovesubscription) to work there also needs to be support for page faulting from the GPU perspective. Consequently memory sharing on Nvidia GPUs is dependent upon Nvidia being able to implement their own extension to the operating system's default page faulting mechanism.

On Linux (an open-source operating system ecosystem) this is easy for Nvidia to overcome by just including whatever additional operating system code is required as part of its driver package. However on OSs like Windows and OS X (closed-source) this is a total no-go. Microsoft/Apple are the sole arbiters of what code does/doesn't go into core OS functionality like page faulting. Hence the current lack of functionality despite all the individual components (drivers, hardware interconnects) being present.

Interesting, thank you!

If nvidia is really interested into giving us memory pooling with gForce cards, why they didn't do the same as with Quadro cards, which seems to support memory pooling on Windows?

The support (or lack thereof) situation between Windows and Nvidia drivers is the SAME across all GPU product skews - GeForce, Titan, Quadro and Tesla. Hence why current Nvidia pre-built computing/rendering systems like the RTX Server only offer Linux as an included operating system option.

Are there any information, if nvidia working togehter with Microsoft to get memory pooling for gForce cards to work?

None, I'm afraid. Microsoft has negative incentive to adopt proprietary code from Nvidia for a core component of its own operating system since A) they'd have to pay Nvidia for that, and B) it would put overall operating system stability in the hands of a single sub-component device manufacturer. At the same time, Nvidia has little incentive to make their own proprietary driver code open-source and royalty-free (what it would need to be for it to be worth Microsoft adopting) since A) that would hamper their ability to continue developing new hardware closely coupled to software, and B) they already have a perfectly viable distribution path in Linux. Which also happens to be the operating system of choice for about 95% of the software in which GPU Oversubscription functionality currently matters anyway.

Unfortunately for us Windows/Mac based Daz Studio (which is to say Iray), users we are an extreme edge-case.

Hurdy3D · September 2019

looks like that v-ray implemented memory pooling for non quadro cards

https://www.chaosgroup.com/blog/profiling-the-nvidia-rtx-cards

maybe iray could do the same?

RayDAnt · September 2019

gerster said:

looks like that v-ray implemented memory pooling for non quadro cards

https://www.chaosgroup.com/blog/profiling-the-nvidia-rtx-cards

maybe iray could do the same?

Iray has already implemented it via interoperability with Nvidia's enhanced Unified Memory architecture. Imo the chances of them going through the trouble of developing (from scratch) a second, entirely different way to achieve the same thing at likely significantly worse levels of performance (code at the driver level always trumps code at the application level) are very slim.

Hurdy3D · September 2019

Jack Tomalin said:

https://www.pugetsystems.com/labs/support-hardware/How-to-Enable-and-Test-NVIDIA-NVLink-on-Quadro-and-GeForce-RTX-Cards-in-Windows-10-1266/

When I tried a few months back, while NVlink worked, it wasn't supported in Iray.

@Jack Tomalin

What about migration your iRay Server from Windows to Linux, for Memory Pooling?

outrider42 · September 2019

RayDAnt said:

gerster said:

looks like that v-ray implemented memory pooling for non quadro cards

https://www.chaosgroup.com/blog/profiling-the-nvidia-rtx-cards

maybe iray could do the same?

Iray has already implemented it via interoperability with Nvidia's enhanced Unified Memory architecture. Imo the chances of them going through the trouble of developing (from scratch) a second, entirely different way to achieve the same thing at likely significantly worse levels of performance (code at the driver level always trumps code at the application level) are very slim.

That doesn't make a lot of sense to me. They just reworked Iray for RTX, but you are saying they didn't bother reworking memory so it could be pooled? That is extremely BAD for the future of Iray. It doesn't matter what the reason is. This is a market filled with competition, and the fact is the competition is taking all of the necessary steps to evolve. With Iray being so pitifully limited to the strict amount of VRAM available on single GPUs, why on earth would anybody choose Iray over other rendering options that allow the user to easily combine VRAM between two GPUs? Iray might make pretty people, but it doesn't matter if the scene drops to CPU only mode or doesn't render at all if there is no CPU fallback. Not to mention that CPU only mode is using Intel Embree in Daz. If Nvidia ever wants Iray to catch on...this isn't going to help. This is what people suggesting Nvidia wants users to buy the larger VRAM Quadros instead of pooling VRAM overlook. This market is extremely competitive and constantly changing.

Vray has supported VRAM pooling for a long time, and they have memory pooling working on every GPU that can use Nvlink. Not only that, but Chaosgroup has VRAM pooling working in Windows, not just Linux. This would seem to prove to me that this is not really a Windows issue, it is quite clearly a Nvidia issue. Also, when Vray uses the Nvlink, they only lose about 12% in performance. Of course, considering you are able to render larger scenes on GPU, that means still rendering way faster than CPU only.

I found a 3rd party site that did some Vray benchmarks with Nvlink. They benched some of the biggest GPUs on offer, including the monster RTX 8000 with 48GB a pop. That means two of these combined can handle scenes that almost 96GB in size. BTW, they did this on a Windows machine.

https://blog.boxx.com/2019/08/20/v-ray/

RayDAnt · September 2019

outrider42 said:

RayDAnt said:

gerster said:

looks like that v-ray implemented memory pooling for non quadro cards

https://www.chaosgroup.com/blog/profiling-the-nvidia-rtx-cards

maybe iray could do the same?

Iray has already implemented it via interoperability with Nvidia's enhanced Unified Memory architecture. Imo the chances of them going through the trouble of developing (from scratch) a second, entirely different way to achieve the same thing at likely significantly worse levels of performance (code at the driver level always trumps code at the application level) are very slim.

That doesn't make a lot of sense to me. They just reworked Iray for RTX, but you are saying they didn't bother reworking memory so it could be pooled?

Iray's developers already re-worked memory so that it could be pooled prior to January of 2017 with the release of Iray 2017.1 beta, build 296300.616 (skip to page 24.)

outrider42 said:

That is extremely BAD for the future of Iray. It doesn't matter what the reason is. This is a market filled with competition, and the fact is the competition is taking all of the necessary steps to evolve.

Nvidia - with Iray - already evolved in this direction years prior to the competition. They just did it using the following use-case model:

Windows thin-client/workstation (running Iray plugin compatible application eg. Daz Studio, 3DS Max, Maya, etc.) >>> LAN connection >>> Linux server (running Iray Server)

Which - while sucky for small time rendering operators like you and me who can't necessarily afford dedicated IRay Server boxes - works directly into the primary use case of Iray on the enterprise side of things. And since the major bankrollers behind Iray's continued deveopment over the years have been companies like Honda (primarily for use with in-house R&D) I think it's safe to say that what Iray use-cases would work best for you and me hasn't exactly been much on the Iray development team's radar. Although I wouldn't be too surprised to see that change in the not-so-distant future as memory eating hi-def textures for EVERYTHING gradually become the norm in order to meet the raw rendering firepower of RTX+ generation hardware.

outrider42 said:

If Nvidia ever wants Iray to catch on...

Iray has long since already caught on with major players in the professional graphics rendering world. Here's a pull-quote from the Honda link found above (which was originally published 4 years ago right around the time Nvidia would've been debuting early memory pooling support in Iray to its corporate clients including Honda):

Honda prides itself on creating quality cars that are both economical and stylish. Honda's engineers are actually so dedicated to the details of their automotive designs that they've taken to using Iray as the gold standard for verifying what their cars will look like when they roll off the assembly line and into the real world. But aside from concentrating on the curvature of each panel, Honda uses 38 GPU servers to dig deep into its designs to discover designer flaws that most companies might miss.

What does that mean?

In one of their recent models, Honda's engineers noticed that the housing in their brake lights wasn’t behaving as it should. When the brake light was engaged, the light from the brake light chamber would leak into other areas of the rear light housing, creating alight spillage that was unsightly and unacceptable. After noticing the light leak, Honda’s engineers took a deeper look at the problem area by creating a few more renders. With those renders in hand, Honda's design team went back to its model and fixedits brake light housing flaw before it was shipped out for tooling. Because Iray accurately predicted the final results of a design,Honda was able to catch a flaw that could have cost the company tens if not hundreds of thousands of dollars in faulty parts and manufacturing delays.

Iray is not only a tool for visual artists and the marketing team. Its photo real rendering capabilities can be a powerful design evaluation tool that can save time and money while also leading to better designs.

Although it can seem that Daz Studio users are large stakeholders in the development path that Iray takes (just do a Google search for Iray and see how much of those results involve Daz Studio), the reality is that we are only a very small player in all of this (especially monetarily speaking.)

outrider42 said:

I found a 3rd party site that did some Vray benchmarks with Nvlink. They benched some of the biggest GPUs on offer, including the monster RTX 8000 with 48GB a pop. That means two of these combined can handle scenes that almost 96GB in size. BTW, they did this on a Windows machine.

The thing is, if a design entity has the funds to afford two RTX 8000s (circa $11,000) they most likely also have the money to afford a pre-built RTX Server (c. $20,000.) Which, in addition to things like memory pooling support for up to 224GB sized scenes (system memory also counts for GPU rendering with Iray on Linux) would offer out-of-the-box handling of all rendering needs for an entire team of designers simultaneously rather than just a single person. And I think it's safe to say that any 3D design house with 10k+ to throw around on a single machine computing solution likely has more than one employee. This is the crux of the issue.

kyoto kid · September 2019

...so what about Otoy?. They were also working at getting memory pooling to work for Octane4.

outrider42 · September 2019

I'm sorry, but its going take a lot more than Honda to persuade me differently. The fact that Googling Iray leads to countless Daz related things, and very little else, says it all. If Iray can pool that much memory with GPUs and system memory, then tell me why Iray hasn't become a defacto standard for Hollywood, the place where that kind of memory and GPU acceleration would be quite welcomed? Rather, they are using just about anything else BUT Iray. That includes Vray, which saw action rendering some backdrops for movies like Black Panther, why didn't they use Iray for that purpose? Go to almost any 3D related site, and while you might find some bits about Iray, its not much. Instead, Daz, and nearly exclusively Daz, is the place to find any information at all about Iray. Even Nvidia's own site doesn't have as much info as Daz, nor is it nearly as active. Like I have said before, Iray is a niche inside a niche.

And I rather think that the Hollywood backed firms would have the cash to get one of those servers. Where is Iray in Hollywood? The render engine that Iray has largely superceded, 3Delight, was featured in numerous Hollywood movies, but Iray gets no such love. After all, CEO Huang's entire push for GPU is the cost ratio to performance and space over CPU server farms. My comment still stands. I Forgot the Quadro RTX8000 is now selling for $5500, it used to be $8000. So with that price, you can build the same server for $18K quite easily just using pcpartpicker. https://pcpartpicker.com/list/8hd7YH ; I didn't include the Nvlink, but the price I just linked is only $17.5K, a Quadro Nvlink is only $80 on Nvidia's site. Its not going to dent the budget. And that also means that the server you linked would have cost a lot more when the Quadro RTX8000 was its original price.

If you want to bring up the point that Hollywood doesn't want true realism in movies, and that Iray cannot really do the magical sparkles and fantasy stuff well...that's not my problem. And frankly that is pretty big negative considering that Daz is about artist, not architecture or cars. If anything, it is really odd that Daz chose a rending engine that is better suited for architecture when so many people here love magic and fairies. And it doesn't even have to be magic. Iray straight up sucks at rendering something as basic as fire! Fire, cigars, burning embers, Iray has trouble with these things and most of us have to resort to cheap tricks to render them. Which isn't all that real, then. You wont see a lit campground fire in a Honda render! Video games can do that better.

Also, I don't see how one of the many other physically based render engines could not have helped Honda the exact same way they talk about Iray helping them. Its just math.

And besides, competition isn't just about the big fish. Its about the smaller ones, too. You cannot just ignore an entire market segment. That server you mentioned? Nvidia only supplied the GPUs, they are not getting a dime for the other components, so whether you buy Nvidia GPUs by themselves or installed in giant prebuilt servers, Nvidia is still making the same general cut and profit. You dropped a couple grand on a Titan RTX, the Quadro RTX8000 by comparison isn't really that much more expensive considering the VRAM is quadruple the Titan RTX. We got all sorts of people here in this forum who have 1, 2, or more cards that of 2080ti caliber or more. We got a guy sporting $3000 Titan V's. The hobbyist and gaming market is pretty big, that's where Nvidia made their bread for most of their company's history. That has changed in recent years as Nvidia has pushed towards AI and workstations, however, the company still has gaming and hobbyist to support. And if they don't, the market can flip on a dime. Gamers have been very agitated about RTX, primarily the price. Things have been so bad that AMD is actually catching up to Nvidia in some markets with its 5700XT. To be blunt, that should not be happening. AMD was practically on life support for several years, even as Ryzen took off in the CPU space, AMD had nothing to compete with Nvidia in the GPU space...and really, they still don't. Nvidia still has the fastest cards on the market. Nvidia cannot afford to go around and get cocky.

And at the end of the day, its still on Nvidia Iray for making a solution that didn't ultimately work in Windows like it should. Other engines did.

RayDAnt · September 2019

outrider42 said:

I'm sorry, but its going take a lot more than Honda to persuade me differently.

Ok. Add to that:

BMW
Daimler Chrysler
Airbus
Boeing
Lockheed Martin

In 2016 Honda alone spent upwards of 6 billion dollars just on research & development. I think it is fair to say that what companies like Honda choose to rely on for 3D visualization tools is impactful on a cross-industry level.

outrider42 said:

The fact that Googling Iray leads to countless Daz related things, and very little else, says it all.

All it really says is that Daz Studio is the most popular application of Iray in the public domain. Iray is a heavily favored rendering solution - just primarily for in-house pre-visualization and production work. Speaking of which:

outrider42 said:

If Iray can pool that much memory with GPUs and system memory, then tell me why Iray hasn't become a defacto standard for Hollywood, the place where that kind of memory and GPU acceleration would be quite welcomed?

Iray/Mental Ray has been a de facto standard in the Hollywood production world for almost 30 years. From a 2003 Mental Ray (Iray pre-Nvidia's buyout) press release:

"The Academy Award is a wonderful recognition of our work", said Thomas Driemeyer, the chief architect of mental ray . "This is an appreciation of the increasing relevance of our fundamental research and software development for the movie industry," added Rolf Herken, President and Director of R&D at mental images. "The film industry is becoming increasingly dependent on technology to achieve ambitious creative goals, and mental ray is well positioned to become the rendering software of choice among Hollywood's leading studios."

As innovative technologies further influence the movie industry, mental ray has become an indispensable tool for many filmmakers, almost as necessary as a camera. The world's leading cinematographers use mental ray to generate images of unsurpassed realism in visual effects and with films that are produced entirely using digital techniques. mental ray was initially released in 1989 and has been used in the production of over 100 major motion pictures such as:

"Harry Potter and the Chamber of Secrets"
"The Matrix" (including its 2003 sequels "The Matrix Reloaded" and "The Matrix Revolutions")
"Spider Man"
"Star Wars: Episode II - Attack of the Clones"

as well as in numerous earlier motion pictures including "AI - Artificial Intelligence", "Jurassic Park III'," "Panic Room," "Fight Club," and "The City of Lost Children."

From another later press release - this time in conjunction with Industrial Light & Magic on making the movie Poseidon:

To achieve realistic and life-like effects ILM used mental ray to have complete control over lighting and design of the entire vessel. ILM worked closely with the mental images's engineers to develop proprietary algorithms specifically designed for the project. One of the most difficult, but important challenges for the team was to create the natural light effects that would integrate the ship with the live actors. By utilizing the global illumination algorithms in mental ray the artists were able to create the lighting that makes the ship appear real to the human eye.

Some more recent films/franchises in which Iray (post Mental Ray) has played a signfiicant role:

Star Trek (J.J. era)
Ironman 1, 2, 3
Jurassic World
Oz the Great and Powerful

outrider42 said:

That includes Vray, which saw action rendering some backdrops for movies like Black Panther, why didn't they use Iray for that purpose?

Did you know that core components of Vray like its application of QMCs is actually just a part of Iray used under license? At the end of the day, all photoreal renderers are striving to achieve the same thing. And which ones borrow what code from whom can tell a very interesting story.

kyoto kid · September 2019

....umm the Titan RTX has 24 GB of VRAM, the Quadro 8000 48 GB which is only double, not quadruple that.

https://www.nvidia.com/en-us/titan/titan-rtx/#specs

pandapenguingames · June 2020

late to the party, just leaving this here if anyone stumbles across this thread

working NVLink memory pooling in DAZ on Win10 with 2x2080TI
https://www.daz3d.com/forums/discussion/comment/5701921/#Comment_5701921

Notifications

Test if NVLink is working for vram pooling?

Comments

Adding to Cart…