OT: New Nvidia Cards to come in RTX and GTX versions?! RTX Titan first whispers.

nicstt · September 2018

Takeo.Kensei said:

ebergerly said:

I think there are a few misconceptions going on there. For example, VRAM stacking is more about CUDA and Optix, not Iray. And certainly not the DAZ Studio Iray.

No misconception; and you forget about the TCC. Iray is built on Cuda and Optix. No TCC = no mem stacking with Optix = no mem stacking in Iray (it's the same for specific features in Cuda which need TCC)

Anyway I don't think we should go on speculating. Just have to wait for the cards to be out and an updated version of DS Iray

And to end the debate here is a snip from the Cuda 9.2 Doc with important points in red

9.1.4. Unified Virtual Addressing

Devices of compute capability 2.0 and later support a special addressing mode called Unified Virtual Addressing (UVA) on 64-bit Linux, Mac OS, and Windows XP and on Windows Vista/7 when using TCC driver mode. With UVA, the host memory and the device memories of all installed supported devices share a single virtual address space.

Prior to UVA, an application had to keep track of which pointers referred to device memory (and for which device) and which referred to host memory as a separate bit of metadata (or as hard-coded information in the program) for each pointer. Using UVA, on the other hand, the physical memory space to which a pointer points can be determined simply by inspecting the value of the pointer using cudaPointerGetAttributes().

Under UVA, pinned host memory allocated with cudaHostAlloc() will have identical host and device pointers, so it is not necessary to call cudaHostGetDevicePointer() for such allocations. Host memory allocations pinned after-the-fact via cudaHostRegister(), however, will continue to have different device pointers than their host pointers, so cudaHostGetDevicePointer() remains necessary in that case.

UVA is also a necessary precondition for enabling peer-to-peer (P2P) transfer of data directly across the PCIe bus for supported GPUs in supported configurations, bypassing host memory.

See the CUDA C Programming Guide for further explanations and software requirements for UVA and P2P.

The P2P is also an important feature to get quick communications https://developer.nvidia.com/gpudirect

Good luck to manage a multi GPU system's memory without NVDIA's API (read further the doc about coherency and other problems)

nicstt said:

sudo code

var isRtx = true;

if (isRtx){

DisableSharedRam();

}

a simple flag could be used to enable and disable features; the software can easily determine what card its running on; I'm not saying either way what they'll do; they could enable RAM pooling on RTX cards, but as has been said, they do have different markets. WDDM is responsible for the currently situation with missing RAM; Nvidia could disable that 'feature' I understand, but haven't.

I'll correct something there : It"s not about using a RTX or not. It's about running in TCC mode or not

var TCCactive = true;

if not (TCCactive){

DisableSharedRam();

}

But it"s more complex than that. The point is that memory should be stacked even with older GFX Card even with an updated software provided the configuration is correct

Should - by what definition?

They aren't is all that currently matters; they might be in the future, 'should' may or may not have a bearing on that outcome.

It's 'pretend-code' no correction was necessary; it's purpose was to get a point accross. It did that as it was a demonstation of how easy it is to add flags to disable or enable functionality... Well relatively easy as we're talking about coding, which can be fun.

bluejaunte · September 2018

ebergerly said:

bluejaunte said:

Reasonably sure. It's like any other "offline" renderer giving you a progressive preview, starting with a really rough, noisy approximation of the render and then it gradually clears up. Until you move the viewport again and it starts over. This isn't realtime, if it were it would render at like 30 to 60 frames per second and the finnished render would appear instantly.

Check this out...

https://www.youtube.com/watch?time_continue=23&v=l-5NVNgT70U

Looks cool. Denoising can be really nice when it works, but it has to keep fine details like skin pores, tiny "noisy" details in tact. I have to yet see that in action. It won't help me much when everything gets a clean flat look and then I'll have to let the render run until the detail is back again which is just as long as without denoising. I hope AI can really do this intelligently.

kyoto kid · September 2018

...general rule of thumb is twice the GPU memory. That means for two RTX8000's, 192 GB would be sufficient which is doable even under W7 Pro (W8.1 Pro will support up to 512 GB).

Not saying it wouldn't be an expensive system, but memory prices are falling and expected to fall by 25% next year as more fabrication plants come online. Using a slightly older series Xeon or i7 CPU (pre Kaby Lake) will also help with the cost.

Imagine though, having near real time rendering even on big high quality jobs.

Crikey, 32 GB of VRAM is nothing to sneeze at, and that is doable for around 8,000$.

The bottom line from all this is it appears that GeForce cards will continue to be limited to under 12 GB and there will be no memory stacking available to us as they don't support TCC. The only alternatives then are building a high memory "render farm in a box" with a couple older Xeon CPUs (still much faster than what I currently have), or find a way to scrape up the funds for those two RTX5000s the link bridge, and a system built around that.

ebergerly · September 2018

Takeo.Kensei said:

9.1.4. Unified Virtual Addressing

Devices of compute capability 2.0 and later support a special addressing mode called Unified Virtual Addressing (UVA) on 64-bit Linux, Mac OS, and Windows XP and on Windows Vista/7 when using TCC driver mode. With UVA, the host memory and the device memories of all installed supported devices share a single virtual address space.

FWIW, Unified Virtual Addressing (UVA) is a very old technology that's been pretty much replaced by Unified ("managed") Memory since way back in CUDA 6 days. I'm now running CUDA 9.2, but CUDA 10 is being released with the RTX stuff.

I've mentioned Unified Memory before, and the CUDA "cudaMallocManaged" function to allocate memory, and use all the GPU VRAM and system RAM as one big chunk of RAM. And it's one reason I wouldn't be surprised if the GeForce RTX cards with their NVLink might be able to stack VRAM, perhaps independent of TCC, as was mentioned by NVIDIA previously.

Out of curiosity I fired up some of my sample CUDA code in Visual Studio, and they have a very nice CUDA app that tells you just about everything about your GPU(s). And I found there's a way to determine if your GPU(s) support Managed (Unified) Memory. So I tweaked the code to check whether my 1080ti and 1070 support Managed Memory. And it turns out they do.

Note down in the very bottom of the attached list for my 1080ti is a "Device supports Managed Memory:" entry, which says "Yes", and the CUDA device driver mode is WDDM.

Now I'm not sure if this actually means there's a way to do stacked memory, and I'd have to dive much deeper to look into the inner workings of cudaMallocManaged to see if I can make one big stacked VRAM (even if it's over PCIe).

ebergerly · September 2018

kyoto kid said:

...general rule of thumb is twice the GPU memory. That means for two 8000's, 192 GB which is doable even under W7 Pro (W8.1 Pro will support up to 512 GB).

FWIW, on my big scenes I get 3 times. 10GB in GPU, 30+GB in system RAM.

kyoto kid · September 2018

...depends on how you optimise your scenes.

ebergerly · September 2018

kyoto kid said:

The bottom line from all this is it appears that GeForce cards will continue to be limited to under 12 GB and there will be no memory stacking available to us as they don't support TCC.

I don't suppose it would help to repeat what NIVIDIA said :

"With Quadro RTX cards, NVLink combines the memory of each card to create a single, larger memory pool. Petersen explained that this would not be the case for GeForce RTX cards. The NVLink interface would allow such a use case, but developers would need to build their software around that function."

So NVLink with GeForce cards would allow memory to be combined, but developers would need to build their software around that function.

kyoto kid · September 2018

...if GeForce cards will not support TCC like Tesla and the new Quadro cards, I don't see how that would be possible without maybe programming some form of TCC emulation or hacking the drivers to do so.

ebergerly · September 2018

An interesting quote from guruof3d a few years ago by a rep from AMD regarding DirectX 12's ability to stack VRAM;

"Robert Hallock (Head of Global Technical Marketing at AMD) shared something interesting on Twitter earlier on. You guys know that when your have a GPU graphics card combo with dual-GPUs that the memory is split up per GPU right ?

Thus an 8GB graphics card is really 2x4GB. As it stands right now, this will be different when DirectX 12 comes into play and apparently already is with Mantle. Basically he states that two GPUs finally will be acting as ‘one big’ GPU. Here's his complete quote:

Mantle is the first graphics API to transcend this behavior and allow that much-needed explicit control. For example, you could do split-frame rendering with each GPU ad its respective framebuffer handling 1/2 of the screen. In this way, the GPUs have extremely minimal information, allowing both GPUs to effectively behave as a single large/faster GPU with a correspondingly large pool of memory.

Ultimately the point is that gamers believe that two 4GB cards can’t possibly give you the 8GB of useful memory. That may have been true for the last 25 years of PC gaming, but thats not true with Mantle and its not true with the low overhead APIs that follow in Mantle’s footsteps. – @Thracks (Robert Hallock, AMD)

There is a catch though, this is not done automatically, the new APIs allow memory stacking but game developers will need to specifically optimize games as such. An interesting statement."

So, yet another indication that the capability may be available, but requires developers to jump thru hoops to implement. Again, the devil is in the details, and DirectX 12 is a bit of a different issue, but at least I think this gives some reason not to jump to conclusions at this point.

nicstt · September 2018

bluejaunte said:

ebergerly said:

bluejaunte said:

Reasonably sure. It's like any other "offline" renderer giving you a progressive preview, starting with a really rough, noisy approximation of the render and then it gradually clears up. Until you move the viewport again and it starts over. This isn't realtime, if it were it would render at like 30 to 60 frames per second and the finnished render would appear instantly.

Check this out...

https://www.youtube.com/watch?time_continue=23&v=l-5NVNgT70U

Looks cool. Denoising can be really nice when it works, but it has to keep fine details like skin pores, tiny "noisy" details in tact. I have to yet see that in action. It won't help me much when everything gets a clean flat look and then I'll have to let the render run until the detail is back again which is just as long as without denoising. I hope AI can really do this intelligently.

All I've seen so far is it is useless for skin at any noticeable reduction in time. Distance, and toony look might be ok.

bluejaunte · September 2018

nicstt said:

bluejaunte said:

ebergerly said:

bluejaunte said:

Reasonably sure. It's like any other "offline" renderer giving you a progressive preview, starting with a really rough, noisy approximation of the render and then it gradually clears up. Until you move the viewport again and it starts over. This isn't realtime, if it were it would render at like 30 to 60 frames per second and the finnished render would appear instantly.

Check this out...

https://www.youtube.com/watch?time_continue=23&v=l-5NVNgT70U

Looks cool. Denoising can be really nice when it works, but it has to keep fine details like skin pores, tiny "noisy" details in tact. I have to yet see that in action. It won't help me much when everything gets a clean flat look and then I'll have to let the render run until the detail is back again which is just as long as without denoising. I hope AI can really do this intelligently.

All I've seen so far is it is useless for skin at any noticeable reduction in time. Distance, and toony look might be ok.

Yeah. In theory the AI could go in and really analyze the scene and textures, find out what noise should be left alone. Don't know if/when this will happen. Until then I think we might have a new case of too clean CG look.

ebergerly · September 2018

Y'know I've been thinking about this VRAM stacking issue, and honestly I can't figure out what the big issue is. In its simplest form, from a programmers perspective RAM is nothing more than addresses and memory contents. So the VRAM on GPU1 has a bunch of memory addresses, and in each address there's some data. Same for GPU2. So as long as you know the addresses of all the combined GPU VRAM, it shouldn't be that big a deal to make them all one.

Except for the fact that the two GPU's are separated by a relatively slow PCIe bus. So if GPU1 wants scene data from GPU2 cuz part of the scene is located over there, it has to grab the data over the PCIe bus, but for other data its on the fast, local VRAM. So then you have to stop and wait, which will slow everything down.

But if you have a high speed NVLink to bypass that, then almost by definition it's all one VRAM, tied by a high speed NVLink. So it seems like it's just a matter of CUDA assigning all the VRAM of the combined GPU's as one big chunk of VRAM (aka, Unified Memory), and the software that's talking to CUDA (like Optix) doesn't care, because CUDA is handling the interface. And CUDA already has the Unified Memory model where everything is treated like one big chunk of memory, so I'm not sure what the problem is.

The more I think about this, the more it looks like there is some enabling/disabling of features that makes VRAM stacking a challenge on lower end cards. I'm in the middle of reading a book on CUDA, so maybe I'll give it a try and see if I can stack some VRAM.

Has anyone figured out what the issue is, and what I'm missing?

kyoto kid · September 2018

...well for now "out of the box", memory pooling is a "no go" unless you have a couple Volta or RTX Quadros and the appropriate NVLink bridge(s).

They walked that back somewhat saying it could be "possible", but only if the software itself (the render engine and possibly the actual graphics production programme used) is made compatible (still have trouble seeing it happening without TCC). Now Daz3D is a small company with a small development staff compared to say Autodesk or Adobe. It is primarily focused on their flagship programme, as well as a (much needed update of Hexagon. Furthermore the version of Iray embedded in Daz is not the full featured one that you can get directly from Nvidia (just as with 3DL). To use the standalone version, there also needs to be an plugin or bridge with Daz like for Octane or LuxRender. That doesn't exist, unless of course you are a programmer well versed in C++ and have access to both SDKs along with experience in the CUDA graphics language to tie everything together.

Kevin Sanderson · September 2018

To my knowledge from other threads, DAZ has no ability to change Iray. That's totally up to NVIDIA. DAZ only pops the Iray code into Studio and gets it working with Studio. Soon iClone users will have Iray, but users have to buy the plugin.

Mendoman · September 2018

Maybe it's just me, but I don't really get what's the big deal about NVLink memory pooling. People have been arguing about it for the last 5 pages or so, but to be able to use that, you'd actually have to buy 2 of those expensive RTX cards... and what I've been reading, very few seem to be willing to buy even one. If somebody here really needs 22GB VRAM for their scenes, I think there's lots of optimization to be done in their scenes. If they render 20k X 20k resolution posters or something, and need every possible normal map etc. in their scenes, then in my opinion it is already professional level work. Since Geforce cards are not professional level cards, it's kinda silly to blame them for not being able to do professional level job. Quadros etc are meant for that kind of work.

Sure, it would be super if we could stack 2080ti cards, but I won't loose my sleep if we can't, since I have zero intention to buy a second card.

ebergerly · September 2018

I agree Mendoman. My interest is that maybe the new CUDA 10 which is coming soon with RTX might conceivably allow memory stacking with existing GeForce cards. That would be nice. Not sure I really need it, but cool nonetheless.

ebergerly · September 2018

Oh wait.....DOH!!! Existing cards don't have NVLink. Yeah, you're right. Nevermind....

kyoto kid · September 2018

...I am one of those (as I've mentioned) who does need high quality at large resolution for fine art printing. While yes a more professional approach (as I used to work in oils and watercolours before arthritis put an end to that), I need to do this on a shoestring budget compared to professional studios, so most likely I will have to be content with the "Render farm in a box" solution built with older components.which I am saving for. When working on highly detailed and very involved scenes, optimisation methods many use can become an exercise in diminishing returns when it comes to workflow and some, like the Scene Optimiser do have an affect on final render quality when set for maximum render performance.

While a slightly higher price is always expected when more advanced technology is rolled out, in the end I saw GeForce RTX as I did Genesis 8, not enough significant change for my purposes or needs to justify the extra cost of buying into it as all those RTX and Tesla cores mean nothing if a scene cannot be contained in memory.

It seems every "new GPU model year" we get all the speculation and hype that gets us pumped up, often to be incorrect in the end. Again this time around was no different as it reminds me of when everyone (self included) was so excited about the forthcoming 980 Ti with "double the VRAM of the standard card" (8 GB). That was a letdown. After that I learned to become more skeptical and tend to take with such incredible predictions with a "salt lick" (Like the early talk of a 16 GB GeForce card that we would see with Volta or Turing). Given that initial statement from Mr. Peterson about pooling GeForce cards from by Nvidia, I am also convinced this will not happen either (save maybe with Otoy and Octane4, I'll see once the commercial subscription version becomes available [although out of core rendering already gives Octane an advantage over Iray]).

In retrospect, I see Pascal as being a more of an advance from the previous generation than Turing is from it as Pascal not only offered improved performance, but Nvidia delivered on all the hype around the release of not one but two 8 GB (1070 and 1080) cards which were fully compatible with and offered better performance for the games on the market at the time for the price (and eventually for us, allowed for rendering of bigger scene files).

With respect to gaming, Turing RTX is ahead of its time, which can be a bad thing if game development doesn't embrace ray tracing as hoped. It was also mentioned in the interview that microstutter, which affects smooth game play, will continue to be an ongoing annoyance even with NVLink. We CG enthusiasts and hobbyists are still a very small segment of the consumer GPU market and don't account for large sales numbers like the gaming community does. With 10xx cards becoming more available and continuing to drop in price, along with AMD making strides on their end (again games are not locked to specific graphics languages/APIs like 3D render engines are) not sure how this move will pan out for Nvidia. On the other hand I see the Turing RTX Quadro's possibly being more successful among the pro sector because they are specifically designed for CGI production and offer true memory stacking. I wouldn't be surprised to see film studios considering the move from CPU to NVLinked GPU rendering to reduce production time. 96 GB of video memory per node is a lot of graphics horsepower and I'm sure they can get a bulk purchase discount like supercomputer developers and large datacentres do.

Mendoman · September 2018

@kyoto kid This is totally off topic in a off topic thread, but if you are still interested in doing oil or watercolor paintings, there's really nice free program called PhotoSketcher that can turn digital images into paintings. I'm not sure if the program is 64 or 32 bit, so not 100% sure if it can handle really big images sizes, but that way you wouldn't need to use 4k texture maps in DS if you later turn your images to paintings.

kyoto kid · September 2018

.....I just mentioned painting as I have a lifetime background in traditional art media so given that, yes I am (or was) a "professional".

BTW do you mean FotoSketcher because PhotoSketcher is an Apple/iPhone app. I have played around with similar apps for Android and while fun, they are primarily just a limited set of filters that emulate various schools/artist styles rather than letting me develop a more personalised style on my own.

When I got into this 3D thing, it was with the realisation that it would be a whole different ballgame. In many ways I find it closer to photography, stagecraft, and set lighting (I also was a photography enthusiast for several years in the days when film was the medium as well as worked in set and lighting design in theatre).

Basically what I should have said is that I take the same approach to 3D CG as I did with traditional art mediums, looking to push it for all its got and seeing how far I can take it. If I just wanted to emulate what I used to do on canvas or paper with a computer, I'd get Corel's Painter, though I wouldn't be able really get as much out of it as I have lost a good deal of pressure sensitivity and as well as steadiness of hand (which is why I don't use a tablet or do a lot of post other than apply a filter or text to a rendered image). Hence, I have to get as much out of the render process as I can.

outrider42 · September 2018

Takeo.Kensei said:

ebergerly said:

From the NVIDIA Optix 5.1 Programming Guide:

"On system configurations without NVLINK support, the board with the smallest VRAM amount will be the limit for on-device resources in the OptiX context. In homogeneous multi-GPU systems with NVLINK bridges and the driver running in the Tesla Compute Cluster (TCC) mode, OptiX will automatically use peer-to-peer access across the NVLINK connections to use the combined VRAM of the individual boards together which allows bigger scene sizes."

With that it's already clear that you can forget memory stacking with Iray and any Optix based renderer.

For gamers Nvlink will have a big interrest vs SLI. I'm rather skecptical for Iray unless you buy Quadros or Titans

Now he does mention that performance may drop when two cards pool memory, but you need to stop and think about what he is talking about...video games. Video games need this memory speed because they are constantly swapping data in and out as they render over 60 frames every second. But Iray and other render engines are not like gaming at all. For Iray, the entire scene is loaded to GPU. You already have data flowing between cards in mGPU setups. NVLink greatly speeds up this flow of data as it bypasses pcie entirely.

That is not the case. PCIe communication is still required. See https://fuse.wikichip.org/news/1224/a-look-at-nvidias-nvlink-interconnect-and-the-nvswitch

outrider42 said:
So I believe that using NVLink to pool memory would not only increase memory, it may potentially even increase the render speed over the standard mGPU setup.

outrider42 said:
And as NVLink bypasses pcie, I see no reason why anyone would need a new motherboard to support such a feature. No place mentions needing new boards, and more over, there are no new motherboards coming, we would certainly have preorders for them if this was true.

Since Nvlink will be a dead end for Iray and Optix based renderers, you can already forget that for consumer cards. So no need for a new motherboard in the end. Unless some motherboard with PCIe 4.0 come out and the Real Time Raytracing would benefit of a larger bandwith I don' t really see the need

We can always dream that Nvidia would allow TCC on Geforce but I don't think that may happen

Otherwise, the only possibility I see is to use Linux.

ebergerly said:

bluejaunte said:

I do think 8x is a bit too optimistic after the comments by the Redshift dev who noted that a frame can need up to (if I understood correctly) 50% of calculations on shading where RT cores won't help.

I agree. As explained elsewhere, the "8x" boost that was mentioned is a "rays per second" speed increase, not a "relative decrease in render times". And there are a lot more steps to the overall rendering process than just the ray tracing calculations. As you said there's figuring out the color/shading of the surface that the ray hits, there's de-noising, there's any physics calculations, and on and on.

That being said, those other components also have been separated out to a large extent as has been mentioned elsewhere, and hardware and software has been developed and updated in RTX (NGX, Optix 5.0, Physx, CUDA10, etc.) designed to speed up those elements as well. So the big question is when and how well will that all come together to result in significant improvements in render times?

Clearly there will be improvements, probably resulting in render time improvements greater than the historical 33% decreases we've seen in past generations. And solely based on price, I think the 2080ti will have to cut render times in half compared to a 1080ti for it to be viable for those of us doing renders. Not many would pay $1,200 for a card that can't do what two $600 cards can do.

My only prediction is that one of the nicer and most notable effects of a fully implemented RTX is that the Iray preview will be much faster, much like the beautiful realtime previews in Blender's Eevee, which is done with existing hardware. And that probably requires you enable the new AI de-noising if/when it's available.

As far as bottom line render times, like I say it has to at least cut Iray render times in half compared to a 1080ti, so I won't be surprised at something like a 60-70% cut in render times. Which means a 10 minute render becomes a 3 or 4 minute render.

Personally, I'd be far more excited if Iray's realtime preview gets updated to become like Eevee using my existing hardware. I've already got two GPU's and I'm not interested in buying another one, especially at those prices.

The 5-8x speed up seems reasonable in my POV. I think it should even be possible to achieve higher performance. The introduction of tensor cores should open up some new ways to speed up renders and also bring some new possibilities in what we can do. For me it's just the beginning as software and techs should improve after a while and we'll see more gain after 1-2 years

Obviously the gpu will need to talk to the CPU, I never said otherwise. You are confusing the issue. I'm talking about how the GPUs talk to each other. Without these links, the gpu only has one way to share information with other cards in the system, and that is over pcie. NVLink bypasses this bottleneck.

The dude in the video straight up says that it is possible on gaming RTX, it just has to be done by the developer. This man is the Director of Technical Marketing, so if anybody knows what they are talking about, it is him.

Anyway, the NVLink part of the video takes place around 37 minutes. If you watch the video, it is more clear than the quotation that was brought up. Tom says it is very much possible for a game developer to set up a game to use both cards as a single pool of memory. But he also said that doing so would cause a performance hit because of latency. This is why I said what I said about Iray not being a video game. That latency will not be an issue at all, because we ALREADY use multiple gpus in Iray with the worst latency possible, given that is without any direct connection between the gpus at all. Iray can handle this latency.

I don't see why people think that whatever RTX does would encroach on Quadro in any meaningful way. If you have the cash, YOU WANT QUADRO. It is just that simple. The idea of linking two 2080ti's to create a homebrew Quadro monster is just silly. For CAD programs, you can use GTX, but you want Quadro. It does depend on what you do, so here's a video to demonstrate the difference. Note the CAD benchmarks, it is not even funny. Quadro will keep its user base. Nvidia is making mad stacks of cash off these DGX boxes as customers like freakin' Disney buy them up.

Nvidia is already reaching out to non gamers with everything that RTX does as it is. We have ray tracing, something that when enabled, all reports show it is a massive performance hog. No gamer on the planet is going to spend this crazy cash just to play 1080p. 1080p has basically been mastered for years. A 4 year old 970 will play almost anything at 1080p, let alone the more recent 1070, which is almost overkill for 1080p games. People who are buying anything beyond the 1070 are looking at 1440p and 4K, period, no exceptions. I have a feeling that gamers will reject RT for the most part because of this.

Now lets look at sales. Preorders for 2080ti have had no trouble selling out. Even with how absurdly high these costs are and all the controversy around ray tracing not even doing 1080p60, preorders are STILL selling very well. This may not speak for the long term, but at least for right now these sky high prices have not stopped RTX from selling. So this make me wonder...are gamers really the ones buying these?

You see, I believe the segment of non gamers is actually much larger than you guys believe. Certainly Daz Studio is super niche, but at the same time it looks like Daz has never been as popular as it is right now. You have dozens of other software in graphics design out there, and many are doing pretty well. I personally believe that graphics design is on the major upswing, and that yes, RTX is reaching out to these people. With the promising numbers we have seen proclaiming 5-8 times performance for OTOY, you can bet that the people in any ray tracing field will be snatching RTX cards up like hotcakes. Some might have Quadros, but many probably have Titans or x80ti class cards, because Quadro is not a huge benefit for them. Just look at Iray...Quadro does not help Iray in any way aside from a larger frame buffer. For most small time graphic artists, Quadro is just not worth it or it is simply unaffordable. For all the talk that kyoto says about Quadro, he does not own one, and that is because of the simple fact that Quadro is far too expensive. There are many people like that. I am pretty sure just about every person in this thread would love a nice big Quadro in their machine (or 2 or 3, LOL.) But the harsh reality is that most of us can not afford one. And Nvidia knows this. Nvidia undercuts themselves all the time! This is not new!

The $3000 Titan V is not even a full year old yet, and now it is totally outclassed by a card less than half its price. Who in their right minds is buying a Titan V right now as we speak? High end GTX series cards have been using the same chipsets as Quadros do for years. The x70 and x80 in particular have a Quadro counterpart each generation. And that is what you see right here. The 2080 and 2080ti have Quadro counterparts that cost thousands more. Does memory pooling really enchroach any further than they already do? No, because of what Quadro is specifically made to do. None of that changes with memory pooling.

As for the performance. You have ray tracing drastically enhanced, so the more ray tracing in a scene, the more of a boost there will be. On the shading side, CUDA seems to have around a 30% boost here, so the shading portion will get faster too. Another thought, I am thinking that with faster ray tracing, this will also help speed up shading, because the shading elements do not need to wait as long to begin. So the shading starts sooner, and is faster. You may see Tensor playing a role here as well, but that is me guessing. But again, I will point to the migenius benchmark which showed the Titan V performaing 2.5 times faster than 1080ti in actual Iray testing.

https://www.migenius.com/products/nvidia-iray/iray-2017-1-benchmarks

It is my believe that Iray is using Tensor to a degree to achieve this. In fact, all of the Volta cards that had Tensor showed major performance gains over Pascal. This is the scene that migenius used to test with.

This scene is indeed heavily ray traced, with reflections of light everywhere, dark corners in the back of the room that can challenge any GPU. It features light from ouside a window. But even without any ray tracing cores, Volta shows massive performance gains. We already know that with Turing, CUDA has only gained about 30% over Pascal, so there is no way that CUDA alone explains the 2.5X increase.

So this is why I believe that at minimum you will see around a 2.5X boost if nd when Daz gets the updated Iray plugin. It mat be possible that a portrait scene with all that SSS and other good stuff could see less benefit. But it will see a big benefit.

kyoto kid · September 2018

...of course I am not working with CAD so the embedded video above (which was definitely not anywhere near 37 minutes) pretty much says little to me as I am into creating fully rendered scenes in very high quality and large format. For GPU based rendering, video memory is very important, and the more, the better.

Maybe Otoy will get stacking to work in Octane4, we will just have to wait and see what transpires by next year (and hope the forthcoming subscription version supports it as well).

Believe me, I really would so much love to be wrong on this count, as yes it would be great to have almost the same amount of video memory (along with more CUDA, Tensor, and RTX cores) as the 6,300$ RTX6000 for less than half the price. I just feel that expectations are being raised further than they should be and there may be disappointment in the end. In this case better to err on the skeptical side than place a lot of hope in it.

Nvidia, Daz, "surprise me".

As to the question of would I love to have two RTX5000s with a NVLink bridge? You Betcha', however unless there is some sort of small windfall, most likely it won't happen so I have to keep going with the original option of throwing as many CPU cores and as much memory at the process in a dedicated render box as I can afford.

7thOmen · September 2018

outrider42 said:

The dude in the video straight up says that it is possible on gaming RTX, it just has to be done by the developer. This man is the Director of Technical Marketing, so if anybody knows what they are talking about, it is him.

Hmmmm....

outrider42 said:

Now lets look at sales. Preorders for 2080ti have had no trouble selling out. Even with how absurdly high these costs are and all the controversy around ray tracing not even doing 1080p60, preorders are STILL selling very well. This may not speak for the long term, but at least for right now these sky high prices have not stopped RTX from selling. So this make me wonder...are gamers really the ones buying these?

A couple problems here.

First we have no idea what the actual numbers are for preorders. Is it 1 or 10,000 examples? It's easy to sell out a low volume and use the 'Sold Out' sign to further increase the interest.

Second, if history has taught us anything, it's that on release day the volume is still rather low. The formal reviews have become available and potential buyers can now make much more educated buying decisions. If the new GPU (or any exciting new product) actually has something worth investing in, then the buying frenzy begins. But, oops, retailers run out! This is where the preorder scalpers move in, and they do. So, over time it has become obvious that a portion of preorders (and release day orders) are made as a short term investment, anticipating a market that favors the supplier. I've witnessed it over and over since at least the GTX 6xx. How does that equate to the preorder sales?

Preorders are a bad metric since the only data available is the number of 'Sold Out' signs.

outrider42 said:

The $3000 Titan V is not even a full year old yet, and now it is totally outclassed by a card less than half its price.

"Outclassed" is a strong sentiment. On paper, the Titan V is superior - more CUDA, more memory with greater bandwidth, more Streaming Multiprocessors/Tensor Cores/Texture Units/(you get the point I hope). We also do not know if nVidia will add ray tracing to the Titan V, but if they do, there is just more to work with. If you look very closely at the Turing core, you should find it difficult to distinguish a difference from Volta - even the non-HBM die size is the same. As I stated earlier in the thread, segregate some Tensors, change their algorithm, and market them as 'RT Cores'. Oh, and have marketing create some slick new naming conventions like "Giga Rays". I still have my doubts about NVLink being nothing more than a higher bandwidth SLI link - see the questions I raised in my earlier post.

I really don't want to debate this and I hope you are at least partially correct in that the new GPU family will give us 'renderers' some exciting new technology to enhance our experience. When the unbiased reviews appear in a week or so the facts will lay to rest all these 'what ifs'. Hard data, not theories, fuel my excitement and purchasing decisions, but I certianly can't help but wonder...

Omen

Ghosty12 · September 2018

Here is some more news and if true could be very concerning, with this bit of news at 4:26 in the video.. It seems that the cost of these new cards could be even more than first thought if bought from an AIB such as ASUS, MSI and so on.. As it seems that the AIB's are having to pay MSRP themselves and then putting their margin on top of that..

kyoto kid · September 2018

...well that would be pretty stupid on NVida's part.

Artini · September 2018

Technology costs - especially the new technology.

Maybe that explains, why ASUS only put 2 fans on their RTX 2080Ti card (instead of 3 fans, they have on RTX 2080 card).

They would like to save every penny, to maximize their already little extra income, that will come from selling these new cards ;)

Gator · September 2018

Mendoman said:

Maybe it's just me, but I don't really get what's the big deal about NVLink memory pooling. People have been arguing about it for the last 5 pages or so, but to be able to use that, you'd actually have to buy 2 of those expensive RTX cards... and what I've been reading, very few seem to be willing to buy even one. If somebody here really needs 22GB VRAM for their scenes, I think there's lots of optimization to be done in their scenes. If they render 20k X 20k resolution posters or something, and need every possible normal map etc. in their scenes, then in my opinion it is already professional level work. Since Geforce cards are not professional level cards, it's kinda silly to blame them for not being able to do professional level job. Quadros etc are meant for that kind of work.

Sure, it would be super if we could stack 2080ti cards, but I won't loose my sleep if we can't, since I have zero intention to buy a second card.

There's a lot of interest out there. 3D graphics have come a long way, and one thing that easily separates or separated CG from other types is/were lack of details. Clean, bare not-lived in areas just aren't very believable. Or an outdoors with just a few trees and plants. So now that we have a lot more computing power even at consumer levels, that means more details which means more memory.

Then on the professional side, lots of folks are using gaming cards for 3D other than Daz Studio. Lots of people's budgets don't allow for Quadro cards. Heck, there's lots of businesses getting by with GeForce series cards. Just because Nvidia markets it as a gamer's card doesn't mean it's strictly used for gaming.

Can't speak for others, but it seems there is a fair amount of interest in the memory pooling here. I am. Especially for folks not looking to spend 3K or easliy more - for example stacking a few 8 GB 2080's to give 16GB usable VRAM to Iray.

nicstt · September 2018

kyoto kid said:

...well that would be pretty stupid on NVida's part.

But wouldn't be the first time.

kyoto kid · September 2018

..yeah.

Pricing of the Titan Xp, and Titan V.

The 2 x 6 GB Titan-Z. (which cost 300$ more than the 12 GB Titan-X).

Having both the 12 GB Quadro M6000 and 12 GB Maxwell Titan-X on the market at the same time.

nicstt · September 2018

I wasn't alluding to the pricing, but it is true enough that they do make me wonder. They are the current top-dog; funny how companies always forget - that it never lasts. And usually in part at least because of their own actions.

Notifications

OT: New Nvidia Cards to come in RTX and GTX versions?! RTX Titan first whispers.

Comments

9.1.4. Unified Virtual Addressing

9.1.4. Unified Virtual Addressing

Adding to Cart…