Computer failure. Solved. Now playing with 1070 ti!

1235»

Comments

  • outrider42outrider42 Posts: 3,679
    edited July 2018
    ebergerly said:

    Nvidia video cards for consumers are designed for gaming not Iray. The fan profiles and cooling solutions are designed with this in mind. This is why even a really well build card with excellent cooling is gonna go over designed heating threshold when subjected to Iray. The way video cards are designed is like american cars they are built to break>> Consequently we are left with the need to do everything in our power to expand the MTBF life of our investment in these expensive video cards as best we can. Use MSI afterburner and open case and point a fan at computer to blow cool air at case especially during the summer months.

    I realize that's a popular belief, but do you have any references to support any of that? Because there are a lot of people who successfully use their GPU's on the standard fan profiles for rendering iray for years, and I don't know of any facts to support the belief that their cards would have lasted longer if they were just gaming and not doing iray. Electronics have thermal limits because of physical characteristics of the silicon, etc. And they have continuous ratings, because that's what they can withstand continuously. For some reason some like to believe that using something at the continuous ratings for a longer period is worse. I don't think the facts support that. My GPU's fans keep the temps at or below about 80C. Whether I'm rendering for 20 minutes or 20 hours. 80C max. That's not a damaging temperature. Just check the equipment specs.   

     

    If video card makers wanted they could use something other than solder to connect the GPU to the video card which is the Achilles heel of the video card most failure is due to the constant cooling and reheating of solder connecting this to the card. When this starts to happen your GPU becomes disconnected from video board and that's it for your video card (time to by another). Keeping the card cool lengthens the life of these video cards solder points. If solder was a good solution we would have them on expensive desktop CPU sockets but most good motherboard manufactures don't solder CPU's because this is the hottest part of the system and the most expensive. Consequenlty, we have sockets that make the direct connection to CPU without solder and really good air or water cooling for the CPU. Solder is a cheap way for companies to attach something to a board and it breaks down!  Do CPU and GPU wear out? Yes they do but most of the time it is bad solder contacts or bad resistors on the boards that go bad that bring the CPU motherboard system or video card down. Yes I'm over simplifing but it would take pages and pages to expain why things are engineered to break just ask any good engineer about the problem Henry Ford ran into for making a car that could run too long without breaking.

    Oh, Kyoto Kid yes MSI Afterburner will work with any video card.

     

    If they are designed to break, then why would a company offer a 10 year warranty? Warranties are only profitable when they are NOT used by customers, so such an offer would be really bad for business. And since GPUs are not cheap parts, when one dies, whether under warranty or not, the user is not going to be very pleased that their GPU died. That's not good for business, either.

    I get what you are trying to say, the launch of the Xbox 360 suffered from poor soldering. It was the infamous "Red Ring of Death" and it effected extremely high numbers of the console. This problem led to a PR nightmare, and extremely angry customers. Eventually MS extended the warranty period. Its generally not a good idea to engineer a product to expire before its warranty, because the customer backlash is not worth skimping on decent solder. Microsoft lost a lot of customers because of this issue.

    But here's the thing...Quadros and GTX are manufactured with essentially the same components. There is no super special secret solder being used in Quadro or Tesla, and they all use the same kinds of cooling solutions. They may cherry pick some parts, but so do 3rd party board makers. Take note that 3rd party boards almost always have better cooling solutions than Nvidia's own (like the Founder's Edition.) Nvidia only supplies the chip itself to these 3rd parties, and while the 3rd parties must conform to certain spec like VRAM, they are free to use what components they want. So the notion that Quadro has better components has a fallacy, because that is only when comparing Nvidia's own boards...the situation can change when comparing Quadro to 3rd party GTX. The biggest hardware difference between Quadro and gaming is that Quadro is downclocked for stability. Gaming cards on the other hand, may be overclocked by the gamer, and BTW that is also why I generally do not recommend overclocking for Iray. The performance gains are not enough to justify pushing the hardware beyond spec. The equivalent Quadro chip will in fact perform most tasks slower because they are clocked slower. The other hardware difference is that Quadro uses ECC memory, again for stability. The rest is basically software, as Nvidia takes special care to separate Quadro, Tesla and gaming as much as possible. It is about business. Remember, Nvidia prohibits using GTX in workstations, even though you could if you really wanted to.

    As for suggesting that Iray is not really intended for GTX, that is simply not true. Iray is a CUDA application, and thus any CUDA based hardware can run it. If Nvidia had intended for GTX to not run Iray, then they would have done something to lock GTX out of Iray, because other pro features are indeed locked out of GTX, like CAD. But it should be obvious why an Iray user would desire a Quadro, they have much more VRAM for Iray, and that is pretty important as every user of Daz can attest to. You will not see a 32GB GTX card (at least for many more years.)

    Ultimately, the downclocking is the biggest reason why a Quadro may outlast a GTX card. If you clock the cards the same, I would bet you a pretty large sum of cash that the cards will last the same amount of time.

     

    I don't mean to derail, but this is a topic that doesn't get enough coverage. Now that you have a 1070ti, that is really awesome, and I think you will love it. I believe you made a great choice going with that card, and I invite you to run some the benchmarks on it from the benchmark thread.

    https://www.daz3d.com/forums/discussion/53771/iray-starter-scene-post-your-benchmarks/p1

    I made my own benchmark scene which is posted on one of the last pages of that thread. I'd love to see what your stats are so we can truly test if it runs Iray faster than the 1080.

    BTW, I very highly recommend using some kind of fan controlling software on any GPU. It is important to make sure the card is cooling itself properly, and as long as it does so, it should run fine for a long time. Personally I use the EVGA PrecisionX software, but you can certainly use MSI Afterburner. I would not use both, pick one or the other, if you want to test one, disable the other before doing so. Anyway, you can set a fan curve that adjusts the fan speed more aggressively than what your card shipped with. This is super important! The easiest way is to simply select the option for performance, or whatever it is called. But you can also totally customize the fan curve to run at the exact speed you desire at the exact temps you select. The goal should be to keep temps below 70 C, which is pretty easy with most fan curves. Mine stay below 60 at max load.

    Here is how to set up a fan curve in PrecisionX.

    https://www.evga.com/support/faq/afmviewfaq.aspx?faqid=59657

    And do note, that the software must be running in order to control the fans. And it is good to check sometimes to make sure it is engaged. I have had a couple of times where PrecisionX was running, but it didn't apply the fan curve. I noticed this because I didn't hear my fans kicking up during the render, so I checked and indeed it wasn't applying my curve, and my temps were getting up. So my point here is to be vigilant about your temps, because at the end of the day, this is what will determine the life expectancy of your GPU.

    Post edited by outrider42 on
  • ebergerlyebergerly Posts: 3,255
    edited July 2018

    Outrider42, why do you think its "super important" to keep temps below 70C? Heres what NVIDIA says:

    "NVIDIA GPUs are designed to operate reliably up to their maximum specified operating temperature. This maximum temperature varies by GPU, but is generally in the 105C range (refer to the nvidia.com product page for individual GPU specifications). If a GPU hits the maximum temperature, the driver will throttle down performance to attempt to bring temperature back underneath the maximum specification. If the GPU temperature continues to increase despite the performance throttling, the GPU will shutdown the system to prevent damage to the graphics card. Performance utilities such as EVGA Precision or GPU-Z can be used to monitor temperature of NVIDIA GPUs. If a GPU is hitting the maximum temperature, improved system cooling via an added system fan in the PC can help to reduce temperatures."

    My GPUs are designed to maintain max of 80C. If its that important to be below 70C why didnt NVIDIA use a 70C setting?

    Post edited by ebergerly on
  • TaozTaoz Posts: 10,256
    edited July 2018
    ebergerly said:
    My GPUs are designed to maintain max of 80C. If its that important to be below 70C why didnt NVIDIA use a 70C setting?

    MSI (and possibly other brands) have their "Silent Fan" feature which means the fan speed adjusts dynamically to the temperature, and generally they don't run at all unless you are gaming or rendering. Still, I've never seen the temperature go beyond 65-67 C even during hour long renders, where the fans run at about 65% speed. There must be a reason why they keep the temperature that low when possible (there may be cases where the temperature goes higher because the fans can't keep the temperature at that level even at 100% speed because of poor case cooling).

     

    Post edited by Taoz on
  • ebergerlyebergerly Posts: 3,255
    edited July 2018
    GPUs can run at different temps for many reasons. How much heat do they generate? 250 watts? 100 watts? How much cooling do you have? How effective is it? Do you have multiple GPUs side-by-side? The point here is whether its super important to stay below 70c when NVIDIA says they will operate reliably below 100C. If you run at 60c maybe its because its a low power unit with effective cooling and a manufacturer who designed a higher safety margin in its fan controls. That doesnt mean all GPUs need to run at that temperature or else theyll get damaged.
    Post edited by ebergerly on
  • Daikatana said:
    Taoz said:
    ebergerly said:

    Nvidia video cards for consumers are designed for gaming not Iray. The fan profiles and cooling solutions are designed with this in mind. This is why even a really well build card with excellent cooling is gonna go over designed heating threshold when subjected to Iray. The way video cards are designed is like american cars they are built to break>> Consequently we are left with the need to do everything in our power to expand the MTBF life of our investment in these expensive video cards as best we can. Use MSI afterburner and open case and point a fan at computer to blow cool air at case especially during the summer months.

    I realize that's a popular belief, but do you have any references to support any of that? Because there are a lot of people who successfully use their GPU's on the standard fan profiles for rendering iray for years, and I don't know of any facts to support the belief that their cards would have lasted longer if they were just gaming and not doing iray. Electronics have thermal limits because of physical characteristics of the silicon, etc. And they have continuous ratings, because that's what they can withstand continuously. For some reason some like to believe that using something at the continuous ratings for a longer period is worse. I don't think the facts support that. My GPU's fans keep the temps at or below about 80C. Whether I'm rendering for 20 minutes or 20 hours. 80C max. That's not a damaging temperature. Just check the equipment specs.   

    I used to think so too but there must be a reason why nVidia reduces the warranty to 90 days (normally 3 years for gaming cards) on cards designed specifically for mining:

    "What’s being stripped away? Details are scarce, but the manufacturers are removing HDMI and/or Display ports as image outputs from the mining-specific hardware. They will also come with reduced 90-day warranty periods due to the intensive 24-hour operation that the bitcoin mining GPUs are likely to see."

    https://www.ccn.com/nvidia-amd-to-release-cheaper-bitcoin-mining-gpus/

    First off, a lot of mining setups are not engineered well. The airflow is not directed correctly.  In fact, some are not even set up in proper cases but simply mounted on a backplane and running in ambient room air.  That’s going to shorten the life of any electronics.  In an intelligently designed pc, the case itself assists with cooling as in constrains the airflow brought in and routed through by intelligently placed fans.  This is just simple engineering at work and it’s a beautiful thing.  My pc case for example has cool air from almost floor level pulled in by a fan and that airflow is directed up and over what needs to be cooled  before being pushed out of the case by a strategically placed exhaust fan at the top rear of the case.  Now if I had 3 or more gpu’s in my case I would consider a liquid cooling setup for the gpu’s necessary because you can only do so much with engineered airflow.  However for a single gpu ,or two gpu’s in a well designed case with intelligent fan placement and direction, ambient room temperature of about 68f-75f is perfectly fine for both hours of gaming and long rendering stretches.  I think a lot of people get alarmed when the thermal sensors cause the fans to ramp up but that’s what they were designed and built to do.  

    BINGO!!!  I'm not an engineer, but after over 35 years in operations and manufacturing within computer and medical device settings, some things I've learned about controlling heat in a critically enclosed environment are REMOVING heat, rather than trying to cool it.  It is ridicuously inefficient to try and cool hot air.  There will ALWAYS be hot spots, even inside a computer case.  As much hot air as possible needs to be removed from the enclosure, THEN you can start to consider how to cool it further.

  • drzapdrzap Posts: 795
    edited July 2018

    I think we can all agree that heat is the enemy of electronic parts in our computers.  The higher the temperature, the better chance of failure.  We want as low a chance of failure as reasonably possible, so it is always a good idea to bolster your cooling system in some way.  Sure, Nvidia says their products are safe at 80 degrees, but I'm not going to test that statement with my equipment!  Don't put yourself in the poor house over it, but lowering the temp in your computer box is a good move, whether it is by liquid cooling, higher capacity fans, or just using the software to boost the settings.  Because anomalies do exist and some boards come from the factory with weaker connections than others.  It's just not worth it (at least for me) to simply let things go at stock levels and hope nothing goes wrong.  As has been mentioned, a well-engineered cooling and ventilation system in your case is a strong first step.

    Post edited by drzap on
  • ebergerlyebergerly Posts: 3,255
    drzap said:

    I think we can all agree that heat is the enemy of electronic parts in our computers.  The higher the temperature, the better chance of failure.  

    I think that's what causes so many misconceptions and so much paranoia. Heat is not the enemy. Heat above a certain level can be damaging. But just like you don't go out to your car and upgrade the cooling system and radiator fan (on something that's far more expensive than a GPU), you don't need to upgrade the cooling on your computer/GPU. As long as you operate your components within their design specifications, and are aware of the specifications, you should be okay. Just like your car...you can operate it at 70mph all day, and as long as the TEMP light doesn't go on you're okay. And just like you need to make sure there are no radiator hose leaks, or lose fan belts, make sure your GPU cooling fans aren't clogged and monitor temps regularly.

    Now, if someone wants to be extra extra careful and doesn't mind spending time and money, then that's their choice. But finding a fact-based, rational proof that it is making any difference whatsoever to the overall life of the device might be somewhere near impossible.  

     

  • kyoto kidkyoto kid Posts: 41,851

    Remember, Nvidia prohibits using GTX in workstations, even though you could if you really wanted to.

    ...interesting as I've been to custom build sites that build workstations which also offer options for a GTX 1070, 1080, 1080 Ti, and Titan Xp along with the Quadro series.

    Not sure what they would or even could do. 

  • drzapdrzap Posts: 795
    kyoto kid said:

    Remember, Nvidia prohibits using GTX in workstations, even though you could if you really wanted to.

    ...interesting as I've been to custom build sites that build workstations which also offer options for a GTX 1070, 1080, 1080 Ti, and Titan Xp along with the Quadro series.

    Not sure what they would or even could do. 

    Nvidia's prohibitions are limited to commercial services involving their processors, such as render farms.  Even then, it's not very enforceable.  I use a European render farm that rents out workstations with 8 1080ti's.   Maybe the fact that they rent access to the PC's instead of run the jobs for the customers allow them to skirt the prohibition. 

  • AnotherUserNameAnotherUserName Posts: 2,727
    edited July 2018

    This one ran at 64 C for a little over 10 minutes.

    Neon.png
    632 x 819 - 579K
    Post edited by AnotherUserName on
  • AnotherUserNameAnotherUserName Posts: 2,727

    AnotherUserName said:

    So maybe this is a no-brainer, but should I run my fans at 100% while I render or is that unnecessary? I never have anything super complex going on...

     

    Firstly, congratulations on your new card - good to see you enjoying the new speed. 

    I have a very similar setup to Nanffuak who posted earlier, and I also use a custom fan profile.  I have used MSI Afterburner for quite a long time, but I've recently had to stop using it and go with the software that came with my card because Riva Tuner Statistics Server is used to display the graphs and I'm having problems with it since the update to Windows 10 1803.  But if you're not runing that build of Windows 10 then I would definitely recommend it.  I usually runthe fans at 30% at 50 Celsius and at 60% at 70 Celsius and this is enough to keep them at around 70 Celsius during a prolonged render.  If it's a very hot day then I turn the GPU fans up for long renders.

    Hi Reaper. Quick question(s). I havent looked at MSI Afterburner yet but will soon. I looked in my thermal controls and it looks like I have direct control over my TOP fan and my PCI Fan. I can create custom profiles. Which of these fans should I adjust to keep the card cool? Should I use both? And, can I over do it? I dont have direct control over my GPU fan. Is it possible to run the computer too cool?

  • Dim ReaperDim Reaper Posts: 687
    edited July 2018

    Hi Reaper. Quick question(s). I havent looked at MSI Afterburner yet but will soon. I looked in my thermal controls and it looks like I have direct control over my TOP fan and my PCI Fan. I can create custom profiles. Which of these fans should I adjust to keep the card cool? Should I use both? And, can I over do it? I dont have direct control over my GPU fan. Is it possible to run the computer too cool?

     

    I'm not 100% sure ( so correct me if I'm wrong) which thermal controls you mean - ones that control your case fans, or ones that came with the 1070 Ti?  Also, when you say "Top fan" are you referring to one on the 1070 Ti or a case fan?  I'm going to take a guess that you are referring to case fans, in which case I would say that if you had no previous heat problems, leave them as they are unless they are very quiet or your computer is somewhere that extra noise won't bother you.

    Over the past couple of months I upgraded from a 980Ti to a 1080Ti and it ran at a lower temperature and fan setting.  A couple of weeks later I upgraded the power supply and added back in the 980Ti and this was when I had to make some changes to the fan profiles for the graphics cards.  I have also moved a case fan that used to be on the bottom of the case pulling air in (new PSU needed that space).  I now have it blowing air between the two cards, towards the back of the computer.  To be honest, it doesn't make a huge difference to graphics card temperatures, whereas adjusting the fan profiles definitely did.

    It might be a good idea to download GPU-Z.  There is a tab labelled "Sensors" that will allow you to monitor your card.  Have that running and then set your machine off on an iray GPU render and keep and eye on the temperature of the GPU.  You may find that even under a rendering load your temperatures stay low with the default fan settings - in which case you don't need to tinker with fan profiles in MSI Afterburner or similar.

    EDIT:  I've just seen your post above the image you posted (nice work btw).  I would try a render that takes around 30 minutes.  If your GPU temp is still around 64C after that then don't worry about changing anything.

    Post edited by Dim Reaper on
  • AnotherUserNameAnotherUserName Posts: 2,727
    edited July 2018

    Hi Reaper. Quick question(s). I havent looked at MSI Afterburner yet but will soon. I looked in my thermal controls and it looks like I have direct control over my TOP fan and my PCI Fan. I can create custom profiles. Which of these fans should I adjust to keep the card cool? Should I use both? And, can I over do it? I dont have direct control over my GPU fan. Is it possible to run the computer too cool?

     

    "I'm not 100% sure ( so correct me if I'm wrong) which thermal controls you mean - ones that control your case fans, or ones that came with the 1070 Ti?  Also, when you say "Top fan" are you referring to one on the 1070 Ti or a case fan?  I'm going to take a guess that you are referring to case fans, in which case I would say that if you had no previous heat problems, leave them as they are unless they are very quiet or your computer is somewhere that extra noise won't bother you.

    Over the past couple of months I upgraded from a 980Ti to a 1080Ti and it ran at a lower temperature and fan setting.  A couple of weeks later I upgraded the power supply and added back in the 980Ti and this was when I had to make some changes to the fan profiles for the graphics cards.  I have also moved a case fan that used to be on the bottom of the case pulling air in (new PSU needed that space).  I now have it blowing air between the two cards, towards the back of the computer.  To be honest, it doesn't make a huge difference to graphics card temperatures, whereas adjusting the fan profiles definitely did.

    It might be a good idea to download GPU-Z.  There is a tab labelled "Sensors" that will allow you to monitor your card.  Have that running and then set your machine off on an iray GPU render and keep and eye on the temperature of the GPU.  You may find that even under a rendering load your temperatures stay low with the default fan settings - in which case you don't need to tinker with fan profiles in MSI Afterburner or similar.

    EDIT:  I've just seen your post above the image you posted (nice work btw).  I would try a render that takes around 30 minutes.  If your GPU temp is still around 64C after that then don't worry about changing anything".

    The fans I was reffering to do seem to be case fans. I have two fans on the card but only one is displayed in my thermal controls. I just did a render that took 20 minutes and I saw consistant results with the temperature. The GPU fan was at a high of 46% to keep the gpu at a constant 64C.

    Now that I have this new card, my paranoia can finally level up!surprise

    EDIT: Whats up with the quotes function? Seems to be broken...

    Post edited by AnotherUserName on
  • sapatsapat Posts: 1,735
    sapat said:

    Well, now I'm looking at the 1080Ti since it has 11GB.  I have 3GB currently, so 6 or 8 doesn't sound like a big enough leap.  Do you think a 750W power supply is enough for a 1080Ti?

  • ebergerlyebergerly Posts: 3,255
    edited July 2018
    sapat said:
    sapat said:

    Well, now I'm looking at the 1080Ti since it has 11GB.  I have 3GB currently, so 6 or 8 doesn't sound like a big enough leap.  Do you think a 750W power supply is enough for a 1080Ti?

    More than enough. It draws 250 watts. I have a 1080ti with a 1070, and a 750 watt power supply, but I can barely reach 400watts during a render for my entire computer. 

    I recommend people spend $30 on a power meter to measure actual power usage rather than getting wrapped up in the "you need MORE POWER" paranoia laugh

     

    Post edited by ebergerly on
Sign In or Register to comment.