Qual é a diferença de desempenho entre a estação de trabalho e a GPU de desktop para cálculos CUDA?

Question

Qual é a diferença de desempenho entre a estação de trabalho e a GPU de desktop para cálculos CUDA?

#1 resposta do (3 votos)

3

Comparando placas gráficas de workstation e desktop, não está claro o que é realmente diferente entre as duas. Eu entendo que uma placa de estação de trabalho é melhor para uso profissional e uma placa de desktop é para jogos, mas o que isso significa para cálculos CUDA? As postagens que vi parecem se concentrar em renderização em CAD e 3D. Estou interessado em rodar código c / c ++ escrito para rodar em núcleos CUDA de um sistema Linux.

Comparando 2 cartões de preço similar, a placa de desktop fornece o dobro dos núcleos CUDA e 50% a mais de memória.

Quadro K4200 atualmente orgulha-se de 789 $
- 1344 núcleos CUDA
- 4 GB de 256 bits GDDR5
GeForce GTX 980 Ti atualmente custa 699 $
- 2816 núcleos CUDA
- 6 GB GDDR5 de 384 bits

Existem outras diferenças que devem ser consideradas?

cuda nvidia-geforce graphics-card nvidia-quadro

por Steven C. Howell 31.07.2015 / 21:27

1 resposta

Tags cuda nvidia-geforce graphics-card nvidia-quadro

Como ver o caminho original de um arquivo na lixeira? programa DOS tem 8 segundo atraso na inicialização no Win 7

score 3 · Accepted Answer

Ambos os cards têm seus prós e contras:

A GTX é muito melhor no desempenho: link

The GTX 980 is based on the newer second-generation Maxwell architecture, which means that it will more likely support newer technologies than the Quadro K4200 (which is still based on the first-generation Kepler architecture despite being part of the newest Quadro lineup).

Sobre o Quadro:

So I was going to buy the Nvidia Quadro K4200, because I was told that its drivers supported better viewport performance. However, I saw many users on various forums say that they were not impressed by the viewport performance bump it provided, and that it was negligible. They were in the "GTX-has-much-better-specs-for-the-price" camp. "Team GTX"

I've then seen people with the viewpoint that specs don't matter at all, and that it's "all in the drivers." ("Team Quadro") They proclaim that Quadro's superior drivers make a dramatic difference in Max workflow, and is totally worth the hefty price. They also say that there are important hardware differences as well, that it's not just optimized Quadro/throttled GTX drivers.

"Team GTX" then counters that this USED to be true, but that Quadro and GTX have converged in recent years. They give anecdotes on how well their Many of the benchmarks and discussions online are either outdated (Quadro NON-Kepler series compared, for instance), or they just compare just gaming cards/workstation cards without crossover. I've used head-to-head benchmark sites which show the GTX 980 being superior by a wide margin. But again, the benchmarks seem to be targeted at gamers.

Further complicating things are the GTX 970/980 vs. the Titan. It seems that there is little advantage offered by the Titan to justify the price for me.

fonte

Veja também:

GTX 900 series graphic cards are not supported for now from ADOBE Developers to get nvidia drivers and to certify them as ADOBE SIGNED GTX 900 DRIVERS as Nvidia do with Microsoft for Windows (WHQL-Windows Hardware Quality Labs)...for now GTX 900 GPUs work practicully Based on CUDA API and the GPU GM204 isnide of GTX 980 works without Maxwell Improvements but it reacts as a old Fermi or Kepler one.

Now for the moment all beasts as HP Z840 , Precision T7810 , Celsius R940 , Thinkstation P900 are based on Quadro Cards because the drivers signed for this GPUs has ISV Certifications for all media , including here and Decoders and Encoders on VIDEO for AE and PR.

Its not important to select a GPU that has a lot of Gpixel/s or Gtexel/s or a lot of memory bandwidth (ok they are importat buuut..)...Select a certified GPU as the first Like Quadro 2000/4000/5000/6000 Low Budget now and Quadro K 2000/2200/4000/4200/5000/5200 or Special Game GPUs that are Certified for AE and PR ,GTX 780 ,GTX Titan and GTX 780 Ti and than see the specs.

fonte

Há uma explicação muito mais detalhada em uma pergunta semelhante:

In general:

If you need lots of memory than you need Tesla or Quadro. Consumer cards ATM have max 1.5 Gb (GTX 480) while Teslas and Quadros up to 6 Gb.

GF10x series cards have their double precision (FP64) performance capped at 1/8-th of the single precision (FP32) performance, while the architecture is capable of 1/2. Yet another market segmentation trick, quite popular nowadays among hardware manufacturers. Crippling the GeForce line is meant to give the Tesla line an advantage in HPC; GTX 480 is in fact faster than Tesla 20x0 - 1.34TFlops vs 1.03 TFlops, 177.4 Gb vs 144 Gb/sec (peak).

Tesla and Quadro are (supposed to be) more thoroughly tested and therefore less prone to produce errors that are pretty much irrelevant in gaming, but when it comes to scientific computing, just a single bit flip can trash the results. NVIDIA claims that Tesla cards are QC-d for 24/7 use.

A recent paper (Haque and Pande, Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU) suggests that Tesla is indeed less error prone.

My experience is that GeForce cards tend to be less reliable, especially at constant hight load. Proper cooling is very important, as well as avoiding overclocked cards including factory overclokced models

source

Também há uma discussão sobre isso na seção de comentários em este post

There is a clear performance difference in general-purpose GPU computing using CUDA. While GeForces do support double-precision arithmetic, their performance appears to be artifically capped at 1/8 single-precision performance, whereas Quadros get 1/2 of single-precision performance, as one would expect. Disclaimer: this is second-hand knowledge, I don’t do CUDA myself. In OpenGL, on the other hand, both Quadros and GeForces only use single-precision arithmetic. This is based on observing our CAVE, where the Quadros running the render nodes show exactly the same rounding problems when working with large-extent model data as regular GeForces do. Fortunately, there are workarounds in place.

Outro link útil que o stvn66 encontrou e resumirá: link