Показать сообщение отдельно
Старый 13.05.2004, 09:05     # 48
grogi
Moderator
 
Аватар для grogi
 
Регистрация: 09.08.2002
Адрес: Kaliningrad
Пол: Male
Сообщения: 15 485

grogi - Гад и сволочь
on english=):
No 6800 Ultra extreme, Nvidia says
By Fuad Abazovic:
NVIDIA SAID THAT all Geforce 6800 Ultra will remain clocked at 400MHz, the same speed that they where introduced at in San Francisno and Geneva last month.
Sources close to Nvidia confirmed that the 6800 Ultra cards will remain at 400MHz since it actually can ship them in any volumes at this speed, while its 450+MHz chips will be reserved for special partners.

Special cards will be introduced by, yes, Gainward, a company that will soon be ready with its Cool FX water-cooled card clocked at over 450MHz. We suspect that it may reach over 500MHz.

Gainward is still testing its final clock speeds but we are sure that this card will be one of the fastest cards ever produced up to this point.

Another company which will do this is the ambitious XFX, which will also introduce a limited number of cards clocked at least to 450MHz for the core and close to 1200MHz for the memory. We don't have any idea about its availability.

The third partner will be US only and will be BFG. It has the green light to introduce these 450MHz cards as well.

We don't have any solid date when any of these cards will be available but Gainward should be ready to sample early next week with production in just a week or two beyond that.

In this product cycle we expect to see the Radeon X800XT Platinum water cooled producing a nice little fight, we believe. µ
_http://www.theinquirer.net/?article=15864

GeForce 6 Series Question & Answer with NVIDIA - Page 1 of 1
By Mike Chambers - May 13, 2004
INTRODUCTION

nV News was given the opportunity to conduct a question and answer session with NVIDIA's Tony Tomasi, Senior Director of Desktop Product Management and Ujesh Desai, General Manager of Desktop Graphics Processing Units. The questions are naturally related to the GeForce 6 Series, which was announced by NVIDIA on April 14.
I would like to thank the group of nV News staff and selected visitors who assisted us in compiling a list of questions for NVIDIA. A total of 30 questions were submitted and I had the unfortunate duty of trimming the list down to 10 as requested by NVIDIA. Since most of the submitted questions were technically oriented, I chose to keep the Q&A along those lines.

nV News:

What are the primary factors that would lead to a difference in performance between FP16 and FP32 calculations with the GeForce 6800?

Tony Tomasi:

FP16 uses less storage than FP32, and GeForce 6800 supports FP16 texture filtering and frame buffer blending in hardware. In particular, support for FP16 texture filtering and frame buffer blending can be a very significant performance win for high dynamic range applications. There are also some operations that can be performed faster, or in some cases "for free" using FP16 in the shading hardware. For example, partial precision normalize (FP16 normalize) is essentially "free" on GeForce 6800 hardware.

nV News:

X-bit labs has shown benchmark results (chart at bottom of page) of the GeForce 6800 Ultra generating more than its theoretical limit of 32 z-pixels per clock cycle when color writes are disabled. Are these results correct?

Tony Tomasi:

Depending on the impact of occlusion culling, one could measure rates beyond 32 pixels per clock of effective fill. But the Z-ROP hardware in GeForce 6800 is capable of 32 pixels per clock of z/stencil-only rendering. Rates beyond that would be due to other factors. High z/stencil rendering rates are a nice win for applications that do 2-pass shadow algorithms, like Doom3, which is why GeForce 6800 is capable of rendering z/stencil-only at such high rates.

nV News:

We have seen partial precision benchmark results (compare "PS 2.0 - Simple" to "PS 2.0 PP - Simple" in first table) on the GeForce 6800 drop in performance over full precision. Is it possible that the GeForce 6800 could have lost performance in some applications due to having Pixel Shader 2.0 shaders optimized for the GeForce FX?

Tony Tomasi:

We have filed a bug against the results generated by Marko Dolenc's fill-rate tester and are checking in to it. Something is definitely up there. We're not aware of any reason that partial precision floating point should be slower than full precision (FP32) floating point.

There are architectural differences between the GeForce FX architecture and the GeForce 6800 architecture that can lead the GeForce FX to have higher performance per Quad of shading horsepower in some limited number of partial precision cases, but since the GeForce 6800 has 4x the number of Quads for shading computation, GeForce 6800 should deliver better absolute performance.

While its possible to build a pathologically bad register combiner program that could potentially run faster on GeForce FX than on GeForce 6800, in practice no application we've seen does or would do that.

nV News:

Reviewers have noticed image quality issues with Far Cry and are perplexed as to why the game performs better on the newly announced Radeon X800 graphics chipsets than the GeForce 6800. Comments? Is a future patch in the works that will support the new features of Shader Model 3.0?

Ujesh Desai:

We are aware of all the issues with Far Cry and we are working with Crytek to solve them as soon as possible. Some of the issues are driver bugs that we are working on, and some of the issues are application related and Crytek is working on a patch for this. We are also working with Crytek to get a Shader Model 3.0 patch added to the game.

nV News:

Performance of the GeForce FX sometimes dropped when a Pixel Shader 1.x shader was executed compared to executing a similar shader under Pixel Shader 2.0. Will the GeForce 6800 have a similar performance drop when executing a shader using Pixel Shader 3.0?

Ujesh Desai:

Actually it will be the opposite. I think there is a bit of confusion about Shader Model 3.0 and Shader Model 2.0. While Shader Model 3.0 will enable some "new" effects, it is better characterized by ease of programming, more efficient use of the hardware, and higher scene complexity/or frame rates. Shader Model 3.0 makes developers lives easier due to the support for advanced programming features such as loops and branches.

This is a fundamental requirement and will improve the efficiency in how programmers can write their code. Without support for loops and branches enabled by Shader Model 3.0, developers will be forced to break up longer Shader Model 3.0 shader programs into smaller segments that will run on Shader Model 2.0 hardware. This will absorb clock cycles which will hamper performance in games that use the latest version of DirectX and have more sophisticated Shader Model 3.0 pixel shaders.

It is important to note that in some cases, developers can create the same effect with Shader Model 2.0 and Shader Model 3.0, however it may take longer to program using Shader Model 2.0 and may require more passes through the hardware to render.

Shader Model 3.0 does introduce some new functionality - particularly dynamic branching in the pixel shader, which must be used carefully for good performance. But in general, Shader Model 3.0 should actually make development easier, and can offer some nice performance benefits for complex shaders that can be executed in pixel shader 2.0, but can be executed more efficiently in Shader Model 3.0.(Вот подтверждение того, что не так уж и сильно они на SM3 напирают)....

nV News:

Do you believe that mixed precision is still relevant knowing that the uses of 64-bit floating point precision over 128-bit still benefits the GeForce 6 series?

Ujesh Desai:

Yes. In general, as with any processor, you should always use the fewest bits of precision that performs the function with the degree of accuracy you are after. In a CPU, people don't always declare doubles for good reasons - there are performance trade offs. I expect much the same behavior with GPU's. While 128-bit pixels are certainly higher precision than 96-bit or 64-bit pixels, they have more storage requirements as well.

Additionally, most high dynamic range applications being developed work quite well with 16-bits of floating point per component, and that in combination with floating point texture filtering and frame buffer blending of 64-bit floating point data makes mixed precision a large benefit. In fact, 64-bit floating point (called half) is enough precision for many high quality rendering and image processing systems. OpenEXR has some great examples of this as well.

nV News:

When using static branching, are those branches "free?" That is, as long as the constant value that changes which branch is executed is not changed, can that shader run at exactly the same speed as an "unrolled" shader with no branches? If not, about how many clock cycles does a static branch cost?

Tony Tomasi:

On the pixel shader side, shaders are recompiled based on constant state, so the hardware should see an unrolled shader independent of the input. This is not true in the vertex shader, but static branches are pretty cheap, all things considered (~2 clocks / branch in vertex shader).

nV News:

When using static branching, does changing the constant value that changes which branch is executed result in a state change? That is, can "uber shaders" be used to avoid state changes, and thus increase performance?

Tony Tomasi:

Yes, this is a state change (and shader recompile in pixel shader). Uber shaders can be used to avoid state changes, but uber shaders use extremely coherent dynamic branching (branch condition supplied as a vertex attribute), rather than static branching.

nV News:

About how many cycles does a dynamic branch cost at a minimum? Under what situations would a developer want to use dynamic branching and why?

Tony Tomasi:

There is a 2 cycle latency per branch instruction in the pixel shader, so IF/ELSE/ENDIF adds 6 cycles to a program (IF/ENDIF adds 4). If the branches are coherent (such as uber shaders, or potentially skipping calculations if N.L <= 0), and the number of instructions that can be skipped is greater than the latency, a developer should try dynamic branching. The vertex shader has a 2 cycle latency, too, but branch coherence isn't important, since the vertex shaders are MIMD.

nV News:

What programming feature of the GeForce 6 series do you find the most exciting?

Ujesh Desai:

I am really excited to see NVIDIA continuing to push technology forward and continue to be the technology leaders in the industry by pioneering Shader Model 3.0. Bottom line is developers want Shader Model 3.0. It is really disappointing to see some GPU manufacturers trying to hold the industry back by down playing the significance of Shader Model 3.0.

Next time someone downplays Shader Model 3.0, ask them if a Shader Model 3.0 part is on their roadmap.
(знамо дело без пинков ати не обошлось gigi ) Then ask Microsoft what they think about their next version of DirectX 9. If given the choice between supporting it or not, any GPU manufacture would want Shader Model 3.0 support.

Tony Tomasi:

I would say both Shader Model 3.0 AND floating point texture filtering and frame buffer blending. FP16 texture filtering and frame buffer blending makes developer's lives much better for high dynamic range type applications, as well as having a significant positive impact on performance and quality for those classes of applications as texture filtering can be done in higher precision w/o using shader passes (and/or without performing filtering in lower precision).

In the same way, frame buffer blenders can be used on this same FP16 data, again avoiding extra passes in the pixel shader. By making 64-bit floating point fully orthogonal (mipmapping, all texture filtering modes, etc.) developers no longer have to special case high dynamic range functionality.

nV News:

Is there any information on the low and mid-range GeForce 6 product lineup that you can share?

Ujesh Desai:

Sorry, but we cannot discuss unannounced products. Stay tuned. We're quite pleased with the per-clock performance of the NV4x architecture, and look forward to delivering the benefits to a top?to-bottom family of the GeForce 6 GPUs.

We would like to thank Brian Burke, NVIDIA's Desktop Public Relations Manager, for suggesting a Q&A session with Tony and Ujesh. And of course thanks go out to Tony and Ujesh for taking time to provide us with their responses!

_http://www.nvnews.net/articles/geforce_6_interview/
__________________
"Самый аккуратный водитель тот, кто забыл свои права дома"
Дружно переходим по ссылке
Строим город для имхо!!!
grogi вне форума