I don't care about who is an insider or not. I will go by what we see right on VGleaks and as you will see the GPU V.S. GPU WAR is over.
Lets start here; http://www.vgleaks.com/durango-gpu-2/2/
Now scroll to Compute, here you see vgleaks:"Each of the 12 Durango SCs has its own L1 cache,LSM(Local Shared Memory), and scheduler, AND FOUR SIMD UNITS.-"
Now open new tab, go too: http://www.vgleaks.com/world-exclusi...is-unveiled-2/
Scroll down Pass the GPU and Update crap and stop at vgleaks: "Each CU contains dedicated: "-ALU(32 64-bit operations per cycle)
Now here is what all this tells us: Durango can vgleaks: " A SIMD executes a vector instruction on 64 threads at once in lockstep " their are 4x in one SC but lets use just one.
PS4 can vgleaks: " -ALU 32 64-bit operations per cycle "
1. (1)SIMD 64-bits ON 64 threads at once. (PER CYCLE) V.S. (1)CU 64-bits 32 threads at once.
Threads per cycle [winner Durango with 64 threads vs PS4 32 threads ]
2. (12)sc each on 64 threads = 768 V.s. (14)CU each 32 threads = 448 If you add the other 4CU it's = 557
Total Threads [Winner Durango with 768 threads vs PS4 557]
Here is why SONY wonts to code to the metal as they say. Durango under vgleaks SIMD: "Shaders no longer waste processing power-"
SONY Knows that the PS4 and PCs old type CUs waste upto 40% - 50% of processing power. Look it up!
So with metal-codeing they can recover maybe 30% of that. This is allso why they use 1.8TF or 14CUs V.s. (12CUs)12SCs at 1.2TF
When you buy a car with 300BHP at 150fp torque V.S. 280HP at 280fp the lesser hp with higher torque will win every time.
SONY added 4 CUs that are, vgleaks: " Minor boost if used FOR rendering." but i put it in for PS4boys.
If their was a GDDR8 on PS4 it still would not help to pass Durango's computation at this point.
Now that we are done with that, With this last GEN-systems I call out the PS3's GPU in forums as not being more powerful as the 360,I WAS RIGHT!
It's now that time..... The Durango as it is on VGleaks is in FACT A BEAST. Without knowing what the added power of the 4x Move Engines on the GPU chip can do and I did not go into the Fucking Scheduler, vgleaks:" they perform an operation per vector rather an operation than per thread." wooowwwww .
The Scheduler will not need the ducking GPU cycle to pull any of the data from 4x SIMDs, Holy shit!!! Well, Sony/Ps4 pray MS don't use 2xGPU. IT'S bad for you!
THe DDR3 if it stays, it's ok! Thats where the ESRAM will come in at. A boxcar of a train can only move as fast as the head engine pulls it.
One last thing, their are 4x SIMDs so the Scheduler can, vgleaks:" A SIMD can have up to 10 VECTORS (PS4 is 1)in flight at any time.The scheduler selects
one or more of theses 10 candidate vectors to execute an instruction" with out the GPUs cycle count. Dam that's bad ass!!!!
The LAVA Demo with the Nvidia GPUs, it can do that. I think with the new DEVKIT.
The Infiltrator Demo it can do that, and yes it can do that BF4 DEMO at 1080p. You know what i like the most about this i shited on VGleaks but it was always
Right in are FACE.vgleaks:"The company has built a powerful system, more than the WIIU" They know they know!!!!
Also, this is where the 64 threads dp v.s. 32 threads sp of the SC AND CU. With the SC 100% efficiency! vgleaks: "FLOPs 768 ops/ clock* (1 mul + add 1)* 800 MHz = 1.2 TFlOPS" is righ x2. Old way of counting UCs and not counting the work of the SC as 2X just the work of a cu.
MIsterxmedia: How big do you think is a possibility MS will go 2XGPU?
Marc Berry: I can only use the data on vgleaks, so i say no for now, remember
this is without the help of the Move engines that can pass data and work on, vgleaks: From a sub-box of a texture. To a sub-box of a 3D texture." their are more and they can do it with no penalty to the GPU. This is where they will get to do DX11,12 ray tracing from. This is not apart of the 1.2TF x2.
I know that something was wrong, How about this, vgleaks: "This example assum what's expected to be a typical CPU load and a maximum GPU load:"
"Three display planes (PS4 has 2)are enabled at 1080p resolution."
"Display write-back is writing a 1080p image at 60FPS."
That's fucking 1080p x3 and one will run at 1080p 60FPS' BF4 here i come!! Do you know the work load of 3 planes. WOW! Scroll down see the blue, did anyone see that the GPU can read / write from DRAM and ESRAM at the same time.(PS4 just has one main memory) !http://www.vgleaks.com/durango-memory-system-example/
josefajardo: Nice analysis...
.. and its good you estimated using 1 SIMD execution for Durango, even thou an SC has 4 SIMD's, because only 1 SIMD is ever processed per cycle. That is, it takes 4 cycles to process the 4 SIMDs in an SC.
Mistercteam: to make it simple
12 SC = 48CU = 768 thread
7970 = 32CU
PS4 = 18 CU = 1/3 xbox next
PS4 = 18 x 16 * 4 * 2flop (MADD) * 0.8Ghz = 1.843 TF
that's why they use 7970m at the devkit
xbox next
12 SC = 48 CU = 3x PS4 per GPU
XB 720 = 48 * 16 * 4 * 2flop (MADD) * 0.8 = xxxx TF
that's why they use 7970 as devkit ( a multiple one)
If xbox next use float then it will be more efficient
ALU core will bigger but will be more efficient
FLoat = less wider to operate (more efficient)
+ eSRAM + Move engine + Workload manager (L3 tile base )
more detailed analisys from mistercteam is here: http://misterxmedia.livejournal.com/...25769#t2125769
MIsterxmedia: So what is new for me here. We all could relax because even with 1 SOC Xbox next will be around 3.2T SP unoptimized. But i hope MS will go with 2xSOC for about about 6-7Tflops SP unoptimized. I think they will do that option because BF4 or Unreal Engine 4 needs about 3-4 TFlops + Kinect + Illuniroom + Advanced AI and so on - we need about 5-6 Tflops SP.
Sony should also surprise with upgrade to 3.2 T SP.
http://misterxmedia.livejournal.com/....html#comments