Description

CUDA parallel processing cores: 2304

NVIDIA Tensor Cores: 288

NVIDIA RT Cores: 36

Memory: 8 GB GDDR6

RTX-OPS: 43T

Raycasting: 8 Giga Rays/Sec

Maximum single precision (FP32) performance: 7.1 TFLOPS

Maximum single precision (FP16) performance: 14.2 TFLOPS

Highest integer arithmetic (INT8) performance: 28.5 TOPS

Deep Learning: TeraFLOPS1 57.0 TFLOPS

Memory Interface: 256-bit

Memory Bandwidth: Up to 416 GB/s

Maximum power consumption: 160 W

Bus: PCI Express 3.0 x 16

Display Connector: DP 1.4 (3) + VirtualLink (1)

Form Factor: 4.4”H x 9.5”L

Weight: 479 g

Cooling scheme: Active

NVIDIA® 3D Vision® and 3D Vision Pro supported by 3 pin mini DIN

Frame Lock Compatible (with Quadro Sync II)

NVLink Interconnect Technology: None

External power supply: 8-pin PCIe



Performance characteristics


Turing GPU Architecture

The Quadro RTX 4000 GPU is manufactured on the most advanced 12nm FFN (FinFET NVIDIA) high-performance process, custom-made for NVIDIA, contains 2304 CUDA cores, and is the most powerful computing platform for HPC, AI, VR and graphics workloads on the professional desktop . The Turing GPU architecture represents the biggest leap forward in real-time computer graphics imaging since NVIDIA invented programmable shaders in 2001. It integrates 13.6 billion transistors in a size of 545 square millimeters, providing more than 7.1 TFLOPS single precision (FP32), 14.2 TFLOPS half precision (FP16), 28.5 TOPS integer precision (INT8), and 57.0 TFLOPs Tensor computing power, Perfect support for all kinds of computationally intensive work add-ons.


RT core

New hardware ray tracing technology enables the GPU to produce, for the first time, film-quality photorealistic objects and environments in real time, including accurate physical shadows, reflections, and refractions. The real-time ray tracing engine works with NVIDIA OptiX, Microsoft DXR, and the Vulkan API to deliver a level of realism far beyond what traditional imaging techniques can achieve. The RT core uses pixel-by-pixel casting to accelerate Bounding Volume Hierarchy (BVH) traversal and ray casting capabilities.


Enhanced Tensor Cores

The new mixed-precision core is designed for deep learning matrix operations, delivering 8x the TFLOPS of the previous generation when training. The Quadro RTX 4000 utilizes 288 Tensor Cores, each of which can perform 64 floating-point fused multiply-add (FMA) operations per frequency, for a total of 1024 independent floating-point operations per SM per frequency. In addition to supporting FP16/FP32 matrix operations, the new Tensor core adds INT8 (2048 integer operations per frequency) and experimental INT4 and INT1 (binary) precision modes for matrix operations.


Advanced Shading Technology

Mesh Shading: Computation-based geometry pipeline to speed up geometry processing and culling for geometrically complex models and scenes. Mesh shading provides up to two times the performance boost for workloads constrained by geometric capabilities. Variable Rate Shading (VRS): Change the shading rate based on scene content, gaze direction, and motion to improve imaging efficiency. Variable Rate Shading provides similar image quality, but shades 50% fewer pixels. Material space shading: Object/material space shading improves performance for pixel shading-heavy workloads such as depth of field and motion blur. Material Space Shading For pixel-shading-heavy VR workloads, reuse pre-shaded material pixels to improve throughput and increase realism.


High-performance GDDR6 memory

Featuring Turing's highly optimized 8GB GDDR6 memory subsystem with the industry's fastest graphics memory (416 GB/s peak bandwidth), the Quadro RTX 4000 is the ideal platform for latency-sensitive applications that specialize in processing large datasets. The Quadro RTX 4000 offers 70% more memory bandwidth than the previous generation.


Single Instruction, Multiple Threads (SIMT)

The new independent thread scheduling feature shares resources among smaller jobs, enabling finer synchronization and cooperation between parallel threads.


Advanced Streaming Multiprocessor (SM) Architecture

Combines shared memory and L1 cache to dramatically improve performance, simplifying programs and reducing required tuning for optimal application performance. Each set of SMs contains 96 KB of L1/shared memory, which can be configured in various capacities based on computing or graphics workloads. For compute workloads, up to 64 KB can be allocated to L1 cache and shared memory, while graphics workloads can allocate up to 48 KB to shared memory; 32 KB L1 and 16 KB texture units. Combining L1 cache and shared memory reduces latency and provides higher bandwidth.


mixed precision arithmetic

16-bit floating-point precision operations to double throughput and reduce storage requirements for training and deployment of larger neural networks. Turing SM has independent parallel integer and floating point data paths, making it more efficient for workloads with a mix of arithmetic and address calculations.


Graphics preemption

Pixel-level preemption provides more granular control and better support for time-dependent work, such as VR motion tracking.


Computational preemption

Instruction-level preemption provides finer-grained control over computational work to avoid long-executing applications monopolizing system resources or timing out.


H.264 and HEVC encoding/decoding engines

Two dedicated H.264 and HEVC encoding engines and a decoding engine independent of the 3D/computing pipeline provide faster-than-real-time performance for transcoding, video editing, and other encoding applications.


NVIDIA GPU BOOST 4.0

Automatically maximize application performance without exceeding the power consumption and thermal envelope of the card. Allows the application to stay at the boost frequency for longer at higher temperatures before it drops to the base frequency set at the second temperature. This feature requires a software application to start, not a standalone program.

Payment
1. We accept payment through Paypal ONLY.
2. When using Paypal,your money is deposited securely in your account.
3. Money will be released to us only after you confirm the delivery.
4. Payment must be made within 7 days of order.
5. If you have bought multiple items from me, you can send me one payment for all the items instead of paying for them individually.
6. When you click the Pay Now button, eBay will automatically determine if there are other items you have won or purchased from me and will combine them for your review.
7. The buyer is responsible for customs duties
Shipping
1. We guarantee to ship out within 24 hours after payment confirmation(except Holiday).
2. If there is any unexpected situation(for example,if the order can not ship out within 24 hours),we will contact you as soon as possible,and follow up closely to ship your order soon.
3. All products will be tested strictly before shipping.
4. We ship to Paypal address ONLY.Please make sure your address in Paypal is matched with your shipping address before you pay.
5. And Russian buyer please leave us your full name,which is very important, Thank you!
Feedback
1. If you are satisfied with our product and service, Please leave Us positive feedback & a 5 star review!
2. If you are unhappy with our service or any other problem,please contact us before leaving feedback and we will resolve the issue immediately ,thank you
Our Services
1. Full refund if product isn't received in guaranteed delivery time
2. 30 days more protection after order completion
3. Return any product, only those in perfect condition
4. Guaranteed Quality or Full refund
Contact US
We care about our valued buyers, if you have any questions, our Customer Service staffs will be very glad to help you. We try our best to reply to your emails as soon as possible, however, due to high volume of daily incoming emails and time zone difference, we may not be able to reply your emails immediately. Please allow 24 business hours for us to resp