Overview of NVIDIA GPU architectures. The Fermi architecture is shown to consist of 16 Streaming Multiprocessors (SMs). Each SM consists of 64 KB of on-chip memory, which can configured as 16 KB of shared memory and 48 KB of L1 data cache or vice versa. Also present on each SM is 128 KB of L2 data cache. The GDDR5 memory controllers facilitate data accesses to and from the global memory. The GT200 architecture consists of 30 SMs. Each SM consists of 16 KB of shared memory but no data caches, instead it contains L1/L2 texture memory space. Also present are GDDR3 memory controllers to facilitate global memory accesses.
Daga and Feng BMC Bioinformatics 2012 13(Suppl 5):S4 doi:10.1186/1471-2105-13-S5-S4