Let’s assume that an SM has an execution width of 32 threads…
Let’s assume that an SM has an execution width of 32 threads and can accommodate 128 threads. Each instruction involves at most 2 read and 1 write register operations. How many minimum register read ports are needed to execute each instruction in one cycle, given that a register has 8 banks and 1 register read takes 1 cycle?