Header menu link for other important links
X
Towards a Better Cache Utilization by Selective Data Storage for CMP Last Level Caches
, H.K. Kapoor
Published in IEEE Computer Society
2016
Volume: 2016-March
   
Pages: 92 - 97
Abstract
Tiled based CMP (TCMP) has become the essential next generation scalable multicore architecture. The cores in TCMP commonly share a large sized Last Level Cache. NUCA is used in LLC to divide it into multiple banks such that each bank can be accessed independently. Static NUCA has a fixed address mapping policy whereas dynamic NUCA (DNUCA) allows blocks to relocate nearer to the processing cores at runtime. DNUCA based TCMP can distribute the loads to each bank uniformly for a better global utilization. But such DNUCA designs cannot improve the local utilization factor for every bank. Within each bank the memory accesses are not uniformly distributed among the sets. Therefore flexibility of storing data items in unused portion of the bank can help to improve its utilization. In this paper we propose a DNUCA based design called STD-NUCA, to improve the local utilization of each bank. It has been observed that on average 24% blocks in L2 are useless as they are exclusively owned by some L1. These blocks, called stale blocks, cannot be used without contacting the owner. STD-NUCA removes the data portion of these stale blocks from L2 and only stores their tags. The proposed STD-NUCA can be used either for performance improvement or reducing the hardware overheads. Increasing the number of tag entries improves the performance, while keeping the size of tag entries same and decreasing the data array reduces the hardware overheads. Reduction in data array size gives 9% gain in energy consumption and 10.4% gain in energy delay product over an existing design TLD-NUCA. With higher associative tag array we get 5% improvement in performance. © 2016 IEEE.
About the journal
JournalProceedings of the IEEE International Conference on VLSI Design
PublisherIEEE Computer Society
ISSN10639667