Why you should consider using memory latency metric instead of memory bandwidth
On multicore platforms with shared memory, the performance of memory bus is critical to the overall performance of the system. Heavy contention on the memory bus leads to unpredictable task completion time in real-time systems and throughput fluctuation in server systems. Currently, people monitor the memory bus contention by looking at the whole system's memory bandwidth (e.g., bytes/sec). When the observed bandwidth usage exceeds a certain threshold (which is hardware dependent), the system is considered under memory bus contention. While memory bandwidth metric has been widely used, both in academia and industry, I would argue that it is not accurate enough to identify contention. Well, if you use something like Valgrind to profile your application, you certainly get accurate information on almost everything. But here we are talking about online performance monitoring, where you don't want to dramatically slow down your application in production environment. So most likely,...