Posts

Showing posts from April, 2018

Why you should consider using memory latency metric instead of memory bandwidth

Image
On multicore platforms with shared memory, the performance of memory bus is critical to the overall performance of the system. Heavy contention on the memory bus leads to unpredictable task completion time in real-time systems and throughput fluctuation in server systems. Currently, people monitor the memory bus contention by looking at the whole system's memory bandwidth (e.g., bytes/sec). When the observed bandwidth usage exceeds a certain threshold (which is hardware dependent), the system is considered under memory bus contention. While memory bandwidth metric has been widely used, both in academia and industry, I would argue that it is not accurate enough to identify contention. Well, if you use something like Valgrind to profile your application, you certainly get accurate information on almost everything. But here we are talking about online performance monitoring, where you don't want to dramatically slow down your application in production environment. So most likely,

The foreground/background scheduling model for real-time and cloud systems

Image
In real-time systems, the very basic task model assumes every real-time task (process) is either periodic or can be modeled as periodic one (e.g., sporadic servers). A periodic task is generally assigned three properties: period (T), worst case execution time (WCET, or C) and deadline (D). In every period T, the task needs to complete some job before its deadline D. In the worst possible case with all sorts of execution interference (e.g., interrupts, synchronization, cache evictions, memory bus contention), it takes C amount of time to finish the job. In no period would the job execution time exceed C. In normal execution though, the job time is way smaller than C. C <= D, since if the relative deadline D is smaller than C, the task will always fail in the worst case. Notice D can be greater than T, although most scheduling algorithms and analysis methods assume D <= T or just D = T. You can definitely write a web server that handles one request in every period and has D > T