Challenges and opportunities in developing a hash table optimized for persistent memory

Abstract

Most programs have traditionally been divisible into either "in-memory" or "database" applications. When writing in-memory applications, all that matters for speed is the efficiency of the algorithms and data structures in the CPU/memory system during the run of the program, whereas with database applications the main determinant of speed is the number and pattern of accesses to storage (e.g., random accesses vs. sequential ones). The reason that database application programmers don't need to be too concerned with in-memory algorithmic efficiency is that the time it takes for one storage access, typically measured in hundreds of microseconds, corresponds to tens or hundreds of thousands of CPU instruction times, which are measured in picoseconds. Thus, tuning the CPU part of the problem, other than by selecting a reasonable algorithm, is generally a waste of time in such applications. However, when programming with persistent memory, the luxury of concentrating on only one of these sets of constraints is gone. One access to persistent memory takes a few hundred nanoseconds, which is roughly the time it takes to execute 1000 sequential CPU instructions. Thus, with such storage devices, the efficiency of CPU processing is critical to overall program performance in a way that is not true with slower storage devices. This new significance of CPU processing time, along with the fact that persistent memory is accessed through the virtual memory mechanism, implies new restrictions on the design and implementation of such programs if optimal performance is to be obtained. A few of these restrictions are: 1. Random accesses to arrays or vectors that won't fit in the CPU caches must be avoided for optimal performance. 2. The time to calculate a hash code or similar operation, normally considered insignificant with a database program, can cause performance bottlenecks and must be executed in as efficient a manner as possible. 3. System calls that cause a context switch cannot be executed in the critical path because they take too much time (on the order of a microsecond). The design of a heterogeneous hash table tuned for persistent memory provides an instructive example of such restrictions.

Steve Heller
2Misses Corp.
Related Sessions