and the RAM-drive deployment does not change such internal functioning. These
processes "go through the motions" even when no longer needed, adding several distinct
types of performance overhead.
Caching overhead
Due to the significant performance drain of physical disk access, virtually all disk-based
databases incorporate sophisticated techniques to minimize the need to go to disk.
Foremost among these is database caching, which strives to keep the most frequently
used portions of the database in memory. Caching logic includes cache synchronization,
which makes sure that an image of a database page in cache is consistent with the
physical database page on disk, to prevent the application from reading invalid data.
Another process, cache lookup, determines if data requested by the application is in cache
and, if not, retrieves the page and adds it to the cache for future reference. It also selects
data to be removed from cache, to make room for incoming pages. If the outgoing page
is "dirty" (holds one or more modified records), additional logic is invoked to protect
other applications from seeing the modified data until the transaction is committed.
These caching functions present only minor overhead when considered individually, but
present significant overhead in aggregate. Each process plays out every time the
application makes a function call to read a record from disk (in the case of db.linux,
examples are d_recfrst, d_recnext, d_findnm, d_keyfind, etc.). In the demonstration
application above, this amounts to some 90,000 function calls: 30,000 d_fillnew, 30,000
d_keyfind and 30,000 d_recread. In contrast, all records in a main memory database such
as eXtremeDB are always in memory, and therefore require zero caching
Transaction Processing Overhead
Transaction processing logic is a major source of processing latency. In the event of a
catastrophic failure such as loss of power, a disk-based database recovers by committing
or rolling back complete or partial transactions from one or more log files when the
system is restarted. Disk-based databases are hard-wired to keep transaction logs, and to
flush transaction log files and cache to disk after the transactions are committed. A disk-
based database doesn't know that it is running in a RAM-drive, and this complicated
processing continues, even when the log file exists only in memory and cannot aid in
recovery should system failure occur.
Main memory databases must also provide transactional integrity, or so-called ACID
compliant transactions. In plain English, a main memory database application thread
must be able to commit or abort a series of updates as a single unit. To do this,
eXtremeDB maintains a before-image of the objects that are updated or deleted, and a list
of database pages added during a transaction. When the application commits the
transaction, the memory for before-images and page references returns to the memory
pool