C aching, a fundamental metaphor in modern computing, finds wide application in storage systems,(1) databases, Web servers, middleware, processors, file systems, disk drives, redundant array of independent disks controllers, operating systems, and other applications such as data compression and list updating.(2) In a two-level memory hierarchy, a cache performs faster than auxiliary storage, but it is more expensive. Cost concerns thus usually limit cache size to a fraction of the auxiliary memory's size. Both cache and auxiliary memory handle uniformly sized items called pages. Requests for pages go first to the cache. When a page is found in the cache, a hit occurs; otherwise, a cache miss happens, and the request goes to the auxiliary memory. In the latter case, a copy is paged in to the cache. This practice, called demand paging, rules out prefetching pages from the auxiliary memory into the cache. If the cache is full, before the system can page in a new page, it must page out one of the currently cached pages. A replacement policy determines which page is evicted. A commonly used criterion for evaluating a replacement policy is its bit ratio-the frequency with which it finds a page in the cache. Of course, the replacement policy's implementation overhead should not exceed the anticipated time savings. Discarding the least-recently-used page is the policy of choice in cache management. Until recently, attempts to outperform LRU in practice had not succeeded because of overhead issues and the need to pretune parameters. The adaptive replacement cache is a self-tuning, low-overhead algorithm that responds online to changing access patterns. ARC continually balances between the recency and frequency features of the workload, demonstrating that adaptation eliminates the need for the workload-specific pretuning that plagued many previous proposals to improve LRU. ARC's online adaptation will likely have benefits for real-life workloads due to their richness and variability with time. These workloads can contain long sequential I/Os or moving hot spots, changing frequency and scale of temporal locality and fluctuating between stable, repeating access patterns and patterns with transient clustered references. Like LRU, ARC is easy to implement, and its running time per request is essentially independent of the cache size. A real-life implementation revealed that ARC has a low space overhead-0.75 percent of the cache size. Also, unlike LRU, ARC is scanresistant in that it allows one-time sequential requests to pass through without polluting the cache or flushing pages that have temporal locality. Likewise, ARC also effectively handles long periods of low temporal locality. ARC leads to substantial performance gains in terms of an improved hit ratio compared with LRU for a wide range of cache sizes.