A Qualitative Study of Application-level Caching

Latency and cost of Internet-based services are driving the proliferation of application-level caching to improve the user perceived latency, reduce the Internet traffic and improve the scalability and availability of origin servers. However, there is no transparent and automatic way to design and implement this type of caching, since that this usually depends on specific details of the application being cached. As a result, it requires application developers to reason about an orthogonal aspect, which is unrelated to the business logic. Moreover, this concern is typically addressed in an ad-hoc way and, consequently, is a time-consuming and error-prone task, becoming a common source of bugs. In this paper, we present the results of a qualitative study of how developers handle caching logic in their web applications, which involved the investigation of 10 software projects. The analysis of our results allowed us to extract cache-related concerns addressed by developers to improve their application's performance and scalability. Based on our analysis, we derived guidelines and patterns for application-level caching, which provide guidance to developers while designing, implementing and managing application-level caching, thus supporting developers in this challenging task that is crucial in enterprise web applications.

Click here if you would like to download the patterns in pdf format.
Universidade Federal do Rio Grande do Sul
Avenida Bento Gonçalves, 9500 - Instituto de Informática
Porto Alegre, Brazil
Jhonny Mertz and Ingrid Nunes
jhonnymertz@gmail.com

Applications


In order to investigate different aspects of caching, it was important to select representative systems that make extensive use of application-level caching. To obtain applications that employ application-level caching, we searched through open-source repositories, which information can be easily retrieved for our study. Based on text search mechanisms, we assessed how many occurrences of cache implementation and issues were present in applications to ensure they would contribute to the study. The more caching-related implementation and issues, the better, so that the rationale behind choices regarding application-level caching could be extracted. Moreover, we managed to obtain commercial applications with partner companies interested in the results of this study.

Project name Domain Language KLOC Analyzed commit
S1 Market trend analysis Ruby 21
Pencilblue CMS and blogging platform Javascript 33 a3b378bb195a1641427a023f30728adb8fd85a94
Spree E-commerce Ruby 50 8e74d76c41ce3b9dbb5e8f66f015793176a7451d
Shopizer E-commerce Java 57 d3bb68ec908c440267e7084ed95392afe2aa1a1f
Discourse Platform for community discussion Ruby 88 16f509afb94553ee16fa50258de6937a8360878b
OpenCart E-commerce PHP 123 76a772f9a11f37d3ceb37866b45cd2beee42fc7a
OpenMRS API and web application Patient-based medical record system Java 127 971ada09e1763cb9df57dc290958f0de34029788
ownCloud core File sharing solution for online collaboration and storage PHP 193 b877b0508bb08868909b2b270a7753a9ba868c7a
PrestaShop E-commerce PHP 245 a36c389729429fb7395f903d2cc6ebd57ab958e9
Open edX Online courses platform Python 250 22e01a8c9527add15e8bcc9589c02900fbb2c63b

Guidelines


Based on the analysis, we have identified application-level caching decisions and behaviors adopted by developers. Our findings allowed us to propose a set of guidelines for the development of a caching approach.

Design Guidelines

Evaluate different abstraction levels to cache. It is important to cache data where it reduces the most processing power and round trips, choosing locations that support the lifetime needed for the cached items, despite where it is located in the application. Different levels of caching provide different behavior, and possibilities must be analyzed. For instance, caching in the model or database level offers higher hit ratios, while caching in presentation layer can reduce the application processing overhead significantly in the application in the case of a hit. However, in the latter case, hit ratios are in general lower. It is possible to cache data at various layers of an application, according to the following layer-by-layer considerations.

  • Controller layer. Caching data at the controller layer should be considered when data needs to be frequently displayed to the user and is not cached on a per-user basis. At this level, controllers usually work by serving parameterized content, which can be used as an identifier in the cache. For example, if a list of states is presented to the user, the application can load these once from the database and then cache them, according to the parameters passed in the first request.
  • Business or service layer. Caching data at the business layer should be considered if an application needs to process requests from the presentation layer or when the data cannot be efficiently retrieved from the database or another service. It can be implemented by using hash tables, library or framework. However, at this level, a large amount of data tends to be manipulated and caching it can consume more memory and leads to memory bottlenecks.
  • Model or database layer. At the model or database layer, a large amount of data can be cached, for lengthy periods. It is useful to cache data from a database when it demands a long time to process queries, avoiding unnecessary round-trips.

Stack caching layers. It is reasonable to say that the more data cached, the lower the chance of being hit without any content already loaded. Caching might be at the client, proxy server, inside the application in presentation, business, and model logics, or database. Despite the same data may be cached in multiple locations, when the cache expires in one of them, the application will not be hit with an entirely uncached content, avoiding processing and network round trips. However, it is important to keep in mind that caching layers imply a range of responsibilities, such as consistency conditions and constraints, and extra code and configuration. Due to this, it is important to consider many caching layers but, at the same time, achieve a good trade-off between caching benefits and implementation effort.

Separate dynamic from static data. Content can be distinguished in static, dynamic, and user-specific. By partitioning the content, it is easier to select portions of the data to cache.

Evaluate application boundaries. Communication between application and external components is a common bottleneck and, consequently, an opportunity for caching. Consider caching for database queries, remote server calls and requests to web services, which are made across a network.

Specify selection criteria. Selecting the right data to cache involves a great reasoning effort given that data manipulated by web applications range in dynamicity, from being completely static to changing constantly. To optimize this selection process, there are four primary selection criteria used by developers while detecting cacheable content, which should be used in decisions regarding whether to cache. These criteria are described below, ordered according to their importance; i.e.\ the higher the influence level, the earlier it is presented.

  • Data change frequency. Developers should seek for data that have some degree of stability, i.e.\ those that are more used than changed. Even if data are volatile and change in time intervals, caching still brings a benefit. This is the first factor to be considered since caching volatile data implies the implementation of consistency mechanisms, which is not trivial and requires an extra effort and reasoning from developers. In short, the cost of consistency approaches cannot be higher than the benefit of caching. Besides, when stale data is not a critical issue, an approach of weak consistency can be employed, such as time-to-live (TTL) eviction, where data is expired after a time in cache, regardless of possible changes.
  • Data usage frequency. Frequent requests, operations, queries and shared content (accessed by multiple users) must be identified, focusing on recomputation avoidance. Even if some processing can be fast enough at a glance, it can potentially become a bottleneck when being invoked many times. Despite being frequently used, user-specific data cannot be shared and may not bring the benefit of caching, being usually left out of the cache.
  • Data computation complexity. Data that is expensive to retrieve, compute, or render, regardless of its dynamicity, is always considered a good caching opportunity.
  • Size of the data. The size of the content being cached should be considered when using size-limited caches. If the content is large, it can result in a problem of cache thrashing where the cache is filled with data, and purges it soon after, without having a benefit of caching.

Evaluate staleness and lifetime of cached data. Every piece of cached data is already potentially stale, it is important to rethink the degree of integrity and potential staleness that the application can compromise for increased performance and scalability. Many cache implementations adopted an expiration policy to invalidate cached data based on a timeout since weak consistency is easier than defining a hard-to-maintain, but more robust, invalidation process. In short, developers must ensure that the expiration policy matches the pattern of access to applications that use the data, which is based on determining how often the cached information is allowed to be outdated, and relaxing freshness when possible.

Avoid caching per-user data. It is recommended to avoid caching per-user data unless the user base is small and the total size of the cached data does not require an excessive amount of memory; otherwise, it can cause a memory bottleneck. However, if users tend to be active for a while and then go away again, caching per-user data for short-time periods may be an appropriate approach. For instance, a search engine that caches query results by each user, so that it can page through results efficiently.

Avoid caching volatile data. Data should be caches when it is frequently used and is not continually changing. Developers should remember that caching is most effective for relatively stable data, or data that is frequently read. Caching volatile data, which is required to be accurate or updated in real time, should be avoided.

Do not discard small improvements. The user perceived latency is reduced by any caching solutions employed. This means that even not obvious scenarios should be target of caching, i.e.\ it is not true that solely data that is frequently used and expensive to retrieve or create should considered for caching. Furthermore, data that is expensive to retrieve and is modified on a periodic basis can still improve performance and scalability when properly managed. Caching data even for a few seconds can make a large difference in high volume sites. If the data is handled more often than it is updated, it is also a candidate for caching.

Implementation Guidelines

Keep the cache API simple. Caching logic tends to be spread all over the application, and a good solution should be employed to avoid writing messy code at the cost of high maintenance efforts.

Define naming conventions. To define appropriate names for cached data, it is important to assign a name that is related to its context, the data itself, and the caching location. It can provide two direct benefits: (a) prevention of key conflicts, and (b) guidance of cache actions such as updates and deletes of stale data in case of changes in the source of information.

Perform cache actions asynchronously. For large caches, it is adequate to load the cache asynchronously with a separate thread or batch process. Moreover, when an expired cache is requested, it needs to be repopulated and doing so synchronously affects response time and blocks the request processing thread.

Do not use cache as data storage. An application can modify data held in a cache, but the cache should be considered as a transient data store that can disappear at any time. Therefore, developers should not save valuable data only in the cache, but keep the information where it should be as well, minimizing the chances of losing data if the cache unexpectedly becomes unavailable.

Maintenance Guidelines

Perform measurements. Caching is an optimization technique, and as any optimization, it is important perform measurements before making substantial changes, given that not all application performance and scalability problems can be solved with caching. Furthermore, if unnecessarily employed, caching can eventually decrease performance rather than improve it.

Document and report measurements. To compare and reproduce performance tests employed, it is important to document the setup used to perform the application analysis. It includes modules enabled, particular configurations and hardware settings.

Consider using supporting libraries and frameworks. Supporting libraries and frameworks can raise the level of abstraction of cache implementation and provide useful features. In addition, they can scale in a much easier and faster way than application-wide solutions.

Tune default configurations. Default configurations provided by external components serve as a start point. However, because they are generic and may not fit the application specificities, other configurations must be evaluated and possibly adopted.

Use of transparent caching components. The use of transparent caching solutions to address bottlenecks outside the application boundaries such as databases, final HTML pages or fragments, and static assets can provide fast results. These solutions do not explore application specificities, but can still provide performance benefits for typical usage scenarios.

Patterns


Based on our study, we derived patterns of explanations, which can be used for supporting a decision made by a software system. Moreover, we identified the components these patterns must have, which comprise a template for a caching pattern catalog. These components are (i) a classification, (ii) the pattern intent, (iii) the solution proposed, and (v) an example.

Asynchronous Loading Implementation


Intent

  • Design an intermediary to asynchronously deal with caching.
  • Synchronous caching operations can affect client-response time and blocks the request processing thread.

Problem

Under certain circumstances the cost of populating the cache is very expensive (data provided by third party systems via webservices or the result of a heavy calculation process, and others). In such a scenario whenever the cache is invalidated (by automatic expiration or invalidation) the request processing is blocked while getting fresh data, which affects client-response time.

Solution

Load the cache asynchronously with a separate thread or by using a batch process.
Rules of thumb:
  • At initial population of the cache, mainly for large caches, load the cache asynchronously with a separate thread or by using a batch process.
  • At runtime, when the cache is invalidated, repopulate it in a background thread and then hide the cost of data retrieval to the end-users.
  • It is strongly recommended the use of a third-party library or frameworks which already provides cache basic operations in an async way. There are options for the most of programming languages and cache providers.

Example

The method in the following code example shows an implementation of the Cache-aside pattern based on asynchronous processing. An object is identified by using an ID as the key. The asyncGet method uses this key and attempts to retrieve an item from the cache. If a matching item is found, it is returned. If there is no match in the cache, it should retrieve the object from a data store, adds it to the cache, and then returns it (the code that actually retrieves the data from the data store has been omitted because it is data store dependent). Note that the load from cache has a timeout, in order to ensure that the cache interaction do not block the processing.


                // Get a cache client
                // it can be a trird-party library or an implemented module
                // this facade should provide at least get, set and remove methods
                Cache cache = Cache.getInstance();

                public List<Product> getProducts() {

                    List<Product> products = null;
                    Future<Object> f = (List<Product>) cache.asyncGet("products");
                    try {
                      // Try to get a value, for up to 5 seconds, and cancel if it
                      // doesn't return
                      products = f.get(5, TimeUnit.SECONDS);

                    // throws expecting InterruptedException, ExecutionException
                    // or TimeoutException
                    } catch (Exception e) {
                        // Since we don't need this, go ahead and cancel the operation.
                        // This is not strictly necessary, but it'll save some work on
                        // the server.  It is okay to cancel it if running.
                        f.cancel(true);
                        // Do other timeout related stuff
                    }

                    if (products == null) {
                        products = getProductsFromDB();

                        // updates into cache should not block the request
                        // return the user request as soon as possible
                        cache.asyncSet("products", products);
                    }

                    return products;
                }

                public Product getProduct(String id) {

                        Product product = null;
                        Future<Object> f = (Product) cache.asyncGet("product" + id);
                        try {
                          product = f.get(5, TimeUnit.SECONDS);
                        } catch (Exception e) {
                            f.cancel(true);
                        }

                        if (products == null) {
                            product = getProductFromDB(id);

                            // updates into cache should not block the request
                            // return the user request as soon as possible
                            cache.asyncSet("product" + id, products);
                        }

                        return product;
                }
            

The code below demonstrates how to invalidate an object in the cache when the value is changed by the application. The code updates the original data store and then removes the cached item from the cache by calling the Remove method, specifying the key.

The order of the steps in this sequence is important. If the item is removed before the cache is updated, there is a small window of opportunity for a client application to fetch the data (because it is not found in the cache) before the item in the data store has been changed, resulting in the cache containing stale data.


              public void updateProduct(Product product) {
                updateProductIntoDB(product);
                cache.asyncDelete("products");

                //optionally, it is possible to update the data into cache
                cache.asyncSet("product" + id, product);
              }

              public void deleteProduct(String id) {
                deleteProductFromDB(id);
                cache.asyncDelete("products");
                cache.asyncDelete("product" + id);
              }
          

Cacheability Design


Intent

  • Provide an intuitive process to decide whether to cache or not particular data.

Problem

Cache has limited size, so it is important to use the available space to cache data that maximizes the benefits provided to the application. Otherwise, it can end up reducing application performance instead of improving it, consuming more cache memory and at the same time suffering from cache misses, where the data is not getting served from cache but is fetched from the source.

Solution

Even though there are many criteria that contribute for identifying the level of data cacheability, there is a subset that would confirm this decision regardless of the values of the other criteria. Figure below expresses a flowchart of the reasoning process to decide whether to cache or not data, based on the observation of data and cache properties. Changeability is the first criterion that should be analyzed while selecting cacheable data, then usage frequency, shareability, computation complexity, and cache properties should be considered. All criteria are tightly related to the application specificities and should be specified by the developer.

Cacheabilitiy Flowchart
Rules of thumb:
  • Despite being frequently used, user-specific data are not shareable and may not bring the benefit of caching, being usually avoided by developers. In this case, a specific session component is used to keep and retrieve user sessions.
  • If the data changes frequently, it should not be immediately discarded from cache. An evaluation of the performance benefits of caching against the cost of building the cache should be done. Caching frequently changing data can provide benefits if slightly stale data is allowed.
  • Expensive spots (when much processing is required to retrieve or create data) are bottlenecks that directly affect application performance and should be cached, even though it can increase complexity and responsibilities to deal with. Methods with high latency or that consists of a large call stack are some examples of this situation and opportunities for caching.

In addition, we list content properties that should be avoided, which do not convey the influence factors in a good way and lead to problems such as cache trashing.

  • (a) User-specific data. Avoid caching content that varies depending on the particularities of the request, unless weak consistency is acceptable. Otherwise, the cache can end up being fulfilled with small and less beneficial objects. As result, the caching component achieves its maximum capacity earlier and is flushed or replaced many times in a brief period, which is cache thrashing.
  • (b) Highly time-sensitive data. Content that changes more than is used should not be cached given that it will not take advantage from caching. The cost of implementing and designing an efficient consistency policy may not be compensate.
  • (c) Large-sized objects. Unless the size of the cache is large enough, do not cache large objects, it will probably result in a cache trashing problem, where the caching component is flushed or replaced many times in a short period.

Example

We list some typical scenarios where data should be cached and also give explanations based on the criteria presented.

  • (a) Headlines. In most cases, headlines are shared by multiple users and updated infrequently.
  • (b) Dashboards. Usually, much data need to be gathered across several application modules and manipulated to build a summarized information about the application.
  • (c) Catalogs. Catalogs need to be updated at specific intervals, are shared across the application, and manipulated before sending the content to the client.
  • (d) Metadata/configuration. Settings that do not frequently change, such as country/state lists, external resource addresses, logic/branching settings and tax definitions.
  • (e) Historical datasets for reports. Costly to retrieve or create and does not need to change frequently.

Data Expiration Design and Maintenance


Intent

  • Given the cacheable content, provide an intuitive process to decide the consistency management approach based on data specificities.

Problem

It is usually impractical to expect that cached data will always be completely consistent with the data in the data store. Applications should implement a strategy that helps to ensure that the data in the cache is up to date as far as possible, but can also detect and handle situations that arise when the data in the cache has become stale. An inappropriate expiration policy may result in frequent invalidation of the cached data, which negates the benefits of caching.

Solution

Every piece of cached data is already potentially stale, and a good trade-off between performance benefits and cost of invalidation approaches should be achieved. Its necessary to determine the appropriate time interval to refresh data, and design a notification process to indicate that the cache needs refreshing. If the data is held too long, it runs the risk of using stale data, and if it was expired too frequently, it could affect performance.

Deciding on the expiration algorithm that is right for the scenario includes the following possibilities:

  • Heuristic-based. Traditional algorithms such as least recently used and least frequently used can be used.
  • Absolute expiration after a fixed interval. Expiration based on a pre-defined TTL, applied to every content.
  • Invalidation. Caching expiration based on a change in an external dependency, such as modifications in the data by users actions.
  • Flushing. Cleaning up the cache if a resource threshold (such as a memory limit) is reached.

Figure above expresses a flowchart with the reasoning process to decide the appropriate consistency approach, based on observation of data properties. Changeability is the first property that should be analyzed while deciding, then staleness level and the amount of operations and dependencies related to the data should be considered. All properties are tightly related to the application specificities and should be defined by developer.

Data Expiration Flowchart
Rules of thumb:
  • While deciding the best consistency approach, it is important to measure the staleness degree and the lifetime of cached data.
  • Frequently changed data is easily managed when associated with a TTL.
  • Infrequently changed data provide more benefits when cached for long periods, thus manual invalidations or replacement are recommended.
  • Determining how often is the cached information allowed to be wrong and work with weak consistency can be easier than defining a hard-to-maintain invalidation process. However, if the expiration period is too short, objects will expire too quickly, and it will reduce the benefits of using the cache. On the other hand, if the expiration period is too long, it risks the data becoming stale.
  • Do not give all your keys the same TTLs, so they do not all expire at the same time. Doing this ensures that you do not get spikes of requests trying to make requests to your database because the cache keys have expired simultaneously.
  • If the lifetime is dependent on how frequently the data is used, traditional heuristic-based eviction policies are right choices.
  • If you frequently expire the cache to keep in synchronization with the rapidly changing data, you might end up using more system resources such as CPU, memory, and network.
  • If the data does change frequently, you should evaluate the acceptable time limit during which stale data can be served to the user.
  • Even if the data is quite volatile and changes, for example, every two minutes, the application can still take advantage from caching. For instance, if 20 clients are requesting the same data in a 2-minute interval, it is saving at least 20 round trips to the server by caching the data.
  • Do not make the expiration period too short because this can cause applications to continually retrieve data from the data store and add it to the cache.
  • Similarly, do not make the expiration period so long that the cached data is likely to become stale.
  • Most caches adopt a least-recently-used policy for selecting items to evict, but this may be customizable. Configure the global expiration property and other properties of the cache, and the expiration property of each cached item, to help ensure that the cache is cost effective. It may not always be appropriate to apply a global eviction policy to every item in the cache. For example, if a cached item is very expensive to retrieve from the data store, it may be beneficial to retain this item in cache at the expense of more frequently accessed but less costly items.
  • Cache services typically evict data on a least-recently-used (LRU) basis, but you can usually override this policy and prevent items from being evicted. However, if you adopt this approach, you risk your cache exceeding the memory that it has available, and an application that attempts to add an item to the cache will fail with an exception. \\

Example

Consider a stock ticker, which shows the stock quotes. Although the stock rates are continuously updated, the stock ticker can safely be removed or even updated after a fixed time interval of some minutes.

Name Assignment Implementation


Intent

  • Ensure an unique key.
  • Keep track of the content cached.

Problem

Keys are important to keep track of the content cached while debugging or when it is necessary to invalidate and delete stale data from cache, in the case of changes in the source of information.

Solution

When choosing a cache key, you should ensure that it is unique to the object being cached, and that it appropriately varies by any contextual values.

Rules of thumb:
  • A unique key can be simple strings or more complex types like hashes, lists, or sets. The content can be identified by method signature and individual identification. Moreover, a tag-based identification should be used to group related content.
  • Nowadays, caching keys are not restricted to simple strings but can be even complex types like Hashes, Lists, Sets, or Sorted Sets.
  • If the object being cached relies on the current user (perhaps via HttpContext.Current.User), then the cache key may include a variable that uniquely identifies that user.

Example

The key can be scoped to its particular function area, and be formatted with varying parameters. Its make easy to debug and track caching values.


                var key = string.Format("MyClass.MyMethod:{0}:{1}", myParam1, myParam2);