Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
4,285 result(s) for "Caching"
Sort by:
A review of modern caching strategies in named data network: overview, classification, and research directions
Nowadays, Named data networking (NDN) is an extended form of Content-Centric Networking, which is a significant one of the Information Centric Networking paradigm. It is critical for accessing the majority of internet-based applications, as access to content is determined via its content name rather than its physical host location. Furthermore, the current Internet design is unsuitable for the enormous volume of Internet traffic. As a result, the paradigm shifts from a location-based to a content-based one. The most important area to be explored in NDN architecture is the distribution of data in-network (data caching), which is very helpful for the subscribers to get the required content from the nearest caching node, however, this is costly due to high bandwidth and popularity. In order to achieve higher cache performance, the cache needs to be managed using a more efficient technique. There are numerous content placement and replacement strategies to manage an NDN-based cache, this work focuses on reviewing cache placement and replacement strategies that address the problem of managing in NDN architecture. In this paper, an overview has been provided with modern caching strategies and related issues such as caching characteristics, caching challenges, caching simulated environment, and caching evaluation metrics. The main focus is also to present useful research papers for a community of researchers interested in the field of NDN so that they can get an overview of what studies and topics have been and are being designed and developed in this particular caching area.
XRootD caching for Belle II
The Belle II experiment at the second generation e + /e – B-factory SuperKEKB has been collecting data since 2019 and aims to accumulate a 50PB data set. To efficiently process these steadily growing data sets of recorded and simulated data as well as support Grid-based analysis workflows using the DIRAC Workload Management System, an XRootD-based caching architecture is presented. The presented mechanism decreases job waiting time for often-used data sets by transparently adding copies of these files at smaller sites without managed storage. The described architecture seamlessly integrates local storage services and supports the use of dynamic computing resources with minimal deployment effort. This is especially useful in environments with many institutions providing comparatively small numbers of cores and limited personpower.
Mobility-Aware Proactive Edge Caching Optimization Scheme in Information-Centric IoV Networks
Edge caching is a promising approach to alleviate the burden on the backhaul of network links. It has a significant role in the Internet of Vehicle (IoV) networks performance by providing cached data at the edge and reduce the burden of the core network caused by the number of participating vehicles and data volume. However, due to the limited computing and storage capabilities of edge devices, it is hard to guarantee that all contents are cached and every requirement of the device are satisfied for all users. In this paper, we design an Information-Centric Network (ICN) with mobility-aware proactive caching scheme to provide delay-sensitive services on IoV networks. The real-time status and interaction of vehicles with other vehicles and Roadside Units (RSU) is modeled using a Markov process. Mobility aware proactive edge caching decision that maximize network performance while minimizing transmission delay is applied. Our numerical simulation results show that the proposed scheme outperforms related caching schemes in terms of latency by 20–25% in terms of latency and by 15–23% in cache hits.
Joint computation offloading and task caching for multi-user and multi-task MEC systems: reinforcement learning-based algorithms
Computation offloading at mobile edge computing (MEC) servers can mitigate the resource limitation and reduce the communication latency for mobile devices. Thereby, in this study, we proposed an offloading model for a multi-user MEC system with multi-task. In addition, a new caching concept is introduced for the computation tasks, where the application program and related code for the completed tasks are cached at the edge server. Furthermore, an efficient model of task offloading and caching integration is formulated as a nonlinear problem whose goal is to reduce the total overhead of time and energy. However, solving these types of problems is computationally prohibitive, especially for large-scale of mobile users. Thus, an equivalent form of reinforcement learning is created where the state spaces are defined based on all possible solutions and the actions are defined on the basis of movement between the different states. Afterwards, two effective Q-learning and Deep-Q-Network-based algorithms are proposed to derive the near-optimal solution for this problem. Finally, experimental evaluations verify that our proposed model can substantially minimize the mobile devices’ overhead by deploying computation offloading and task caching strategy reasonably.
Mixed Micro/Macro Cache for Device-to-Device Caching Systems in Multi-Operator Environments
In a device-to-device (D2D) caching system that utilizes a device’s available storage space as a content cache, a device called a helper can provide content requested by neighboring devices, thereby reducing the burden on the wireless network. To enhance the efficiency of a limited-size cache, one can consider not only macro caching, which is content-based caching based on content popularity, but also micro caching, which is chunk-based sequential prefetching and stores content chunks slightly behind the one that a nearby device is currently viewing. If the content in a cache can be updated intermittently even during peak hours, the helper can improve the hit ratio by performing micro caching, which stores chunks that are expected to be requested by nearby devices in the near future. In this paper, we discuss the performance and effectiveness of micro D2D caching when there are multiple operators, the helpers can communicate with the devices of other operators, and the operators are under a low load independently of each other. We also discuss the ratio of micro caching in the cache area when the cache space is divided into macro and micro cache areas. Good performance can be achieved by using micro D2D caching in conjunction with macro D2D caching when macro caching alone does not provide sufficient performance, when users are likely to continue viewing the content they are currently viewing, when the content update cycle for the cache is short and a sufficient number of chunks can be updated for micro caching, and when there are multiple operators in the region.
PCICaching: Learning-Driven and Resilient UAV Caching with Cache-Aware User Association in SAGINs
Space–air–ground integrated networks (SAGINs) enable flexible content delivery through satellite–UAV–ground cooperation, yet time-varying user demand and dynamic backhaul conditions pose significant challenges to efficient UAV caching. To address these challenges, this paper proposes PCICaching, a backhaul-aware and prediction-driven UAV caching framework that integrates LSTM-based popularity forecasting, cache-aware user association, and conditionally activated cooperative caching. Under normal satellite backhaul conditions, PCICaching operates in a latency-oriented mode and reduces average content delivery latency by up to 33.9% and 38.9% compared with representative GTGA-based and history-based baselines, respectively. When backhaul connectivity degrades, the proposed C3 mechanism enlarges cluster-level content coverage and maintains service continuity with only a moderate latency increase of approximately 14.2%. Moreover, the proposed sequential decomposition enables scalable online operation with per-update execution time below 100 ms. These results demonstrate that PCICaching provides a structurally adaptive and computationally efficient solution for UAV-assisted caching in SAGINs, effectively balancing latency efficiency and content availability under time-varying demand and infrastructure uncertainty.
MiniPIC: Flexible Position-Independent Caching in <100LOC
Retrieval-augmented and agentic workloads repeatedly prefill recurring predictable structured inputs (which we call \"spans\") such as documents and code files. Yet, prefix caching in engines such as vLLM cannot reuse their KV entries unless they share identical prefixes with another request, while Position-Independent Caching (PIC) implementations within production-grade inference servers typically either require substantial server code changes or keep KV state outside the server, incurring host-to-device transfer overhead. We present Minimalistic PIC (MiniPIC): a minimal, flexible and fast vLLM design built from two ingredients: positional-encoding-free KV cache and user-controlled cache-reuse primitives. MiniPIC stores unrotated K vectors in the KV cache, applies RoPE to K tiles inside attention using per-request logical positions, and exposes three user-facing and token-level primitives: block-aligned padding, span separator (SSep), and prompt depend (PDep), that modify hashing behavior and effective block-level causal attention structure. With fewer than 100 lines of core-engine changes plus a custom attention backend, these primitives are sufficient to realize multiple PIC methods, including Block-Attention, EPIC, and Prompt Cache, within the same running vLLM instance, while natively integrating with KV cache CPU offload implementations. On 2WikiMultihopQA, MiniPIC with interleaved scheduling improves prefill throughput by 49% over baseline vLLM, reduces cached-span time-to-first-token by up to two orders of magnitude, preserves the linear prefill scaling of uncached spans, and incurs only 5.7% worst-case overhead.
Caching strategy for Web application – a systematic literature review
PurposeInternet users and Web-based applications continue to grow every day. The response time on a Web application really determines the convenience of its users. Caching Web content is one strategy that can be used to speed up response time. This strategy is divided into three main techniques, namely, Web caching, Web prefetching and application-level caching. The purpose of this paper is to put forward a literature review of caching strategy research that can be used in Web-based applications.Design/methodology/approachThe methods used in this paper were as follows: determined the review method, conducted a review process, pros and cons analysis and explained conclusions. The review method is carried out by searching literature from leading journals and conferences. The first search process starts by determining keywords related to caching strategies. To limit the latest literature in accordance with current developments in website technology, search results are limited to the past 10 years, in English only and related to computer science only.FindingsNote in advance that Web caching and Web prefetching are slightly overlapping techniques because they have the same goal of reducing latency on the user’s side. But actually, the two techniques are motivated by different basic mechanisms. Web caching uses the basic mechanism of cache replacement or the algorithm to change cache objects in memory when the cache capacity is full, whereas Web prefetching uses the basic mechanism of predicting cache objects that can be accessed in the future. This paper also contributes practical guidelines for choosing the appropriate caching strategy for Web-based applications.Originality/valueThis paper conducts a state-of-the art review of caching strategies that can be used in Web applications. Exclusively, this paper presents taxonomy, pros and cons of selected research and discusses data sets that are often used in caching strategy research. This paper also provides another contribution, namely, practical instructions for Web developers to decide the caching strategy.
Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models
To address the high sampling cost of Diffusion Transformers (DiTs), feature caching offers a training-free acceleration method. However, existing methods rely on hand-crafted forecasting formulas that fail under aggressive skipping. We propose L2P (Learnable Linear Predictor), a simple data-driven caching framework that replaces fixed coefficients with learnable per-timestep weights. Rapidly trained in ~20 seconds on a single GPU, L2P accurately reconstructs current features from past trajectories. L2P significantly outperforms existing baselines: it achieves a 4.55x FLOPs reduction and 4.15x latency speedup on FLUX.1-dev, and maintains high visual fidelity under up to 7.18x acceleration on Qwen-Image models, where prior methods show noticeable quality degradation. Our results show learning linear predictors is highly effective for efficient DiT inference. Code is available at https://github.com/Aredstone/L2P-Cache.
Security, Privacy, and Linear Function Retrieval in Combinatorial Multi-Access Coded Caching with Private Caches
We consider combinatorial multi-access coded caching with private caches, where users are connected to two types of caches: private caches and multi-access caches. Each user has its own private cache, while multi-access caches are connected in the same way as caches are connected in a combinatorial topology. A scheme is proposed that satisfies the following three requirements simultaneously: (a) Linear Function Retrieval (LFR), (b) content security against an eavesdropper, and (c) demand privacy against a colluding set of users. It is shown that the private caches included in this work enable the proposed scheme to provide privacy against colluding users. For the same rate, our scheme requires less total memory accessed by each user and less total system memory than the existing scheme for multi-access combinatorial topology (no private caches) in the literature. We derive a cut-set lower bound and prove optimality when r≥C−1. For r