Linux kernel work in progress news


Shuah Khan
 

Another round of Linux kernel development news:

Stray-write protection for persistent memory

"Persistent memory has a number of advantages; it is fast, CPU-addressable,
available in large quantities and, of course, persistent. But it also,
arguably, poses a higher risk of suffering corruption as a result of bugs
in the kernel." LWN.

You can read about the work in progress in this LWN article:
https://lwn.net/Articles/883352/

A memory allocator for BPF code

- Work in progress special memory allocator to reduce memory usage
with a small performance gain. Allocates two buffers one for
for JIT compiler to use to during compile and a second for the
generated code to be copied once the compile step is done.

You can find details - https://lwn.net/Articles/883454/

thanks,
-- Shuah


Wenhui Zhang
 

Hi, Shuah:

Thanks so much for sharing.
The two links are very interesting.

We  have three proposals looking for you comments, I listed them as below.

Thanks so much for you help!

Wenhui

PMem Issues:

  1. cache coherence issue with Fabric PMem, cache may not locates on the same node with PMem --> partially resolved with PMDK, libpmem
  1. ordering uncertainty in map-and-sync affects crash consistency --> CRC error checking, however encountered performance issue, advanced data structured might help
  1. identity and ownership associating, and missing checks --> unresolved if kernel bypassing, also RoCE/PMem Fabric does not address authentication, authorization or encryption of data in flight.

Proposed: Investigate and Implement PMem Aware Crash Consistent Data Structures

This work focuses on Atomicity (both failer atomicity and multi-thread atomicity, atomicity, consistency, isolation and durability (ACID) ) of NVM Programming Models, while avoiding unnecessary serialization or redundant instrumentation. For example, PMem could be accessed without locks in some cases, by selectively bypassing read/write fences of ordering enforcing, we could optimize the performance.

We aim at building a mechanism for reliable transaction of data from one consistent state to another, which ensures crash consistency.

PMem aware data structures refer to mechanisms that address unique characteristics of PMem, such as avoiding PMem memory leaks. Misallocation of PMem, for example, fails dereferencing a list/tree after used, potentially leads to a memory leak in PMem.

Investigation into how these data structures in PMem libs (or to be added to PMem libs), and their associated compiler optimization (instruction selection and reordering) techniques achieves atomicity.

Both failer atomicity (PMem aware libraries sensitive to behaviors relates to CPU caches) and multi-thread atomicity (might be our contributions to the open source community) are to be investigated for PMem Aware Data Structures.

Some possible techniques for these data structures, such as concurrent updates management techniques. [ISCA'90] Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors .pdf

  • log structured filesystem/database/ transactions to record pending/inprocessing persistent updates, and run recovery logic which rolls back incomplete atomic updates
  • secure deletion: (1) remove the owner's logical path to data; (2) render data inaccessible to all software; (3) make storage space previously occupied status -> available status for reuse.

While designing and evaluating these PMem aware data structures, cache-line, atomic writing, and kernel copy sizes have to be taken into consideration. Hardware aligned operations on these data structures to enable optimal CPU/Cache/Memory performance will be taken into consideration, such as lockless data structures for better performance. Also, management of larger range of PMem atomicity (multi-thread atomicity) has to be considered . For example, if we are updating a large range of PMem, it needs to lock out other threads by invoking a fence operation with a set of store operations.

multi-thread atomicity workflow (no failer atomicity guaranteed in this workflow):

lock-> multi stores of fundamental data type sizes -> flush -> fence -> unlock

Proposed: extended datastructor for persistent memory for identification of ownership and information flow tracking

After permission checks are performed kernel, data located on persistent memory is directly accessible (i.e. DAXfs mode) for userspace programs. No kernel code, file system page caches, or interrupts are involved in the data path. This causes potential data hazard for integrity preserving data. Once a tenant established a root of trust with NVDIMM, there is nothing within NVDIMM to keep other tenants from accessing the protected memory address. Thus NVDIMM is subject to software access tenant boundary from attackers such as admin or tenants, which affects privacy and confidentiality of data at use.

Previous solutions include classical authorization, authentication separation of roles and memory protection. Kernel manages memory with virtual memory range/regions associated with process identity. However, this is not the case when we use PMem as a kernel bypassing module for PMem with DMA or RDMA.

Proposed: Linux abstraction layer for protected read-only memory

NVDIMM is subject to physical manipulation from attackers such as admin or repair, which affects privacy and confidentiality of data at rest. Also, software defects might introduce data integrity and accessibility issues, due to mismanaged pointers and memory resources.

If the state data could be compromised through over-writing by another userspace program. Thus kernel support for read-only persistent memory is needed. This read-only persistence memory is used to store passwords and integrity sensitive records.

Reference

  1. Persistent Memory Hardware Threat Model.pdf
  1. SNIA-NVM-PM-Remote-Access-for-High-Availability-v1.0.pdf
  1. SNIA-Persistent-Memory-Atomics-Transactions-WP.pdf
  1. SNIA-NVM-Programming-Model-v1.2.pdf
  1. PMDK Installation, https://docs.pmem.io/persistent-memory/getting-started-guide/installing-pmdk/compiling-pmdk-from-source
  1. Persistent Memory Documentation, https://docs.pmem.io/persistent-memory/
  1. Facebook Patch of CXL PMem on cold page swap, https://lore.kernel.org/lkml/cover.1637778851.git.hasanalmaruf@.../
  1. (ASPLOS'19)PMTest- A Fast and Flexible Testing Framework for Persistent Memory Programs.pdf
  1. (SOSP'19) Index Data Structure for PMem, RECIPE - Converting Concurrent DRAM Indexes to Persistent-Memory Indexes.pdf
  1. (ASPLOS'20) Cross-Failure Bug Detection in Persistent Memory Programs.pdf
  1. (OSDI'21) Storage Stack for PMem, Rearchitecting Linux Storage Stack for µs Latency and High Throughput.pdf
  1. (NVMW'22) Redis-PMem Perf, Making Volatile Index Structures Persistent using TIPS.pdf
  1. (OSDI'20) Stacking Ordering, Write Dependency Disentanglement with Horae.pdf
  1. (SOSP'21) Transactional NVME Protocol, Crash Consistent Non-Volatile Memory Express
  1. (ISCA'90) Atomicity of PMem Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors .pdf
  1. (PLDI'19) AutoPersist- An Easy-To-Use Java NVM Framework Based on Reachability.pdf
  1. (MICRO'19) Distributed Logless Atomic Durability with Persistent Memory.pdf

On Feb 7, 2022, at 8:19 AM, Shuah Khan <skhan@...> wrote:

Another round of Linux kernel development news:

Stray-write protection for persistent memory

"Persistent memory has a number of advantages; it is fast, CPU-addressable,
available in large quantities and, of course, persistent. But it also,
arguably, poses a higher risk of suffering corruption as a result of bugs
in the kernel." LWN.

You can read about the work in progress in this LWN article:
https://lwn.net/Articles/883352/

A memory allocator for BPF code

- Work in progress special memory allocator to reduce memory usage
 with a small performance gain. Allocates two buffers one for
 for JIT compiler to use to during compile and a second for the
 generated code to be copied once the compile step is done.

You can find details - https://lwn.net/Articles/883454/

thanks,
-- Shuah