kernel/kernel-std/debian/patches/0046-kernfs-also-call-kernfs_set_rev-for-positive-dentry.patch
Jim Somerville 63492b8ddd Reduce kernfs_mutex contention
This is a backport of a collection of 12 upstream patches.
The main one being the switch to use a rwsem instead.
The next important one being the switch of the rwsem to be a
per filesystem lock instead of global.

See the individual patches for details.  They did not require
much work or wiggling to get them applied.

They all come from Linus' tree and are easily located.  As such
I have not modified their individual headers with upstream
commit ids.

Verification:
- two scripts, the concept behind them supplied by Vefa Bicakci.
The first one causes a lot of concurrent contention in sysfs.
The second script highlights how well systemd is also contending.
Run Script1 followed by Script2

Without this change, Script2 has timeouts and fails.

Script1:
for i in `seq 20`; do
  (while :; do find /sys/fs/cgroup/ -type f -readable -print0 \
    2>/dev/null | xargs -0 -n 20 -r cat >&/dev/null ; done) &
done

for i in `seq 10`; do
  (while :; do systemd-run --scope -q sleep 0.5 >/dev/null; done) &
done

Script2:
while true; do
        date -Is
        /usr/bin/time -f %e systemctl enable  -q lighttpd.service ||
break
        /usr/bin/time -f %e systemctl disable -q lighttpd.service ||
break
        /usr/bin/time -f %e systemctl restart -q lighttpd.service ||
break
        sleep 0.5 || break
done

- also soak testing to ensure that these patches don't introduce issues

Partial-Bug: 2016028

Signed-off-by: Jim Somerville <jim.somerville@windriver.com>
Change-Id: I6ad64cd7c90f756c6eb904065febfeb516e73009
2023-04-25 17:51:09 +00:00

76 lines
2.8 KiB
Diff

From 51d9e2881717f650b762c322ddf484e9928e8355 Mon Sep 17 00:00:00 2001
From: Hou Tao <houtao1@huawei.com>
Date: Tue, 28 Sep 2021 22:07:50 +0800
Subject: [PATCH] kernfs: also call kernfs_set_rev() for positive dentry
A KMSAN warning is reported by Alexander Potapenko:
BUG: KMSAN: uninit-value in kernfs_dop_revalidate+0x61f/0x840
fs/kernfs/dir.c:1053
kernfs_dop_revalidate+0x61f/0x840 fs/kernfs/dir.c:1053
d_revalidate fs/namei.c:854
lookup_dcache fs/namei.c:1522
__lookup_hash+0x3a6/0x590 fs/namei.c:1543
filename_create+0x312/0x7c0 fs/namei.c:3657
do_mkdirat+0x103/0x930 fs/namei.c:3900
__do_sys_mkdir fs/namei.c:3931
__se_sys_mkdir fs/namei.c:3929
__x64_sys_mkdir+0xda/0x120 fs/namei.c:3929
do_syscall_x64 arch/x86/entry/common.c:51
It seems a positive dentry in kernfs becomes a negative dentry directly
through d_delete() in vfs_rmdir(). dentry->d_time is uninitialized
when accessing it in kernfs_dop_revalidate(), because it is only
initialized when created as negative dentry in kernfs_iop_lookup().
The problem can be reproduced by the following command:
cd /sys/fs/cgroup/pids && mkdir hi && stat hi && rmdir hi && stat hi
A simple fixes seems to be initializing d->d_time for positive dentry
in kernfs_iop_lookup() as well. The downside is the negative dentry
will be revalidated again after it becomes negative in d_delete(),
because the revison of its parent must have been increased due to
its removal.
Alternative solution is implement .d_iput for kernfs, and assign d_time
for the newly-generated negative dentry in it. But we may need to
take kernfs_rwsem to protect again the concurrent kernfs_link_sibling()
on the parent directory, it is a little over-killing. Now the simple
fix is chosen.
Link: https://marc.info/?l=linux-fsdevel&m=163249838610499
Fixes: c7e7c04274b1 ("kernfs: use VFS negative dentry caching")
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20210928140750.1274441-1-houtao1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jim Somerville <jim.somerville@windriver.com>
---
fs/kernfs/dir.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 51b35d6f9739..dd7ef74ee0ff 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1124,8 +1124,13 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
if (!inode)
inode = ERR_PTR(-ENOMEM);
}
- /* Needed only for negative dentry validation */
- if (!inode)
+ /*
+ * Needed for negative dentry validation.
+ * The negative dentry can be created in kernfs_iop_lookup()
+ * or transforms from positive dentry in dentry_unlink_inode()
+ * called from vfs_rmdir().
+ */
+ if (!IS_ERR(inode))
kernfs_set_rev(parent, dentry);
up_read(&kernfs_rwsem);
--
2.25.1