
This is a backport of a collection of 12 upstream patches. The main one being the switch to use a rwsem instead. The next important one being the switch of the rwsem to be a per filesystem lock instead of global. See the individual patches for details. They did not require much work or wiggling to get them applied. They all come from Linus' tree and are easily located. As such I have not modified their individual headers with upstream commit ids. Verification: - two scripts, the concept behind them supplied by Vefa Bicakci. The first one causes a lot of concurrent contention in sysfs. The second script highlights how well systemd is also contending. Run Script1 followed by Script2 Without this change, Script2 has timeouts and fails. Script1: for i in `seq 20`; do (while :; do find /sys/fs/cgroup/ -type f -readable -print0 \ 2>/dev/null | xargs -0 -n 20 -r cat >&/dev/null ; done) & done for i in `seq 10`; do (while :; do systemd-run --scope -q sleep 0.5 >/dev/null; done) & done Script2: while true; do date -Is /usr/bin/time -f %e systemctl enable -q lighttpd.service || break /usr/bin/time -f %e systemctl disable -q lighttpd.service || break /usr/bin/time -f %e systemctl restart -q lighttpd.service || break sleep 0.5 || break done - also soak testing to ensure that these patches don't introduce issues Partial-Bug: 2016028 Signed-off-by: Jim Somerville <jim.somerville@windriver.com> Change-Id: I6ad64cd7c90f756c6eb904065febfeb516e73009
76 lines
2.8 KiB
Diff
76 lines
2.8 KiB
Diff
From 51d9e2881717f650b762c322ddf484e9928e8355 Mon Sep 17 00:00:00 2001
|
|
From: Hou Tao <houtao1@huawei.com>
|
|
Date: Tue, 28 Sep 2021 22:07:50 +0800
|
|
Subject: [PATCH] kernfs: also call kernfs_set_rev() for positive dentry
|
|
|
|
A KMSAN warning is reported by Alexander Potapenko:
|
|
|
|
BUG: KMSAN: uninit-value in kernfs_dop_revalidate+0x61f/0x840
|
|
fs/kernfs/dir.c:1053
|
|
kernfs_dop_revalidate+0x61f/0x840 fs/kernfs/dir.c:1053
|
|
d_revalidate fs/namei.c:854
|
|
lookup_dcache fs/namei.c:1522
|
|
__lookup_hash+0x3a6/0x590 fs/namei.c:1543
|
|
filename_create+0x312/0x7c0 fs/namei.c:3657
|
|
do_mkdirat+0x103/0x930 fs/namei.c:3900
|
|
__do_sys_mkdir fs/namei.c:3931
|
|
__se_sys_mkdir fs/namei.c:3929
|
|
__x64_sys_mkdir+0xda/0x120 fs/namei.c:3929
|
|
do_syscall_x64 arch/x86/entry/common.c:51
|
|
|
|
It seems a positive dentry in kernfs becomes a negative dentry directly
|
|
through d_delete() in vfs_rmdir(). dentry->d_time is uninitialized
|
|
when accessing it in kernfs_dop_revalidate(), because it is only
|
|
initialized when created as negative dentry in kernfs_iop_lookup().
|
|
|
|
The problem can be reproduced by the following command:
|
|
|
|
cd /sys/fs/cgroup/pids && mkdir hi && stat hi && rmdir hi && stat hi
|
|
|
|
A simple fixes seems to be initializing d->d_time for positive dentry
|
|
in kernfs_iop_lookup() as well. The downside is the negative dentry
|
|
will be revalidated again after it becomes negative in d_delete(),
|
|
because the revison of its parent must have been increased due to
|
|
its removal.
|
|
|
|
Alternative solution is implement .d_iput for kernfs, and assign d_time
|
|
for the newly-generated negative dentry in it. But we may need to
|
|
take kernfs_rwsem to protect again the concurrent kernfs_link_sibling()
|
|
on the parent directory, it is a little over-killing. Now the simple
|
|
fix is chosen.
|
|
|
|
Link: https://marc.info/?l=linux-fsdevel&m=163249838610499
|
|
Fixes: c7e7c04274b1 ("kernfs: use VFS negative dentry caching")
|
|
Reported-by: Alexander Potapenko <glider@google.com>
|
|
Signed-off-by: Hou Tao <houtao1@huawei.com>
|
|
Link: https://lore.kernel.org/r/20210928140750.1274441-1-houtao1@huawei.com
|
|
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Signed-off-by: Jim Somerville <jim.somerville@windriver.com>
|
|
---
|
|
fs/kernfs/dir.c | 9 +++++++--
|
|
1 file changed, 7 insertions(+), 2 deletions(-)
|
|
|
|
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
|
|
index 51b35d6f9739..dd7ef74ee0ff 100644
|
|
--- a/fs/kernfs/dir.c
|
|
+++ b/fs/kernfs/dir.c
|
|
@@ -1124,8 +1124,13 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
|
|
if (!inode)
|
|
inode = ERR_PTR(-ENOMEM);
|
|
}
|
|
- /* Needed only for negative dentry validation */
|
|
- if (!inode)
|
|
+ /*
|
|
+ * Needed for negative dentry validation.
|
|
+ * The negative dentry can be created in kernfs_iop_lookup()
|
|
+ * or transforms from positive dentry in dentry_unlink_inode()
|
|
+ * called from vfs_rmdir().
|
|
+ */
|
|
+ if (!IS_ERR(inode))
|
|
kernfs_set_rev(parent, dentry);
|
|
up_read(&kernfs_rwsem);
|
|
|
|
--
|
|
2.25.1
|
|
|