kernel/kernel-std/debian/patches/0044-kernfs-use-i_lock-to-protect-concurrent-inode-update.patch
Jim Somerville 63492b8ddd Reduce kernfs_mutex contention
This is a backport of a collection of 12 upstream patches.
The main one being the switch to use a rwsem instead.
The next important one being the switch of the rwsem to be a
per filesystem lock instead of global.

See the individual patches for details.  They did not require
much work or wiggling to get them applied.

They all come from Linus' tree and are easily located.  As such
I have not modified their individual headers with upstream
commit ids.

Verification:
- two scripts, the concept behind them supplied by Vefa Bicakci.
The first one causes a lot of concurrent contention in sysfs.
The second script highlights how well systemd is also contending.
Run Script1 followed by Script2

Without this change, Script2 has timeouts and fails.

Script1:
for i in `seq 20`; do
  (while :; do find /sys/fs/cgroup/ -type f -readable -print0 \
    2>/dev/null | xargs -0 -n 20 -r cat >&/dev/null ; done) &
done

for i in `seq 10`; do
  (while :; do systemd-run --scope -q sleep 0.5 >/dev/null; done) &
done

Script2:
while true; do
        date -Is
        /usr/bin/time -f %e systemctl enable  -q lighttpd.service ||
break
        /usr/bin/time -f %e systemctl disable -q lighttpd.service ||
break
        /usr/bin/time -f %e systemctl restart -q lighttpd.service ||
break
        sleep 0.5 || break
done

- also soak testing to ensure that these patches don't introduce issues

Partial-Bug: 2016028

Signed-off-by: Jim Somerville <jim.somerville@windriver.com>
Change-Id: I6ad64cd7c90f756c6eb904065febfeb516e73009
2023-04-25 17:51:09 +00:00

104 lines
3.5 KiB
Diff

From 1ba2bcb994271391f6719d76987a4eac8b91f625 Mon Sep 17 00:00:00 2001
From: Ian Kent <raven@themaw.net>
Date: Fri, 16 Jul 2021 17:28:34 +0800
Subject: [PATCH] kernfs: use i_lock to protect concurrent inode updates
The inode operations .permission() and .getattr() use the kernfs node
write lock but all that's needed is the read lock to protect against
partial updates of these kernfs node fields which are all done under
the write lock.
And .permission() is called frequently during path walks and can cause
quite a bit of contention between kernfs node operations and path
walks when the number of concurrent walks is high.
To change kernfs_iop_getattr() and kernfs_iop_permission() to take
the rw sem read lock instead of the write lock an additional lock is
needed to protect against multiple processes concurrently updating
the inode attributes and link count in kernfs_refresh_inode().
The inode i_lock seems like the sensible thing to use to protect these
inode attribute updates so use it in kernfs_refresh_inode().
The last hunk in the patch, applied to kernfs_fill_super(), is possibly
not needed but taking the lock was present originally. I prefer to
continue to take it to protect against a partial update of the source
kernfs fields during the call to kernfs_refresh_inode() made by
kernfs_get_inode().
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Link: https://lore.kernel.org/r/162642771474.63632.16295959115893904470.stgit@web.messagingengine.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jim Somerville <jim.somerville@windriver.com>
---
fs/kernfs/inode.c | 18 ++++++++++++------
fs/kernfs/mount.c | 4 ++--
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index ddaf18198935..73d7d4a24c51 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -189,11 +189,13 @@ int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
struct inode *inode = d_inode(path->dentry);
struct kernfs_node *kn = inode->i_private;
- down_write(&kernfs_rwsem);
+ down_read(&kernfs_rwsem);
+ spin_lock(&inode->i_lock);
kernfs_refresh_inode(kn, inode);
- up_write(&kernfs_rwsem);
-
generic_fillattr(inode, stat);
+ spin_unlock(&inode->i_lock);
+ up_read(&kernfs_rwsem);
+
return 0;
}
@@ -275,17 +277,21 @@ void kernfs_evict_inode(struct inode *inode)
int kernfs_iop_permission(struct inode *inode, int mask)
{
struct kernfs_node *kn;
+ int ret;
if (mask & MAY_NOT_BLOCK)
return -ECHILD;
kn = inode->i_private;
- down_write(&kernfs_rwsem);
+ down_read(&kernfs_rwsem);
+ spin_lock(&inode->i_lock);
kernfs_refresh_inode(kn, inode);
- up_write(&kernfs_rwsem);
+ ret = generic_permission(inode, mask);
+ spin_unlock(&inode->i_lock);
+ up_read(&kernfs_rwsem);
- return generic_permission(inode, mask);
+ return ret;
}
int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index baa4155ba2ed..f2f909d09f52 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
sb->s_shrink.seeks = 0;
/* get root inode, initialize and unlock it */
- down_write(&kernfs_rwsem);
+ down_read(&kernfs_rwsem);
inode = kernfs_get_inode(sb, info->root->kn);
- up_write(&kernfs_rwsem);
+ up_read(&kernfs_rwsem);
if (!inode) {
pr_debug("kernfs: could not get root inode\n");
return -ENOMEM;
--
2.25.1