
We received a bug report indicating that the "Dirty" field in /proc/meminfo was increasing without bounds, to the point that the number of dirty file pages would eventually reach what is enforced by the vm.dirty_bytes threshold (which is set to 800_000_000 bytes in StarlingX) and cause any task attempting to carry out disk I/O to get blocked. Upon further debugging, we noticed that this issue occurred on nohz_full CPUs where a user application was carrying out disk I/O by writing to and rotating log files. The issue was reproducible with the preempt-rt patch set very reliably. This commit addresses the issue in question, by reverting commit 62cb1188ed86 ("sched/idle: Move quiet_vmstate() into the NOHZ code"), which was merged in the v4.15-rc1 time frame. The revert, in effect, moves the quiet_vmstat function call from hard IRQ context back to the start of the idle loop. Please see the patch description for a more detailed overview. Note that this commit does not introduce a "novel" change, as the 4.14.298-rt140 kernel, released on 2022-11-04 does not have the reverted commit either, which should preclude the need for regression testing in terms of functionality and performance. I would like to acknowledge the extensive help and guidance provided by Jim Somerville <jim.somerville@windriver.com> during the debugging and investigation of this issue. Verification - The issue was reproduced with an older CentOS-based StarlingX-based system, running a StarlingX/linux-yocto preempt-rt kernel based on v5.10.112-rt61 by running a test application for about 4~5 hours. In this configuration, the issue becomes apparent within 1 hour or so, where the Dirty field in /proc/meminfo reaches the threshold sysctl vm.dirty_background_bytes (set to 600_000_000 bytes in StarlingX). By the end of the test, the Dirty field was very close to the vm.dirty_bytes threshold sysctl (800_000_000 bytes). Afterwards, a kernel patched with this commit was found to no longer reproduce the issue, by running the same test application for ~12.5 hours. (Note that the second test had Meltdown/Spectre mitigations enabled by accident, but we are confident that this does not affect the test results.) The Dirty value in /proc/meminfo stayed around 180_000 KiB for the duration of the test. A test re-run with the Meltdown/Spectre mitigations disabled, for a duration of 1.75 hours, had similar results. The test application that reproduces this issue writes to and rotates log files in a rapid manner, with a usleep(0) call between every log file rotation. The issue is reproduced on nohz_full CPUs with the preempt-rt kernel, more reliably at least. - A Debian-based StarlingX ISO image was successfully built with this commit. - The ISO image was successfully installed into a qemu/KVM-based virtual machine using the All-in-One Simplex, low-latency profile, and the Ansible bootstrap procedure was successful. - The issue was confirmed to no longer exist with this commit, by running multiple concurrent instances of a simplified test application for about 30 minutes (with the installation resulting from the Debian-based StarlingX ISO image built with this commit). Without a patched kernel, the issue becomes apparent within 10 minutes of test runtime in this configuration. Closes-Bug: 2002039 Change-Id: I818d8bd751f4b1941a26530a99a4a635e98d5c54 Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
38 lines
2.0 KiB
Plaintext
38 lines
2.0 KiB
Plaintext
0001-Notification-of-death-of-arbitrary-processes.patch
|
|
0002-PCI-Add-ACS-quirk-for-Intel-Fortville-NICs.patch
|
|
0003-affine-compute-kernel-threads.patch
|
|
0004-Affine-irqs-and-workqueues-with-kthread_cpus.patch
|
|
0005-Make-kernel-start-eth-devices-at-offset.patch
|
|
0006-intel-iommu-allow-ignoring-Ethernet-device-RMRR-with.patch
|
|
0007-turn-off-write-same-in-smartqpi-driver.patch
|
|
0008-Allow-dmar-quirks-for-broken-bioses.patch
|
|
0009-tpm-ignore-burstcount-to-improve-tpm_tis-send-perfor.patch
|
|
0010-bpf-cgroups-Fix-cgroup-v2-fallback-on-v1-v2-mixed-mo.patch
|
|
0011-scsi-smartpqi-Enable-sas_address-sysfs-for-SATA-dev.patch
|
|
0012-workqueue-Affine-rescuer-threads-and-unbound-wqs.patch
|
|
0015-Revert-scsi-sd-Inline-sd_probe_part2.patch
|
|
0016-Revert-commit-f049cf1a7b.patch
|
|
0017-genirq-Export-affinity-setter-for-modules.patch
|
|
0018-genirq-Provide-new-interfaces-for-affinity-hints.patch
|
|
0019-ixgbe-Use-irq_update_affinity_hint.patch
|
|
0020-Add-auxiliary-bus-support.patch
|
|
0021-driver-core-auxiliary-bus-move-slab.h-from-include-f.patch
|
|
0022-driver-core-auxiliary-bus-make-remove-function-retur.patch
|
|
0023-driver-core-auxiliary-bus-minor-coding-style-tweaks.patch
|
|
0024-driver-core-auxiliary-bus-Fix-auxiliary-bus-shutdown.patch
|
|
0025-driver-core-auxiliary-bus-Fix-calling-stage-for-auxi.patch
|
|
0026-driver-core-auxiliary-bus-Remove-unneeded-module-bit.patch
|
|
0027-driver-core-auxiliary-bus-Fix-memory-leak-when-drive.patch
|
|
0028-driver-core-auxiliary-bus-Enable-by-default.patch
|
|
0029-Enable-CONFIG_PAGE_POOL-by-default.patch
|
|
0030-x86-Enumerate-AVX512-FP16-CPUID-feature-flag.patch
|
|
0031-KVM-x86-Expose-AVX512_FP16-for-supported-CPUID.patch
|
|
0032-tools-headers-cpufeatures-Sync-with-the-kernel-sourc.patch
|
|
0033-rcu-Avoid-running-boost-kthreads-on-isolated-CPUs.patch
|
|
0034-xfs-drop-submit-side-trans-alloc-for-append-ioends.patch
|
|
0035-xfs-open-code-ioend-needs-workqueue-helper.patch
|
|
0036-xfs-drop-unused-ioend-private-merge-and-setfilesize-.patch
|
|
0037-xfs-drop-unnecessary-setfilesize-helper.patch
|
|
0038-samples-bpf-use-kprobe-and-urandom_read_iter.patch
|
|
0039-Revert-sched-idle-Move-quiet_vmstate-into-the-NOHZ-c.patch
|