
When the system is stressed running pods on isolated cores (using stress-ng for instance [1]) and the Power Metrics App [2] is also being executed, the system hangs. [1] https://github.com/ColinIanKing/stress-ng [2] https://opendev.org/starlingx/app-power-metrics Dmesg shows the following output: WARNING: CPU: 16 PID: 207561 at kernel/events/core.c:868 perf_cgroup_switch+0x222/0x230 RIP: 0010:perf_cgroup_switch+0x222/0x230 Call Trace: ? __warn+0x79/0xc0 ? perf_cgroup_switch+0x222/0x230 ? report_bug+0x9e/0xc0 ? handle_bug+0x41/0x90 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x12/0x20 ? perf_cgroup_switch+0x222/0x230 ? perf_cgroup_switch+0xff/0x230 __perf_event_task_sched_in+0x169/0x330 ? __perf_event_task_sched_out+0x27c/0x6d0 ? newidle_balance+0x3fd/0x480 finish_task_switch.isra.0+0x118/0x4b0 __schedule+0x2ae/0x930 ? hrtimer_start_range_ns+0x2fc/0x420 schedule+0xa7/0x110 do_nanosleep+0x7c/0x1a0 hrtimer_nanosleep+0x9b/0x140 ? __hrtimer_init+0xe0/0xe0 __x64_sys_nanosleep+0xad/0xe0 do_syscall_64+0x30/0x40 entry_SYSCALL_64_after_hwframe+0x61/0xc6 There is an upstream patch set that fix a race condition on perf_cgroup_switch. Applying these patches into stx kernel solved the issue. * commit a0827713e298 ("perf/core: Don't pass task around when ctx sched in") (v5.18-rc2~8^2~3) * commit 6875186aea5c ("perf/core: perf/core: Use perf_cgroup_info->active to check if cgroup is active") (v5.18-rc2~8^2~2) * commit 96492a6c558a ("perf/core: Fix perf_cgroup_switch()") (v5.18-rc2~8^2~1) * commit e19cd0b6fa59 ("perf/core: Always set cpuctx cgrp when enable cgroup event") (v5.18-rc2~8^2) Note: It was verified that are no "fixes" commits from mainline kernel to the commits mentioned above Test plan: PASS: Build iso success for rt and std. PASS: Install success onto a AIO-SX lab with both rt and std kernel. PASS: Apply power-metrics app, launch stress pods and confirm the system is stable. Closes-Bug: 2035124 Change-Id: I30fcb63e4564a23cdb26794f4dfefa748eaa0cee Signed-off-by: Alyson Deives Pereira <alyson.deivespereira@windriver.com>
66 lines
2.3 KiB
Diff
66 lines
2.3 KiB
Diff
From 38afc590e9d4a5b428d28da4b530be5799331767 Mon Sep 17 00:00:00 2001
|
|
From: Chengming Zhou <zhouchengming@bytedance.com>
|
|
Date: Tue, 29 Mar 2022 23:45:23 +0800
|
|
Subject: [PATCH 74/74] perf/core: Always set cpuctx cgrp when enable cgroup
|
|
event
|
|
|
|
When enable a cgroup event, cpuctx->cgrp setting is conditional
|
|
on the current task cgrp matching the event's cgroup, so have to
|
|
do it for every new event. It brings complexity but no advantage.
|
|
|
|
To keep it simple, this patch would always set cpuctx->cgrp
|
|
when enable the first cgroup event, and reset to NULL when disable
|
|
the last cgroup event.
|
|
|
|
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
|
|
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
Link: https://lore.kernel.org/r/20220329154523.86438-5-zhouchengming@bytedance.com
|
|
(cherry picked from commit e19cd0b6fa5938c51d7b928010d584f0de93913a)
|
|
Signed-off-by: Alyson Deives Pereira <alyson.deivespereira@windriver.com>
|
|
---
|
|
kernel/events/core.c | 18 ++----------------
|
|
1 file changed, 2 insertions(+), 16 deletions(-)
|
|
|
|
diff --git a/kernel/events/core.c b/kernel/events/core.c
|
|
index 5f0265a4809d..807375faaa98 100644
|
|
--- a/kernel/events/core.c
|
|
+++ b/kernel/events/core.c
|
|
@@ -966,22 +966,10 @@ perf_cgroup_event_enable(struct perf_event *event, struct perf_event_context *ct
|
|
*/
|
|
cpuctx = container_of(ctx, struct perf_cpu_context, ctx);
|
|
|
|
- /*
|
|
- * Since setting cpuctx->cgrp is conditional on the current @cgrp
|
|
- * matching the event's cgroup, we must do this for every new event,
|
|
- * because if the first would mismatch, the second would not try again
|
|
- * and we would leave cpuctx->cgrp unset.
|
|
- */
|
|
- if (ctx->is_active && !cpuctx->cgrp) {
|
|
- struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
|
|
-
|
|
- if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
|
|
- cpuctx->cgrp = cgrp;
|
|
- }
|
|
-
|
|
if (ctx->nr_cgroups++)
|
|
return;
|
|
|
|
+ cpuctx->cgrp = perf_cgroup_from_task(current, ctx);
|
|
list_add(&cpuctx->cgrp_cpuctx_entry,
|
|
per_cpu_ptr(&cgrp_cpuctx_list, event->cpu));
|
|
}
|
|
@@ -1003,9 +991,7 @@ perf_cgroup_event_disable(struct perf_event *event, struct perf_event_context *c
|
|
if (--ctx->nr_cgroups)
|
|
return;
|
|
|
|
- if (ctx->is_active && cpuctx->cgrp)
|
|
- cpuctx->cgrp = NULL;
|
|
-
|
|
+ cpuctx->cgrp = NULL;
|
|
list_del(&cpuctx->cgrp_cpuctx_entry);
|
|
}
|
|
|
|
--
|
|
2.25.1
|
|
|