kernel/kernel-std/debian/patches/0074-perf-core-Always-set-cpuctx-cgrp-when-enable-cgroup-.patch
Alyson Deives Pereira 2f15f5cb6c perf/core: Fix perf_cgroup_switch()
When the system is stressed running pods on isolated cores (using
stress-ng for instance [1]) and the Power Metrics App [2] is also
being executed, the system hangs.

[1] https://github.com/ColinIanKing/stress-ng
[2] https://opendev.org/starlingx/app-power-metrics

Dmesg shows the following output:
WARNING: CPU: 16 PID: 207561 at
  kernel/events/core.c:868 perf_cgroup_switch+0x222/0x230
RIP: 0010:perf_cgroup_switch+0x222/0x230
Call Trace:
 ? __warn+0x79/0xc0
 ? perf_cgroup_switch+0x222/0x230
 ? report_bug+0x9e/0xc0
 ? handle_bug+0x41/0x90
 ? exc_invalid_op+0x14/0x70
 ? asm_exc_invalid_op+0x12/0x20
 ? perf_cgroup_switch+0x222/0x230
 ? perf_cgroup_switch+0xff/0x230
 __perf_event_task_sched_in+0x169/0x330
 ? __perf_event_task_sched_out+0x27c/0x6d0
 ? newidle_balance+0x3fd/0x480
 finish_task_switch.isra.0+0x118/0x4b0
 __schedule+0x2ae/0x930
 ? hrtimer_start_range_ns+0x2fc/0x420
 schedule+0xa7/0x110
 do_nanosleep+0x7c/0x1a0
 hrtimer_nanosleep+0x9b/0x140
 ? __hrtimer_init+0xe0/0xe0
 __x64_sys_nanosleep+0xad/0xe0
 do_syscall_64+0x30/0x40
 entry_SYSCALL_64_after_hwframe+0x61/0xc6

There is an upstream patch set that fix a race condition on
perf_cgroup_switch. Applying these patches into stx kernel solved the
issue.

* commit a0827713e298
  ("perf/core: Don't pass task around when ctx sched in")
  (v5.18-rc2~8^2~3)

* commit 6875186aea5c
  ("perf/core: perf/core: Use perf_cgroup_info->active to check if
  cgroup is active") (v5.18-rc2~8^2~2)

* commit 96492a6c558a
  ("perf/core: Fix perf_cgroup_switch()") (v5.18-rc2~8^2~1)

* commit e19cd0b6fa59
  ("perf/core: Always set cpuctx cgrp when enable cgroup event")
  (v5.18-rc2~8^2)

Note: It was verified that are no "fixes" commits from mainline kernel
to the commits mentioned above

Test plan:
PASS: Build iso success for rt and std.
PASS: Install success onto a AIO-SX lab with both rt and std kernel.
PASS: Apply power-metrics app, launch stress pods and confirm the
      system is stable.

Closes-Bug: 2035124
Change-Id: I30fcb63e4564a23cdb26794f4dfefa748eaa0cee
Signed-off-by: Alyson Deives Pereira <alyson.deivespereira@windriver.com>
2023-09-13 10:16:15 -03:00

66 lines
2.3 KiB
Diff

From 38afc590e9d4a5b428d28da4b530be5799331767 Mon Sep 17 00:00:00 2001
From: Chengming Zhou <zhouchengming@bytedance.com>
Date: Tue, 29 Mar 2022 23:45:23 +0800
Subject: [PATCH 74/74] perf/core: Always set cpuctx cgrp when enable cgroup
event
When enable a cgroup event, cpuctx->cgrp setting is conditional
on the current task cgrp matching the event's cgroup, so have to
do it for every new event. It brings complexity but no advantage.
To keep it simple, this patch would always set cpuctx->cgrp
when enable the first cgroup event, and reset to NULL when disable
the last cgroup event.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220329154523.86438-5-zhouchengming@bytedance.com
(cherry picked from commit e19cd0b6fa5938c51d7b928010d584f0de93913a)
Signed-off-by: Alyson Deives Pereira <alyson.deivespereira@windriver.com>
---
kernel/events/core.c | 18 ++----------------
1 file changed, 2 insertions(+), 16 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5f0265a4809d..807375faaa98 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -966,22 +966,10 @@ perf_cgroup_event_enable(struct perf_event *event, struct perf_event_context *ct
*/
cpuctx = container_of(ctx, struct perf_cpu_context, ctx);
- /*
- * Since setting cpuctx->cgrp is conditional on the current @cgrp
- * matching the event's cgroup, we must do this for every new event,
- * because if the first would mismatch, the second would not try again
- * and we would leave cpuctx->cgrp unset.
- */
- if (ctx->is_active && !cpuctx->cgrp) {
- struct perf_cgroup *cgrp = perf_cgroup_from_task(current, ctx);
-
- if (cgroup_is_descendant(cgrp->css.cgroup, event->cgrp->css.cgroup))
- cpuctx->cgrp = cgrp;
- }
-
if (ctx->nr_cgroups++)
return;
+ cpuctx->cgrp = perf_cgroup_from_task(current, ctx);
list_add(&cpuctx->cgrp_cpuctx_entry,
per_cpu_ptr(&cgrp_cpuctx_list, event->cpu));
}
@@ -1003,9 +991,7 @@ perf_cgroup_event_disable(struct perf_event *event, struct perf_event_context *c
if (--ctx->nr_cgroups)
return;
- if (ctx->is_active && cpuctx->cgrp)
- cpuctx->cgrp = NULL;
-
+ cpuctx->cgrp = NULL;
list_del(&cpuctx->cgrp_cpuctx_entry);
}
--
2.25.1