kernel/kernel-rt/debian/patches/ice-dpll/0040-dpll-fix-possible-deadlock-during-netlink-dump-opera.patch
Jiping Ma 853c60e2e8 ice: back port DPLL related commits from the upstream kernel
This commit back ports the DPLL related commits from the upstream
kernel that are identified by Intel to provide the expected
SyncE/GNSS functionality.

There are totally 46 back ported commits included the four commits
I added are used to resolve the conflicts during back porting.

The 0046 patch is cherry picked from kernel-6.9.
The 0031-0045 patches are cherry picked from kernel-6.8.
The 0001-0030 patches are cherry picked from kernel-6.7.

We also change the in-tree ice driver version to 6.6.40-stx.2 from
6.6.40-stx.1.

* To fix the conflict of 91e43ca0090b ("ice: fix linking when
  CONFIG_PTP_1588_CLOCK=n"), we cherry pick 12a5a28b565b
  ("ice: remove ICE_F_PTP_EXTTS feature flag") and 89776a6a702e
  ("ice: check netlist before enabling ICE_F_GNSS").

  Adjust 12a5a28b565b because 0d1b22367ec2 ("ice: fix pin
  assignment for E810-T without SMA control") already included
  the part code of 12a5a28b565b.
  https://git.yoctoproject.org/linux-yocto/commit/?id=0d1b22367ec2
* Cherry pick 7049fd5df7 ("netlink: specs: remove redundant type keys
  from attributes in subsets") to fix the the conflict of c3c6ab95c397
  ("dpll: spec: add support for pin-dpll signal phase offset/adjust.")
* Cherry pick be16574609f1 ("ice: introduce hw->phy_model for handling
  PTP PHY differences") to fix the confilict of 6db5f2cd9ebb ("ice:
  dpll:fix output pin capabilities").

Verification:
- Build kernel and out of tree modules success for rt and std.
- Install success onto a All-in-One lab with rt kernel.
- Boot up successfully in the lab.
- interfaces are up and pass packets for rt and std.
- Check dmesg to see DDP package is loaded successfully and
  the version is 1.3.36.0 for rt and std, that is same with the OOT
  ice-1.14.9 driver.
- The SyncE/GNSS functionality tests were done by the network team.

Story: 2011056
Task: 50797

Change-Id: I715480681c7c43d53b0a0126b34135562e9d02a0
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
2024-08-09 10:13:09 +00:00

218 lines
7.8 KiB
Diff

From e01f5b5e574aca4688e649ae69d7a00b37f95b87 Mon Sep 17 00:00:00 2001
From: Jiri Pirko <jiri@nvidia.com>
Date: Wed, 7 Feb 2024 12:59:02 +0100
Subject: [PATCH 40/46] dpll: fix possible deadlock during netlink dump
operation
Recently, I've been hitting following deadlock warning during dpll pin
dump:
[52804.637962] ======================================================
[52804.638536] WARNING: possible circular locking dependency detected
[52804.639111] 6.8.0-rc2jiri+ #1 Not tainted
[52804.639529] ------------------------------------------------------
[52804.640104] python3/2984 is trying to acquire lock:
[52804.640581] ffff88810e642678 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}, at: netlink_dump+0xb3/0x780
[52804.641417]
but task is already holding lock:
[52804.642010] ffffffff83bde4c8 (dpll_lock){+.+.}-{3:3}, at: dpll_lock_dumpit+0x13/0x20
[52804.642747]
which lock already depends on the new lock.
[52804.643551]
the existing dependency chain (in reverse order) is:
[52804.644259]
-> #1 (dpll_lock){+.+.}-{3:3}:
[52804.644836] lock_acquire+0x174/0x3e0
[52804.645271] __mutex_lock+0x119/0x1150
[52804.645723] dpll_lock_dumpit+0x13/0x20
[52804.646169] genl_start+0x266/0x320
[52804.646578] __netlink_dump_start+0x321/0x450
[52804.647056] genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.647575] genl_rcv_msg+0x1ed/0x3b0
[52804.648001] netlink_rcv_skb+0xdc/0x210
[52804.648440] genl_rcv+0x24/0x40
[52804.648831] netlink_unicast+0x2f1/0x490
[52804.649290] netlink_sendmsg+0x36d/0x660
[52804.649742] __sock_sendmsg+0x73/0xc0
[52804.650165] __sys_sendto+0x184/0x210
[52804.650597] __x64_sys_sendto+0x72/0x80
[52804.651045] do_syscall_64+0x6f/0x140
[52804.651474] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.652001]
-> #0 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}:
[52804.652650] check_prev_add+0x1ae/0x1280
[52804.653107] __lock_acquire+0x1ed3/0x29a0
[52804.653559] lock_acquire+0x174/0x3e0
[52804.653984] __mutex_lock+0x119/0x1150
[52804.654423] netlink_dump+0xb3/0x780
[52804.654845] __netlink_dump_start+0x389/0x450
[52804.655321] genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.655842] genl_rcv_msg+0x1ed/0x3b0
[52804.656272] netlink_rcv_skb+0xdc/0x210
[52804.656721] genl_rcv+0x24/0x40
[52804.657119] netlink_unicast+0x2f1/0x490
[52804.657570] netlink_sendmsg+0x36d/0x660
[52804.658022] __sock_sendmsg+0x73/0xc0
[52804.658450] __sys_sendto+0x184/0x210
[52804.658877] __x64_sys_sendto+0x72/0x80
[52804.659322] do_syscall_64+0x6f/0x140
[52804.659752] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.660281]
other info that might help us debug this:
[52804.661077] Possible unsafe locking scenario:
[52804.661671] CPU0 CPU1
[52804.662129] ---- ----
[52804.662577] lock(dpll_lock);
[52804.662924] lock(nlk_cb_mutex-GENERIC);
[52804.663538] lock(dpll_lock);
[52804.664073] lock(nlk_cb_mutex-GENERIC);
[52804.664490]
The issue as follows: __netlink_dump_start() calls control->start(cb)
with nlk->cb_mutex held. In control->start(cb) the dpll_lock is taken.
Then nlk->cb_mutex is released and taken again in netlink_dump(), while
dpll_lock still being held. That leads to ABBA deadlock when another
CPU races with the same operation.
Fix this by moving dpll_lock taking into dumpit() callback which ensures
correct lock taking order.
Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://lore.kernel.org/r/20240207115902.371649-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 53c0441dd2c44ee93fddb5473885fd41e4bc2361)
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
---
Documentation/netlink/specs/dpll.yaml | 4 ----
drivers/dpll/dpll_netlink.c | 20 ++++++--------------
drivers/dpll/dpll_nl.c | 4 ----
drivers/dpll/dpll_nl.h | 2 --
4 files changed, 6 insertions(+), 24 deletions(-)
diff --git a/Documentation/netlink/specs/dpll.yaml b/Documentation/netlink/specs/dpll.yaml
index cf8abe1c0550..2b4c4bcd8361 100644
--- a/Documentation/netlink/specs/dpll.yaml
+++ b/Documentation/netlink/specs/dpll.yaml
@@ -374,8 +374,6 @@ operations:
- type
dump:
- pre: dpll-lock-dumpit
- post: dpll-unlock-dumpit
reply: *dev-attrs
-
@@ -462,8 +460,6 @@ operations:
- phase-adjust
dump:
- pre: dpll-lock-dumpit
- post: dpll-unlock-dumpit
request:
attributes:
- id
diff --git a/drivers/dpll/dpll_netlink.c b/drivers/dpll/dpll_netlink.c
index 7cc99d627942..c8c2e836193a 100644
--- a/drivers/dpll/dpll_netlink.c
+++ b/drivers/dpll/dpll_netlink.c
@@ -1171,6 +1171,7 @@ int dpll_nl_pin_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
unsigned long i;
int ret = 0;
+ mutex_lock(&dpll_lock);
xa_for_each_marked_start(&dpll_pin_xa, i, pin, DPLL_REGISTERED,
ctx->idx) {
if (!dpll_pin_available(pin))
@@ -1190,6 +1191,8 @@ int dpll_nl_pin_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
}
genlmsg_end(skb, hdr);
}
+ mutex_unlock(&dpll_lock);
+
if (ret == -EMSGSIZE) {
ctx->idx = i;
return skb->len;
@@ -1345,6 +1348,7 @@ int dpll_nl_device_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
unsigned long i;
int ret = 0;
+ mutex_lock(&dpll_lock);
xa_for_each_marked_start(&dpll_device_xa, i, dpll, DPLL_REGISTERED,
ctx->idx) {
hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid,
@@ -1361,6 +1365,8 @@ int dpll_nl_device_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
}
genlmsg_end(skb, hdr);
}
+ mutex_unlock(&dpll_lock);
+
if (ret == -EMSGSIZE) {
ctx->idx = i;
return skb->len;
@@ -1411,20 +1417,6 @@ dpll_unlock_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
mutex_unlock(&dpll_lock);
}
-int dpll_lock_dumpit(struct netlink_callback *cb)
-{
- mutex_lock(&dpll_lock);
-
- return 0;
-}
-
-int dpll_unlock_dumpit(struct netlink_callback *cb)
-{
- mutex_unlock(&dpll_lock);
-
- return 0;
-}
-
int dpll_pin_pre_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
struct genl_info *info)
{
diff --git a/drivers/dpll/dpll_nl.c b/drivers/dpll/dpll_nl.c
index eaee5be7aa64..1e95f5397cfc 100644
--- a/drivers/dpll/dpll_nl.c
+++ b/drivers/dpll/dpll_nl.c
@@ -95,9 +95,7 @@ static const struct genl_split_ops dpll_nl_ops[] = {
},
{
.cmd = DPLL_CMD_DEVICE_GET,
- .start = dpll_lock_dumpit,
.dumpit = dpll_nl_device_get_dumpit,
- .done = dpll_unlock_dumpit,
.flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DUMP,
},
{
@@ -129,9 +127,7 @@ static const struct genl_split_ops dpll_nl_ops[] = {
},
{
.cmd = DPLL_CMD_PIN_GET,
- .start = dpll_lock_dumpit,
.dumpit = dpll_nl_pin_get_dumpit,
- .done = dpll_unlock_dumpit,
.policy = dpll_pin_get_dump_nl_policy,
.maxattr = DPLL_A_PIN_ID,
.flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DUMP,
diff --git a/drivers/dpll/dpll_nl.h b/drivers/dpll/dpll_nl.h
index 92d4c9c4f788..f491262bee4f 100644
--- a/drivers/dpll/dpll_nl.h
+++ b/drivers/dpll/dpll_nl.h
@@ -30,8 +30,6 @@ dpll_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
void
dpll_pin_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
struct genl_info *info);
-int dpll_lock_dumpit(struct netlink_callback *cb);
-int dpll_unlock_dumpit(struct netlink_callback *cb);
int dpll_nl_device_id_get_doit(struct sk_buff *skb, struct genl_info *info);
int dpll_nl_device_get_doit(struct sk_buff *skb, struct genl_info *info);
--
2.43.0