From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mail.toke.dk; spf=pass (mailfrom) smtp.mailfrom=intel.com (client-ip=134.134.136.126; helo=mga18.intel.com; envelope-from=alexandr.lobakin@intel.com; receiver=) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=RCJQMJ6K Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by mail.toke.dk (Postfix) with ESMTPS id 1412A982EA9 for ; Tue, 28 Jun 2022 21:49:53 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1656445794; x=1687981794; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7sizNGB2psf+Qq79FGgS+INcdvERhbHoOrovj1ZDfPY=; b=RCJQMJ6KMuaL+JVq2Hpl5HIGn0T12+12RgfFcPxphd3XggSDOzmMAhwr R3k2L32A+HzvXwjj6rwALZHJG0PQ8tpSxN8wScSVwWS53ByfLsJjdzP7x tjc8W8GURWbAYmG54bk/AULlfStwSZexdGJ8P2acuqt+nVpz/po2Z09yx WdH0s/gcelvRIXIk5vxgM4SygavIXeBBVI2485KYbnKbo5tROKLnUXOnk cWcTWRW9Q8z5B4qyXQ+REeLdTnWxV/B+xtxhX3gT7OMbIT3NztShPF19S 7KYctQAy7oUZEClGQYpqxoOJ/OPew2n2W1nZFnNN81cbU1RJWAEawcvlB A==; X-IronPort-AV: E=McAfee;i="6400,9594,10392"; a="264874226" X-IronPort-AV: E=Sophos;i="5.92,229,1650956400"; d="scan'208";a="264874226" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jun 2022 12:49:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,229,1650956400"; d="scan'208";a="836809532" Received: from irvmail001.ir.intel.com ([10.43.11.63]) by fmsmga006.fm.intel.com with ESMTP; 28 Jun 2022 12:49:49 -0700 Received: from newjersey.igk.intel.com (newjersey.igk.intel.com [10.102.20.203]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id 25SJmr9e022013; Tue, 28 Jun 2022 20:49:47 +0100 From: Alexander Lobakin To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko Date: Tue, 28 Jun 2022 21:48:00 +0200 Message-Id: <20220628194812.1453059-41-alexandr.lobakin@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220628194812.1453059-1-alexandr.lobakin@intel.com> References: <20220628194812.1453059-1-alexandr.lobakin@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-ID-Hash: NYYMRCQQLNNYS6DPAR2RXZNMVQGHENRB X-Message-ID-Hash: NYYMRCQQLNNYS6DPAR2RXZNMVQGHENRB X-MailFrom: alexandr.lobakin@intel.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Alexander Lobakin , Larysa Zaremba , Michal Swiatkowski , Jesper Dangaard Brouer , =?UTF-8?q?Bj=C3=B6rn=20T=C3=B6pel?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Toke Hoiland-Jorgensen , Lorenzo Bianconi , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jesse Brandeburg , John Fastabend , Yajun Deng , Willem de Bruijn , bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, xdp-hints@xdp-project.net X-Mailman-Version: 3.3.5 Precedence: list Subject: [xdp-hints] [PATCH RFC bpf-next 40/52] net, xdp: add an RCU version of xdp_attachment_setup() List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Currently, xdp_attachment_setup() uses plain assignments and puts the previous BPF program before updating the pointer, rendering itself dangerous for program hot-swaps due to pointer tearing and potential use-after-free's. At the same time, &xdp_attachment_info comes handy to use it in drivers as a main container including hotpath -- the BTF ID and meta threshold values are now being used there as well, not speaking of reducing some boilerplate code. Add an RCU-protected pointer to XDP program to that structure and an RCU version of xdp_attachment_setup(), which will make sure that all the values were not corrupted and that old BPF program was freed only after the pointer was updated. The only thing left is that RCU read critical sections might happen in between each assignment, but since the relations between XDP prog, BTF ID and meta threshold are not vital, it's totally fine to allow this. A caller must ensure it's being executed under the RTNL lock. Reader sides must ensure they're being executed under the RCU read lock. Once all the current users of xdp_attachment_setup() are switched to the RCU-aware version (with appropriate adjustments), the "regular" one will be removed. Partially inspired by commit fe45386a2082 ("net/mlx5e: Use RCU to protect rq->xdp_prog"). Signed-off-by: Alexander Lobakin --- include/net/xdp.h | 7 ++++++- net/bpf/core.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/include/net/xdp.h b/include/net/xdp.h index 5762ce18885f..49e562e4fcca 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -379,7 +379,10 @@ int xdp_reg_mem_model(struct xdp_mem_info *mem, void xdp_unreg_mem_model(struct xdp_mem_info *mem); struct xdp_attachment_info { - struct bpf_prog *prog; + union { + struct bpf_prog __rcu *prog_rcu; + struct bpf_prog *prog; + }; union { __le64 btf_id_le; u64 btf_id; @@ -391,6 +394,8 @@ struct xdp_attachment_info { struct netdev_bpf; void xdp_attachment_setup(struct xdp_attachment_info *info, struct netdev_bpf *bpf); +void xdp_attachment_setup_rcu(struct xdp_attachment_info *info, + struct netdev_bpf *bpf); #define DEV_MAP_BULK_SIZE XDP_BULK_QUEUE_SIZE diff --git a/net/bpf/core.c b/net/bpf/core.c index 65f25019493d..d444d0555057 100644 --- a/net/bpf/core.c +++ b/net/bpf/core.c @@ -557,6 +557,34 @@ void xdp_attachment_setup(struct xdp_attachment_info *info, } EXPORT_SYMBOL_GPL(xdp_attachment_setup); +/** + * xdp_attachment_setup_rcu - an RCU-powered version of xdp_attachment_setup() + * @info: pointer to the target container + * @bpf: pointer to the container passed to ::ndo_bpf() + * + * Protects sensitive values with RCU to allow program how-swaps without + * stopping an interface. Write side (this) must be called under the RTNL lock + * and reader sides must fetch any data only under the RCU read lock -- old BPF + * program will be freed only after a critical section is finished (see + * bpf_prog_put()). + */ +void xdp_attachment_setup_rcu(struct xdp_attachment_info *info, + struct netdev_bpf *bpf) +{ + struct bpf_prog *old_prog; + + ASSERT_RTNL(); + + old_prog = rcu_replace_pointer(info->prog_rcu, bpf->prog, + lockdep_rtnl_is_held()); + WRITE_ONCE(info->btf_id, bpf->btf_id); + WRITE_ONCE(info->meta_thresh, bpf->meta_thresh); + + if (old_prog) + bpf_prog_put(old_prog); +} +EXPORT_SYMBOL_GPL(xdp_attachment_setup_rcu); + struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp) { unsigned int metasize, totsize; -- 2.36.1