From: Alexander Lobakin <alexandr.lobakin@intel.com>
To: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>
Cc: "Alexander Lobakin" <alexandr.lobakin@intel.com>,
"Larysa Zaremba" <larysa.zaremba@intel.com>,
"Michal Swiatkowski" <michal.swiatkowski@linux.intel.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Björn Töpel" <bjorn@kernel.org>,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
"Jonathan Lemon" <jonathan.lemon@gmail.com>,
"Toke Hoiland-Jorgensen" <toke@redhat.com>,
"Lorenzo Bianconi" <lorenzo@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Jesse Brandeburg" <jesse.brandeburg@intel.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Yajun Deng" <yajun.deng@linux.dev>,
"Willem de Bruijn" <willemb@google.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, xdp-hints@xdp-project.net
Subject: [xdp-hints] [PATCH RFC bpf-next 33/52] bpf, cpumap: add option to set a timeout for deferred flush
Date: Tue, 28 Jun 2022 21:47:53 +0200 [thread overview]
Message-ID: <20220628194812.1453059-34-alexandr.lobakin@intel.com> (raw)
In-Reply-To: <20220628194812.1453059-1-alexandr.lobakin@intel.com>
GRO efficiency depends a lot on the batch size. With the size of 8,
it is less efficient than e.g. with NAPI and the size of 64.
To do less percentage of full flushes and not hold GRO packets for
too long, use the GRO hrtimer to wake up the kthread even if there's
no new frames in the ptr_ring. Its value is being passed from the
user side inside the corresponding &bpf_cpumap_val on map creation,
in nanoseconds.
When the timeout is 0/unset, the behaviour is the same as it was
prior to the change.
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
---
include/uapi/linux/bpf.h | 1 +
kernel/bpf/cpumap.c | 39 +++++++++++++++++++++++++++++-----
tools/include/uapi/linux/bpf.h | 1 +
3 files changed, 36 insertions(+), 5 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1caaec1de625..097719ee2172 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5989,6 +5989,7 @@ struct bpf_cpumap_val {
int fd; /* prog fd on map write */
__u32 id; /* prog id on map read */
} bpf_prog;
+ __u64 timeout; /* timeout to wait for new packets, in ns */
};
enum sk_action {
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 2d0edf8f6a05..145f49de0931 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -95,7 +95,8 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
/* check sanity of attributes */
if (attr->max_entries == 0 || attr->key_size != 4 ||
(value_size != offsetofend(struct bpf_cpumap_val, qsize) &&
- value_size != offsetofend(struct bpf_cpumap_val, bpf_prog.fd)) ||
+ value_size != offsetofend(struct bpf_cpumap_val, bpf_prog.fd) &&
+ value_size != offsetofend(struct bpf_cpumap_val, timeout)) ||
attr->map_flags & ~BPF_F_NUMA_NODE)
return ERR_PTR(-EINVAL);
@@ -312,18 +313,42 @@ static void cpu_map_gro_flush(struct bpf_cpu_map_entry *rcpu,
/* If the ring is not empty, there'll be a new iteration
* soon, and we only need to do a full flush if a tick is
* long (> 1 ms).
- * If the ring is empty, to not hold GRO packets in the
- * stack for too long, do a full flush.
+ * If the ring is empty, and there were some new packets
+ * processed, either do a partial flush and spin up a timer
+ * to flush the rest if the timeout is set, or do a full
+ * flush otherwise.
+ * No new packets with non-zero gro_bitmask can mean that we
+ * probably came from the timer call and/or there's [almost]
+ * no activity here right now. To not hold GRO packets in
+ * the stack for too long, do a full flush.
* This is equivalent to how NAPI decides whether to perform
* a full flush (by batches of up to 64 frames tho).
*/
if (__ptr_ring_empty(rcpu->queue))
- flush_old = false;
+ flush_old = new ? !!rcpu->value.timeout : false;
__gro_flush(&rcpu->gro, flush_old);
}
gro_normal_list(&rcpu->gro);
+
+ /* Non-zero gro_bitmask at this point means that we have some packets
+ * held in the GRO engine after a partial flush. If we have a timeout
+ * set up, and there are no signs of a new kthread iteration, launch
+ * a timer to flush them as well.
+ */
+ if (rcpu->gro.bitmask && __ptr_ring_empty(rcpu->queue))
+ gro_timer_start(&rcpu->gro, rcpu->value.timeout);
+}
+
+static enum hrtimer_restart cpu_map_gro_watchdog(struct hrtimer *timer)
+{
+ const struct bpf_cpu_map_entry *rcpu;
+
+ rcpu = container_of(timer, typeof(*rcpu), gro.timer);
+ wake_up_process(rcpu->kthread);
+
+ return HRTIMER_NORESTART;
}
static int cpu_map_kthread_run(void *data)
@@ -489,8 +514,9 @@ __cpu_map_entry_alloc(struct bpf_map *map, struct bpf_cpumap_val *value,
rcpu->cpu = cpu;
rcpu->map_id = map->id;
rcpu->value.qsize = value->qsize;
+ rcpu->value.timeout = value->timeout;
- gro_init(&rcpu->gro, NULL);
+ gro_init(&rcpu->gro, cpu_map_gro_watchdog);
if (fd > 0 && __cpu_map_load_bpf_program(rcpu, map, fd))
goto free_gro;
@@ -606,6 +632,9 @@ static int cpu_map_update_elem(struct bpf_map *map, void *key, void *value,
return -EEXIST;
if (unlikely(cpumap_value.qsize > 16384)) /* sanity limit on qsize */
return -EOVERFLOW;
+ /* Don't allow timeout longer than 1 ms -- 1 tick on HZ == 1000 */
+ if (unlikely(cpumap_value.timeout > 1 * NSEC_PER_MSEC))
+ return -ERANGE;
/* Make sure CPU is a valid possible cpu */
if (key_cpu >= nr_cpumask_bits || !cpu_possible(key_cpu))
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 436b925adfb3..a3579cdb0225 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5989,6 +5989,7 @@ struct bpf_cpumap_val {
int fd; /* prog fd on map write */
__u32 id; /* prog id on map read */
} bpf_prog;
+ __u64 timeout; /* timeout to wait for new packets, in ns */
};
enum sk_action {
--
2.36.1
next prev parent reply other threads:[~2022-06-28 19:49 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-28 19:47 [xdp-hints] [PATCH RFC bpf-next 00/52] bpf, xdp: introduce and use Generic Hints/metadata Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 01/52] libbpf: factor out BTF loading from load_module_btfs() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 02/52] libbpf: try to load vmlinux BTF from the kernel first Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 03/52] libbpf: add function to get the pair BTF ID + type ID for a given type Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 04/52] libbpf: patch module BTF ID into BPF insns Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 05/52] net, xdp: decouple XDP code from the core networking code Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 06/52] bpf: pass a pointer to union bpf_attr to bpf_link_ops::update_prog() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 07/52] net, xdp: remove redundant arguments from dev_xdp_{at,de}tach_link() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 08/52] net, xdp: factor out XDP install arguments to a separate structure Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 09/52] net, xdp: add ability to specify BTF ID for XDP metadata Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 10/52] net, xdp: add ability to specify frame size threshold " Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 11/52] libbpf: factor out __bpf_set_link_xdp_fd_replace() args into a struct Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 12/52] libbpf: add ability to set the BTF/type ID on setting XDP prog Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 13/52] libbpf: add ability to set the meta threshold " Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 14/52] libbpf: pass &bpf_link_create_opts directly to bpf_program__attach_fd() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 15/52] libbpf: add bpf_program__attach_xdp_opts() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 16/52] selftests/bpf: expand xdp_link to check that setting meta opts works Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 17/52] samples/bpf: pass a struct to sample_install_xdp() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 18/52] samples/bpf: add ability to specify metadata threshold Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 19/52] stddef: make __struct_group() UAPI C++-friendly Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 20/52] net, xdp: move XDP metadata helpers into new xdp_meta.h Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 21/52] net, xdp: allow metadata > 32 Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 22/52] net, skbuff: add ability to skip skb metadata comparison Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 23/52] net, skbuff: constify the @skb argument of skb_hwtstamps() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 24/52] bpf, xdp: declare generic XDP metadata structure Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 25/52] net, xdp: add basic generic metadata accessors Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 26/52] bpf, btf: add a pair of function to work with the BTF ID + type ID pair Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 27/52] net, xdp: add &sk_buff <-> &xdp_meta_generic converters Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 28/52] net, xdp: prefetch data a bit when building an skb from an &xdp_frame Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 29/52] net, xdp: try to fill skb fields when converting " Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 30/52] net, gro: decouple GRO from the NAPI layer Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 31/52] net, gro: expose some GRO API to use outside of NAPI Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 32/52] bpf, cpumap: switch to GRO from netif_receive_skb_list() Alexander Lobakin
2024-08-07 20:38 ` [xdp-hints] " Daniel Xu
2024-08-08 4:54 ` Lorenzo Bianconi
2024-08-08 11:57 ` Alexander Lobakin
2024-08-08 17:22 ` Lorenzo Bianconi
2024-08-08 20:52 ` Daniel Xu
2024-08-09 10:02 ` Jesper Dangaard Brouer
2024-08-09 12:20 ` Alexander Lobakin
2024-08-09 12:45 ` Toke Høiland-Jørgensen
2024-08-09 12:56 ` Alexander Lobakin
2024-08-09 13:42 ` Toke Høiland-Jørgensen
2024-08-10 0:54 ` Martin KaFai Lau
2024-08-10 8:02 ` Lorenzo Bianconi
2024-08-13 1:33 ` Jakub Kicinski
2024-08-13 9:51 ` Jesper Dangaard Brouer
2024-08-10 8:00 ` Lorenzo Bianconi
2024-08-13 14:09 ` Alexander Lobakin
2024-08-13 14:54 ` Toke Høiland-Jørgensen
2024-08-13 15:57 ` Jesper Dangaard Brouer
2024-08-19 14:50 ` Alexander Lobakin
2024-08-21 0:29 ` Daniel Xu
2024-08-21 13:16 ` Alexander Lobakin
2024-08-21 16:36 ` Daniel Xu
2024-08-13 16:14 ` Lorenzo Bianconi
2024-08-13 16:27 ` Lorenzo Bianconi
2024-08-13 16:31 ` Alexander Lobakin
2024-08-08 20:44 ` Daniel Xu
2024-08-09 9:32 ` Jesper Dangaard Brouer
2022-06-28 19:47 ` Alexander Lobakin [this message]
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 34/52] samples/bpf: add 'timeout' option to xdp_redirect_cpu Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 35/52] net, skbuff: introduce napi_skb_cache_get_bulk() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 36/52] bpf, cpumap: switch to napi_skb_cache_get_bulk() Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 37/52] rcupdate: fix access helpers for incomplete struct pointers on GCC < 10 Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 38/52] net, xdp: remove unused xdp_attachment_info::flags Alexander Lobakin
2022-06-28 19:47 ` [xdp-hints] [PATCH RFC bpf-next 39/52] net, xdp: make &xdp_attachment_info a bit more useful in drivers Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 40/52] net, xdp: add an RCU version of xdp_attachment_setup() Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 41/52] net, xdp: replace net_device::xdp_prog pointer with &xdp_attachment_info Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 42/52] net, xdp: shortcut skb->dev in bpf_prog_run_generic_xdp() Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 43/52] net, xdp: build XDP generic metadata on Generic (skb) XDP path Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 44/52] net, ice: allow XDP prog hot-swapping Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 45/52] net, ice: consolidate all skb fields processing Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 46/52] net, ice: use an onstack &xdp_meta_generic_rx to store HW frame info Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 47/52] net, ice: build XDP generic metadata Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 48/52] libbpf: compress Endianness ops with a macro Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 49/52] libbpf: add LE <--> CPU conversion helpers Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 50/52] libbpf: introduce a couple memory access helpers Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 51/52] selftests/bpf: fix using test_xdp_meta BPF prog via skeleton infra Alexander Lobakin
2022-06-28 19:48 ` [xdp-hints] [PATCH RFC bpf-next 52/52] selftests/bpf: add XDP Generic Hints selftest Alexander Lobakin
2022-06-29 6:15 ` [xdp-hints] Re: [PATCH RFC bpf-next 00/52] bpf, xdp: introduce and use Generic Hints/metadata John Fastabend
2022-06-29 13:43 ` Toke Høiland-Jørgensen
2022-07-04 15:44 ` Alexander Lobakin
2022-07-04 17:13 ` Jesper Dangaard Brouer
2022-07-05 14:38 ` Alexander Lobakin
2022-07-05 19:08 ` Daniel Borkmann
2022-07-04 17:14 ` Toke Høiland-Jørgensen
2022-07-05 15:41 ` Alexander Lobakin
2022-07-05 18:51 ` Toke Høiland-Jørgensen
2022-07-06 13:50 ` Alexander Lobakin
2022-07-06 23:22 ` Toke Høiland-Jørgensen
2022-07-07 11:41 ` Jesper Dangaard Brouer
2022-07-12 10:33 ` Magnus Karlsson
2022-07-12 14:14 ` Jesper Dangaard Brouer
2022-07-15 11:11 ` Magnus Karlsson
2022-06-29 17:56 ` Zvi Effron
2022-06-30 7:39 ` Magnus Karlsson
2022-07-04 15:31 ` Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220628194812.1453059-34-alexandr.lobakin@intel.com \
--to=alexandr.lobakin@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=jesse.brandeburg@intel.com \
--cc=john.fastabend@gmail.com \
--cc=jonathan.lemon@gmail.com \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lorenzo@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=michal.swiatkowski@linux.intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=toke@redhat.com \
--cc=willemb@google.com \
--cc=xdp-hints@xdp-project.net \
--cc=yajun.deng@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox