XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@google.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org,
	haoluo@google.com, jolsa@kernel.org,
	David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
	Willem de Bruijn <willemb@google.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: [xdp-hints] Re: [PATCH bpf-next 03/11] bpf: Support inlined/unrolled kfuncs for xdp metadata
Date: Tue, 15 Nov 2022 10:37:47 -0800	[thread overview]
Message-ID: <CAKH8qBv0oOnZY3YiXu_SNnRYTgd64KhMgBOgKT2zMmkRiiNHHw@mail.gmail.com> (raw)
In-Reply-To: <87k03wi46i.fsf@toke.dk>

On Tue, Nov 15, 2022 at 8:16 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Stanislav Fomichev <sdf@google.com> writes:
>
> > Kfuncs have to be defined with KF_UNROLL for an attempted unroll.
> > For now, only XDP programs can have their kfuncs unrolled, but
> > we can extend this later on if more programs would like to use it.
> >
> > For XDP, we define a new kfunc set (xdp_metadata_kfunc_ids) which
> > implements all possible metatada kfuncs. Not all devices have to
> > implement them. If unrolling is not supported by the target device,
> > the default implementation is called instead. The default
> > implementation is unconditionally unrolled to 'return false/0/NULL'
> > for now.
> >
> > Upon loading, if BPF_F_XDP_HAS_METADATA is passed via prog_flags,
> > we treat prog_index as target device for kfunc unrolling.
> > net_device_ops gains new ndo_unroll_kfunc which does the actual
> > dirty work per device.
> >
> > The kfunc unrolling itself largely follows the existing map_gen_lookup
> > unrolling example, so there is nothing new here.
> >
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Cc: David Ahern <dsahern@gmail.com>
> > Cc: Martin KaFai Lau <martin.lau@linux.dev>
> > Cc: Jakub Kicinski <kuba@kernel.org>
> > Cc: Willem de Bruijn <willemb@google.com>
> > Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> > Cc: Anatoly Burakov <anatoly.burakov@intel.com>
> > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
> > Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
> > Cc: Maryam Tahhan <mtahhan@redhat.com>
> > Cc: xdp-hints@xdp-project.net
> > Cc: netdev@vger.kernel.org
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  Documentation/bpf/kfuncs.rst   |  8 +++++
> >  include/linux/bpf.h            |  1 +
> >  include/linux/btf.h            |  1 +
> >  include/linux/btf_ids.h        |  4 +++
> >  include/linux/netdevice.h      |  5 +++
> >  include/net/xdp.h              | 24 +++++++++++++
> >  include/uapi/linux/bpf.h       |  5 +++
> >  kernel/bpf/syscall.c           | 28 ++++++++++++++-
> >  kernel/bpf/verifier.c          | 65 ++++++++++++++++++++++++++++++++++
> >  net/core/dev.c                 |  7 ++++
> >  net/core/xdp.c                 | 39 ++++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h |  5 +++
> >  12 files changed, 191 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
> > index 0f858156371d..1723de2720bb 100644
> > --- a/Documentation/bpf/kfuncs.rst
> > +++ b/Documentation/bpf/kfuncs.rst
> > @@ -169,6 +169,14 @@ rebooting or panicking. Due to this additional restrictions apply to these
> >  calls. At the moment they only require CAP_SYS_BOOT capability, but more can be
> >  added later.
> >
> > +2.4.8 KF_UNROLL flag
> > +-----------------------
> > +
> > +The KF_UNROLL flag is used for kfuncs that the verifier can attempt to unroll.
> > +Unrolling is currently implemented only for XDP programs' metadata kfuncs.
> > +The main motivation behind unrolling is to remove function call overhead
> > +and allow efficient inlined kfuncs to be generated.
> > +
> >  2.5 Registering the kfuncs
> >  --------------------------
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 798aec816970..bf8936522dd9 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1240,6 +1240,7 @@ struct bpf_prog_aux {
> >               struct work_struct work;
> >               struct rcu_head rcu;
> >       };
> > +     const struct net_device_ops *xdp_kfunc_ndo;
> >  };
> >
> >  struct bpf_prog {
> > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > index d80345fa566b..950cca997a5a 100644
> > --- a/include/linux/btf.h
> > +++ b/include/linux/btf.h
> > @@ -51,6 +51,7 @@
> >  #define KF_TRUSTED_ARGS (1 << 4) /* kfunc only takes trusted pointer arguments */
> >  #define KF_SLEEPABLE    (1 << 5) /* kfunc may sleep */
> >  #define KF_DESTRUCTIVE  (1 << 6) /* kfunc performs destructive actions */
> > +#define KF_UNROLL       (1 << 7) /* kfunc unrolling can be attempted */
> >
> >  /*
> >   * Return the name of the passed struct, if exists, or halt the build if for
> > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h
> > index c9744efd202f..eb448e9c79bb 100644
> > --- a/include/linux/btf_ids.h
> > +++ b/include/linux/btf_ids.h
> > @@ -195,6 +195,10 @@ asm(                                                     \
> >  __BTF_ID_LIST(name, local)                           \
> >  __BTF_SET8_START(name, local)
> >
> > +#define BTF_SET8_START_GLOBAL(name)                  \
> > +__BTF_ID_LIST(name, global)                          \
> > +__BTF_SET8_START(name, global)
> > +
> >  #define BTF_SET8_END(name)                           \
> >  asm(                                                 \
> >  ".pushsection " BTF_IDS_SECTION ",\"a\";      \n"    \
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 02a2318da7c7..2096b4f00e4b 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -73,6 +73,8 @@ struct udp_tunnel_info;
> >  struct udp_tunnel_nic_info;
> >  struct udp_tunnel_nic;
> >  struct bpf_prog;
> > +struct bpf_insn;
> > +struct bpf_patch;
> >  struct xdp_buff;
> >
> >  void synchronize_net(void);
> > @@ -1604,6 +1606,9 @@ struct net_device_ops {
> >       ktime_t                 (*ndo_get_tstamp)(struct net_device *dev,
> >                                                 const struct skb_shared_hwtstamps *hwtstamps,
> >                                                 bool cycles);
> > +     void                    (*ndo_unroll_kfunc)(const struct bpf_prog *prog,
> > +                                                 u32 func_id,
> > +                                                 struct bpf_patch *patch);
> >  };
> >
> >  /**
> > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > index 55dbc68bfffc..2a82a98f2f9f 100644
> > --- a/include/net/xdp.h
> > +++ b/include/net/xdp.h
> > @@ -7,6 +7,7 @@
> >  #define __LINUX_NET_XDP_H__
> >
> >  #include <linux/skbuff.h> /* skb_shared_info */
> > +#include <linux/btf_ids.h> /* btf_id_set8 */
> >
> >  /**
> >   * DOC: XDP RX-queue information
> > @@ -409,4 +410,27 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
> >
> >  #define DEV_MAP_BULK_SIZE XDP_BULK_QUEUE_SIZE
> >
> > +#define XDP_METADATA_KFUNC_xxx       \
> > +     XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_TIMESTAMP_SUPPORTED, \
> > +                        bpf_xdp_metadata_rx_timestamp_supported) \
> > +     XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_TIMESTAMP, \
> > +                        bpf_xdp_metadata_rx_timestamp) \
> > +
> > +enum {
> > +#define XDP_METADATA_KFUNC(name, str) name,
> > +XDP_METADATA_KFUNC_xxx
> > +#undef XDP_METADATA_KFUNC
> > +MAX_XDP_METADATA_KFUNC,
> > +};
> > +
> > +#ifdef CONFIG_DEBUG_INFO_BTF
> > +extern struct btf_id_set8 xdp_metadata_kfunc_ids;
> > +static inline u32 xdp_metadata_kfunc_id(int id)
> > +{
> > +     return xdp_metadata_kfunc_ids.pairs[id].id;
> > +}
> > +#else
> > +static inline u32 xdp_metadata_kfunc_id(int id) { return 0; }
> > +#endif
> > +
> >  #endif /* __LINUX_NET_XDP_H__ */
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index fb4c911d2a03..b444b1118c4f 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1156,6 +1156,11 @@ enum bpf_link_type {
> >   */
> >  #define BPF_F_XDP_HAS_FRAGS  (1U << 5)
> >
> > +/* If BPF_F_XDP_HAS_METADATA is used in BPF_PROG_LOAD command, the loaded
> > + * program becomes device-bound but can access it's XDP metadata.
> > + */
> > +#define BPF_F_XDP_HAS_METADATA       (1U << 6)
> > +
> >  /* link_create.kprobe_multi.flags used in LINK_CREATE command for
> >   * BPF_TRACE_KPROBE_MULTI attach type to create return probe.
> >   */
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index 85532d301124..597c41949910 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -2426,6 +2426,20 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
> >  /* last field in 'union bpf_attr' used by this command */
> >  #define      BPF_PROG_LOAD_LAST_FIELD core_relo_rec_size
> >
> > +static int xdp_resolve_netdev(struct bpf_prog *prog, int ifindex)
> > +{
> > +     struct net *net = current->nsproxy->net_ns;
> > +     struct net_device *dev;
> > +
> > +     for_each_netdev(net, dev) {
> > +             if (dev->ifindex == ifindex) {
>
> So this is basically dev_get_by_index(), except you're not doing
> dev_hold()? Which also means there's no protection against the netdev
> going away?

Yeah, good point, will use dev_get_by_index here instead with proper refcnt..

> > +                     prog->aux->xdp_kfunc_ndo = dev->netdev_ops;
> > +                     return 0;
> > +             }
> > +     }
>
> > +     return -EINVAL;
> > +}
> > +
> >  static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
> >  {
> >       enum bpf_prog_type type = attr->prog_type;
> > @@ -2443,7 +2457,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
> >                                BPF_F_TEST_STATE_FREQ |
> >                                BPF_F_SLEEPABLE |
> >                                BPF_F_TEST_RND_HI32 |
> > -                              BPF_F_XDP_HAS_FRAGS))
> > +                              BPF_F_XDP_HAS_FRAGS |
> > +                              BPF_F_XDP_HAS_METADATA))
> >               return -EINVAL;
> >
> >       if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
> > @@ -2531,6 +2546,17 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
> >       prog->aux->sleepable = attr->prog_flags & BPF_F_SLEEPABLE;
> >       prog->aux->xdp_has_frags = attr->prog_flags & BPF_F_XDP_HAS_FRAGS;
> >
> > +     if (attr->prog_flags & BPF_F_XDP_HAS_METADATA) {
> > +             /* Reuse prog_ifindex to carry request to unroll
> > +              * metadata kfuncs.
> > +              */
> > +             prog->aux->offload_requested = false;
> > +
> > +             err = xdp_resolve_netdev(prog, attr->prog_ifindex);
> > +             if (err < 0)
> > +                     goto free_prog;
> > +     }
> > +
> >       err = security_bpf_prog_alloc(prog->aux);
> >       if (err)
> >               goto free_prog;
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 07c0259dfc1a..b657ed6eb277 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -9,6 +9,7 @@
> >  #include <linux/types.h>
> >  #include <linux/slab.h>
> >  #include <linux/bpf.h>
> > +#include <linux/bpf_patch.h>
> >  #include <linux/btf.h>
> >  #include <linux/bpf_verifier.h>
> >  #include <linux/filter.h>
> > @@ -14015,6 +14016,43 @@ static int fixup_call_args(struct bpf_verifier_env *env)
> >       return err;
> >  }
> >
> > +static int unroll_kfunc_call(struct bpf_verifier_env *env,
> > +                          struct bpf_insn *insn,
> > +                          struct bpf_patch *patch)
> > +{
> > +     enum bpf_prog_type prog_type;
> > +     struct bpf_prog_aux *aux;
> > +     struct btf *desc_btf;
> > +     u32 *kfunc_flags;
> > +     u32 func_id;
> > +
> > +     desc_btf = find_kfunc_desc_btf(env, insn->off);
> > +     if (IS_ERR(desc_btf))
> > +             return PTR_ERR(desc_btf);
> > +
> > +     prog_type = resolve_prog_type(env->prog);
> > +     func_id = insn->imm;
> > +
> > +     kfunc_flags = btf_kfunc_id_set_contains(desc_btf, prog_type, func_id);
> > +     if (!kfunc_flags)
> > +             return 0;
> > +     if (!(*kfunc_flags & KF_UNROLL))
> > +             return 0;
> > +     if (prog_type != BPF_PROG_TYPE_XDP)
> > +             return 0;
>
> Should this just handle XDP_METADATA_KFUNC_EXPORT_TO_SKB instead of
> passing that into the driver (to avoid every driver having to
> reimplement the same call to xdp_metadata_export_to_skb())?

Good idea, will try to move it here.

  reply	other threads:[~2022-11-15 18:38 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-15  3:01 [xdp-hints] [PATCH bpf-next 00/11] xdp: hints via kfuncs Stanislav Fomichev
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 02/11] bpf: Introduce bpf_patch Stanislav Fomichev
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 03/11] bpf: Support inlined/unrolled kfuncs for xdp metadata Stanislav Fomichev
2022-11-15 16:16   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-15 18:37     ` Stanislav Fomichev [this message]
2022-11-16 20:42   ` Jakub Kicinski
2022-11-16 20:53     ` Stanislav Fomichev
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 05/11] veth: Support rx timestamp metadata for xdp Stanislav Fomichev
2022-11-15 16:17   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-15 18:37     ` Stanislav Fomichev
2022-11-15 22:46       ` Toke Høiland-Jørgensen
2022-11-16  4:09         ` Stanislav Fomichev
2022-11-16  6:38           ` John Fastabend
2022-11-16  7:47             ` Martin KaFai Lau
2022-11-16 10:08               ` Toke Høiland-Jørgensen
2022-11-16 18:20                 ` Martin KaFai Lau
2022-11-16 19:03                 ` John Fastabend
2022-11-16 20:50                   ` Stanislav Fomichev
2022-11-16 23:47                     ` John Fastabend
2022-11-17  0:19                       ` Stanislav Fomichev
2022-11-17  2:17                         ` Alexei Starovoitov
2022-11-17  2:53                           ` Stanislav Fomichev
2022-11-17  2:59                             ` Alexei Starovoitov
2022-11-17  4:18                               ` Stanislav Fomichev
2022-11-17  6:55                                 ` John Fastabend
2022-11-17 17:51                                   ` Stanislav Fomichev
2022-11-17 19:47                                     ` John Fastabend
2022-11-17 20:17                                       ` Alexei Starovoitov
2022-11-17 11:32                             ` Toke Høiland-Jørgensen
2022-11-17 16:59                               ` Alexei Starovoitov
2022-11-17 17:52                                 ` Stanislav Fomichev
2022-11-17 23:46                                   ` Toke Høiland-Jørgensen
2022-11-18  0:02                                     ` Alexei Starovoitov
2022-11-18  0:29                                       ` Toke Høiland-Jørgensen
2022-11-17 10:27                       ` Toke Høiland-Jørgensen
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 06/11] xdp: Carry over xdp metadata into skb context Stanislav Fomichev
2022-11-15 23:20   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-16  3:49     ` Stanislav Fomichev
2022-11-16  9:30       ` Toke Høiland-Jørgensen
2022-11-16  4:40   ` kernel test robot
2022-11-16  7:04   ` Martin KaFai Lau
2022-11-16  9:48     ` Toke Høiland-Jørgensen
2022-11-16 20:51       ` Stanislav Fomichev
2022-11-16 20:51     ` Stanislav Fomichev
2022-11-16  8:22   ` kernel test robot
2022-11-16  9:03   ` kernel test robot
2022-11-16 13:46   ` kernel test robot
2022-11-16 21:12   ` Jakub Kicinski
2022-11-16 21:49     ` Martin KaFai Lau
2022-11-18 14:05   ` Jesper Dangaard Brouer
2022-11-18 18:18     ` Stanislav Fomichev
2022-11-19 12:31       ` Toke Høiland-Jørgensen
2022-11-21 17:53         ` Stanislav Fomichev
2022-11-21 18:47           ` Jakub Kicinski
2022-11-21 19:41             ` Stanislav Fomichev
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 07/11] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 08/11] selftests/bpf: Verify xdp_metadata xdp->skb path Stanislav Fomichev
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 09/11] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-15  3:02 ` [xdp-hints] [PATCH bpf-next 10/11] mxl4: Support rx timestamp metadata for xdp Stanislav Fomichev
2022-11-15 15:54 ` [xdp-hints] Re: [PATCH bpf-next 00/11] xdp: hints via kfuncs Toke Høiland-Jørgensen
2022-11-15 18:37   ` Stanislav Fomichev
2022-11-15 22:31     ` Toke Høiland-Jørgensen
2022-11-15 22:54     ` Alexei Starovoitov
2022-11-15 23:13       ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKH8qBv0oOnZY3YiXu_SNnRYTgd64KhMgBOgKT2zMmkRiiNHHw@mail.gmail.com \
    --to=sdf@google.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=song@kernel.org \
    --cc=toke@redhat.com \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox