XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
To: Larysa Zaremba <larysa.zaremba@intel.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: brouer@redhat.com, John Fastabend <john.fastabend@gmail.com>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, kpsingh@kernel.org, sdf@google.com,
	haoluo@google.com, jolsa@kernel.org,
	David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
	Willem de Bruijn <willemb@google.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org,
	Andrew Lunn <andrew@lunn.ch>
Subject: [xdp-hints] Re: [PATCH bpf-next v2 09/20] xdp: Add VLAN tag hint
Date: Fri, 7 Jul 2023 15:57:13 +0200	[thread overview]
Message-ID: <bb8e2be1-4df9-8b26-468e-4d5d13e006c1@redhat.com> (raw)
In-Reply-To: <ZKbTxDKCRlnJxyf0@lincoln>



On 06/07/2023 16.46, Larysa Zaremba wrote:
> On Tue, Jul 04, 2023 at 04:18:04PM +0200, Jesper Dangaard Brouer wrote:
>>
>>
>> On 04/07/2023 13.02, Larysa Zaremba wrote:
>>> On Tue, Jul 04, 2023 at 12:23:45PM +0200, Jesper Dangaard Brouer wrote:
>>>>
>>>> On 04/07/2023 10.23, Larysa Zaremba wrote:
>>>>> On Mon, Jul 03, 2023 at 01:15:34PM -0700, John Fastabend wrote:
>>>>>> Larysa Zaremba wrote:
>>>>>>> Implement functionality that enables drivers to expose VLAN tag
>>>>>>> to XDP code.
>>>>>>>
>>>>>>> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
>>>>>>> ---
>>>>>>>     Documentation/networking/xdp-rx-metadata.rst |  8 +++++++-
>>>>>>>     include/linux/netdevice.h                    |  2 ++
>>>>>>>     include/net/xdp.h                            |  2 ++
>>>>>>>     kernel/bpf/offload.c                         |  2 ++
>>>>>>>     net/core/xdp.c                               | 20 ++++++++++++++++++++
>>>>>>>     5 files changed, 33 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
>>>>>>> index 25ce72af81c2..ea6dd79a21d3 100644
>>>>>>> --- a/Documentation/networking/xdp-rx-metadata.rst
>>>>>>> +++ b/Documentation/networking/xdp-rx-metadata.rst
>>>>>>> @@ -18,7 +18,13 @@ Currently, the following kfuncs are supported. In the future, as more
>>>>>>>     metadata is supported, this set will grow:
>>>>>>>     .. kernel-doc:: net/core/xdp.c
>>>>>>> -   :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash
>>>>>>> +   :identifiers: bpf_xdp_metadata_rx_timestamp
>>>>>>> +
>>>>>>> +.. kernel-doc:: net/core/xdp.c
>>>>>>> +   :identifiers: bpf_xdp_metadata_rx_hash
>>>>>>> +
>>>>>>> +.. kernel-doc:: net/core/xdp.c
>>>>>>> +   :identifiers: bpf_xdp_metadata_rx_vlan_tag
>>>>>>>     An XDP program can use these kfuncs to read the metadata into stack
>>>>>>>     variables for its own consumption. Or, to pass the metadata on to other
>>>> [...]
>>>>>>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>>>>>>> index 41e5ca8643ec..f6262c90e45f 100644
>>>>>>> --- a/net/core/xdp.c
>>>>>>> +++ b/net/core/xdp.c
>>>>>>> @@ -738,6 +738,26 @@ __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash,
>>>>>>>     	return -EOPNOTSUPP;
>>>>>>>     }
>>>>>>> +/**
>>>>>>> + * bpf_xdp_metadata_rx_vlan_tag - Get XDP packet outermost VLAN tag with protocol
>>>>>>> + * @ctx: XDP context pointer.
>>>>>>> + * @vlan_tag: Destination pointer for VLAN tag
>>>>>>> + * @vlan_proto: Destination pointer for VLAN protocol identifier in network byte order.
>>>>>>> + *
>>>>>>> + * In case of success, vlan_tag contains VLAN tag, including 12 least significant bytes
>>>>>>> + * containing VLAN ID, vlan_proto contains protocol identifier.
>>>>>>
>>>>>> Above is a bit confusing to me at least.
>>>>>>
>>>>>> The vlan tag would be both the 16bit TPID and 16bit TCI. What fields
>>>>>> are to be included here? The VlanID or the full 16bit TCI meaning the
>>>>>> PCP+DEI+VID?
>>>>>
>>>>> It contains PCP+DEI+VID, in patch 16 ("selftests/bpf: Add flags and new hints to
>>>>> xdp_hw_metadata") this is more clear, because the tag is parsed.
>>>>>
>>>>
>>>> Do we really care about the "EtherType" proto (in VLAN speak TPID = Tag
>>>> Protocol IDentifier)?
>>>> I mean, it can basically only have two values[1], and we just wanted to
>>>> know if it is a VLAN (that hardware offloaded/removed for us):
>>>
>>> If we assume everyone follows the standard, this would be correct.
>>> But apparently, some applications use some ambiguous value as a TPID [0].
>>>
>>> So it is not hard to imagine, some NICs could alllow you to configure your
>>> custom TPID. I am not sure if any in-tree drivers actually do this, but I think
>>> it's nice to provide some flexibility on XDP level, especially considering
>>> network stack stores full vlan_proto.
>>>
>>
>> I'm buying your argument, and agree it makes sense to provide TPID in
>> the call signature.  Given weird hardware exists that allow people to
>> configure custom TPID.
>>
>> Looking through kernel defines (in uapi/linux/if_ether.h) I see evidence
>> that funky QinQ EtherTypes have been used in the past:
>>
>>   #define ETH_P_QINQ1	0x9100		/* deprecated QinQ VLAN [ NOT AN OFFICIALLY
>> REGISTERED ID ] */
>>   #define ETH_P_QINQ2	0x9200		/* deprecated QinQ VLAN [ NOT AN OFFICIALLY
>> REGISTERED ID ] */
>>   #define ETH_P_QINQ3	0x9300		/* deprecated QinQ VLAN [ NOT AN OFFICIALLY
>> REGISTERED ID ] */
>>
>>
>>> [0]
>>> https://techhub.hpe.com/eginfolib/networking/docs/switches/7500/5200-1938a_l2-lan_cg/content/495503472.htm
>>>
>>>>
>>>>    static __always_inline int proto_is_vlan(__u16 h_proto)
>>>>    {
>>>> 	return !!(h_proto == bpf_htons(ETH_P_8021Q) ||
>>>> 		  h_proto == bpf_htons(ETH_P_8021AD));
>>>>    }
>>>>
>>>> [1] https://github.com/xdp-project/bpf-examples/blob/master/include/xdp/parsing_helpers.h#L75-L79
>>>>
>>>> Cc. Andrew Lunn, as I notice DSA have a fake VLAN define ETH_P_DSA_8021Q
>>>> (in file include/uapi/linux/if_ether.h)
>>>> Is this actually in use?
>>>> Maybe some hardware can "VLAN" offload this?
>>>>
>>>>
>>>>> What about rephrasing it this way:
>>>>>
>>>>> In case of success, vlan_proto contains VLAN protocol identifier (TPID),
>>>>> vlan_tag contains the remaining 16 bits of a 802.1Q tag (PCP+DEI+VID).
>>>>>
>>>>
>>>> Hmm, I think we can improve this further. This text becomes part of the
>>>> documentation for end-users (target audience).  Thus, I think it is
>>>> worth being more verbose and even mention the existing defines that we
>>>> are expecting end-users to take advantage of.
>>>>
>>>> What about:
>>>>
>>>> In case of success. The VLAN EtherType is stored in vlan_proto (usually
>>>> either ETH_P_8021Q or ETH_P_8021AD) also known as TPID (Tag Protocol
>>>> IDentifier). The VLAN tag is stored in vlan_tag, which is a 16-bit field
>>>> containing sub-fields (PCP+DEI+VID). The VLAN ID (VID) is 12-bits
>>>> commonly extracted using mask VLAN_VID_MASK (0x0fff).  For the meaning
>>>> of the sub-fields Priority Code Point (PCP) and Drop Eligible Indicator
>>>> (DEI) (formerly CFI) please reference other documentation. Remember
>>>> these 16-bit fields are stored in network-byte. Thus, transformation
>>>> with byte-order helper functions like bpf_ntohs() are needed.
>>>>
>>>
>>> AFAIK, vlan_tag is stored in host byte order, this is how it is in skb.
>>
>> I'm not sure we should follow SKB storage scheme for XDP.
>>
> 
> I think following SKB convention is a good idea in this particular case. As I
> have mentioned below, in ice VLAN TCI in descriptor already comes in LE, so no
> point in converting it into BE, so somebody would use bpf_ntohs() later anyway.
> We are not the only manufacturer that does this.
> 

As long as other NIC hardware does the same this seems okay.


>>> In ice, we receive VLAN tag in descriptor already in LE.
>>> Only protocol is BE (network byte order). So I would replace the last 2
>>> sentences with the following:
>>>
>>> vlan_tag is stored in host byte order, so no byte order conversion is needed.
>>
>> Yikes, that was unexpected.  This needs to be heavily documented in docs.
> 
> You mean the motivation, why it is so and not the other way around?
> 

No, I don't mean the motivation.
I simply mean write it in *bold*.

Look at the description for bpf_xdp_metadata_rx_hash, how it gets
rendered [1] and how the code comments look [2].

  [1] 
https://kernel.org/doc/html/latest/networking/xdp-rx-metadata.html#general-design
  [2] https://elixir.bootlin.com/linux/v6.4/source/net/core/xdp.c#L724

To save you some time compiling htmldocs target:

  make SPHINXDIRS="networking" V=1  htmldocs

>>
>> When parsing packets, it is in network-byte-order, else my code is wrong
>> here[1]:
>>
>>    [1] https://github.com/xdp-project/bpf-examples/blob/master/include/xdp/parsing_helpers.h#L122
>>
>> I'm accessing the skb->vlan_tci here [2], and I notice I don't do any
>> byte-order conversions, so fortunately I didn't make a code mistake.
>>
>>    [2] https://github.com/xdp-project/bpf-examples/blob/master/traffic-pacing-edt/edt_pacer_vlan.c#L215
>>
> 
> In raw packet, VLAN TCI is in network byte order, but skb requires NIC/driver
> to convert it into host byte order before putting it into skb.
>

I'm interested in if *most* NIC hardware will deliver this in LE
(Little-Endian) which is host-byte order on x86 ?


>>> vlan_proto is stored in network byte order, the suggested way to use this value:
>>>
>>> vlan_proto == bpf_htons(ETH_P_8021Q)
>>>
>>>>
>>>>
>>
>> --Jesper
>>
> 


  reply	other threads:[~2023-07-07 13:57 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-03 18:12 [xdp-hints] [PATCH bpf-next v2 00/20] XDP metadata via kfuncs for ice Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 01/20] ice: make RX hash reading code more reusable Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 02/20] ice: make RX HW timestamp " Larysa Zaremba
2023-07-04 10:04   ` [xdp-hints] " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 03/20] ice: make RX checksum checking " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 04/20] ice: Make ptype internal to descriptor info processing Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 05/20] ice: Introduce ice_xdp_buff Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 06/20] ice: Support HW timestamp hint Larysa Zaremba
2023-07-05 17:30   ` [xdp-hints] " Stanislav Fomichev
2023-07-06 14:22     ` Larysa Zaremba
2023-07-06 16:39       ` Stanislav Fomichev
2023-07-10 15:49         ` Larysa Zaremba
2023-07-10 18:12           ` Stanislav Fomichev
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 07/20] ice: Support RX hash XDP hint Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 08/20] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 09/20] xdp: Add VLAN tag hint Larysa Zaremba
2023-07-03 20:15   ` [xdp-hints] " John Fastabend
2023-07-04  8:23     ` Larysa Zaremba
2023-07-04 10:23       ` Jesper Dangaard Brouer
2023-07-04 11:02         ` Larysa Zaremba
2023-07-04 14:18           ` Jesper Dangaard Brouer
2023-07-06 14:46             ` Larysa Zaremba
2023-07-07 13:57               ` Jesper Dangaard Brouer [this message]
2023-07-07 17:58                 ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 10/20] ice: Implement " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 11/20] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 12/20] xdp: Add checksum level hint Larysa Zaremba
2023-07-03 20:38   ` [xdp-hints] " John Fastabend
2023-07-04  9:24     ` Larysa Zaremba
2023-07-04 10:39       ` Jesper Dangaard Brouer
2023-07-04 11:19         ` Larysa Zaremba
2023-07-06  5:50           ` John Fastabend
2023-07-06  9:04             ` Jesper Dangaard Brouer
2023-07-06 12:38               ` Larysa Zaremba
2023-07-06 12:49                 ` Larysa Zaremba
2023-07-10 16:58                   ` Alexander Lobakin
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 13/20] ice: Implement " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 14/20] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba
2023-07-05 17:31   ` [xdp-hints] " Stanislav Fomichev
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 15/20] net, xdp: allow metadata > 32 Larysa Zaremba
2023-07-03 21:06   ` [xdp-hints] " John Fastabend
2023-07-06 14:51     ` Larysa Zaremba
2023-07-10 14:01       ` Alexander Lobakin
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 16/20] selftests/bpf: Add flags and new hints to xdp_hw_metadata Larysa Zaremba
2023-07-04 11:03   ` [xdp-hints] " Jesper Dangaard Brouer
2023-07-04 11:04     ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 17/20] veth: Implement VLAN tag and checksum level XDP hint Larysa Zaremba
2023-07-05 17:25   ` [xdp-hints] " Stanislav Fomichev
2023-07-06  9:57     ` Jesper Dangaard Brouer
2023-07-06 10:15       ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 18/20] selftests/bpf: Use AF_INET for TX in xdp_metadata Larysa Zaremba
2023-07-05 17:39   ` [xdp-hints] " Stanislav Fomichev
2023-07-06 14:11     ` Larysa Zaremba
2023-07-06 17:25       ` Stanislav Fomichev
2023-07-06 17:27       ` Stanislav Fomichev
2023-07-07  8:33         ` Larysa Zaremba
2023-07-07 16:49           ` Stanislav Fomichev
2023-07-07 16:58             ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 19/20] selftests/bpf: Check VLAN tag and proto " Larysa Zaremba
2023-07-05 17:41   ` [xdp-hints] " Stanislav Fomichev
2023-07-06 10:10   ` Jesper Dangaard Brouer
2023-07-06 10:13     ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 20/20] selftests/bpf: check checksum level " Larysa Zaremba
2023-07-05 17:41   ` [xdp-hints] " Stanislav Fomichev
2023-07-06 10:25   ` Jesper Dangaard Brouer
2023-07-06 12:02     ` Larysa Zaremba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb8e2be1-4df9-8b26-468e-4d5d13e006c1@redhat.com \
    --to=jbrouer@redhat.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrew@lunn.ch \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=larysa.zaremba@intel.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox