* [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint @ 2023-12-05 21:08 Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 01/18] ice: make RX hash reading code more reusable Larysa Zaremba ` (18 more replies) 0 siblings, 19 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski This series introduces XDP hints via kfuncs [0] to the ice driver. Series brings the following existing hints to the ice driver: - HW timestamp - RX hash with type Series also introduces VLAN tag with protocol XDP hint, it now be accessed by XDP and userspace (AF_XDP) programs. They can also be checked with xdp_metadata test and xdp_hw_metadata program. Impact of these patches on ice performance: ZC: * Full hints implementation decreases pps in ZC mode by less than 3% (64B, rxdrop) skb (packets with invalid IP, dropped by stack): * Overall, patchset improves peak performance in skb mode by about 0.5% [0] https://patchwork.kernel.org/project/netdevbpf/cover/20230119221536.3349901-1-sdf@google.com/ v7: https://lore.kernel.org/bpf/20231115175301.534113-1-larysa.zaremba@intel.com/ v6: https://lore.kernel.org/bpf/20231012170524.21085-1-larysa.zaremba@intel.com/ Intermediate RFC v2: https://lore.kernel.org/bpf/20230927075124.23941-1-larysa.zaremba@intel.com/ Intermediate RFC v1: https://lore.kernel.org/bpf/20230824192703.712881-1-larysa.zaremba@intel.com/ v5: https://lore.kernel.org/bpf/20230811161509.19722-1-larysa.zaremba@intel.com/ v4: https://lore.kernel.org/bpf/20230728173923.1318596-1-larysa.zaremba@intel.com/ v3: https://lore.kernel.org/bpf/20230719183734.21681-1-larysa.zaremba@intel.com/ v2: https://lore.kernel.org/bpf/20230703181226.19380-1-larysa.zaremba@intel.com/ v1: https://lore.kernel.org/all/20230512152607.992209-1-larysa.zaremba@intel.com/ Changes since v7: * shorten timestamp assignment in ice * change first argument of ice_fill_rx_descs back to xsk_buff_pool * fix kernel-doc for ice_run_xdp_zc * add missing XSK_CHECK_PRIV_TYPE() in ice * resolved selftests merge conflicts with TX hints * AF_INET patch adds new packet generation, not replaces AF_XDP one * fix destination port in xdp_metadata Changes since v6: * add ability to fill cb of all xdp_buffs in xsk_buff_pool * place just pointer to packet context in ice_xdp_buff * add const qualifiers in veth implementation * generate uapi for VLAN hint Changes since v5: * drop checksum hint from the patchset entirely * Alex's patch that lifts the data_meta size limitation is no longer required in this patchset, so will be sent separately * new patch: hide some ice hints code behind a static key * fix several bugs in ZC mode (ice) * change argument order in VLAN hint kfunc (tci, proto -> proto, tci) * cosmetic changes * analyze performance impact Changes since v4: * Drop the concept of partial checksum from the hint design * Drop the concept of checksum level from the hint design Changes since v3: * use XDP_CHECKSUM_VALID_LVL0 + csum_level instead of csum_level + 1 * fix spelling mistakes * read XDP timestamp unconditionally * add TO_STR() macro Changes since v2: * redesign checksum hint, so now it gives full status * rename vlan_tag -> vlan_tci, where applicable * use open_netns() and close_netns() in xdp_metadata * improve VLAN hint documentation * replace CFI with DEI * use VLAN_VID_MASK in xdp_metadata * make vlan_get_tag() return -ENODATA * remove unused rx_ptype in ice_xsk.c * fix ice timestamp code division between patches Changes since v1: * directly return RX hash, RX timestamp and RX checksum status in skb-common functions * use intermediate enum value for checksum status in ice * get rid of ring structure dependency in ice kfunc implementation * make variables const, when possible, in ice implementation * use -ENODATA instead of -EOPNOTSUPP for driver implementation * instead of having 2 separate functions for c-tag and s-tag, use 1 function that outputs both VLAN tag and protocol ID * improve documentation for introduced hints * update xdp_metadata selftest to test new hints * implement new hints in veth, so they can be tested in xdp_metadata * parse VLAN tag in xdp_hw_metadata Larysa Zaremba (17): ice: make RX hash reading code more reusable ice: make RX HW timestamp reading code more reusable ice: Make ptype internal to descriptor info processing ice: Introduce ice_xdp_buff ice: Support HW timestamp hint ice: Support RX hash XDP hint ice: Support XDP hints in AF_XDP ZC mode xdp: Add VLAN tag hint ice: Implement VLAN tag hint ice: use VLAN proto from ring packet context in skb path veth: Implement VLAN tag XDP hint net: make vlan_get_tag() return -ENODATA instead of -EINVAL mlx5: implement VLAN tag XDP hint selftests/bpf: Allow VLAN packets in xdp_hw_metadata selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata selftests/bpf: Add AF_INET packet generation to xdp_metadata selftests/bpf: Check VLAN tag and proto in xdp_metadata Maciej Fijalkowski (1): xsk: add functions to fill control buffer Documentation/netlink/specs/netdev.yaml | 4 + Documentation/networking/xdp-rx-metadata.rst | 8 +- drivers/net/ethernet/intel/ice/ice.h | 2 + drivers/net/ethernet/intel/ice/ice_base.c | 15 + .../net/ethernet/intel/ice/ice_lan_tx_rx.h | 412 +++++++++--------- drivers/net/ethernet/intel/ice/ice_main.c | 21 + drivers/net/ethernet/intel/ice/ice_ptp.c | 22 +- drivers/net/ethernet/intel/ice/ice_ptp.h | 16 +- drivers/net/ethernet/intel/ice/ice_txrx.c | 19 +- drivers/net/ethernet/intel/ice/ice_txrx.h | 32 +- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 207 ++++++++- drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 18 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 17 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 15 + drivers/net/veth.c | 19 + include/linux/if_vlan.h | 4 +- include/linux/mlx5/device.h | 2 +- include/net/xdp.h | 9 + include/net/xdp_sock_drv.h | 17 + include/net/xsk_buff_pool.h | 2 + include/uapi/linux/netdev.h | 3 + net/core/xdp.c | 33 ++ net/xdp/xsk_buff_pool.c | 12 + tools/include/uapi/linux/netdev.h | 3 + tools/net/ynl/generated/netdev-user.c | 1 + .../selftests/bpf/prog_tests/xdp_metadata.c | 134 +++++- .../selftests/bpf/progs/xdp_hw_metadata.c | 38 +- .../selftests/bpf/progs/xdp_metadata.c | 5 + tools/testing/selftests/bpf/testing_helpers.h | 3 + tools/testing/selftests/bpf/xdp_hw_metadata.c | 34 +- tools/testing/selftests/bpf/xdp_metadata.h | 34 +- 31 files changed, 851 insertions(+), 310 deletions(-) -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 01/18] ice: make RX hash reading code more reusable 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 02/18] ice: make RX HW timestamp " Larysa Zaremba ` (17 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Previously, we only needed RX hash in skb path, hence all related code was written with skb in mind. But with the addition of XDP hints via kfuncs to the ice driver, the same logic will be needed in .xmo_() callbacks. Separate generic process of reading RX hash from a descriptor into a separate function. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 36 +++++++++++++------ 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 7e06373e14d9..17530359aaf8 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -63,28 +63,42 @@ static enum pkt_hash_types ice_ptype_to_htype(u16 ptype) } /** - * ice_rx_hash - set the hash value in the skb + * ice_get_rx_hash - get RX hash value from descriptor + * @rx_desc: specific descriptor + * + * Returns hash, if present, 0 otherwise. + */ +static u32 ice_get_rx_hash(const union ice_32b_rx_flex_desc *rx_desc) +{ + const struct ice_32b_rx_flex_desc_nic *nic_mdid; + + if (unlikely(rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)) + return 0; + + nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc; + return le32_to_cpu(nic_mdid->rss_hash); +} + +/** + * ice_rx_hash_to_skb - set the hash value in the skb * @rx_ring: descriptor ring * @rx_desc: specific descriptor * @skb: pointer to current skb * @rx_ptype: the ptype value from the descriptor */ static void -ice_rx_hash(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, - struct sk_buff *skb, u16 rx_ptype) +ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring, + const union ice_32b_rx_flex_desc *rx_desc, + struct sk_buff *skb, u16 rx_ptype) { - struct ice_32b_rx_flex_desc_nic *nic_mdid; u32 hash; if (!(rx_ring->netdev->features & NETIF_F_RXHASH)) return; - if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC) - return; - - nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc; - hash = le32_to_cpu(nic_mdid->rss_hash); - skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype)); + hash = ice_get_rx_hash(rx_desc); + if (likely(hash)) + skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype)); } /** @@ -186,7 +200,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb, u16 ptype) { - ice_rx_hash(rx_ring, rx_desc, skb, ptype); + ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype); /* modifies the skb - consumes the enet header */ skb->protocol = eth_type_trans(skb, rx_ring->netdev); -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 02/18] ice: make RX HW timestamp reading code more reusable 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 01/18] ice: make RX hash reading code more reusable Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 03/18] ice: Make ptype internal to descriptor info processing Larysa Zaremba ` (16 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Previously, we only needed RX HW timestamp in skb path, hence all related code was written with skb in mind. But with the addition of XDP hints via kfuncs to the ice driver, the same logic will be needed in .xmo_() callbacks. Put generic process of reading RX HW timestamp from a descriptor into a separate function. Move skb-related code into another source file. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice_ptp.c | 20 +++++++------------ drivers/net/ethernet/intel/ice/ice_ptp.h | 16 +++++++++------ drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 20 ++++++++++++++++++- 3 files changed, 36 insertions(+), 20 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index 71f405f8a6fe..bb54f43b5a18 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -2127,30 +2127,26 @@ int ice_ptp_set_ts_config(struct ice_pf *pf, struct ifreq *ifr) } /** - * ice_ptp_rx_hwtstamp - Check for an Rx timestamp - * @rx_ring: Ring to get the VSI info + * ice_ptp_get_rx_hwts - Get packet Rx timestamp in ns * @rx_desc: Receive descriptor - * @skb: Particular skb to send timestamp with + * @rx_ring: Ring to get the cached time * * The driver receives a notification in the receive descriptor with timestamp. - * The timestamp is in ns, so we must convert the result first. */ -void -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, - union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, + struct ice_rx_ring *rx_ring) { - struct skb_shared_hwtstamps *hwtstamps; u64 ts_ns, cached_time; u32 ts_high; if (!(rx_desc->wb.time_stamp_low & ICE_PTP_TS_VALID)) - return; + return 0; cached_time = READ_ONCE(rx_ring->cached_phctime); /* Do not report a timestamp if we don't have a cached PHC time */ if (!cached_time) - return; + return 0; /* Use ice_ptp_extend_32b_ts directly, using the ring-specific cached * PHC value, rather than accessing the PF. This also allows us to @@ -2161,9 +2157,7 @@ ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, ts_high = le32_to_cpu(rx_desc->wb.flex_ts.ts_high); ts_ns = ice_ptp_extend_32b_ts(cached_time, ts_high); - hwtstamps = skb_hwtstamps(skb); - memset(hwtstamps, 0, sizeof(*hwtstamps)); - hwtstamps->hwtstamp = ns_to_ktime(ts_ns); + return ts_ns; } /** diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h index 06a330867fc9..45327cb92bc6 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp.h @@ -298,9 +298,8 @@ void ice_ptp_extts_event(struct ice_pf *pf); s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb); enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf); -void -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, - union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb); +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, + struct ice_rx_ring *rx_ring); void ice_ptp_reset(struct ice_pf *pf); void ice_ptp_prepare_for_reset(struct ice_pf *pf); void ice_ptp_init(struct ice_pf *pf); @@ -329,9 +328,14 @@ static inline bool ice_ptp_process_ts(struct ice_pf *pf) { return true; } -static inline void -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, - union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) { } + +static inline u64 +ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, + struct ice_rx_ring *rx_ring) +{ + return 0; +} + static inline void ice_ptp_reset(struct ice_pf *pf) { } static inline void ice_ptp_prepare_for_reset(struct ice_pf *pf) { } static inline void ice_ptp_init(struct ice_pf *pf) { } diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 17530359aaf8..02d70a96a5a4 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -184,6 +184,24 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb, ring->vsi->back->hw_csum_rx_error++; } +/** + * ice_ptp_rx_hwts_to_skb - Put RX timestamp into skb + * @rx_ring: Ring to get the VSI info + * @rx_desc: Receive descriptor + * @skb: Particular skb to send timestamp with + * + * The timestamp is in ns, so we must convert the result first. + */ +static void +ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring, + const union ice_32b_rx_flex_desc *rx_desc, + struct sk_buff *skb) +{ + u64 ts_ns = ice_ptp_get_rx_hwts(rx_desc, rx_ring); + + skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(ts_ns); +} + /** * ice_process_skb_fields - Populate skb header fields from Rx descriptor * @rx_ring: Rx descriptor ring packet is being transacted on @@ -208,7 +226,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, ice_rx_csum(rx_ring, skb, rx_desc, ptype); if (rx_ring->ptp_rx) - ice_ptp_rx_hwtstamp(rx_ring, rx_desc, skb); + ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb); } /** -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 03/18] ice: Make ptype internal to descriptor info processing 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 01/18] ice: make RX hash reading code more reusable Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 02/18] ice: make RX HW timestamp " Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 04/18] ice: Introduce ice_xdp_buff Larysa Zaremba ` (15 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Currently, rx_ptype variable is used only as an argument to ice_process_skb_fields() and is computed just before the function call. Therefore, there is no reason to pass this value as an argument. Instead, remove this argument and compute the value directly inside ice_process_skb_fields() function. Also, separate its calculation into a short function, so the code can later be reused in .xmo_() callbacks. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 6 +----- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 15 +++++++++++++-- drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 2 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +----- 4 files changed, 16 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 9e97ea863068..6afe4cf1de8a 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -1181,7 +1181,6 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) unsigned int size; u16 stat_err_bits; u16 vlan_tag = 0; - u16 rx_ptype; /* get the Rx desc from Rx ring based on 'next_to_clean' */ rx_desc = ICE_RX_DESC(rx_ring, ntc); @@ -1286,10 +1285,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) total_rx_bytes += skb->len; /* populate checksum, VLAN, and protocol */ - rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) & - ICE_RX_FLEX_DESC_PTYPE_M; - - ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype); + ice_process_skb_fields(rx_ring, rx_desc, skb); ice_trace(clean_rx_irq_indicate, rx_ring, rx_desc, skb); /* send completed skb up the stack */ diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 02d70a96a5a4..8904b22bfba7 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -202,12 +202,21 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring, skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(ts_ns); } +/** + * ice_get_ptype - Read HW packet type from the descriptor + * @rx_desc: RX descriptor + */ +static u16 ice_get_ptype(const union ice_32b_rx_flex_desc *rx_desc) +{ + return le16_to_cpu(rx_desc->wb.ptype_flex_flags0) & + ICE_RX_FLEX_DESC_PTYPE_M; +} + /** * ice_process_skb_fields - Populate skb header fields from Rx descriptor * @rx_ring: Rx descriptor ring packet is being transacted on * @rx_desc: pointer to the EOP Rx descriptor * @skb: pointer to current skb being populated - * @ptype: the packet type decoded by hardware * * This function checks the ring, descriptor, and packet information in * order to populate the hash, checksum, VLAN, protocol, and @@ -216,8 +225,10 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring, void ice_process_skb_fields(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, - struct sk_buff *skb, u16 ptype) + struct sk_buff *skb) { + u16 ptype = ice_get_ptype(rx_desc); + ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype); /* modifies the skb - consumes the enet header */ diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index 115969ecdf7b..e1d49e1235b3 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -148,7 +148,7 @@ void ice_release_rx_desc(struct ice_rx_ring *rx_ring, u16 val); void ice_process_skb_fields(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, - struct sk_buff *skb, u16 ptype); + struct sk_buff *skb); void ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag); #endif /* !_ICE_TXRX_LIB_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 99954508184f..906e383e864a 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -864,7 +864,6 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget) struct sk_buff *skb; u16 stat_err_bits; u16 vlan_tag = 0; - u16 rx_ptype; rx_desc = ICE_RX_DESC(rx_ring, ntc); @@ -944,10 +943,7 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget) vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc); - rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) & - ICE_RX_FLEX_DESC_PTYPE_M; - - ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype); + ice_process_skb_fields(rx_ring, rx_desc, skb); ice_receive_skb(rx_ring, skb, vlan_tag); } -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 04/18] ice: Introduce ice_xdp_buff 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (2 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 03/18] ice: Make ptype internal to descriptor info processing Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-12 13:07 ` [xdp-hints] " Maciej Fijalkowski 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 05/18] ice: Support HW timestamp hint Larysa Zaremba ` (14 subsequent siblings) 18 siblings, 1 reply; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski In order to use XDP hints via kfuncs we need to put RX descriptor and miscellaneous data next to xdp_buff. Same as in hints implementations in other drivers, we achieve this through putting xdp_buff into a child structure. Currently, xdp_buff is stored in the ring structure, so replace it with union that includes child structure. This way enough memory is available while existing XDP code remains isolated from hints. Minimum size of the new child structure (ice_xdp_buff) is exactly 64 bytes (single cache line). To place it at the start of a cache line, move 'next' field from CL1 to CL4, as it isn't used often. This still leaves 192 bits available in CL3 for packet context extensions. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx.c | 7 +++++-- drivers/net/ethernet/intel/ice/ice_txrx.h | 18 +++++++++++++++--- drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 ++++++++++ 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 6afe4cf1de8a..99ea47011fe0 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size) * @xdp_prog: XDP program to run * @xdp_ring: ring to be used for XDP_TX action * @rx_buf: Rx buffer to store the XDP action + * @eop_desc: Last descriptor in packet to read metadata from * * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} */ static void ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, - struct ice_rx_buf *rx_buf) + struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) { unsigned int ret = ICE_XDP_PASS; u32 act; @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, if (!xdp_prog) goto exit; + ice_xdp_meta_set_desc(xdp, eop_desc); + act = bpf_prog_run_xdp(xdp_prog, xdp); switch (act) { case XDP_PASS: @@ -1240,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf); + ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb; total_rx_bytes += xdp_get_buff_len(xdp); diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index daf7b9dbb143..cd93394fab17 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -257,6 +257,14 @@ enum ice_rx_dtype { ICE_RX_DTYPE_SPLIT_ALWAYS = 2, }; +struct ice_xdp_buff { + struct xdp_buff xdp_buff; + const union ice_32b_rx_flex_desc *eop_desc; +}; + +/* Required for compatibility with xdp_buffs from xsk_pool */ +static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0); + /* indices into GLINT_ITR registers */ #define ICE_RX_ITR ICE_IDX_ITR0 #define ICE_TX_ITR ICE_IDX_ITR1 @@ -298,7 +306,6 @@ enum ice_dynamic_itr { /* descriptor ring, associated with a VSI */ struct ice_rx_ring { /* CL1 - 1st cacheline starts here */ - struct ice_rx_ring *next; /* pointer to next ring in q_vector */ void *desc; /* Descriptor ring memory */ struct device *dev; /* Used for DMA mapping */ struct net_device *netdev; /* netdev ring maps to */ @@ -310,12 +317,16 @@ struct ice_rx_ring { u16 count; /* Number of descriptors */ u16 reg_idx; /* HW register index of the ring */ u16 next_to_alloc; - /* CL2 - 2nd cacheline starts here */ + union { struct ice_rx_buf *rx_buf; struct xdp_buff **xdp_buf; }; - struct xdp_buff xdp; + /* CL2 - 2nd cacheline starts here */ + union { + struct ice_xdp_buff xdp_ext; + struct xdp_buff xdp; + }; /* CL3 - 3rd cacheline starts here */ struct bpf_prog *xdp_prog; u16 rx_offset; @@ -332,6 +343,7 @@ struct ice_rx_ring { /* CL4 - 4th cacheline starts here */ struct ice_channel *ch; struct ice_tx_ring *xdp_ring; + struct ice_rx_ring *next; /* pointer to next ring in q_vector */ struct xsk_buff_pool *xsk_pool; dma_addr_t dma; /* physical address of ring */ u64 cached_phctime; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index e1d49e1235b3..81b8856d8e13 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -151,4 +151,14 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, struct sk_buff *skb); void ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag); + +static inline void +ice_xdp_meta_set_desc(struct xdp_buff *xdp, + union ice_32b_rx_flex_desc *eop_desc) +{ + struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff, + xdp_buff); + + xdp_ext->eop_desc = eop_desc; +} #endif /* !_ICE_TXRX_LIB_H_ */ -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 04/18] ice: Introduce ice_xdp_buff 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 04/18] ice: Introduce ice_xdp_buff Larysa Zaremba @ 2023-12-12 13:07 ` Maciej Fijalkowski 0 siblings, 0 replies; 27+ messages in thread From: Maciej Fijalkowski @ 2023-12-12 13:07 UTC (permalink / raw) To: Larysa Zaremba Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed On Tue, Dec 05, 2023 at 10:08:33PM +0100, Larysa Zaremba wrote: > In order to use XDP hints via kfuncs we need to put > RX descriptor and miscellaneous data next to xdp_buff. > Same as in hints implementations in other drivers, we achieve > this through putting xdp_buff into a child structure. > > Currently, xdp_buff is stored in the ring structure, > so replace it with union that includes child structure. > This way enough memory is available while existing XDP code > remains isolated from hints. > > Minimum size of the new child structure (ice_xdp_buff) is exactly > 64 bytes (single cache line). To place it at the start of a cache line, > move 'next' field from CL1 to CL4, as it isn't used often. This still > leaves 192 bits available in CL3 for packet context extensions. > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > --- > drivers/net/ethernet/intel/ice/ice_txrx.c | 7 +++++-- > drivers/net/ethernet/intel/ice/ice_txrx.h | 18 +++++++++++++++--- > drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 10 ++++++++++ > 3 files changed, 30 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c > index 6afe4cf1de8a..99ea47011fe0 100644 > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c > @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size) > * @xdp_prog: XDP program to run > * @xdp_ring: ring to be used for XDP_TX action > * @rx_buf: Rx buffer to store the XDP action > + * @eop_desc: Last descriptor in packet to read metadata from > * > * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} > */ > static void > ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, > struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, > - struct ice_rx_buf *rx_buf) > + struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) > { > unsigned int ret = ICE_XDP_PASS; > u32 act; > @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, > if (!xdp_prog) > goto exit; > > + ice_xdp_meta_set_desc(xdp, eop_desc); > + > act = bpf_prog_run_xdp(xdp_prog, xdp); > switch (act) { > case XDP_PASS: > @@ -1240,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) > if (ice_is_non_eop(rx_ring, rx_desc)) > continue; > > - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf); > + ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); > if (rx_buf->act == ICE_XDP_PASS) > goto construct_skb; > total_rx_bytes += xdp_get_buff_len(xdp); > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h > index daf7b9dbb143..cd93394fab17 100644 > --- a/drivers/net/ethernet/intel/ice/ice_txrx.h > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h > @@ -257,6 +257,14 @@ enum ice_rx_dtype { > ICE_RX_DTYPE_SPLIT_ALWAYS = 2, > }; > > +struct ice_xdp_buff { > + struct xdp_buff xdp_buff; > + const union ice_32b_rx_flex_desc *eop_desc; > +}; > + > +/* Required for compatibility with xdp_buffs from xsk_pool */ > +static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0); > + > /* indices into GLINT_ITR registers */ > #define ICE_RX_ITR ICE_IDX_ITR0 > #define ICE_TX_ITR ICE_IDX_ITR1 > @@ -298,7 +306,6 @@ enum ice_dynamic_itr { > /* descriptor ring, associated with a VSI */ > struct ice_rx_ring { > /* CL1 - 1st cacheline starts here */ > - struct ice_rx_ring *next; /* pointer to next ring in q_vector */ > void *desc; /* Descriptor ring memory */ > struct device *dev; /* Used for DMA mapping */ > struct net_device *netdev; /* netdev ring maps to */ > @@ -310,12 +317,16 @@ struct ice_rx_ring { > u16 count; /* Number of descriptors */ > u16 reg_idx; /* HW register index of the ring */ > u16 next_to_alloc; > - /* CL2 - 2nd cacheline starts here */ > + > union { > struct ice_rx_buf *rx_buf; > struct xdp_buff **xdp_buf; > }; > - struct xdp_buff xdp; > + /* CL2 - 2nd cacheline starts here */ > + union { > + struct ice_xdp_buff xdp_ext; > + struct xdp_buff xdp; > + }; > /* CL3 - 3rd cacheline starts here */ > struct bpf_prog *xdp_prog; > u16 rx_offset; > @@ -332,6 +343,7 @@ struct ice_rx_ring { > /* CL4 - 4th cacheline starts here */ > struct ice_channel *ch; > struct ice_tx_ring *xdp_ring; > + struct ice_rx_ring *next; /* pointer to next ring in q_vector */ > struct xsk_buff_pool *xsk_pool; > dma_addr_t dma; /* physical address of ring */ > u64 cached_phctime; > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h > index e1d49e1235b3..81b8856d8e13 100644 > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h > @@ -151,4 +151,14 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, > struct sk_buff *skb); > void > ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag); > + > +static inline void > +ice_xdp_meta_set_desc(struct xdp_buff *xdp, > + union ice_32b_rx_flex_desc *eop_desc) > +{ > + struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff, > + xdp_buff); > + > + xdp_ext->eop_desc = eop_desc; > +} > #endif /* !_ICE_TXRX_LIB_H_ */ > -- > 2.41.0 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 05/18] ice: Support HW timestamp hint 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (3 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 04/18] ice: Introduce ice_xdp_buff Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 06/18] ice: Support RX hash XDP hint Larysa Zaremba ` (13 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Use previously refactored code and create a function that allows XDP code to read HW timestamp. Also, introduce packet context, where hints-related data will be stored. ice_xdp_buff contains only a pointer to this structure, to avoid copying it in ZC mode later in the series. HW timestamp is the first supported hint in the driver, so also add xdp_metadata_ops. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice.h | 2 ++ drivers/net/ethernet/intel/ice/ice_base.c | 1 + drivers/net/ethernet/intel/ice/ice_main.c | 1 + drivers/net/ethernet/intel/ice/ice_ptp.c | 6 ++--- drivers/net/ethernet/intel/ice/ice_ptp.h | 4 +-- drivers/net/ethernet/intel/ice/ice_txrx.h | 10 +++++++- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 25 ++++++++++++++++++- 7 files changed, 42 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index cd7dcd0fa7f2..9cf4ed3d2885 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -996,4 +996,6 @@ static inline void ice_clear_rdma_cap(struct ice_pf *pf) set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags); clear_bit(ICE_FLAG_RDMA_ENA, pf->flags); } + +extern const struct xdp_metadata_ops ice_xdp_md_ops; #endif /* _ICE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index 7fa43827a3f0..2d83f3c029e7 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -575,6 +575,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring) xdp_init_buff(&ring->xdp, ice_rx_pg_size(ring) / 2, &ring->xdp_rxq); ring->xdp.data = NULL; + ring->xdp_ext.pkt_ctx = &ring->pkt_ctx; err = ice_setup_rx_ctx(ring); if (err) { dev_err(dev, "ice_setup_rx_ctx failed for RxQ %d, err %d\n", diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 43ba3e55b8c1..0a2415dd78f1 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -3397,6 +3397,7 @@ static void ice_set_ops(struct ice_vsi *vsi) netdev->netdev_ops = &ice_netdev_ops; netdev->udp_tunnel_nic_info = &pf->hw.udp_tunnel_nic; + netdev->xdp_metadata_ops = &ice_xdp_md_ops; ice_set_ethtool_ops(netdev); if (vsi->type != ICE_VSI_PF) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index bb54f43b5a18..a4d3a9ee409a 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -2129,12 +2129,12 @@ int ice_ptp_set_ts_config(struct ice_pf *pf, struct ifreq *ifr) /** * ice_ptp_get_rx_hwts - Get packet Rx timestamp in ns * @rx_desc: Receive descriptor - * @rx_ring: Ring to get the cached time + * @pkt_ctx: Packet context to get the cached time * * The driver receives a notification in the receive descriptor with timestamp. */ u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, - struct ice_rx_ring *rx_ring) + const struct ice_pkt_ctx *pkt_ctx) { u64 ts_ns, cached_time; u32 ts_high; @@ -2142,7 +2142,7 @@ u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, if (!(rx_desc->wb.time_stamp_low & ICE_PTP_TS_VALID)) return 0; - cached_time = READ_ONCE(rx_ring->cached_phctime); + cached_time = READ_ONCE(pkt_ctx->cached_phctime); /* Do not report a timestamp if we don't have a cached PHC time */ if (!cached_time) diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h index 45327cb92bc6..5c6450e4f2f2 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp.h @@ -299,7 +299,7 @@ s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb); enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf); u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, - struct ice_rx_ring *rx_ring); + const struct ice_pkt_ctx *pkt_ctx); void ice_ptp_reset(struct ice_pf *pf); void ice_ptp_prepare_for_reset(struct ice_pf *pf); void ice_ptp_init(struct ice_pf *pf); @@ -331,7 +331,7 @@ static inline bool ice_ptp_process_ts(struct ice_pf *pf) static inline u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, - struct ice_rx_ring *rx_ring) + const struct ice_pkt_ctx *pkt_ctx) { return 0; } diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index cd93394fab17..ce3434c73a4b 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -257,9 +257,14 @@ enum ice_rx_dtype { ICE_RX_DTYPE_SPLIT_ALWAYS = 2, }; +struct ice_pkt_ctx { + u64 cached_phctime; +}; + struct ice_xdp_buff { struct xdp_buff xdp_buff; const union ice_32b_rx_flex_desc *eop_desc; + const struct ice_pkt_ctx *pkt_ctx; }; /* Required for compatibility with xdp_buffs from xsk_pool */ @@ -328,6 +333,10 @@ struct ice_rx_ring { struct xdp_buff xdp; }; /* CL3 - 3rd cacheline starts here */ + union { + struct ice_pkt_ctx pkt_ctx; + u64 cached_phctime; + }; struct bpf_prog *xdp_prog; u16 rx_offset; @@ -346,7 +355,6 @@ struct ice_rx_ring { struct ice_rx_ring *next; /* pointer to next ring in q_vector */ struct xsk_buff_pool *xsk_pool; dma_addr_t dma; /* physical address of ring */ - u64 cached_phctime; u16 rx_buf_len; u8 dcb_tc; /* Traffic class of ring */ u8 ptp_rx; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 8904b22bfba7..13b8a9addfac 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -197,7 +197,7 @@ ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring, const union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) { - u64 ts_ns = ice_ptp_get_rx_hwts(rx_desc, rx_ring); + u64 ts_ns = ice_ptp_get_rx_hwts(rx_desc, &rx_ring->pkt_ctx); skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(ts_ns); } @@ -507,3 +507,26 @@ void ice_finalize_xdp_rx(struct ice_tx_ring *xdp_ring, unsigned int xdp_res, spin_unlock(&xdp_ring->tx_lock); } } + +/** + * ice_xdp_rx_hw_ts - HW timestamp XDP hint handler + * @ctx: XDP buff pointer + * @ts_ns: destination address + * + * Copy HW timestamp (if available) to the destination address. + */ +static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns) +{ + const struct ice_xdp_buff *xdp_ext = (void *)ctx; + + *ts_ns = ice_ptp_get_rx_hwts(xdp_ext->eop_desc, + xdp_ext->pkt_ctx); + if (!*ts_ns) + return -ENODATA; + + return 0; +} + +const struct xdp_metadata_ops ice_xdp_md_ops = { + .xmo_rx_timestamp = ice_xdp_rx_hw_ts, +}; -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 06/18] ice: Support RX hash XDP hint 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (4 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 05/18] ice: Support HW timestamp hint Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 07/18] xsk: add functions to fill control buffer Larysa Zaremba ` (12 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski RX hash XDP hint requests both hash value and type. Type is XDP-specific, so we need a separate way to map these values to the hardware ptypes, so create a lookup table. Instead of creating a new long list, reuse contents of ice_decode_rx_desc_ptype[] through preprocessor. Current hash type enum does not contain ICMP packet type, but ice devices support it, so also add a new type into core code. Then use previously refactored code and create a function that allows XDP code to read RX hash. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- .../net/ethernet/intel/ice/ice_lan_tx_rx.h | 412 +++++++++--------- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 73 ++++ include/net/xdp.h | 3 + 3 files changed, 284 insertions(+), 204 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h index 89f986a75cc8..d384ddfcb83e 100644 --- a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h +++ b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h @@ -673,6 +673,212 @@ struct ice_tlan_ctx { * Use the enum ice_rx_l2_ptype to decode the packet type * ENDIF */ +#define ICE_PTYPES \ + /* L2 Packet types */ \ + ICE_PTT_UNUSED_ENTRY(0), \ + ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2), \ + ICE_PTT_UNUSED_ENTRY(2), \ + ICE_PTT_UNUSED_ENTRY(3), \ + ICE_PTT_UNUSED_ENTRY(4), \ + ICE_PTT_UNUSED_ENTRY(5), \ + ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT_UNUSED_ENTRY(8), \ + ICE_PTT_UNUSED_ENTRY(9), \ + ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT_UNUSED_ENTRY(12), \ + ICE_PTT_UNUSED_ENTRY(13), \ + ICE_PTT_UNUSED_ENTRY(14), \ + ICE_PTT_UNUSED_ENTRY(15), \ + ICE_PTT_UNUSED_ENTRY(16), \ + ICE_PTT_UNUSED_ENTRY(17), \ + ICE_PTT_UNUSED_ENTRY(18), \ + ICE_PTT_UNUSED_ENTRY(19), \ + ICE_PTT_UNUSED_ENTRY(20), \ + ICE_PTT_UNUSED_ENTRY(21), \ + \ + /* Non Tunneled IPv4 */ \ + ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(25), \ + ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP, PAY4), \ + ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4), \ + ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> IPv4 */ \ + ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(32), \ + ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> IPv6 */ \ + ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(39), \ + ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT */ \ + ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), \ + \ + /* IPv4 --> GRE/NAT --> IPv4 */ \ + ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(47), \ + ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT --> IPv6 */ \ + ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(54), \ + ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT --> MAC */ \ + ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), \ + \ + /* IPv4 --> GRE/NAT --> MAC --> IPv4 */ \ + ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(62), \ + ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT -> MAC --> IPv6 */ \ + ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(69), \ + ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT --> MAC/VLAN */ \ + ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), \ + \ + /* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */ \ + ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(77), \ + ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */ \ + ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(84), \ + ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), \ + \ + /* Non Tunneled IPv6 */ \ + ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(91), \ + ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP, PAY4), \ + ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4), \ + ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> IPv4 */ \ + ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(98), \ + ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> IPv6 */ \ + ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(105), \ + ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT */ \ + ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), \ + \ + /* IPv6 --> GRE/NAT -> IPv4 */ \ + ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(113), \ + ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> IPv6 */ \ + ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(120), \ + ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC */ \ + ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), \ + \ + /* IPv6 --> GRE/NAT -> MAC -> IPv4 */ \ + ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(128), \ + ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC -> IPv6 */ \ + ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(135), \ + ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC/VLAN */ \ + ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), \ + \ + /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */ \ + ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(143), \ + ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */ \ + ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(150), \ + ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), + +#define ICE_NUM_DEFINED_PTYPES 154 /* macro to make the table lines short, use explicit indexing with [PTYPE] */ #define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\ @@ -695,212 +901,10 @@ struct ice_tlan_ctx { /* Lookup table mapping in the 10-bit HW PTYPE to the bit field for decoding */ static const struct ice_rx_ptype_decoded ice_ptype_lkup[BIT(10)] = { - /* L2 Packet types */ - ICE_PTT_UNUSED_ENTRY(0), - ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2), - ICE_PTT_UNUSED_ENTRY(2), - ICE_PTT_UNUSED_ENTRY(3), - ICE_PTT_UNUSED_ENTRY(4), - ICE_PTT_UNUSED_ENTRY(5), - ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT_UNUSED_ENTRY(8), - ICE_PTT_UNUSED_ENTRY(9), - ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT_UNUSED_ENTRY(12), - ICE_PTT_UNUSED_ENTRY(13), - ICE_PTT_UNUSED_ENTRY(14), - ICE_PTT_UNUSED_ENTRY(15), - ICE_PTT_UNUSED_ENTRY(16), - ICE_PTT_UNUSED_ENTRY(17), - ICE_PTT_UNUSED_ENTRY(18), - ICE_PTT_UNUSED_ENTRY(19), - ICE_PTT_UNUSED_ENTRY(20), - ICE_PTT_UNUSED_ENTRY(21), - - /* Non Tunneled IPv4 */ - ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(25), - ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP, PAY4), - ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4), - ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4), - - /* IPv4 --> IPv4 */ - ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3), - ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3), - ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(32), - ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP, PAY4), - ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), - ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), - - /* IPv4 --> IPv6 */ - ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3), - ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3), - ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(39), - ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP, PAY4), - ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), - ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT */ - ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), - - /* IPv4 --> GRE/NAT --> IPv4 */ - ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), - ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), - ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(47), - ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), - ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), - ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT --> IPv6 */ - ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), - ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), - ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(54), - ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), - ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), - ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT --> MAC */ - ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), - - /* IPv4 --> GRE/NAT --> MAC --> IPv4 */ - ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), - ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), - ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(62), - ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), - ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), - ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT -> MAC --> IPv6 */ - ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), - ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), - ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(69), - ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), - ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), - ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT --> MAC/VLAN */ - ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), - - /* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */ - ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), - ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), - ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(77), - ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), - ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), - ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), - - /* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */ - ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), - ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), - ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(84), - ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), - ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), - ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), - - /* Non Tunneled IPv6 */ - ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(91), - ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP, PAY4), - ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4), - ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4), - - /* IPv6 --> IPv4 */ - ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3), - ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3), - ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(98), - ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP, PAY4), - ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), - ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> IPv6 */ - ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3), - ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3), - ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(105), - ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP, PAY4), - ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), - ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT */ - ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), - - /* IPv6 --> GRE/NAT -> IPv4 */ - ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), - ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), - ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(113), - ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), - ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), - ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> IPv6 */ - ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), - ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), - ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(120), - ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), - ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), - ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC */ - ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), - - /* IPv6 --> GRE/NAT -> MAC -> IPv4 */ - ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), - ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), - ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(128), - ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), - ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), - ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC -> IPv6 */ - ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), - ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), - ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(135), - ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), - ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), - ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC/VLAN */ - ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), - - /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */ - ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), - ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), - ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(143), - ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), - ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), - ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */ - ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), - ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), - ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(150), - ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), - ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), - ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), + ICE_PTYPES /* unused entries */ - [154 ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 } + [ICE_NUM_DEFINED_PTYPES ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 } }; static inline struct ice_rx_ptype_decoded ice_decode_rx_desc_ptype(u16 ptype) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 13b8a9addfac..09610c5615a8 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -527,6 +527,79 @@ static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns) return 0; } +/* Define a ptype index -> XDP hash type lookup table. + * It uses the same ptype definitions as ice_decode_rx_desc_ptype[], + * avoiding possible copy-paste errors. + */ +#undef ICE_PTT +#undef ICE_PTT_UNUSED_ENTRY + +#define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\ + [PTYPE] = XDP_RSS_L3_##OUTER_IP_VER | XDP_RSS_L4_##I | XDP_RSS_TYPE_##PL + +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0 + +/* A few supplementary definitions for when XDP hash types do not coincide + * with what can be generated from ptype definitions + * by means of preprocessor concatenation. + */ +#define XDP_RSS_L3_NONE XDP_RSS_TYPE_NONE +#define XDP_RSS_L4_NONE XDP_RSS_TYPE_NONE +#define XDP_RSS_TYPE_PAY2 XDP_RSS_TYPE_L2 +#define XDP_RSS_TYPE_PAY3 XDP_RSS_TYPE_NONE +#define XDP_RSS_TYPE_PAY4 XDP_RSS_L4 + +static const enum xdp_rss_hash_type +ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = { + ICE_PTYPES +}; + +#undef XDP_RSS_L3_NONE +#undef XDP_RSS_L4_NONE +#undef XDP_RSS_TYPE_PAY2 +#undef XDP_RSS_TYPE_PAY3 +#undef XDP_RSS_TYPE_PAY4 + +#undef ICE_PTT +#undef ICE_PTT_UNUSED_ENTRY + +/** + * ice_xdp_rx_hash_type - Get XDP-specific hash type from the RX descriptor + * @eop_desc: End of Packet descriptor + */ +static enum xdp_rss_hash_type +ice_xdp_rx_hash_type(const union ice_32b_rx_flex_desc *eop_desc) +{ + u16 ptype = ice_get_ptype(eop_desc); + + if (unlikely(ptype >= ICE_NUM_DEFINED_PTYPES)) + return 0; + + return ice_ptype_to_xdp_hash[ptype]; +} + +/** + * ice_xdp_rx_hash - RX hash XDP hint handler + * @ctx: XDP buff pointer + * @hash: hash destination address + * @rss_type: XDP hash type destination address + * + * Copy RX hash (if available) and its type to the destination address. + */ +static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash, + enum xdp_rss_hash_type *rss_type) +{ + const struct ice_xdp_buff *xdp_ext = (void *)ctx; + + *hash = ice_get_rx_hash(xdp_ext->eop_desc); + *rss_type = ice_xdp_rx_hash_type(xdp_ext->eop_desc); + if (!likely(*hash)) + return -ENODATA; + + return 0; +} + const struct xdp_metadata_ops ice_xdp_md_ops = { .xmo_rx_timestamp = ice_xdp_rx_hw_ts, + .xmo_rx_hash = ice_xdp_rx_hash, }; diff --git a/include/net/xdp.h b/include/net/xdp.h index 349c36fb5fd8..eb77040b4825 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -427,6 +427,7 @@ enum xdp_rss_hash_type { XDP_RSS_L4_UDP = BIT(5), XDP_RSS_L4_SCTP = BIT(6), XDP_RSS_L4_IPSEC = BIT(7), /* L4 based hash include IPSEC SPI */ + XDP_RSS_L4_ICMP = BIT(8), /* Second part: RSS hash type combinations used for driver HW mapping */ XDP_RSS_TYPE_NONE = 0, @@ -442,11 +443,13 @@ enum xdp_rss_hash_type { XDP_RSS_TYPE_L4_IPV4_UDP = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_UDP, XDP_RSS_TYPE_L4_IPV4_SCTP = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_SCTP, XDP_RSS_TYPE_L4_IPV4_IPSEC = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC, + XDP_RSS_TYPE_L4_IPV4_ICMP = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_ICMP, XDP_RSS_TYPE_L4_IPV6_TCP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_TCP, XDP_RSS_TYPE_L4_IPV6_UDP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_UDP, XDP_RSS_TYPE_L4_IPV6_SCTP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_SCTP, XDP_RSS_TYPE_L4_IPV6_IPSEC = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC, + XDP_RSS_TYPE_L4_IPV6_ICMP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_ICMP, XDP_RSS_TYPE_L4_IPV6_TCP_EX = XDP_RSS_TYPE_L4_IPV6_TCP | XDP_RSS_L3_DYNHDR, XDP_RSS_TYPE_L4_IPV6_UDP_EX = XDP_RSS_TYPE_L4_IPV6_UDP | XDP_RSS_L3_DYNHDR, -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 07/18] xsk: add functions to fill control buffer 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (5 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 06/18] ice: Support RX hash XDP hint Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-12 13:51 ` [xdp-hints] " Magnus Karlsson 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 08/18] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba ` (11 subsequent siblings) 18 siblings, 1 reply; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk") has added a buffer for custom data to xdp_buff_xsk. Particularly, this memory is used for data, consumed by XDP hints kfuncs. It does not always change on a per-packet basis and some parts can be set for example, at the same time as RX queue info. Add functions to fill all cbs in xsk_buff_pool with the same metadata. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- include/net/xdp_sock_drv.h | 17 +++++++++++++++++ include/net/xsk_buff_pool.h | 2 ++ net/xdp/xsk_buff_pool.c | 12 ++++++++++++ 3 files changed, 31 insertions(+) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 81e02de3f453..b62bb8525a5f 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -14,6 +14,12 @@ #ifdef CONFIG_XDP_SOCKETS +struct xsk_cb_desc { + void *src; + u8 off; + u8 bytes; +}; + void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries); bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc); u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max); @@ -47,6 +53,12 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool, xp_set_rxq_info(pool, rxq); } +static inline void xsk_pool_fill_cb(struct xsk_buff_pool *pool, + struct xsk_cb_desc *desc) +{ + xp_fill_cb(pool, desc); +} + static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool) { #ifdef CONFIG_NET_RX_BUSY_POLL @@ -274,6 +286,11 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool, { } +static inline void xsk_pool_fill_cb(struct xsk_buff_pool *pool, + struct xsk_cb_desc *desc) +{ +} + static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool) { return 0; diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 8d48d37ab7c0..99dd7376df6a 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -12,6 +12,7 @@ struct xsk_buff_pool; struct xdp_rxq_info; +struct xsk_cb_desc; struct xsk_queue; struct xdp_desc; struct xdp_umem; @@ -135,6 +136,7 @@ static inline void xp_init_xskb_dma(struct xdp_buff_xsk *xskb, struct xsk_buff_p /* AF_XDP ZC drivers, via xdp_sock_buff.h */ void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq); +void xp_fill_cb(struct xsk_buff_pool *pool, struct xsk_cb_desc *desc); int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev, unsigned long attrs, struct page **pages, u32 nr_pages); void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs); diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 4f6f538a5462..28711cc44ced 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -125,6 +125,18 @@ void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq) } EXPORT_SYMBOL(xp_set_rxq_info); +void xp_fill_cb(struct xsk_buff_pool *pool, struct xsk_cb_desc *desc) +{ + u32 i; + + for (i = 0; i < pool->heads_cnt; i++) { + struct xdp_buff_xsk *xskb = &pool->heads[i]; + + memcpy(xskb->cb + desc->off, desc->src, desc->bytes); + } +} +EXPORT_SYMBOL(xp_fill_cb); + static void xp_disable_drv_zc(struct xsk_buff_pool *pool) { struct netdev_bpf bpf; -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 07/18] xsk: add functions to fill control buffer 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 07/18] xsk: add functions to fill control buffer Larysa Zaremba @ 2023-12-12 13:51 ` Magnus Karlsson 0 siblings, 0 replies; 27+ messages in thread From: Magnus Karlsson @ 2023-12-12 13:51 UTC (permalink / raw) To: Larysa Zaremba Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski On Tue, 5 Dec 2023 at 22:11, Larysa Zaremba <larysa.zaremba@intel.com> wrote: > > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > Commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk") has added > a buffer for custom data to xdp_buff_xsk. Particularly, this memory is used > for data, consumed by XDP hints kfuncs. It does not always change on > a per-packet basis and some parts can be set for example, at the same time > as RX queue info. > > Add functions to fill all cbs in xsk_buff_pool with the same metadata. Thanks Larysa and Maciej. Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> > --- > include/net/xdp_sock_drv.h | 17 +++++++++++++++++ > include/net/xsk_buff_pool.h | 2 ++ > net/xdp/xsk_buff_pool.c | 12 ++++++++++++ > 3 files changed, 31 insertions(+) > > diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h > index 81e02de3f453..b62bb8525a5f 100644 > --- a/include/net/xdp_sock_drv.h > +++ b/include/net/xdp_sock_drv.h > @@ -14,6 +14,12 @@ > > #ifdef CONFIG_XDP_SOCKETS > > +struct xsk_cb_desc { > + void *src; > + u8 off; > + u8 bytes; > +}; > + > void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries); > bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc); > u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max); > @@ -47,6 +53,12 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool, > xp_set_rxq_info(pool, rxq); > } > > +static inline void xsk_pool_fill_cb(struct xsk_buff_pool *pool, > + struct xsk_cb_desc *desc) > +{ > + xp_fill_cb(pool, desc); > +} > + > static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool) > { > #ifdef CONFIG_NET_RX_BUSY_POLL > @@ -274,6 +286,11 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool, > { > } > > +static inline void xsk_pool_fill_cb(struct xsk_buff_pool *pool, > + struct xsk_cb_desc *desc) > +{ > +} > + > static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool) > { > return 0; > diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h > index 8d48d37ab7c0..99dd7376df6a 100644 > --- a/include/net/xsk_buff_pool.h > +++ b/include/net/xsk_buff_pool.h > @@ -12,6 +12,7 @@ > > struct xsk_buff_pool; > struct xdp_rxq_info; > +struct xsk_cb_desc; > struct xsk_queue; > struct xdp_desc; > struct xdp_umem; > @@ -135,6 +136,7 @@ static inline void xp_init_xskb_dma(struct xdp_buff_xsk *xskb, struct xsk_buff_p > > /* AF_XDP ZC drivers, via xdp_sock_buff.h */ > void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq); > +void xp_fill_cb(struct xsk_buff_pool *pool, struct xsk_cb_desc *desc); > int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev, > unsigned long attrs, struct page **pages, u32 nr_pages); > void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs); > diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c > index 4f6f538a5462..28711cc44ced 100644 > --- a/net/xdp/xsk_buff_pool.c > +++ b/net/xdp/xsk_buff_pool.c > @@ -125,6 +125,18 @@ void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq) > } > EXPORT_SYMBOL(xp_set_rxq_info); > > +void xp_fill_cb(struct xsk_buff_pool *pool, struct xsk_cb_desc *desc) > +{ > + u32 i; > + > + for (i = 0; i < pool->heads_cnt; i++) { > + struct xdp_buff_xsk *xskb = &pool->heads[i]; > + > + memcpy(xskb->cb + desc->off, desc->src, desc->bytes); > + } > +} > +EXPORT_SYMBOL(xp_fill_cb); > + > static void xp_disable_drv_zc(struct xsk_buff_pool *pool) > { > struct netdev_bpf bpf; > -- > 2.41.0 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 08/18] ice: Support XDP hints in AF_XDP ZC mode 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (6 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 07/18] xsk: add functions to fill control buffer Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-12 13:20 ` [xdp-hints] " Maciej Fijalkowski 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 09/18] xdp: Add VLAN tag hint Larysa Zaremba ` (10 subsequent siblings) 18 siblings, 1 reply; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski In AF_XDP ZC, xdp_buff is not stored on ring, instead it is provided by xsk_buff_pool. Space for metadata sources right after such buffers was already reserved in commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk"). Some things (such as pointer to packet context) do not change on a per-packet basis, so they can be set at the same time as RX queue info. On the other hand, RX descriptor is unique for each packet, but is already known when setting DMA addresses. This minimizes performance impact of hints on regular packet processing. Update AF_XDP ZC packet processing to support XDP hints. Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice_base.c | 14 ++++++++++++++ drivers/net/ethernet/intel/ice/ice_xsk.c | 5 +++++ 2 files changed, 19 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index 2d83f3c029e7..a040f02a342e 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -519,6 +519,19 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring) return 0; } +static void ice_xsk_pool_fill_cb(struct ice_rx_ring *ring) +{ + void *ctx_ptr = &ring->pkt_ctx; + struct xsk_cb_desc desc = {}; + + XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff); + desc.src = &ctx_ptr; + desc.off = offsetof(struct ice_xdp_buff, pkt_ctx) - + sizeof(struct xdp_buff); + desc.bytes = sizeof(ctx_ptr); + xsk_pool_fill_cb(ring->xsk_pool, &desc); +} + /** * ice_vsi_cfg_rxq - Configure an Rx queue * @ring: the ring being configured @@ -553,6 +566,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring) if (err) return err; xsk_pool_set_rxq_info(ring->xsk_pool, &ring->xdp_rxq); + ice_xsk_pool_fill_cb(ring); dev_info(dev, "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n", ring->q_index); diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 906e383e864a..11b6114ab83d 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -458,6 +458,11 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp, rx_desc->read.pkt_addr = cpu_to_le64(dma); rx_desc->wb.status_error0 = 0; + /* Put private info that changes on a per-packet basis + * into xdp_buff_xsk->cb. + */ + ice_xdp_meta_set_desc(*xdp, rx_desc); + rx_desc++; xdp++; } -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 08/18] ice: Support XDP hints in AF_XDP ZC mode 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 08/18] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba @ 2023-12-12 13:20 ` Maciej Fijalkowski 0 siblings, 0 replies; 27+ messages in thread From: Maciej Fijalkowski @ 2023-12-12 13:20 UTC (permalink / raw) To: Larysa Zaremba Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed On Tue, Dec 05, 2023 at 10:08:37PM +0100, Larysa Zaremba wrote: > In AF_XDP ZC, xdp_buff is not stored on ring, > instead it is provided by xsk_buff_pool. > Space for metadata sources right after such buffers was already reserved > in commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk"). > > Some things (such as pointer to packet context) do not change on a > per-packet basis, so they can be set at the same time as RX queue info. > On the other hand, RX descriptor is unique for each packet, but is already > known when setting DMA addresses. This minimizes performance impact of > hints on regular packet processing. > > Update AF_XDP ZC packet processing to support XDP hints. > > Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Not sure if I am supposed/allowed to provide review here, but: Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> LGTM > --- > drivers/net/ethernet/intel/ice/ice_base.c | 14 ++++++++++++++ > drivers/net/ethernet/intel/ice/ice_xsk.c | 5 +++++ > 2 files changed, 19 insertions(+) > > diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c > index 2d83f3c029e7..a040f02a342e 100644 > --- a/drivers/net/ethernet/intel/ice/ice_base.c > +++ b/drivers/net/ethernet/intel/ice/ice_base.c > @@ -519,6 +519,19 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring) > return 0; > } > > +static void ice_xsk_pool_fill_cb(struct ice_rx_ring *ring) > +{ > + void *ctx_ptr = &ring->pkt_ctx; > + struct xsk_cb_desc desc = {}; > + > + XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff); > + desc.src = &ctx_ptr; > + desc.off = offsetof(struct ice_xdp_buff, pkt_ctx) - > + sizeof(struct xdp_buff); > + desc.bytes = sizeof(ctx_ptr); > + xsk_pool_fill_cb(ring->xsk_pool, &desc); > +} > + > /** > * ice_vsi_cfg_rxq - Configure an Rx queue > * @ring: the ring being configured > @@ -553,6 +566,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring) > if (err) > return err; > xsk_pool_set_rxq_info(ring->xsk_pool, &ring->xdp_rxq); > + ice_xsk_pool_fill_cb(ring); > > dev_info(dev, "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n", > ring->q_index); > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c > index 906e383e864a..11b6114ab83d 100644 > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c > @@ -458,6 +458,11 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp, > rx_desc->read.pkt_addr = cpu_to_le64(dma); > rx_desc->wb.status_error0 = 0; > > + /* Put private info that changes on a per-packet basis > + * into xdp_buff_xsk->cb. > + */ > + ice_xdp_meta_set_desc(*xdp, rx_desc); > + > rx_desc++; > xdp++; > } > -- > 2.41.0 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 09/18] xdp: Add VLAN tag hint 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (7 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 08/18] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-06 8:25 ` [xdp-hints] " Jesper Dangaard Brouer 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 10/18] ice: Implement " Larysa Zaremba ` (9 subsequent siblings) 18 siblings, 1 reply; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Implement functionality that enables drivers to expose VLAN tag to XDP code. VLAN tag is represented by 2 variables: - protocol ID, which is passed to bpf code in BE - VLAN TCI, in host byte order Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- Documentation/netlink/specs/netdev.yaml | 4 +++ Documentation/networking/xdp-rx-metadata.rst | 8 ++++- include/net/xdp.h | 6 ++++ include/uapi/linux/netdev.h | 3 ++ net/core/xdp.c | 33 ++++++++++++++++++++ tools/include/uapi/linux/netdev.h | 3 ++ tools/net/ynl/generated/netdev-user.c | 1 + 7 files changed, 57 insertions(+), 1 deletion(-) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index eef6358ec587..aeec090e1387 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -54,6 +54,10 @@ definitions: name: hash doc: Device is capable of exposing receive packet hash via bpf_xdp_metadata_rx_hash(). + - + name: vlan-tag + doc: + Device is capable of exposing receive packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag(). - type: flags name: xsk-flags diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst index e3e9420fd817..a6e0ece18be5 100644 --- a/Documentation/networking/xdp-rx-metadata.rst +++ b/Documentation/networking/xdp-rx-metadata.rst @@ -20,7 +20,13 @@ Currently, the following kfuncs are supported. In the future, as more metadata is supported, this set will grow: .. kernel-doc:: net/core/xdp.c - :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash + :identifiers: bpf_xdp_metadata_rx_timestamp + +.. kernel-doc:: net/core/xdp.c + :identifiers: bpf_xdp_metadata_rx_hash + +.. kernel-doc:: net/core/xdp.c + :identifiers: bpf_xdp_metadata_rx_vlan_tag An XDP program can use these kfuncs to read the metadata into stack variables for its own consumption. Or, to pass the metadata on to other diff --git a/include/net/xdp.h b/include/net/xdp.h index eb77040b4825..ef79f124dbcf 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -399,6 +399,10 @@ void xdp_attachment_setup(struct xdp_attachment_info *info, NETDEV_XDP_RX_METADATA_HASH, \ bpf_xdp_metadata_rx_hash, \ xmo_rx_hash) \ + XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \ + NETDEV_XDP_RX_METADATA_VLAN_TAG, \ + bpf_xdp_metadata_rx_vlan_tag, \ + xmo_rx_vlan_tag) \ enum xdp_rx_metadata { #define XDP_METADATA_KFUNC(name, _, __, ___) name, @@ -460,6 +464,8 @@ struct xdp_metadata_ops { int (*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp); int (*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash, enum xdp_rss_hash_type *rss_type); + int (*xmo_rx_vlan_tag)(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci); }; #ifdef CONFIG_NET diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 6244c0164976..966638b08ccf 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -44,10 +44,13 @@ enum netdev_xdp_act { * timestamp via bpf_xdp_metadata_rx_timestamp(). * @NETDEV_XDP_RX_METADATA_HASH: Device is capable of exposing receive packet * hash via bpf_xdp_metadata_rx_hash(). + * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive + * packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag(). */ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_TIMESTAMP = 1, NETDEV_XDP_RX_METADATA_HASH = 2, + NETDEV_XDP_RX_METADATA_VLAN_TAG = 4, }; /** diff --git a/net/core/xdp.c b/net/core/xdp.c index b6f1d6dab3f2..4869c1c2d8f3 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -736,6 +736,39 @@ __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash, return -EOPNOTSUPP; } +/** + * bpf_xdp_metadata_rx_vlan_tag - Get XDP packet outermost VLAN tag + * @ctx: XDP context pointer. + * @vlan_proto: Destination pointer for VLAN Tag protocol identifier (TPID). + * @vlan_tci: Destination pointer for VLAN TCI (VID + DEI + PCP) + * + * In case of success, ``vlan_proto`` contains *Tag protocol identifier (TPID)*, + * usually ``ETH_P_8021Q`` or ``ETH_P_8021AD``, but some networks can use + * custom TPIDs. ``vlan_proto`` is stored in **network byte order (BE)** + * and should be used as follows: + * ``if (vlan_proto == bpf_htons(ETH_P_8021Q)) do_something();`` + * + * ``vlan_tci`` contains the remaining 16 bits of a VLAN tag. + * Driver is expected to provide those in **host byte order (usually LE)**, + * so the bpf program should not perform byte conversion. + * According to 802.1Q standard, *VLAN TCI (Tag control information)* + * is a bit field that contains: + * *VLAN identifier (VID)* that can be read with ``vlan_tci & 0xfff``, + * *Drop eligible indicator (DEI)* - 1 bit, + * *Priority code point (PCP)* - 3 bits. + * For detailed meaning of DEI and PCP, please refer to other sources. + * + * Return: + * * Returns 0 on success or ``-errno`` on error. + * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc + * * ``-ENODATA`` : VLAN tag was not stripped or is not available + */ +__bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, + __be16 *vlan_proto, u16 *vlan_tci) +{ + return -EOPNOTSUPP; +} + __bpf_kfunc_end_defs(); BTF_SET8_START(xdp_metadata_kfunc_ids) diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 6244c0164976..966638b08ccf 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -44,10 +44,13 @@ enum netdev_xdp_act { * timestamp via bpf_xdp_metadata_rx_timestamp(). * @NETDEV_XDP_RX_METADATA_HASH: Device is capable of exposing receive packet * hash via bpf_xdp_metadata_rx_hash(). + * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive + * packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag(). */ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_TIMESTAMP = 1, NETDEV_XDP_RX_METADATA_HASH = 2, + NETDEV_XDP_RX_METADATA_VLAN_TAG = 4, }; /** diff --git a/tools/net/ynl/generated/netdev-user.c b/tools/net/ynl/generated/netdev-user.c index 3b9dee94d4ce..e3fe748086bd 100644 --- a/tools/net/ynl/generated/netdev-user.c +++ b/tools/net/ynl/generated/netdev-user.c @@ -53,6 +53,7 @@ const char *netdev_xdp_act_str(enum netdev_xdp_act value) static const char * const netdev_xdp_rx_metadata_strmap[] = { [0] = "timestamp", [1] = "hash", + [2] = "vlan-tag", }; const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value) -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 09/18] xdp: Add VLAN tag hint 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 09/18] xdp: Add VLAN tag hint Larysa Zaremba @ 2023-12-06 8:25 ` Jesper Dangaard Brouer 0 siblings, 0 replies; 27+ messages in thread From: Jesper Dangaard Brouer @ 2023-12-06 8:25 UTC (permalink / raw) To: Larysa Zaremba, bpf Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski On 12/5/23 22:08, Larysa Zaremba wrote: > Implement functionality that enables drivers to expose VLAN tag > to XDP code. > > VLAN tag is represented by 2 variables: > - protocol ID, which is passed to bpf code in BE > - VLAN TCI, in host byte order > > Acked-by: Stanislav Fomichev <sdf@google.com> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> > --- Small doc nitpicks below, but it can go in-as-is Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> > Documentation/netlink/specs/netdev.yaml | 4 +++ > Documentation/networking/xdp-rx-metadata.rst | 8 ++++- > include/net/xdp.h | 6 ++++ > include/uapi/linux/netdev.h | 3 ++ > net/core/xdp.c | 33 ++++++++++++++++++++ > tools/include/uapi/linux/netdev.h | 3 ++ > tools/net/ynl/generated/netdev-user.c | 1 + > 7 files changed, 57 insertions(+), 1 deletion(-) [...] > diff --git a/net/core/xdp.c b/net/core/xdp.c > index b6f1d6dab3f2..4869c1c2d8f3 100644 > --- a/net/core/xdp.c > +++ b/net/core/xdp.c > @@ -736,6 +736,39 @@ __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash, > return -EOPNOTSUPP; > } > > +/** > + * bpf_xdp_metadata_rx_vlan_tag - Get XDP packet outermost VLAN tag > + * @ctx: XDP context pointer. > + * @vlan_proto: Destination pointer for VLAN Tag protocol identifier (TPID). I would have written: Tag Protocol Identifier (TPID). - like e.g. CCNA exam https://study-ccna.com/ieee-802-1q/ Capital letters leading up to the short version, but I don't think this is a requirement. I noticed that wikipedia also got this wrong. So, I it doesn't really matter. If you need to do a respin, I would appreciate this changed, but you got my ACK anyway. > + * @vlan_tci: Destination pointer for VLAN TCI (VID + DEI + PCP) > + * > + * In case of success, ``vlan_proto`` contains *Tag protocol identifier (TPID)*, > + * usually ``ETH_P_8021Q`` or ``ETH_P_8021AD``, but some networks can use > + * custom TPIDs. ``vlan_proto`` is stored in **network byte order (BE)** > + * and should be used as follows: > + * ``if (vlan_proto == bpf_htons(ETH_P_8021Q)) do_something();`` > + * > + * ``vlan_tci`` contains the remaining 16 bits of a VLAN tag. > + * Driver is expected to provide those in **host byte order (usually LE)**, > + * so the bpf program should not perform byte conversion. > + * According to 802.1Q standard, *VLAN TCI (Tag control information)* > + * is a bit field that contains: > + * *VLAN identifier (VID)* that can be read with ``vlan_tci & 0xfff``, > + * *Drop eligible indicator (DEI)* - 1 bit, Drop Eligible Indicator (DEI) > + * *Priority code point (PCP)* - 3 bits. Priority Code Point (PCP) > + * For detailed meaning of DEI and PCP, please refer to other sources. > + * > + * Return: > + * * Returns 0 on success or ``-errno`` on error. > + * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc > + * * ``-ENODATA`` : VLAN tag was not stripped or is not available > + */ > +__bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, > + __be16 *vlan_proto, u16 *vlan_tci) > +{ > + return -EOPNOTSUPP; > +} > + > __bpf_kfunc_end_defs(); ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 10/18] ice: Implement VLAN tag hint 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (8 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 09/18] xdp: Add VLAN tag hint Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 11/18] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba ` (8 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Implement .xmo_rx_vlan_tag callback to allow XDP code to read packet's VLAN tag. At the same time, use vlan_tci instead of vlan_tag in touched code, because VLAN tag often refers to VLAN proto and VLAN TCI combined, while in the code we clearly store only VLAN TCI. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice_main.c | 20 ++++++++++++++ drivers/net/ethernet/intel/ice/ice_txrx.c | 6 ++--- drivers/net/ethernet/intel/ice/ice_txrx.h | 6 ++++- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 26 +++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 4 +-- drivers/net/ethernet/intel/ice/ice_xsk.c | 6 ++--- 6 files changed, 59 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 0a2415dd78f1..86f704850aa6 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -6043,6 +6043,23 @@ ice_fix_features(struct net_device *netdev, netdev_features_t features) return features; } +/** + * ice_set_rx_rings_vlan_proto - update rings with new stripped VLAN proto + * @vsi: PF's VSI + * @vlan_ethertype: VLAN ethertype (802.1Q or 802.1ad) in network byte order + * + * Store current stripped VLAN proto in ring packet context, + * so it can be accessed more efficiently by packet processing code. + */ +static void +ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype) +{ + u16 i; + + ice_for_each_alloc_rxq(vsi, i) + vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype; +} + /** * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI * @vsi: PF's VSI @@ -6085,6 +6102,9 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features) if (strip_err || insert_err) return -EIO; + ice_set_rx_rings_vlan_proto(vsi, enable_stripping ? + htons(vlan_ethertype) : 0); + return 0; } diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 99ea47011fe0..59617f055e35 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -1183,7 +1183,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) struct sk_buff *skb; unsigned int size; u16 stat_err_bits; - u16 vlan_tag = 0; + u16 vlan_tci; /* get the Rx desc from Rx ring based on 'next_to_clean' */ rx_desc = ICE_RX_DESC(rx_ring, ntc); @@ -1278,7 +1278,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) continue; } - vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc); + vlan_tci = ice_get_vlan_tci(rx_desc); /* pad the skb if needed, to make a valid ethernet frame */ if (eth_skb_pad(skb)) @@ -1292,7 +1292,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) ice_trace(clean_rx_irq_indicate, rx_ring, rx_desc, skb); /* send completed skb up the stack */ - ice_receive_skb(rx_ring, skb, vlan_tag); + ice_receive_skb(rx_ring, skb, vlan_tci); /* update budget accounting */ total_rx_pkts++; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index ce3434c73a4b..b3379ff73674 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -259,6 +259,7 @@ enum ice_rx_dtype { struct ice_pkt_ctx { u64 cached_phctime; + __be16 vlan_proto; }; struct ice_xdp_buff { @@ -335,7 +336,10 @@ struct ice_rx_ring { /* CL3 - 3rd cacheline starts here */ union { struct ice_pkt_ctx pkt_ctx; - u64 cached_phctime; + struct { + u64 cached_phctime; + __be16 vlan_proto; + }; }; struct bpf_prog *xdp_prog; u16 rx_offset; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 09610c5615a8..25ffb539b474 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -599,7 +599,33 @@ static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash, return 0; } +/** + * ice_xdp_rx_vlan_tag - VLAN tag XDP hint handler + * @ctx: XDP buff pointer + * @vlan_proto: destination address for VLAN protocol + * @vlan_tci: destination address for VLAN TCI + * + * Copy VLAN tag (if was stripped) and corresponding protocol + * to the destination address. + */ +static int ice_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci) +{ + const struct ice_xdp_buff *xdp_ext = (void *)ctx; + + *vlan_proto = xdp_ext->pkt_ctx->vlan_proto; + if (!*vlan_proto) + return -ENODATA; + + *vlan_tci = ice_get_vlan_tci(xdp_ext->eop_desc); + if (!*vlan_tci) + return -ENODATA; + + return 0; +} + const struct xdp_metadata_ops ice_xdp_md_ops = { .xmo_rx_timestamp = ice_xdp_rx_hw_ts, .xmo_rx_hash = ice_xdp_rx_hash, + .xmo_rx_vlan_tag = ice_xdp_rx_vlan_tag, }; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index 81b8856d8e13..3893af1c11f3 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -84,7 +84,7 @@ ice_build_ctob(u64 td_cmd, u64 td_offset, unsigned int size, u64 td_tag) } /** - * ice_get_vlan_tag_from_rx_desc - get VLAN from Rx flex descriptor + * ice_get_vlan_tci - get VLAN TCI from Rx flex descriptor * @rx_desc: Rx 32b flex descriptor with RXDID=2 * * The OS and current PF implementation only support stripping a single VLAN tag @@ -92,7 +92,7 @@ ice_build_ctob(u64 td_cmd, u64 td_offset, unsigned int size, u64 td_tag) * one is found return the tag, else return 0 to mean no VLAN tag was found. */ static inline u16 -ice_get_vlan_tag_from_rx_desc(union ice_32b_rx_flex_desc *rx_desc) +ice_get_vlan_tci(const union ice_32b_rx_flex_desc *rx_desc) { u16 stat_err_bits; diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 11b6114ab83d..5d1ae8e4058a 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -868,7 +868,7 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget) struct xdp_buff *xdp; struct sk_buff *skb; u16 stat_err_bits; - u16 vlan_tag = 0; + u16 vlan_tci; rx_desc = ICE_RX_DESC(rx_ring, ntc); @@ -946,10 +946,10 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget) total_rx_bytes += skb->len; total_rx_packets++; - vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc); + vlan_tci = ice_get_vlan_tci(rx_desc); ice_process_skb_fields(rx_ring, rx_desc, skb); - ice_receive_skb(rx_ring, skb, vlan_tag); + ice_receive_skb(rx_ring, skb, vlan_tci); } rx_ring->next_to_clean = ntc; -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 11/18] ice: use VLAN proto from ring packet context in skb path 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (9 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 10/18] ice: Implement " Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-12 13:26 ` [xdp-hints] " Maciej Fijalkowski 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 12/18] veth: Implement VLAN tag XDP hint Larysa Zaremba ` (7 subsequent siblings) 18 siblings, 1 reply; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski VLAN proto, used in ice XDP hints implementation is stored in ring packet context. Utilize this value in skb VLAN processing too instead of checking netdev features. At the same time, use vlan_tci instead of vlan_tag in touched code, because VLAN tag often refers to VLAN proto and VLAN TCI combined, while in the code we clearly store only VLAN TCI. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 14 +++++--------- drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 2 +- 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 25ffb539b474..839e5da24ad5 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -244,21 +244,17 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, * ice_receive_skb - Send a completed packet up the stack * @rx_ring: Rx ring in play * @skb: packet to send up - * @vlan_tag: VLAN tag for packet + * @vlan_tci: VLAN TCI for packet * * This function sends the completed packet (via. skb) up the stack using * gro receive functions (with/without VLAN tag) */ void -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag) +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci) { - netdev_features_t features = rx_ring->netdev->features; - bool non_zero_vlan = !!(vlan_tag & VLAN_VID_MASK); - - if ((features & NETIF_F_HW_VLAN_CTAG_RX) && non_zero_vlan) - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag); - else if ((features & NETIF_F_HW_VLAN_STAG_RX) && non_zero_vlan) - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021AD), vlan_tag); + if ((vlan_tci & VLAN_VID_MASK) && rx_ring->vlan_proto) + __vlan_hwaccel_put_tag(skb, rx_ring->vlan_proto, + vlan_tci); napi_gro_receive(&rx_ring->q_vector->napi, skb); } diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index 3893af1c11f3..762047508619 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -150,7 +150,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb); void -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag); +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci); static inline void ice_xdp_meta_set_desc(struct xdp_buff *xdp, -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 11/18] ice: use VLAN proto from ring packet context in skb path 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 11/18] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba @ 2023-12-12 13:26 ` Maciej Fijalkowski 0 siblings, 0 replies; 27+ messages in thread From: Maciej Fijalkowski @ 2023-12-12 13:26 UTC (permalink / raw) To: Larysa Zaremba Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed On Tue, Dec 05, 2023 at 10:08:40PM +0100, Larysa Zaremba wrote: > VLAN proto, used in ice XDP hints implementation is stored in ring packet > context. Utilize this value in skb VLAN processing too instead of checking > netdev features. > > At the same time, use vlan_tci instead of vlan_tag in touched code, > because VLAN tag often refers to VLAN proto and VLAN TCI combined, > while in the code we clearly store only VLAN TCI. > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> IMHO series is good to go, however I'd like Magnus to take a look at 07/18 (no pressure:)) > --- > drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 14 +++++--------- > drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 2 +- > 2 files changed, 6 insertions(+), 10 deletions(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c > index 25ffb539b474..839e5da24ad5 100644 > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c > @@ -244,21 +244,17 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, > * ice_receive_skb - Send a completed packet up the stack > * @rx_ring: Rx ring in play > * @skb: packet to send up > - * @vlan_tag: VLAN tag for packet > + * @vlan_tci: VLAN TCI for packet > * > * This function sends the completed packet (via. skb) up the stack using > * gro receive functions (with/without VLAN tag) > */ > void > -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag) > +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci) > { > - netdev_features_t features = rx_ring->netdev->features; > - bool non_zero_vlan = !!(vlan_tag & VLAN_VID_MASK); > - > - if ((features & NETIF_F_HW_VLAN_CTAG_RX) && non_zero_vlan) > - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag); > - else if ((features & NETIF_F_HW_VLAN_STAG_RX) && non_zero_vlan) > - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021AD), vlan_tag); > + if ((vlan_tci & VLAN_VID_MASK) && rx_ring->vlan_proto) > + __vlan_hwaccel_put_tag(skb, rx_ring->vlan_proto, > + vlan_tci); > > napi_gro_receive(&rx_ring->q_vector->napi, skb); > } > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h > index 3893af1c11f3..762047508619 100644 > --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h > +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h > @@ -150,7 +150,7 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, > union ice_32b_rx_flex_desc *rx_desc, > struct sk_buff *skb); > void > -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag); > +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci); > > static inline void > ice_xdp_meta_set_desc(struct xdp_buff *xdp, > -- > 2.41.0 > ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 12/18] veth: Implement VLAN tag XDP hint 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (10 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 11/18] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 13/18] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba ` (6 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski In order to test VLAN tag hint in hardware-independent selftests, implement newly added hint in veth driver. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/veth.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 57efb3454c57..1efdbe4b92f5 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1722,6 +1722,24 @@ static int veth_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash, return 0; } +static int veth_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci) +{ + const struct veth_xdp_buff *_ctx = (void *)ctx; + const struct sk_buff *skb = _ctx->skb; + int err; + + if (!skb) + return -ENODATA; + + err = __vlan_hwaccel_get_tag(skb, vlan_tci); + if (err) + return err; + + *vlan_proto = skb->vlan_proto; + return err; +} + static const struct net_device_ops veth_netdev_ops = { .ndo_init = veth_dev_init, .ndo_open = veth_open, @@ -1746,6 +1764,7 @@ static const struct net_device_ops veth_netdev_ops = { static const struct xdp_metadata_ops veth_xdp_metadata_ops = { .xmo_rx_timestamp = veth_xdp_rx_timestamp, .xmo_rx_hash = veth_xdp_rx_hash, + .xmo_rx_vlan_tag = veth_xdp_rx_vlan_tag, }; #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \ -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 13/18] net: make vlan_get_tag() return -ENODATA instead of -EINVAL 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (11 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 12/18] veth: Implement VLAN tag XDP hint Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 14/18] mlx5: implement VLAN tag XDP hint Larysa Zaremba ` (5 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski, Jesper Dangaard Brouer __vlan_hwaccel_get_tag() is used in veth XDP hints implementation, its return value (-EINVAL if skb is not VLAN tagged) is passed to bpf code, but XDP hints specification requires drivers to return -ENODATA, if a hint cannot be provided for a particular packet. Solve this inconsistency by changing error return value of __vlan_hwaccel_get_tag() from -EINVAL to -ENODATA, do the same thing to __vlan_get_tag(), because this function is supposed to follow the same convention. This, in turn, makes -ENODATA the only non-zero value vlan_get_tag() can return. We can do this with no side effects, because none of the users of the 3 above-mentioned functions rely on the exact value. Suggested-by: Jesper Dangaard Brouer <jbrouer@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- include/linux/if_vlan.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 3028af87716e..c1645c86eed9 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -540,7 +540,7 @@ static inline int __vlan_get_tag(const struct sk_buff *skb, u16 *vlan_tci) struct vlan_ethhdr *veth = skb_vlan_eth_hdr(skb); if (!eth_type_vlan(veth->h_vlan_proto)) - return -EINVAL; + return -ENODATA; *vlan_tci = ntohs(veth->h_vlan_TCI); return 0; @@ -561,7 +561,7 @@ static inline int __vlan_hwaccel_get_tag(const struct sk_buff *skb, return 0; } else { *vlan_tci = 0; - return -EINVAL; + return -ENODATA; } } -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 14/18] mlx5: implement VLAN tag XDP hint 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (12 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 13/18] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-06 8:52 ` [xdp-hints] " Jesper Dangaard Brouer 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 15/18] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba ` (4 subsequent siblings) 18 siblings, 1 reply; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski, Tariq Toukan Implement the newly added .xmo_rx_vlan_tag() hint function. Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 15 +++++++++++++++ include/linux/mlx5/device.h | 2 +- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index e2e7d82cfca4..9e695ed122ee 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -256,9 +256,24 @@ static int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash, return 0; } +static int mlx5e_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci) +{ + const struct mlx5e_xdp_buff *_ctx = (void *)ctx; + const struct mlx5_cqe64 *cqe = _ctx->cqe; + + if (!cqe_has_vlan(cqe)) + return -ENODATA; + + *vlan_proto = htons(ETH_P_8021Q); + *vlan_tci = be16_to_cpu(cqe->vlan_info); + return 0; +} + const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = { .xmo_rx_timestamp = mlx5e_xdp_rx_timestamp, .xmo_rx_hash = mlx5e_xdp_rx_hash, + .xmo_rx_vlan_tag = mlx5e_xdp_rx_vlan_tag, }; struct mlx5e_xsk_tx_complete { diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 820bca965fb6..01275c6e8468 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -918,7 +918,7 @@ static inline u8 get_cqe_tls_offload(struct mlx5_cqe64 *cqe) return (cqe->tls_outer_l3_tunneled >> 3) & 0x3; } -static inline bool cqe_has_vlan(struct mlx5_cqe64 *cqe) +static inline bool cqe_has_vlan(const struct mlx5_cqe64 *cqe) { return cqe->l4_l3_hdr_type & 0x1; } -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 14/18] mlx5: implement VLAN tag XDP hint 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 14/18] mlx5: implement VLAN tag XDP hint Larysa Zaremba @ 2023-12-06 8:52 ` Jesper Dangaard Brouer 0 siblings, 0 replies; 27+ messages in thread From: Jesper Dangaard Brouer @ 2023-12-06 8:52 UTC (permalink / raw) To: Larysa Zaremba, bpf Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski, Tariq Toukan On 12/5/23 22:08, Larysa Zaremba wrote: > Implement the newly added .xmo_rx_vlan_tag() hint function. > > Reviewed-by: Tariq Toukan<tariqt@nvidia.com> > Signed-off-by: Larysa Zaremba<larysa.zaremba@intel.com> > --- > drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 15 +++++++++++++++ > include/linux/mlx5/device.h | 2 +- > 2 files changed, 16 insertions(+), 1 deletion(-) LGTM Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 15/18] selftests/bpf: Allow VLAN packets in xdp_hw_metadata 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (13 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 14/18] mlx5: implement VLAN tag XDP hint Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 16/18] selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata Larysa Zaremba ` (3 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Make VLAN c-tag and s-tag XDP hint testing more convenient by not skipping VLAN-ed packets. Allow both 802.1ad and 802.1Q headers. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- tools/testing/selftests/bpf/progs/xdp_hw_metadata.c | 10 +++++++++- tools/testing/selftests/bpf/xdp_metadata.h | 8 ++++++++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c index f6d1cc9ad892..8767d919c881 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -26,15 +26,23 @@ int rx(struct xdp_md *ctx) { void *data, *data_meta, *data_end; struct ipv6hdr *ip6h = NULL; - struct ethhdr *eth = NULL; struct udphdr *udp = NULL; struct iphdr *iph = NULL; struct xdp_meta *meta; + struct ethhdr *eth; int err; data = (void *)(long)ctx->data; data_end = (void *)(long)ctx->data_end; eth = data; + + if (eth + 1 < data_end && (eth->h_proto == bpf_htons(ETH_P_8021AD) || + eth->h_proto == bpf_htons(ETH_P_8021Q))) + eth = (void *)eth + sizeof(struct vlan_hdr); + + if (eth + 1 < data_end && eth->h_proto == bpf_htons(ETH_P_8021Q)) + eth = (void *)eth + sizeof(struct vlan_hdr); + if (eth + 1 < data_end) { if (eth->h_proto == bpf_htons(ETH_P_IP)) { iph = (void *)(eth + 1); diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h index 938a729bd307..6664893c2c77 100644 --- a/tools/testing/selftests/bpf/xdp_metadata.h +++ b/tools/testing/selftests/bpf/xdp_metadata.h @@ -9,6 +9,14 @@ #define ETH_P_IPV6 0x86DD #endif +#ifndef ETH_P_8021Q +#define ETH_P_8021Q 0x8100 +#endif + +#ifndef ETH_P_8021AD +#define ETH_P_8021AD 0x88A8 +#endif + struct xdp_meta { __u64 rx_timestamp; __u64 xdp_timestamp; -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 16/18] selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (14 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 15/18] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 17/18] selftests/bpf: Add AF_INET packet generation to xdp_metadata Larysa Zaremba ` (2 subsequent siblings) 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Add VLAN hint to the xdp_hw_metadata program. Also, to make metadata layout more straightforward, add flags field to pass information about validity of every separate hint separately. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- .../selftests/bpf/progs/xdp_hw_metadata.c | 28 +++++++++++---- tools/testing/selftests/bpf/xdp_hw_metadata.c | 34 ++++++++++++++++--- tools/testing/selftests/bpf/xdp_metadata.h | 26 +++++++++++++- 3 files changed, 76 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c index 8767d919c881..330ece2eabdb 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -20,6 +20,9 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, __u64 *timestamp) __ksym; extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, enum xdp_rss_hash_type *rss_type) __ksym; +extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, + __be16 *vlan_proto, + __u16 *vlan_tci) __ksym; SEC("xdp.frags") int rx(struct xdp_md *ctx) @@ -84,15 +87,28 @@ int rx(struct xdp_md *ctx) return XDP_PASS; } + meta->hint_valid = 0; + + meta->xdp_timestamp = bpf_ktime_get_tai_ns(); err = bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp); - if (!err) - meta->xdp_timestamp = bpf_ktime_get_tai_ns(); + if (err) + meta->rx_timestamp_err = err; + else + meta->hint_valid |= XDP_META_FIELD_TS; + + err = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, + &meta->rx_hash_type); + if (err) + meta->rx_hash_err = err; else - meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */ + meta->hint_valid |= XDP_META_FIELD_RSS; - err = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, &meta->rx_hash_type); - if (err < 0) - meta->rx_hash_err = err; /* Used by AF_XDP as no hash signal */ + err = bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_proto, + &meta->rx_vlan_tci); + if (err) + meta->rx_vlan_tag_err = err; + else + meta->hint_valid |= XDP_META_FIELD_VLAN_TAG; __sync_add_and_fetch(&pkts_redir, 1); return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 3291625ba4fb..e3ad74eca8ec 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -21,6 +21,9 @@ #include "xsk.h" #include <error.h> +#include <linux/kernel.h> +#include <linux/bits.h> +#include <linux/bitfield.h> #include <linux/errqueue.h> #include <linux/if_link.h> #include <linux/net_tstamp.h> @@ -182,19 +185,31 @@ static void print_tstamp_delta(const char *name, const char *refname, (double)delta / 1000); } +#define VLAN_PRIO_MASK GENMASK(15, 13) /* Priority Code Point */ +#define VLAN_DEI_MASK GENMASK(12, 12) /* Drop Eligible Indicator */ +#define VLAN_VID_MASK GENMASK(11, 0) /* VLAN Identifier */ +static void print_vlan_tci(__u16 tag) +{ + __u16 vlan_id = FIELD_GET(VLAN_VID_MASK, tag); + __u8 pcp = FIELD_GET(VLAN_PRIO_MASK, tag); + bool dei = FIELD_GET(VLAN_DEI_MASK, tag); + + printf("PCP=%u, DEI=%d, VID=0x%X\n", pcp, dei, vlan_id); +} + static void verify_xdp_metadata(void *data, clockid_t clock_id) { struct xdp_meta *meta; meta = data - sizeof(*meta); - if (meta->rx_hash_err < 0) - printf("No rx_hash err=%d\n", meta->rx_hash_err); - else + if (meta->hint_valid & XDP_META_FIELD_RSS) printf("rx_hash: 0x%X with RSS type:0x%X\n", meta->rx_hash, meta->rx_hash_type); + else + printf("No rx_hash, err=%d\n", meta->rx_hash_err); - if (meta->rx_timestamp) { + if (meta->hint_valid & XDP_META_FIELD_TS) { __u64 ref_tstamp = gettime(clock_id); /* store received timestamps to calculate a delta at tx */ @@ -206,7 +221,16 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id) print_tstamp_delta("XDP RX-time", "User RX-time", meta->xdp_timestamp, ref_tstamp); } else { - printf("No rx_timestamp\n"); + printf("No rx_timestamp, err=%d\n", meta->rx_timestamp_err); + } + + if (meta->hint_valid & XDP_META_FIELD_VLAN_TAG) { + printf("rx_vlan_proto: 0x%X\n", ntohs(meta->rx_vlan_proto)); + printf("rx_vlan_tci: "); + print_vlan_tci(meta->rx_vlan_tci); + } else { + printf("No rx_vlan_tci or rx_vlan_proto, err=%d\n", + meta->rx_vlan_tag_err); } } diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h index 6664893c2c77..87318ad1117a 100644 --- a/tools/testing/selftests/bpf/xdp_metadata.h +++ b/tools/testing/selftests/bpf/xdp_metadata.h @@ -17,12 +17,36 @@ #define ETH_P_8021AD 0x88A8 #endif +#ifndef BIT +#define BIT(nr) (1 << (nr)) +#endif + +/* Non-existent checksum status */ +#define XDP_CHECKSUM_MAGIC BIT(2) + +enum xdp_meta_field { + XDP_META_FIELD_TS = BIT(0), + XDP_META_FIELD_RSS = BIT(1), + XDP_META_FIELD_VLAN_TAG = BIT(2), +}; + struct xdp_meta { - __u64 rx_timestamp; + union { + __u64 rx_timestamp; + __s32 rx_timestamp_err; + }; __u64 xdp_timestamp; __u32 rx_hash; union { __u32 rx_hash_type; __s32 rx_hash_err; }; + union { + struct { + __be16 rx_vlan_proto; + __u16 rx_vlan_tci; + }; + __s32 rx_vlan_tag_err; + }; + enum xdp_meta_field hint_valid; }; -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 17/18] selftests/bpf: Add AF_INET packet generation to xdp_metadata 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (15 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 16/18] selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-05 22:59 ` [xdp-hints] " Stanislav Fomichev 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 18/18] selftests/bpf: Check VLAN tag and proto in xdp_metadata Larysa Zaremba 2023-12-14 0:30 ` [xdp-hints] Re: [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint patchwork-bot+netdevbpf 18 siblings, 1 reply; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski The easiest way to simulate stripped VLAN tag in veth is to send a packet from VLAN interface, attached to veth. Unfortunately, this approach is incompatible with AF_XDP on TX side, because VLAN interfaces do not have such feature. Check both packets sent via AF_XDP TX and regular socket. AF_INET packet will also have a filled-in hash type (XDP_RSS_TYPE_L4), unlike AF_XDP packet, so more values can be checked. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- .../selftests/bpf/prog_tests/xdp_metadata.c | 116 +++++++++++++++--- 1 file changed, 97 insertions(+), 19 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c index 33cdf88efa6b..e7f06cbdd845 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -20,7 +20,7 @@ #define UDP_PAYLOAD_BYTES 4 -#define AF_XDP_SOURCE_PORT 1234 +#define UDP_SOURCE_PORT 1234 #define AF_XDP_CONSUMER_PORT 8080 #define UMEM_NUM 16 @@ -33,6 +33,12 @@ #define RX_ADDR "10.0.0.2" #define PREFIX_LEN "8" #define FAMILY AF_INET +#define TX_NETNS_NAME "xdp_metadata_tx" +#define RX_NETNS_NAME "xdp_metadata_rx" +#define TX_MAC "00:00:00:00:00:01" +#define RX_MAC "00:00:00:00:00:02" + +#define XDP_RSS_TYPE_L4 BIT(3) struct xsk { void *umem_area; @@ -181,7 +187,7 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) ASSERT_EQ(inet_pton(FAMILY, RX_ADDR, &iph->daddr), 1, "inet_pton(RX_ADDR)"); ip_csum(iph); - udph->source = htons(AF_XDP_SOURCE_PORT); + udph->source = htons(UDP_SOURCE_PORT); udph->dest = htons(dst_port); udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES); udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, @@ -204,6 +210,30 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) return 0; } +static int generate_packet_inet(void) +{ + char udp_payload[UDP_PAYLOAD_BYTES]; + struct sockaddr_in rx_addr; + int sock_fd, err = 0; + + /* Build a packet */ + memset(udp_payload, 0xAA, UDP_PAYLOAD_BYTES); + rx_addr.sin_addr.s_addr = inet_addr(RX_ADDR); + rx_addr.sin_family = AF_INET; + rx_addr.sin_port = htons(AF_XDP_CONSUMER_PORT); + + sock_fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); + if (!ASSERT_GE(sock_fd, 0, "socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)")) + return sock_fd; + + err = sendto(sock_fd, udp_payload, UDP_PAYLOAD_BYTES, MSG_DONTWAIT, + (void *)&rx_addr, sizeof(rx_addr)); + ASSERT_GE(err, 0, "sendto"); + + close(sock_fd); + return err; +} + static void complete_tx(struct xsk *xsk) { struct xsk_tx_metadata *meta; @@ -236,7 +266,7 @@ static void refill_rx(struct xsk *xsk, __u64 addr) } } -static int verify_xsk_metadata(struct xsk *xsk) +static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp) { const struct xdp_desc *rx_desc; struct pollfd fds = {}; @@ -290,17 +320,36 @@ static int verify_xsk_metadata(struct xsk *xsk) if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash")) return -1; + if (!sent_from_af_xdp) { + if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type")) + return -1; + goto done; + } + ASSERT_EQ(meta->rx_hash_type, 0, "rx_hash_type"); /* checksum offload */ ASSERT_EQ(udph->check, htons(0x721c), "csum"); +done: xsk_ring_cons__release(&xsk->rx, 1); refill_rx(xsk, comp_addr); return 0; } +static void switch_ns_to_rx(struct nstoken **tok) +{ + close_netns(*tok); + *tok = open_netns(RX_NETNS_NAME); +} + +static void switch_ns_to_tx(struct nstoken **tok) +{ + close_netns(*tok); + *tok = open_netns(TX_NETNS_NAME); +} + void test_xdp_metadata(void) { struct xdp_metadata2 *bpf_obj2 = NULL; @@ -318,27 +367,31 @@ void test_xdp_metadata(void) int sock_fd; int ret; - /* Setup new networking namespace, with a veth pair. */ + /* Setup new networking namespaces, with a veth pair. */ + SYS(out, "ip netns add " TX_NETNS_NAME); + SYS(out, "ip netns add " RX_NETNS_NAME); - SYS(out, "ip netns add xdp_metadata"); - tok = open_netns("xdp_metadata"); + tok = open_netns(TX_NETNS_NAME); SYS(out, "ip link add numtxqueues 1 numrxqueues 1 " TX_NAME " type veth peer " RX_NAME " numtxqueues 1 numrxqueues 1"); - SYS(out, "ip link set dev " TX_NAME " address 00:00:00:00:00:01"); - SYS(out, "ip link set dev " RX_NAME " address 00:00:00:00:00:02"); + SYS(out, "ip link set " RX_NAME " netns " RX_NETNS_NAME); + + SYS(out, "ip link set dev " TX_NAME " address " TX_MAC); SYS(out, "ip link set dev " TX_NAME " up"); - SYS(out, "ip link set dev " RX_NAME " up"); SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME); + + /* Avoid ARP calls */ + SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME); + + switch_ns_to_rx(&tok); + + SYS(out, "ip link set dev " RX_NAME " address " RX_MAC); + SYS(out, "ip link set dev " RX_NAME " up"); SYS(out, "ip addr add " RX_ADDR "/" PREFIX_LEN " dev " RX_NAME); rx_ifindex = if_nametoindex(RX_NAME); - tx_ifindex = if_nametoindex(TX_NAME); - /* Setup separate AF_XDP for TX and RX interfaces. */ - - ret = open_xsk(tx_ifindex, &tx_xsk); - if (!ASSERT_OK(ret, "open_xsk(TX_NAME)")) - goto out; + /* Setup separate AF_XDP for RX interface. */ ret = open_xsk(rx_ifindex, &rx_xsk); if (!ASSERT_OK(ret, "open_xsk(RX_NAME)")) @@ -379,18 +432,38 @@ void test_xdp_metadata(void) if (!ASSERT_GE(ret, 0, "bpf_map_update_elem")) goto out; - /* Send packet destined to RX AF_XDP socket. */ + switch_ns_to_tx(&tok); + + /* Setup separate AF_XDP for TX interface nad send packet to the RX socket. */ + tx_ifindex = if_nametoindex(TX_NAME); + ret = open_xsk(tx_ifindex, &tx_xsk); + if (!ASSERT_OK(ret, "open_xsk(TX_NAME)")) + goto out; + if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, "generate AF_XDP_CONSUMER_PORT")) goto out; - /* Verify AF_XDP RX packet has proper metadata. */ - if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk), 0, + switch_ns_to_rx(&tok); + + /* Verify packet sent from AF_XDP has proper metadata. */ + if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk, true), 0, "verify_xsk_metadata")) goto out; + switch_ns_to_tx(&tok); complete_tx(&tx_xsk); + /* Now check metadata of packet, generated with network stack */ + if (!ASSERT_GE(generate_packet_inet(), 0, "generate UDP packet")) + goto out; + + switch_ns_to_rx(&tok); + + if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk, false), 0, + "verify_xsk_metadata")) + goto out; + /* Make sure freplace correctly picks up original bound device * and doesn't crash. */ @@ -408,11 +481,15 @@ void test_xdp_metadata(void) if (!ASSERT_OK(xdp_metadata2__attach(bpf_obj2), "attach freplace")) goto out; + switch_ns_to_tx(&tok); + /* Send packet to trigger . */ if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, "generate freplace packet")) goto out; + switch_ns_to_rx(&tok); + while (!retries--) { if (bpf_obj2->bss->called) break; @@ -427,5 +504,6 @@ void test_xdp_metadata(void) xdp_metadata__destroy(bpf_obj); if (tok) close_netns(tok); - SYS_NOFAIL("ip netns del xdp_metadata"); + SYS_NOFAIL("ip netns del " RX_NETNS_NAME); + SYS_NOFAIL("ip netns del " TX_NETNS_NAME); } -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 17/18] selftests/bpf: Add AF_INET packet generation to xdp_metadata 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 17/18] selftests/bpf: Add AF_INET packet generation to xdp_metadata Larysa Zaremba @ 2023-12-05 22:59 ` Stanislav Fomichev 0 siblings, 0 replies; 27+ messages in thread From: Stanislav Fomichev @ 2023-12-05 22:59 UTC (permalink / raw) To: Larysa Zaremba Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski On 12/05, Larysa Zaremba wrote: > The easiest way to simulate stripped VLAN tag in veth is to send a packet > from VLAN interface, attached to veth. Unfortunately, this approach is > incompatible with AF_XDP on TX side, because VLAN interfaces do not have > such feature. > > Check both packets sent via AF_XDP TX and regular socket. > > AF_INET packet will also have a filled-in hash type (XDP_RSS_TYPE_L4), > unlike AF_XDP packet, so more values can be checked. > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> > --- > .../selftests/bpf/prog_tests/xdp_metadata.c | 116 +++++++++++++++--- > 1 file changed, 97 insertions(+), 19 deletions(-) > > diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c > index 33cdf88efa6b..e7f06cbdd845 100644 > --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c > +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c > @@ -20,7 +20,7 @@ > > #define UDP_PAYLOAD_BYTES 4 > > -#define AF_XDP_SOURCE_PORT 1234 > +#define UDP_SOURCE_PORT 1234 > #define AF_XDP_CONSUMER_PORT 8080 > > #define UMEM_NUM 16 > @@ -33,6 +33,12 @@ > #define RX_ADDR "10.0.0.2" > #define PREFIX_LEN "8" > #define FAMILY AF_INET > +#define TX_NETNS_NAME "xdp_metadata_tx" > +#define RX_NETNS_NAME "xdp_metadata_rx" > +#define TX_MAC "00:00:00:00:00:01" > +#define RX_MAC "00:00:00:00:00:02" > + > +#define XDP_RSS_TYPE_L4 BIT(3) > > struct xsk { > void *umem_area; > @@ -181,7 +187,7 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) > ASSERT_EQ(inet_pton(FAMILY, RX_ADDR, &iph->daddr), 1, "inet_pton(RX_ADDR)"); > ip_csum(iph); > > - udph->source = htons(AF_XDP_SOURCE_PORT); > + udph->source = htons(UDP_SOURCE_PORT); > udph->dest = htons(dst_port); > udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES); > udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, > @@ -204,6 +210,30 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) > return 0; > } > > +static int generate_packet_inet(void) > +{ > + char udp_payload[UDP_PAYLOAD_BYTES]; > + struct sockaddr_in rx_addr; > + int sock_fd, err = 0; > + > + /* Build a packet */ > + memset(udp_payload, 0xAA, UDP_PAYLOAD_BYTES); > + rx_addr.sin_addr.s_addr = inet_addr(RX_ADDR); > + rx_addr.sin_family = AF_INET; > + rx_addr.sin_port = htons(AF_XDP_CONSUMER_PORT); > + > + sock_fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); > + if (!ASSERT_GE(sock_fd, 0, "socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)")) > + return sock_fd; > + > + err = sendto(sock_fd, udp_payload, UDP_PAYLOAD_BYTES, MSG_DONTWAIT, > + (void *)&rx_addr, sizeof(rx_addr)); > + ASSERT_GE(err, 0, "sendto"); > + > + close(sock_fd); > + return err; > +} > + > static void complete_tx(struct xsk *xsk) > { > struct xsk_tx_metadata *meta; > @@ -236,7 +266,7 @@ static void refill_rx(struct xsk *xsk, __u64 addr) > } > } > > -static int verify_xsk_metadata(struct xsk *xsk) > +static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp) > { > const struct xdp_desc *rx_desc; > struct pollfd fds = {}; > @@ -290,17 +320,36 @@ static int verify_xsk_metadata(struct xsk *xsk) > if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash")) > return -1; > > + if (!sent_from_af_xdp) { > + if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type")) > + return -1; > + goto done; > + } > + > ASSERT_EQ(meta->rx_hash_type, 0, "rx_hash_type"); > > /* checksum offload */ > ASSERT_EQ(udph->check, htons(0x721c), "csum"); > > +done: > xsk_ring_cons__release(&xsk->rx, 1); > refill_rx(xsk, comp_addr); > > return 0; > } > > +static void switch_ns_to_rx(struct nstoken **tok) > +{ > + close_netns(*tok); > + *tok = open_netns(RX_NETNS_NAME); > +} > + > +static void switch_ns_to_tx(struct nstoken **tok) > +{ > + close_netns(*tok); > + *tok = open_netns(TX_NETNS_NAME); > +} > + > void test_xdp_metadata(void) > { > struct xdp_metadata2 *bpf_obj2 = NULL; > @@ -318,27 +367,31 @@ void test_xdp_metadata(void) > int sock_fd; > int ret; > > - /* Setup new networking namespace, with a veth pair. */ > + /* Setup new networking namespaces, with a veth pair. */ > + SYS(out, "ip netns add " TX_NETNS_NAME); > + SYS(out, "ip netns add " RX_NETNS_NAME); > > - SYS(out, "ip netns add xdp_metadata"); > - tok = open_netns("xdp_metadata"); > + tok = open_netns(TX_NETNS_NAME); > SYS(out, "ip link add numtxqueues 1 numrxqueues 1 " TX_NAME > " type veth peer " RX_NAME " numtxqueues 1 numrxqueues 1"); > - SYS(out, "ip link set dev " TX_NAME " address 00:00:00:00:00:01"); > - SYS(out, "ip link set dev " RX_NAME " address 00:00:00:00:00:02"); > + SYS(out, "ip link set " RX_NAME " netns " RX_NETNS_NAME); > + > + SYS(out, "ip link set dev " TX_NAME " address " TX_MAC); > SYS(out, "ip link set dev " TX_NAME " up"); > - SYS(out, "ip link set dev " RX_NAME " up"); > SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME); > + > + /* Avoid ARP calls */ > + SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME); > + > + switch_ns_to_rx(&tok); > + > + SYS(out, "ip link set dev " RX_NAME " address " RX_MAC); > + SYS(out, "ip link set dev " RX_NAME " up"); > SYS(out, "ip addr add " RX_ADDR "/" PREFIX_LEN " dev " RX_NAME); > > rx_ifindex = if_nametoindex(RX_NAME); > - tx_ifindex = if_nametoindex(TX_NAME); > > - /* Setup separate AF_XDP for TX and RX interfaces. */ > - > - ret = open_xsk(tx_ifindex, &tx_xsk); > - if (!ASSERT_OK(ret, "open_xsk(TX_NAME)")) > - goto out; > + /* Setup separate AF_XDP for RX interface. */ > > ret = open_xsk(rx_ifindex, &rx_xsk); > if (!ASSERT_OK(ret, "open_xsk(RX_NAME)")) > @@ -379,18 +432,38 @@ void test_xdp_metadata(void) > if (!ASSERT_GE(ret, 0, "bpf_map_update_elem")) > goto out; > > - /* Send packet destined to RX AF_XDP socket. */ > + switch_ns_to_tx(&tok); > + > + /* Setup separate AF_XDP for TX interface nad send packet to the RX socket. */ Not sure we care, but s/nad/and/ if you happen to do another respin.. Acked-by: Stanislav Fomichev <sdf@google.com> ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] [PATCH bpf-next v8 18/18] selftests/bpf: Check VLAN tag and proto in xdp_metadata 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (16 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 17/18] selftests/bpf: Add AF_INET packet generation to xdp_metadata Larysa Zaremba @ 2023-12-05 21:08 ` Larysa Zaremba 2023-12-14 0:30 ` [xdp-hints] Re: [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint patchwork-bot+netdevbpf 18 siblings, 0 replies; 27+ messages in thread From: Larysa Zaremba @ 2023-12-05 21:08 UTC (permalink / raw) To: bpf Cc: Larysa Zaremba, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, David Ahern, Jakub Kicinski, Willem de Bruijn, Jesper Dangaard Brouer, Anatoly Burakov, Alexander Lobakin, Magnus Karlsson, Maryam Tahhan, xdp-hints, netdev, Willem de Bruijn, Alexei Starovoitov, Tariq Toukan, Saeed Mahameed, Maciej Fijalkowski Verify, whether VLAN tag and proto are set correctly. To simulate "stripped" VLAN tag on veth, send test packet from VLAN interface. Also, add TO_STR() macro for convenience. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- .../selftests/bpf/prog_tests/xdp_metadata.c | 20 +++++++++++++++++-- .../selftests/bpf/progs/xdp_metadata.c | 5 +++++ tools/testing/selftests/bpf/testing_helpers.h | 3 +++ 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c index e7f06cbdd845..05edcf32f528 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -38,7 +38,13 @@ #define TX_MAC "00:00:00:00:00:01" #define RX_MAC "00:00:00:00:00:02" +#define VLAN_ID 59 +#define VLAN_PROTO "802.1Q" +#define VLAN_PID htons(ETH_P_8021Q) +#define TX_NAME_VLAN TX_NAME "." TO_STR(VLAN_ID) + #define XDP_RSS_TYPE_L4 BIT(3) +#define VLAN_VID_MASK 0xfff struct xsk { void *umem_area; @@ -323,6 +329,12 @@ static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp) if (!sent_from_af_xdp) { if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type")) return -1; + + if (!ASSERT_EQ(meta->rx_vlan_tci & VLAN_VID_MASK, VLAN_ID, "rx_vlan_tci")) + return -1; + + if (!ASSERT_EQ(meta->rx_vlan_proto, VLAN_PID, "rx_vlan_proto")) + return -1; goto done; } @@ -378,10 +390,14 @@ void test_xdp_metadata(void) SYS(out, "ip link set dev " TX_NAME " address " TX_MAC); SYS(out, "ip link set dev " TX_NAME " up"); - SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME); + + SYS(out, "ip link add link " TX_NAME " " TX_NAME_VLAN + " type vlan proto " VLAN_PROTO " id " TO_STR(VLAN_ID)); + SYS(out, "ip link set dev " TX_NAME_VLAN " up"); + SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME_VLAN); /* Avoid ARP calls */ - SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME); + SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME_VLAN); switch_ns_to_rx(&tok); diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c index 5d6c1245c310..31ca229bb3c0 100644 --- a/tools/testing/selftests/bpf/progs/xdp_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c @@ -23,6 +23,9 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, __u64 *timestamp) __ksym; extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, enum xdp_rss_hash_type *rss_type) __ksym; +extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, + __be16 *vlan_proto, + __u16 *vlan_tci) __ksym; SEC("xdp") int rx(struct xdp_md *ctx) @@ -86,6 +89,8 @@ int rx(struct xdp_md *ctx) meta->rx_timestamp = 1; bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, &meta->rx_hash_type); + bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_proto, + &meta->rx_vlan_tci); return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); } diff --git a/tools/testing/selftests/bpf/testing_helpers.h b/tools/testing/selftests/bpf/testing_helpers.h index 5b7a55136741..35284faff4f2 100644 --- a/tools/testing/selftests/bpf/testing_helpers.h +++ b/tools/testing/selftests/bpf/testing_helpers.h @@ -9,6 +9,9 @@ #include <bpf/libbpf.h> #include <time.h> +#define __TO_STR(x) #x +#define TO_STR(x) __TO_STR(x) + int parse_num_list(const char *s, bool **set, int *set_len); __u32 link_info_prog_id(const struct bpf_link *link, struct bpf_link_info *info); int bpf_prog_test_load(const char *file, enum bpf_prog_type type, -- 2.41.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba ` (17 preceding siblings ...) 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 18/18] selftests/bpf: Check VLAN tag and proto in xdp_metadata Larysa Zaremba @ 2023-12-14 0:30 ` patchwork-bot+netdevbpf 18 siblings, 0 replies; 27+ messages in thread From: patchwork-bot+netdevbpf @ 2023-12-14 0:30 UTC (permalink / raw) To: Larysa Zaremba Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, dsahern, kuba, willemb, hawk, anatoly.burakov, alexandr.lobakin, magnus.karlsson, mtahhan, xdp-hints, netdev, willemdebruijn.kernel, alexei.starovoitov, tariqt, saeedm, maciej.fijalkowski Hello: This series was applied to bpf/bpf-next.git (master) by Alexei Starovoitov <ast@kernel.org>: On Tue, 5 Dec 2023 22:08:29 +0100 you wrote: > This series introduces XDP hints via kfuncs [0] to the ice driver. > > Series brings the following existing hints to the ice driver: > - HW timestamp > - RX hash with type > > Series also introduces VLAN tag with protocol XDP hint, it now be accessed by > XDP and userspace (AF_XDP) programs. They can also be checked with xdp_metadata > test and xdp_hw_metadata program. > > [...] Here is the summary with links: - [bpf-next,v8,01/18] ice: make RX hash reading code more reusable https://git.kernel.org/bpf/bpf-next/c/9244384e811e - [bpf-next,v8,02/18] ice: make RX HW timestamp reading code more reusable https://git.kernel.org/bpf/bpf-next/c/3310aad20def - [bpf-next,v8,03/18] ice: Make ptype internal to descriptor info processing https://git.kernel.org/bpf/bpf-next/c/6b62a4214903 - [bpf-next,v8,04/18] ice: Introduce ice_xdp_buff https://git.kernel.org/bpf/bpf-next/c/d951c14ad237 - [bpf-next,v8,05/18] ice: Support HW timestamp hint https://git.kernel.org/bpf/bpf-next/c/9031d5f491b9 - [bpf-next,v8,06/18] ice: Support RX hash XDP hint https://git.kernel.org/bpf/bpf-next/c/0e6a7b095970 - [bpf-next,v8,07/18] xsk: add functions to fill control buffer https://git.kernel.org/bpf/bpf-next/c/b4e352ff1169 - [bpf-next,v8,08/18] ice: Support XDP hints in AF_XDP ZC mode https://git.kernel.org/bpf/bpf-next/c/d68d707dcbbf - [bpf-next,v8,09/18] xdp: Add VLAN tag hint https://git.kernel.org/bpf/bpf-next/c/e6795330f88b - [bpf-next,v8,10/18] ice: Implement VLAN tag hint https://git.kernel.org/bpf/bpf-next/c/714ed949c6f3 - [bpf-next,v8,11/18] ice: use VLAN proto from ring packet context in skb path https://git.kernel.org/bpf/bpf-next/c/b591137c4ec3 - [bpf-next,v8,12/18] veth: Implement VLAN tag XDP hint https://git.kernel.org/bpf/bpf-next/c/fca783799f64 - [bpf-next,v8,13/18] net: make vlan_get_tag() return -ENODATA instead of -EINVAL https://git.kernel.org/bpf/bpf-next/c/537fec0733c4 - [bpf-next,v8,14/18] mlx5: implement VLAN tag XDP hint https://git.kernel.org/bpf/bpf-next/c/7978bad4b6b9 - [bpf-next,v8,15/18] selftests/bpf: Allow VLAN packets in xdp_hw_metadata https://git.kernel.org/bpf/bpf-next/c/e71a9fa7fdb2 - [bpf-next,v8,16/18] selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata https://git.kernel.org/bpf/bpf-next/c/8e68a4beba94 - [bpf-next,v8,17/18] selftests/bpf: Add AF_INET packet generation to xdp_metadata https://git.kernel.org/bpf/bpf-next/c/a3850af4ea25 - [bpf-next,v8,18/18] selftests/bpf: Check VLAN tag and proto in xdp_metadata https://git.kernel.org/bpf/bpf-next/c/4c6612f6100c You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2023-12-14 0:30 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-12-05 21:08 [xdp-hints] [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 01/18] ice: make RX hash reading code more reusable Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 02/18] ice: make RX HW timestamp " Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 03/18] ice: Make ptype internal to descriptor info processing Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 04/18] ice: Introduce ice_xdp_buff Larysa Zaremba 2023-12-12 13:07 ` [xdp-hints] " Maciej Fijalkowski 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 05/18] ice: Support HW timestamp hint Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 06/18] ice: Support RX hash XDP hint Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 07/18] xsk: add functions to fill control buffer Larysa Zaremba 2023-12-12 13:51 ` [xdp-hints] " Magnus Karlsson 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 08/18] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba 2023-12-12 13:20 ` [xdp-hints] " Maciej Fijalkowski 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 09/18] xdp: Add VLAN tag hint Larysa Zaremba 2023-12-06 8:25 ` [xdp-hints] " Jesper Dangaard Brouer 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 10/18] ice: Implement " Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 11/18] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba 2023-12-12 13:26 ` [xdp-hints] " Maciej Fijalkowski 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 12/18] veth: Implement VLAN tag XDP hint Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 13/18] net: make vlan_get_tag() return -ENODATA instead of -EINVAL Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 14/18] mlx5: implement VLAN tag XDP hint Larysa Zaremba 2023-12-06 8:52 ` [xdp-hints] " Jesper Dangaard Brouer 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 15/18] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 16/18] selftests/bpf: Add flags and VLAN hint to xdp_hw_metadata Larysa Zaremba 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 17/18] selftests/bpf: Add AF_INET packet generation to xdp_metadata Larysa Zaremba 2023-12-05 22:59 ` [xdp-hints] " Stanislav Fomichev 2023-12-05 21:08 ` [xdp-hints] [PATCH bpf-next v8 18/18] selftests/bpf: Check VLAN tag and proto in xdp_metadata Larysa Zaremba 2023-12-14 0:30 ` [xdp-hints] Re: [PATCH bpf-next v8 00/18] XDP metadata via kfuncs for ice + VLAN hint patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox