XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
* [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc
@ 2023-03-21 13:47 Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 1/6] igc: enable and fix RX hash usage by netstack Jesper Dangaard Brouer
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-21 13:47 UTC (permalink / raw)
  To: bpf
  Cc: Jesper Dangaard Brouer, netdev, Stanislav Fomichev, martin.lau,
	ast, daniel, alexandr.lobakin, larysa.zaremba, xdp-hints,
	anthony.l.nguyen, yoong.siang.song, boon.leong.ong,
	intel-wired-lan, pabeni, jesse.brandeburg, kuba, edumazet,
	john.fastabend, hawk, davem

Implemented XDP-hints metadata kfuncs for Intel driver igc.

Primarily used the tool in tools/testing/selftests/bpf/ xdp_hw_metadata,
when doing driver development of these features. Recommend other driver
developers to do the same. In the process xdp_hw_metadata was updated to
help assist development. I've documented my practical experience with igc
and tool here[1].

[1] https://github.com/xdp-project/xdp-project/blob/master/areas/hints/xdp_hints_kfuncs02_driver_igc.org

This patchset implement RX-hash as a simple u32 value (as this is the
current kfunc API), but my experience with RX-hash is that we will also
need to provide the Hash-type for this raw value to be useful to
BPF-developers. This will be addressed in followup work once this patchset
lands.

---

Jesper Dangaard Brouer (6):
      igc: enable and fix RX hash usage by netstack
      selftests/bpf: xdp_hw_metadata track more timestamps
      selftests/bpf: xdp_hw_metadata RX hash return code info
      igc: add igc_xdp_buff wrapper for xdp_buff in driver
      igc: add XDP hints kfuncs for RX timestamp
      igc: add XDP hints kfuncs for RX hash


 drivers/net/ethernet/intel/igc/igc.h          | 35 +++++++
 drivers/net/ethernet/intel/igc/igc_main.c     | 94 ++++++++++++++++---
 .../selftests/bpf/progs/xdp_hw_metadata.c     | 18 ++--
 tools/testing/selftests/bpf/xdp_hw_metadata.c | 51 ++++++++--
 tools/testing/selftests/bpf/xdp_metadata.h    |  1 +
 5 files changed, 176 insertions(+), 23 deletions(-)

--



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] [PATCH bpf-next V2 1/6] igc: enable and fix RX hash usage by netstack
  2023-03-21 13:47 [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
@ 2023-03-21 13:47 ` Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 2/6] selftests/bpf: xdp_hw_metadata track more timestamps Jesper Dangaard Brouer
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-21 13:47 UTC (permalink / raw)
  To: bpf
  Cc: Jesper Dangaard Brouer, netdev, Stanislav Fomichev, martin.lau,
	ast, daniel, alexandr.lobakin, larysa.zaremba, xdp-hints,
	anthony.l.nguyen, yoong.siang.song, boon.leong.ong,
	intel-wired-lan, pabeni, jesse.brandeburg, kuba, edumazet,
	john.fastabend, hawk, davem

When function igc_rx_hash() was introduced in v4.20 via commit 0507ef8a0372
("igc: Add transmit and receive fastpath and interrupt handlers"), the
hardware wasn't configured to provide RSS hash, thus it made sense to not
enable net_device NETIF_F_RXHASH feature bit.

The NIC hardware was configured to enable RSS hash info in v5.2 via commit
2121c2712f82 ("igc: Add multiple receive queues control supporting"), but
forgot to set the NETIF_F_RXHASH feature bit.

The original implementation of igc_rx_hash() didn't extract the associated
pkt_hash_type, but statically set PKT_HASH_TYPE_L3. The largest portions of
this patch are about extracting the RSS Type from the hardware and mapping
this to enum pkt_hash_types. This was based on Foxville i225 software user
manual rev-1.3.1 and tested on Intel Ethernet Controller I225-LM (rev 03).

For UDP it's worth noting that RSS (type) hashing have been disabled both for
IPv4 and IPv6 (see IGC_MRQC_RSS_FIELD_IPV4_UDP + IGC_MRQC_RSS_FIELD_IPV6_UDP)
because hardware RSS doesn't handle fragmented pkts well when enabled (can
cause out-of-order). This results in PKT_HASH_TYPE_L3 for UDP packets, and
hash value doesn't include UDP port numbers. Not being PKT_HASH_TYPE_L4, have
the effect that netstack will do a software based hash calc calling into
flow_dissect, but only when code calls skb_get_hash(), which doesn't
necessary happen for local delivery.

Fixes: 2121c2712f82 ("igc: Add multiple receive queues control supporting")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |   28 ++++++++++++++++++++++++++
 drivers/net/ethernet/intel/igc/igc_main.c |   31 +++++++++++++++++++++++++----
 2 files changed, 55 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index df3e26c0cf01..f83cbc4a1afa 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -13,6 +13,7 @@
 #include <linux/ptp_clock_kernel.h>
 #include <linux/timecounter.h>
 #include <linux/net_tstamp.h>
+#include <linux/bitfield.h>
 
 #include "igc_hw.h"
 
@@ -311,6 +312,33 @@ extern char igc_driver_name[];
 #define IGC_MRQC_RSS_FIELD_IPV4_UDP	0x00400000
 #define IGC_MRQC_RSS_FIELD_IPV6_UDP	0x00800000
 
+/* RX-desc Write-Back format RSS Type's */
+enum igc_rss_type_num {
+	IGC_RSS_TYPE_NO_HASH		= 0,
+	IGC_RSS_TYPE_HASH_TCP_IPV4	= 1,
+	IGC_RSS_TYPE_HASH_IPV4		= 2,
+	IGC_RSS_TYPE_HASH_TCP_IPV6	= 3,
+	IGC_RSS_TYPE_HASH_IPV6_EX	= 4,
+	IGC_RSS_TYPE_HASH_IPV6		= 5,
+	IGC_RSS_TYPE_HASH_TCP_IPV6_EX	= 6,
+	IGC_RSS_TYPE_HASH_UDP_IPV4	= 7,
+	IGC_RSS_TYPE_HASH_UDP_IPV6	= 8,
+	IGC_RSS_TYPE_HASH_UDP_IPV6_EX	= 9,
+	IGC_RSS_TYPE_MAX		= 10,
+};
+#define IGC_RSS_TYPE_MAX_TABLE		16
+#define IGC_RSS_TYPE_MASK		GENMASK(3,0) /* 4-bits (3:0) = mask 0x0F */
+
+/* igc_rss_type - Rx descriptor RSS type field */
+static inline u32 igc_rss_type(const union igc_adv_rx_desc *rx_desc)
+{
+	/* RSS Type 4-bits (3:0) number: 0-9 (above 9 is reserved)
+	 * Accessing the same bits via u16 (wb.lower.lo_dword.hs_rss.pkt_info)
+	 * is slightly slower than via u32 (wb.lower.lo_dword.data)
+	 */
+	return le32_get_bits(rx_desc->wb.lower.lo_dword.data, IGC_RSS_TYPE_MASK);
+}
+
 /* Interrupt defines */
 #define IGC_START_ITR			648 /* ~6000 ints/sec */
 #define IGC_4K_ITR			980
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 2928a6c73692..f6a54feec011 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1677,14 +1677,36 @@ static void igc_rx_checksum(struct igc_ring *ring,
 		   le32_to_cpu(rx_desc->wb.upper.status_error));
 }
 
+/* Mapping HW RSS Type to enum pkt_hash_types */
+enum pkt_hash_types igc_rss_type_table[IGC_RSS_TYPE_MAX_TABLE] = {
+	[IGC_RSS_TYPE_NO_HASH]		= PKT_HASH_TYPE_L2,
+	[IGC_RSS_TYPE_HASH_TCP_IPV4]	= PKT_HASH_TYPE_L4,
+	[IGC_RSS_TYPE_HASH_IPV4]	= PKT_HASH_TYPE_L3,
+	[IGC_RSS_TYPE_HASH_TCP_IPV6]	= PKT_HASH_TYPE_L4,
+	[IGC_RSS_TYPE_HASH_IPV6_EX]	= PKT_HASH_TYPE_L3,
+	[IGC_RSS_TYPE_HASH_IPV6]	= PKT_HASH_TYPE_L3,
+	[IGC_RSS_TYPE_HASH_TCP_IPV6_EX] = PKT_HASH_TYPE_L4,
+	[IGC_RSS_TYPE_HASH_UDP_IPV4]	= PKT_HASH_TYPE_L4,
+	[IGC_RSS_TYPE_HASH_UDP_IPV6]	= PKT_HASH_TYPE_L4,
+	[IGC_RSS_TYPE_HASH_UDP_IPV6_EX] = PKT_HASH_TYPE_L4,
+	[10] = PKT_HASH_TYPE_NONE, /* RSS Type above 9 "Reserved" by HW  */
+	[11] = PKT_HASH_TYPE_NONE, /* keep array sized for SW bit-mask   */
+	[12] = PKT_HASH_TYPE_NONE, /* to handle future HW revisons       */
+	[13] = PKT_HASH_TYPE_NONE,
+	[14] = PKT_HASH_TYPE_NONE,
+	[15] = PKT_HASH_TYPE_NONE,
+};
+
 static inline void igc_rx_hash(struct igc_ring *ring,
 			       union igc_adv_rx_desc *rx_desc,
 			       struct sk_buff *skb)
 {
-	if (ring->netdev->features & NETIF_F_RXHASH)
-		skb_set_hash(skb,
-			     le32_to_cpu(rx_desc->wb.lower.hi_dword.rss),
-			     PKT_HASH_TYPE_L3);
+	if (ring->netdev->features & NETIF_F_RXHASH) {
+		u32 rss_hash = le32_to_cpu(rx_desc->wb.lower.hi_dword.rss);
+		u32 rss_type = igc_rss_type(rx_desc);
+
+		skb_set_hash(skb, rss_hash, igc_rss_type_table[rss_type]);
+	}
 }
 
 static void igc_rx_vlan(struct igc_ring *rx_ring,
@@ -6543,6 +6565,7 @@ static int igc_probe(struct pci_dev *pdev,
 	netdev->features |= NETIF_F_TSO;
 	netdev->features |= NETIF_F_TSO6;
 	netdev->features |= NETIF_F_TSO_ECN;
+	netdev->features |= NETIF_F_RXHASH;
 	netdev->features |= NETIF_F_RXCSUM;
 	netdev->features |= NETIF_F_HW_CSUM;
 	netdev->features |= NETIF_F_SCTP_CRC;



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] [PATCH bpf-next V2 2/6] selftests/bpf: xdp_hw_metadata track more timestamps
  2023-03-21 13:47 [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 1/6] igc: enable and fix RX hash usage by netstack Jesper Dangaard Brouer
@ 2023-03-21 13:47 ` Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info Jesper Dangaard Brouer
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-21 13:47 UTC (permalink / raw)
  To: bpf
  Cc: Jesper Dangaard Brouer, netdev, Stanislav Fomichev, martin.lau,
	ast, daniel, alexandr.lobakin, larysa.zaremba, xdp-hints,
	anthony.l.nguyen, yoong.siang.song, boon.leong.ong,
	intel-wired-lan, pabeni, jesse.brandeburg, kuba, edumazet,
	john.fastabend, hawk, davem

To correlate the hardware RX timestamp with something, add tracking of
two software timestamps both clock source CLOCK_TAI (see description in
man clock_gettime(2)).

XDP metadata is extended with xdp_timestamp for capturing when XDP
received the packet. Populated with BPF helper bpf_ktime_get_tai_ns(). I
could not find a BPF helper for getting CLOCK_REALTIME, which would have
been preferred. In userspace when AF_XDP sees the packet another
software timestamp is recorded via clock_gettime() also clock source
CLOCK_TAI.

Example output shortly after loading igc driver:

  poll: 1 (0)
  xsk_ring_cons__peek: 1
  0x11fc958: rx_desc[7]->addr=10000000000f000 addr=f100 comp_addr=f000
  rx_hash: 0x00000000
  rx_timestamp:  1676297171760293047 (sec:1676297171.7603)
  XDP RX-time:   1676297208760355863 (sec:1676297208.7604) delta sec:37.0001
  AF_XDP time:   1676297208760416292 (sec:1676297208.7604) delta sec:0.0001 (60.429 usec)
  0x11fc958: complete idx=15 addr=f000

The first observation is that the 37 sec difference between RX HW vs XDP
timestamps, which indicate hardware is likely clock source
CLOCK_REALTIME, because (as of this writing) CLOCK_TAI is initialised
with a 37 sec offset.

The 60 usec (microsec) difference between XDP vs AF_XDP userspace is the
userspace wakeup time. On this hardware it was caused by CPU idle sleep
states, which can be reduced by tuning /dev/cpu_dma_latency.

View current requested/allowed latency bound via:
  hexdump --format '"%d\n"' /dev/cpu_dma_latency

More explanation of the output and how this can be used to identify
clock drift for the HW clock can be seen here[1]:

[1] https://github.com/xdp-project/xdp-project/blob/master/areas/hints/xdp_hints_kfuncs02_driver_igc.org

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
---
 .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 +++-
 tools/testing/selftests/bpf/xdp_hw_metadata.c      |   46 ++++++++++++++++++--
 tools/testing/selftests/bpf/xdp_metadata.h         |    1 
 3 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index 4c55b4d79d3d..40c17adbf483 100644
--- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
@@ -69,10 +69,13 @@ int rx(struct xdp_md *ctx)
 		return XDP_PASS;
 	}
 
-	if (!bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp))
-		bpf_printk("populated rx_timestamp with %llu", meta->rx_timestamp);
-	else
+	if (!bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp)) {
+		meta->xdp_timestamp = bpf_ktime_get_tai_ns();
+		bpf_printk("populated rx_timestamp with  %llu", meta->rx_timestamp);
+		bpf_printk("populated xdp_timestamp with %llu", meta->xdp_timestamp);
+	} else {
 		meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
+	}
 
 	if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
 		bpf_printk("populated rx_hash with %u", meta->rx_hash);
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 1c8acb68b977..400bfe19abfe 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -27,6 +27,7 @@
 #include <sys/mman.h>
 #include <net/if.h>
 #include <poll.h>
+#include <time.h>
 
 #include "xdp_metadata.h"
 
@@ -134,14 +135,47 @@ static void refill_rx(struct xsk *xsk, __u64 addr)
 	}
 }
 
-static void verify_xdp_metadata(void *data)
+#define NANOSEC_PER_SEC 1000000000 /* 10^9 */
+static __u64 gettime(clockid_t clock_id)
+{
+	struct timespec t;
+	int res;
+
+	/* See man clock_gettime(2) for type of clock_id's */
+	res = clock_gettime(clock_id, &t);
+
+	if (res < 0)
+		error(res, errno, "Error with clock_gettime()");
+
+	return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec;
+}
+
+static void verify_xdp_metadata(void *data, clockid_t clock_id)
 {
 	struct xdp_meta *meta;
 
 	meta = data - sizeof(*meta);
 
-	printf("rx_timestamp: %llu\n", meta->rx_timestamp);
 	printf("rx_hash: %u\n", meta->rx_hash);
+	printf("rx_timestamp:  %llu (sec:%0.4f)\n", meta->rx_timestamp,
+	       (double)meta->rx_timestamp / NANOSEC_PER_SEC);
+	if (meta->rx_timestamp) {
+		__u64 usr_clock = gettime(clock_id);
+		__u64 xdp_clock = meta->xdp_timestamp;
+		__s64 delta_X = xdp_clock - meta->rx_timestamp;
+		__s64 delta_X2U = usr_clock - xdp_clock;
+
+		printf("XDP RX-time:   %llu (sec:%0.4f) delta sec:%0.4f (%0.3f usec)\n",
+		       xdp_clock, (double)xdp_clock / NANOSEC_PER_SEC,
+		       (double)delta_X / NANOSEC_PER_SEC,
+		       (double)delta_X / 1000);
+
+		printf("AF_XDP time:   %llu (sec:%0.4f) delta sec:%0.4f (%0.3f usec)\n",
+		       usr_clock, (double)usr_clock / NANOSEC_PER_SEC,
+		       (double)delta_X2U / NANOSEC_PER_SEC,
+		       (double)delta_X2U / 1000);
+	}
+
 }
 
 static void verify_skb_metadata(int fd)
@@ -189,7 +223,7 @@ static void verify_skb_metadata(int fd)
 	printf("skb hwtstamp is not found!\n");
 }
 
-static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd)
+static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t clock_id)
 {
 	const struct xdp_desc *rx_desc;
 	struct pollfd fds[rxq + 1];
@@ -237,7 +271,8 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd)
 			addr = xsk_umem__add_offset_to_addr(rx_desc->addr);
 			printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n",
 			       xsk, idx, rx_desc->addr, addr, comp_addr);
-			verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr));
+			verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr),
+					    clock_id);
 			xsk_ring_cons__release(&xsk->rx, 1);
 			refill_rx(xsk, comp_addr);
 		}
@@ -364,6 +399,7 @@ static void timestamping_enable(int fd, int val)
 
 int main(int argc, char *argv[])
 {
+	clockid_t clock_id = CLOCK_TAI;
 	int server_fd = -1;
 	int ret;
 	int i;
@@ -437,7 +473,7 @@ int main(int argc, char *argv[])
 		error(1, -ret, "bpf_xdp_attach");
 
 	signal(SIGINT, handle_signal);
-	ret = verify_metadata(rx_xsk, rxq, server_fd);
+	ret = verify_metadata(rx_xsk, rxq, server_fd, clock_id);
 	close(server_fd);
 	cleanup();
 	if (ret)
diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h
index f6780fbb0a21..260345b2c6f1 100644
--- a/tools/testing/selftests/bpf/xdp_metadata.h
+++ b/tools/testing/selftests/bpf/xdp_metadata.h
@@ -11,5 +11,6 @@
 
 struct xdp_meta {
 	__u64 rx_timestamp;
+	__u64 xdp_timestamp;
 	__u32 rx_hash;
 };



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-21 13:47 [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 1/6] igc: enable and fix RX hash usage by netstack Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 2/6] selftests/bpf: xdp_hw_metadata track more timestamps Jesper Dangaard Brouer
@ 2023-03-21 13:47 ` Jesper Dangaard Brouer
  2023-03-21 18:47   ` [xdp-hints] " Stanislav Fomichev
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 4/6] igc: add igc_xdp_buff wrapper for xdp_buff in driver Jesper Dangaard Brouer
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-21 13:47 UTC (permalink / raw)
  To: bpf
  Cc: Jesper Dangaard Brouer, netdev, Stanislav Fomichev, martin.lau,
	ast, daniel, alexandr.lobakin, larysa.zaremba, xdp-hints,
	anthony.l.nguyen, yoong.siang.song, boon.leong.ong,
	intel-wired-lan, pabeni, jesse.brandeburg, kuba, edumazet,
	john.fastabend, hawk, davem

When driver developers add XDP-hints kfuncs for RX hash it is
practical to print the return code in bpf_printk trace pipe log.

Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
as this makes it easier to spot poor quality hashes.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
 tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index 40c17adbf483..ce07010e4d48 100644
--- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
@@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
 		meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
 	}
 
-	if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
-		bpf_printk("populated rx_hash with %u", meta->rx_hash);
-	else
+	ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
+	if (ret >= 0) {
+		bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
+	} else {
+		bpf_printk("rx_hash not-avail errno:%d", ret);
 		meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
+	}
 
 	return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
 }
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 400bfe19abfe..f3ec07ccdc95 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -3,6 +3,9 @@
 /* Reference program for verifying XDP metadata on real HW. Functional test
  * only, doesn't test the performance.
  *
+ * BPF-prog bpf_printk info outout can be access via
+ * /sys/kernel/debug/tracing/trace_pipe
+ *
  * RX:
  * - UDP 9091 packets are diverted into AF_XDP
  * - Metadata verified:
@@ -156,7 +159,7 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id)
 
 	meta = data - sizeof(*meta);
 
-	printf("rx_hash: %u\n", meta->rx_hash);
+	printf("rx_hash: 0x%08X\n", meta->rx_hash);
 	printf("rx_timestamp:  %llu (sec:%0.4f)\n", meta->rx_timestamp,
 	       (double)meta->rx_timestamp / NANOSEC_PER_SEC);
 	if (meta->rx_timestamp) {



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] [PATCH bpf-next V2 4/6] igc: add igc_xdp_buff wrapper for xdp_buff in driver
  2023-03-21 13:47 [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
                   ` (2 preceding siblings ...)
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info Jesper Dangaard Brouer
@ 2023-03-21 13:47 ` Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 5/6] igc: add XDP hints kfuncs for RX timestamp Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 6/6] igc: add XDP hints kfuncs for RX hash Jesper Dangaard Brouer
  5 siblings, 0 replies; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-21 13:47 UTC (permalink / raw)
  To: bpf
  Cc: Jesper Dangaard Brouer, netdev, Stanislav Fomichev, martin.lau,
	ast, daniel, alexandr.lobakin, larysa.zaremba, xdp-hints,
	anthony.l.nguyen, yoong.siang.song, boon.leong.ong,
	intel-wired-lan, pabeni, jesse.brandeburg, kuba, edumazet,
	john.fastabend, hawk, davem

Driver specific metadata data for XDP-hints kfuncs are propagated via tail
extending the struct xdp_buff with a locally scoped driver struct.

Zero-Copy AF_XDP/XSK does similar tricks via struct xdp_buff_xsk. This
xdp_buff_xsk struct contains a CB area (24 bytes) that can be used for
extending the locally scoped driver into. The XSK_CHECK_PRIV_TYPE define
catch size violations build time.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |    6 ++++++
 drivers/net/ethernet/intel/igc/igc_main.c |   30 ++++++++++++++++++++++-------
 2 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index f83cbc4a1afa..bc67a52e47e8 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -499,6 +499,12 @@ struct igc_rx_buffer {
 	};
 };
 
+/* context wrapper around xdp_buff to provide access to descriptor metadata */
+struct igc_xdp_buff {
+	struct xdp_buff xdp;
+	union igc_adv_rx_desc *rx_desc;
+};
+
 struct igc_q_vector {
 	struct igc_adapter *adapter;    /* backlink */
 	void __iomem *itr_register;
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index f6a54feec011..a78d7e6bcfd6 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2223,6 +2223,8 @@ static bool igc_alloc_rx_buffers_zc(struct igc_ring *ring, u16 count)
 	if (!count)
 		return ok;
 
+	XSK_CHECK_PRIV_TYPE(struct igc_xdp_buff);
+
 	desc = IGC_RX_DESC(ring, i);
 	bi = &ring->rx_buffer_info[i];
 	i -= ring->count;
@@ -2507,8 +2509,8 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 		union igc_adv_rx_desc *rx_desc;
 		struct igc_rx_buffer *rx_buffer;
 		unsigned int size, truesize;
+		struct igc_xdp_buff ctx;
 		ktime_t timestamp = 0;
-		struct xdp_buff xdp;
 		int pkt_offset = 0;
 		void *pktbuf;
 
@@ -2542,13 +2544,14 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 		}
 
 		if (!skb) {
-			xdp_init_buff(&xdp, truesize, &rx_ring->xdp_rxq);
-			xdp_prepare_buff(&xdp, pktbuf - igc_rx_offset(rx_ring),
+			xdp_init_buff(&ctx.xdp, truesize, &rx_ring->xdp_rxq);
+			xdp_prepare_buff(&ctx.xdp, pktbuf - igc_rx_offset(rx_ring),
 					 igc_rx_offset(rx_ring) + pkt_offset,
 					 size, true);
-			xdp_buff_clear_frags_flag(&xdp);
+			xdp_buff_clear_frags_flag(&ctx.xdp);
+			ctx.rx_desc = rx_desc;
 
-			skb = igc_xdp_run_prog(adapter, &xdp);
+			skb = igc_xdp_run_prog(adapter, &ctx.xdp);
 		}
 
 		if (IS_ERR(skb)) {
@@ -2570,9 +2573,9 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 		} else if (skb)
 			igc_add_rx_frag(rx_ring, rx_buffer, skb, size);
 		else if (ring_uses_build_skb(rx_ring))
-			skb = igc_build_skb(rx_ring, rx_buffer, &xdp);
+			skb = igc_build_skb(rx_ring, rx_buffer, &ctx.xdp);
 		else
-			skb = igc_construct_skb(rx_ring, rx_buffer, &xdp,
+			skb = igc_construct_skb(rx_ring, rx_buffer, &ctx.xdp,
 						timestamp);
 
 		/* exit if we failed to retrieve a buffer */
@@ -2673,6 +2676,15 @@ static void igc_dispatch_skb_zc(struct igc_q_vector *q_vector,
 	napi_gro_receive(&q_vector->napi, skb);
 }
 
+static struct igc_xdp_buff *xsk_buff_to_igc_ctx(struct xdp_buff *xdp)
+{
+	/* xdp_buff pointer used by ZC code path is alloc as xdp_buff_xsk. The
+	 * igc_xdp_buff shares its layout with xdp_buff_xsk and private
+	 * igc_xdp_buff fields fall into xdp_buff_xsk->cb
+	 */
+       return (struct igc_xdp_buff *)xdp;
+}
+
 static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
 {
 	struct igc_adapter *adapter = q_vector->adapter;
@@ -2691,6 +2703,7 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
 	while (likely(total_packets < budget)) {
 		union igc_adv_rx_desc *desc;
 		struct igc_rx_buffer *bi;
+		struct igc_xdp_buff *ctx;
 		ktime_t timestamp = 0;
 		unsigned int size;
 		int res;
@@ -2708,6 +2721,9 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
 
 		bi = &ring->rx_buffer_info[ntc];
 
+		ctx = xsk_buff_to_igc_ctx(bi->xdp);
+		ctx->rx_desc = desc;
+
 		if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) {
 			timestamp = igc_ptp_rx_pktstamp(q_vector->adapter,
 							bi->xdp->data);



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] [PATCH bpf-next V2 5/6] igc: add XDP hints kfuncs for RX timestamp
  2023-03-21 13:47 [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
                   ` (3 preceding siblings ...)
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 4/6] igc: add igc_xdp_buff wrapper for xdp_buff in driver Jesper Dangaard Brouer
@ 2023-03-21 13:47 ` Jesper Dangaard Brouer
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 6/6] igc: add XDP hints kfuncs for RX hash Jesper Dangaard Brouer
  5 siblings, 0 replies; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-21 13:47 UTC (permalink / raw)
  To: bpf
  Cc: Jesper Dangaard Brouer, netdev, Stanislav Fomichev, martin.lau,
	ast, daniel, alexandr.lobakin, larysa.zaremba, xdp-hints,
	anthony.l.nguyen, yoong.siang.song, boon.leong.ong,
	intel-wired-lan, pabeni, jesse.brandeburg, kuba, edumazet,
	john.fastabend, hawk, davem

The NIC hardware RX timestamping mechanism adds an optional tailored
header before the MAC header containing packet reception time. Optional
depending on RX descriptor TSIP status bit (IGC_RXDADV_STAT_TSIP). In
case this bit is set driver does offset adjustments to packet data start
and extracts the timestamp.

The timestamp need to be extracted before invoking the XDP bpf_prog,
because this area just before the packet is also accessible by XDP via
data_meta context pointer (and helper bpf_xdp_adjust_meta). Thus, an XDP
bpf_prog can potentially overwrite this and corrupt data that we want to
extract with the new kfunc for reading the timestamp.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |    1 +
 drivers/net/ethernet/intel/igc/igc_main.c |   20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index bc67a52e47e8..29941734f1a1 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -503,6 +503,7 @@ struct igc_rx_buffer {
 struct igc_xdp_buff {
 	struct xdp_buff xdp;
 	union igc_adv_rx_desc *rx_desc;
+	ktime_t rx_ts; /* data indication bit IGC_RXDADV_STAT_TSIP */
 };
 
 struct igc_q_vector {
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index a78d7e6bcfd6..f66285c85444 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2539,6 +2539,7 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 		if (igc_test_staterr(rx_desc, IGC_RXDADV_STAT_TSIP)) {
 			timestamp = igc_ptp_rx_pktstamp(q_vector->adapter,
 							pktbuf);
+			ctx.rx_ts = timestamp;
 			pkt_offset = IGC_TS_HDR_LEN;
 			size -= IGC_TS_HDR_LEN;
 		}
@@ -2727,6 +2728,7 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
 		if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) {
 			timestamp = igc_ptp_rx_pktstamp(q_vector->adapter,
 							bi->xdp->data);
+			ctx->rx_ts = timestamp;
 
 			bi->xdp->data += IGC_TS_HDR_LEN;
 
@@ -6481,6 +6483,23 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg)
 	return value;
 }
 
+static int igc_xdp_rx_timestamp(const struct xdp_md *_ctx, u64 *timestamp)
+{
+	const struct igc_xdp_buff *ctx = (void *)_ctx;
+
+	if (igc_test_staterr(ctx->rx_desc, IGC_RXDADV_STAT_TSIP)) {
+		*timestamp = ctx->rx_ts;
+
+		return 0;
+	}
+
+	return -ENODATA;
+}
+
+const struct xdp_metadata_ops igc_xdp_metadata_ops = {
+	.xmo_rx_timestamp		= igc_xdp_rx_timestamp,
+};
+
 /**
  * igc_probe - Device Initialization Routine
  * @pdev: PCI device information struct
@@ -6554,6 +6573,7 @@ static int igc_probe(struct pci_dev *pdev,
 	hw->hw_addr = adapter->io_addr;
 
 	netdev->netdev_ops = &igc_netdev_ops;
+	netdev->xdp_metadata_ops = &igc_xdp_metadata_ops;
 	igc_ethtool_set_ops(netdev);
 	netdev->watchdog_timeo = 5 * HZ;
 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] [PATCH bpf-next V2 6/6] igc: add XDP hints kfuncs for RX hash
  2023-03-21 13:47 [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
                   ` (4 preceding siblings ...)
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 5/6] igc: add XDP hints kfuncs for RX timestamp Jesper Dangaard Brouer
@ 2023-03-21 13:47 ` Jesper Dangaard Brouer
  5 siblings, 0 replies; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-21 13:47 UTC (permalink / raw)
  To: bpf
  Cc: Jesper Dangaard Brouer, netdev, Stanislav Fomichev, martin.lau,
	ast, daniel, alexandr.lobakin, larysa.zaremba, xdp-hints,
	anthony.l.nguyen, yoong.siang.song, boon.leong.ong,
	intel-wired-lan, pabeni, jesse.brandeburg, kuba, edumazet,
	john.fastabend, hawk, davem

This implements XDP hints kfunc for RX-hash (xmo_rx_hash) straightforward
by returning the u32 hash value.

The associated RSS-type for the hash value isn't available to the BPF-prog
caller. This is problematic if BPF-prog tries to do L4 load-balancing with
the hardware hash, but the RSS hash type is L3 based.

For this driver this issue occurs for UDP packets, as driver (default
config) does L3 hashing for UDP packets (excludes UDP src/dest ports in
hash calc). Tested that the igc_rss_type_num for UDP is either
IGC_RSS_TYPE_HASH_IPV4 or IGC_RSS_TYPE_HASH_IPV6.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index f66285c85444..846041119fd4 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -6496,8 +6496,21 @@ static int igc_xdp_rx_timestamp(const struct xdp_md *_ctx, u64 *timestamp)
 	return -ENODATA;
 }
 
+static int igc_xdp_rx_hash(const struct xdp_md *_ctx, u32 *hash)
+{
+	const struct igc_xdp_buff *ctx = (void *)_ctx;
+
+	if (!(ctx->xdp.rxq->dev->features & NETIF_F_RXHASH))
+		return -ENODATA;
+
+	*hash = le32_to_cpu(ctx->rx_desc->wb.lower.hi_dword.rss);
+
+	return 0;
+}
+
 const struct xdp_metadata_ops igc_xdp_metadata_ops = {
 	.xmo_rx_timestamp		= igc_xdp_rx_timestamp,
+	.xmo_rx_hash			= igc_xdp_rx_hash,
 };
 
 /**



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info Jesper Dangaard Brouer
@ 2023-03-21 18:47   ` Stanislav Fomichev
  2023-03-22 16:05     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-21 18:47 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: bpf, netdev, martin.lau, ast, daniel, alexandr.lobakin,
	larysa.zaremba, xdp-hints, anthony.l.nguyen, yoong.siang.song,
	boon.leong.ong, intel-wired-lan, pabeni, jesse.brandeburg, kuba,
	edumazet, john.fastabend, hawk, davem

On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> When driver developers add XDP-hints kfuncs for RX hash it is
> practical to print the return code in bpf_printk trace pipe log.
>
> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> as this makes it easier to spot poor quality hashes.
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
>  tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
>  2 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> index 40c17adbf483..ce07010e4d48 100644
> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
>                 meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
>         }
>
> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> -       else
> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> +       if (ret >= 0) {
> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> +       } else {
> +               bpf_printk("rx_hash not-avail errno:%d", ret);
>                 meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> +       }
>
>         return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
>  }
> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> index 400bfe19abfe..f3ec07ccdc95 100644
> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> @@ -3,6 +3,9 @@
>  /* Reference program for verifying XDP metadata on real HW. Functional test
>   * only, doesn't test the performance.
>   *
> + * BPF-prog bpf_printk info outout can be access via
> + * /sys/kernel/debug/tracing/trace_pipe

s/outout/output/

But let's maybe drop it? If you want to make it more usable, let's
have a separate patch to enable tracing and periodically dump it to
the console instead (as previously discussed).

With this addressed:
Acked-by: Stanislav Fomichev <sdf@google.com>

> + *
>   * RX:
>   * - UDP 9091 packets are diverted into AF_XDP
>   * - Metadata verified:
> @@ -156,7 +159,7 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id)
>
>         meta = data - sizeof(*meta);
>
> -       printf("rx_hash: %u\n", meta->rx_hash);
> +       printf("rx_hash: 0x%08X\n", meta->rx_hash);
>         printf("rx_timestamp:  %llu (sec:%0.4f)\n", meta->rx_timestamp,
>                (double)meta->rx_timestamp / NANOSEC_PER_SEC);
>         if (meta->rx_timestamp) {
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-21 18:47   ` [xdp-hints] " Stanislav Fomichev
@ 2023-03-22 16:05     ` Jesper Dangaard Brouer
  2023-03-22 16:07       ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-22 16:05 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: brouer, bpf, netdev, martin.lau, ast, daniel, alexandr.lobakin,
	larysa.zaremba, xdp-hints, anthony.l.nguyen, yoong.siang.song,
	boon.leong.ong, intel-wired-lan, pabeni, jesse.brandeburg, kuba,
	edumazet, john.fastabend, hawk, davem



On 21/03/2023 19.47, Stanislav Fomichev wrote:
> On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
>>
>> When driver developers add XDP-hints kfuncs for RX hash it is
>> practical to print the return code in bpf_printk trace pipe log.
>>
>> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
>> as this makes it easier to spot poor quality hashes.
>>
>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>> ---
>>   .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
>>   tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
>>   2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>> index 40c17adbf483..ce07010e4d48 100644
>> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
>>                  meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
>>          }
>>
>> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
>> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
>> -       else
>> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
>> +       if (ret >= 0) {
>> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
>> +       } else {
>> +               bpf_printk("rx_hash not-avail errno:%d", ret);
>>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
>> +       }
>>
>>          return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
>>   }
>> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
>> index 400bfe19abfe..f3ec07ccdc95 100644
>> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
>> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
>> @@ -3,6 +3,9 @@
>>   /* Reference program for verifying XDP metadata on real HW. Functional test
>>    * only, doesn't test the performance.
>>    *
>> + * BPF-prog bpf_printk info outout can be access via
>> + * /sys/kernel/debug/tracing/trace_pipe
> 
> s/outout/output/
> 

Fixed in V3

> But let's maybe drop it? If you want to make it more usable, let's
> have a separate patch to enable tracing and periodically dump it to
> the console instead (as previously discussed).

Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
setting in
/sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable

We likely need a followup patch that adds a BPF config switch that can
disable bpf_printk calls, because this adds overhead and thus affects
the timestamps.

> With this addressed:
> Acked-by: Stanislav Fomichev <sdf@google.com>
> 
>> + *
>>    * RX:
>>    * - UDP 9091 packets are diverted into AF_XDP
>>    * - Metadata verified:
>> @@ -156,7 +159,7 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id)
>>
>>          meta = data - sizeof(*meta);
>>
>> -       printf("rx_hash: %u\n", meta->rx_hash);
>> +       printf("rx_hash: 0x%08X\n", meta->rx_hash);
>>          printf("rx_timestamp:  %llu (sec:%0.4f)\n", meta->rx_timestamp,
>>                 (double)meta->rx_timestamp / NANOSEC_PER_SEC);
>>          if (meta->rx_timestamp) {
>>
>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-22 16:05     ` Jesper Dangaard Brouer
@ 2023-03-22 16:07       ` Alexei Starovoitov
  2023-03-22 19:00         ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2023-03-22 16:07 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Stanislav Fomichev, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>
>
>
> On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > <brouer@redhat.com> wrote:
> >>
> >> When driver developers add XDP-hints kfuncs for RX hash it is
> >> practical to print the return code in bpf_printk trace pipe log.
> >>
> >> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> >> as this makes it easier to spot poor quality hashes.
> >>
> >> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> >> ---
> >>   .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> >>   tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> >>   2 files changed, 10 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> >> index 40c17adbf483..ce07010e4d48 100644
> >> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> >> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> >> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> >>                  meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> >>          }
> >>
> >> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> >> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> >> -       else
> >> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> >> +       if (ret >= 0) {
> >> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> >> +       } else {
> >> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> >>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> >> +       }
> >>
> >>          return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> >>   }
> >> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> >> index 400bfe19abfe..f3ec07ccdc95 100644
> >> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> >> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> >> @@ -3,6 +3,9 @@
> >>   /* Reference program for verifying XDP metadata on real HW. Functional test
> >>    * only, doesn't test the performance.
> >>    *
> >> + * BPF-prog bpf_printk info outout can be access via
> >> + * /sys/kernel/debug/tracing/trace_pipe
> >
> > s/outout/output/
> >
>
> Fixed in V3
>
> > But let's maybe drop it? If you want to make it more usable, let's
> > have a separate patch to enable tracing and periodically dump it to
> > the console instead (as previously discussed).
>
> Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> setting in
> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
>
> We likely need a followup patch that adds a BPF config switch that can
> disable bpf_printk calls, because this adds overhead and thus affects
> the timestamps.

No. This is by design.
Do not use bpf_printk* in production.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-22 16:07       ` Alexei Starovoitov
@ 2023-03-22 19:00         ` Stanislav Fomichev
  2023-03-22 19:17           ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-22 19:00 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jesper Dangaard Brouer, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> <jbrouer@redhat.com> wrote:
> >
> >
> >
> > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > > <brouer@redhat.com> wrote:
> > >>
> > >> When driver developers add XDP-hints kfuncs for RX hash it is
> > >> practical to print the return code in bpf_printk trace pipe log.
> > >>
> > >> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> > >> as this makes it easier to spot poor quality hashes.
> > >>
> > >> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > >> ---
> > >>   .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > >>   tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > >>   2 files changed, 10 insertions(+), 4 deletions(-)
> > >>
> > >> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > >> index 40c17adbf483..ce07010e4d48 100644
> > >> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > >> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > >> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > >>                  meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> > >>          }
> > >>
> > >> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> > >> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> > >> -       else
> > >> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> > >> +       if (ret >= 0) {
> > >> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > >> +       } else {
> > >> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > >>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > >> +       }
> > >>
> > >>          return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> > >>   }
> > >> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > >> index 400bfe19abfe..f3ec07ccdc95 100644
> > >> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > >> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > >> @@ -3,6 +3,9 @@
> > >>   /* Reference program for verifying XDP metadata on real HW. Functional test
> > >>    * only, doesn't test the performance.
> > >>    *
> > >> + * BPF-prog bpf_printk info outout can be access via
> > >> + * /sys/kernel/debug/tracing/trace_pipe
> > >
> > > s/outout/output/
> > >
> >
> > Fixed in V3
> >
> > > But let's maybe drop it? If you want to make it more usable, let's
> > > have a separate patch to enable tracing and periodically dump it to
> > > the console instead (as previously discussed).
> >
> > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> > setting in
> > /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> >
> > We likely need a followup patch that adds a BPF config switch that can
> > disable bpf_printk calls, because this adds overhead and thus affects
> > the timestamps.
>
> No. This is by design.
> Do not use bpf_printk* in production.

But that's not for the production? xdp_hw_metadata is a small tool to
verify that the metadata being dumped is correct (during the
development).
We have a proper (less verbose) selftest in
{progs,prog_tests}/xdp_metadata.c (over veth).
This xdp_hw_metadata was supposed to be used for running it against
the real hardware, so having as much debugging at hand as possible
seems helpful? (at least it was helpful to me when playing with mlx4)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-22 19:00         ` Stanislav Fomichev
@ 2023-03-22 19:17           ` Alexei Starovoitov
  2023-03-22 19:22             ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2023-03-22 19:17 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Jesper Dangaard Brouer, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > <jbrouer@redhat.com> wrote:
> > >
> > >
> > >
> > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > > > <brouer@redhat.com> wrote:
> > > >>
> > > >> When driver developers add XDP-hints kfuncs for RX hash it is
> > > >> practical to print the return code in bpf_printk trace pipe log.
> > > >>
> > > >> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> > > >> as this makes it easier to spot poor quality hashes.
> > > >>
> > > >> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > > >> ---
> > > >>   .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > > >>   tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > > >>   2 files changed, 10 insertions(+), 4 deletions(-)
> > > >>
> > > >> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > >> index 40c17adbf483..ce07010e4d48 100644
> > > >> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > >> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > >> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > >>                  meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> > > >>          }
> > > >>
> > > >> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> > > >> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> > > >> -       else
> > > >> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> > > >> +       if (ret >= 0) {
> > > >> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > > >> +       } else {
> > > >> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > > >>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > > >> +       }
> > > >>
> > > >>          return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> > > >>   }
> > > >> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > >> index 400bfe19abfe..f3ec07ccdc95 100644
> > > >> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > >> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > >> @@ -3,6 +3,9 @@
> > > >>   /* Reference program for verifying XDP metadata on real HW. Functional test
> > > >>    * only, doesn't test the performance.
> > > >>    *
> > > >> + * BPF-prog bpf_printk info outout can be access via
> > > >> + * /sys/kernel/debug/tracing/trace_pipe
> > > >
> > > > s/outout/output/
> > > >
> > >
> > > Fixed in V3
> > >
> > > > But let's maybe drop it? If you want to make it more usable, let's
> > > > have a separate patch to enable tracing and periodically dump it to
> > > > the console instead (as previously discussed).
> > >
> > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> > > setting in
> > > /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > >
> > > We likely need a followup patch that adds a BPF config switch that can
> > > disable bpf_printk calls, because this adds overhead and thus affects
> > > the timestamps.
> >
> > No. This is by design.
> > Do not use bpf_printk* in production.
>
> But that's not for the production? xdp_hw_metadata is a small tool to
> verify that the metadata being dumped is correct (during the
> development).
> We have a proper (less verbose) selftest in
> {progs,prog_tests}/xdp_metadata.c (over veth).
> This xdp_hw_metadata was supposed to be used for running it against
> the real hardware, so having as much debugging at hand as possible
> seems helpful? (at least it was helpful to me when playing with mlx4)

The only use of bpf_printk is for debugging of bpf progs themselves.
It should not be used in any tool.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-22 19:17           ` Alexei Starovoitov
@ 2023-03-22 19:22             ` Stanislav Fomichev
  2023-03-22 19:30               ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-22 19:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jesper Dangaard Brouer, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > > <jbrouer@redhat.com> wrote:
> > > >
> > > >
> > > >
> > > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > > > > <brouer@redhat.com> wrote:
> > > > >>
> > > > >> When driver developers add XDP-hints kfuncs for RX hash it is
> > > > >> practical to print the return code in bpf_printk trace pipe log.
> > > > >>
> > > > >> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> > > > >> as this makes it easier to spot poor quality hashes.
> > > > >>
> > > > >> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > > > >> ---
> > > > >>   .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > > > >>   tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > > > >>   2 files changed, 10 insertions(+), 4 deletions(-)
> > > > >>
> > > > >> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > >> index 40c17adbf483..ce07010e4d48 100644
> > > > >> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > >> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > >> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > > >>                  meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> > > > >>          }
> > > > >>
> > > > >> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> > > > >> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> > > > >> -       else
> > > > >> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> > > > >> +       if (ret >= 0) {
> > > > >> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > > > >> +       } else {
> > > > >> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > > > >>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > > > >> +       }
> > > > >>
> > > > >>          return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> > > > >>   }
> > > > >> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > >> index 400bfe19abfe..f3ec07ccdc95 100644
> > > > >> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > >> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > >> @@ -3,6 +3,9 @@
> > > > >>   /* Reference program for verifying XDP metadata on real HW. Functional test
> > > > >>    * only, doesn't test the performance.
> > > > >>    *
> > > > >> + * BPF-prog bpf_printk info outout can be access via
> > > > >> + * /sys/kernel/debug/tracing/trace_pipe
> > > > >
> > > > > s/outout/output/
> > > > >
> > > >
> > > > Fixed in V3
> > > >
> > > > > But let's maybe drop it? If you want to make it more usable, let's
> > > > > have a separate patch to enable tracing and periodically dump it to
> > > > > the console instead (as previously discussed).
> > > >
> > > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> > > > setting in
> > > > /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > > >
> > > > We likely need a followup patch that adds a BPF config switch that can
> > > > disable bpf_printk calls, because this adds overhead and thus affects
> > > > the timestamps.
> > >
> > > No. This is by design.
> > > Do not use bpf_printk* in production.
> >
> > But that's not for the production? xdp_hw_metadata is a small tool to
> > verify that the metadata being dumped is correct (during the
> > development).
> > We have a proper (less verbose) selftest in
> > {progs,prog_tests}/xdp_metadata.c (over veth).
> > This xdp_hw_metadata was supposed to be used for running it against
> > the real hardware, so having as much debugging at hand as possible
> > seems helpful? (at least it was helpful to me when playing with mlx4)
>
> The only use of bpf_printk is for debugging of bpf progs themselves.
> It should not be used in any tool.

Hmm, good point. I guess it also means we won't have to mess with
enabling/dumping ftrace (and don't need this comment about cat'ing the
file).
Jesper, maybe we can instead pass the status of those
bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
from the userspace if needed.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-22 19:22             ` Stanislav Fomichev
@ 2023-03-22 19:30               ` Alexei Starovoitov
  2023-03-22 19:33                 ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2023-03-22 19:30 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Jesper Dangaard Brouer, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > > > <jbrouer@redhat.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > > > > > <brouer@redhat.com> wrote:
> > > > > >>
> > > > > >> When driver developers add XDP-hints kfuncs for RX hash it is
> > > > > >> practical to print the return code in bpf_printk trace pipe log.
> > > > > >>
> > > > > >> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> > > > > >> as this makes it easier to spot poor quality hashes.
> > > > > >>
> > > > > >> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > > > > >> ---
> > > > > >>   .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > > > > >>   tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > > > > >>   2 files changed, 10 insertions(+), 4 deletions(-)
> > > > > >>
> > > > > >> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > >> index 40c17adbf483..ce07010e4d48 100644
> > > > > >> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > >> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > >> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > > > >>                  meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> > > > > >>          }
> > > > > >>
> > > > > >> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> > > > > >> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> > > > > >> -       else
> > > > > >> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> > > > > >> +       if (ret >= 0) {
> > > > > >> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > > > > >> +       } else {
> > > > > >> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > > > > >>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > > > > >> +       }
> > > > > >>
> > > > > >>          return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> > > > > >>   }
> > > > > >> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > >> index 400bfe19abfe..f3ec07ccdc95 100644
> > > > > >> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > >> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > >> @@ -3,6 +3,9 @@
> > > > > >>   /* Reference program for verifying XDP metadata on real HW. Functional test
> > > > > >>    * only, doesn't test the performance.
> > > > > >>    *
> > > > > >> + * BPF-prog bpf_printk info outout can be access via
> > > > > >> + * /sys/kernel/debug/tracing/trace_pipe
> > > > > >
> > > > > > s/outout/output/
> > > > > >
> > > > >
> > > > > Fixed in V3
> > > > >
> > > > > > But let's maybe drop it? If you want to make it more usable, let's
> > > > > > have a separate patch to enable tracing and periodically dump it to
> > > > > > the console instead (as previously discussed).
> > > > >
> > > > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> > > > > setting in
> > > > > /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > > > >
> > > > > We likely need a followup patch that adds a BPF config switch that can
> > > > > disable bpf_printk calls, because this adds overhead and thus affects
> > > > > the timestamps.
> > > >
> > > > No. This is by design.
> > > > Do not use bpf_printk* in production.
> > >
> > > But that's not for the production? xdp_hw_metadata is a small tool to
> > > verify that the metadata being dumped is correct (during the
> > > development).
> > > We have a proper (less verbose) selftest in
> > > {progs,prog_tests}/xdp_metadata.c (over veth).
> > > This xdp_hw_metadata was supposed to be used for running it against
> > > the real hardware, so having as much debugging at hand as possible
> > > seems helpful? (at least it was helpful to me when playing with mlx4)
> >
> > The only use of bpf_printk is for debugging of bpf progs themselves.
> > It should not be used in any tool.
>
> Hmm, good point. I guess it also means we won't have to mess with
> enabling/dumping ftrace (and don't need this comment about cat'ing the
> file).
> Jesper, maybe we can instead pass the status of those
> bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
> from the userspace if needed.

There are so many other ways for bpf prog to communicate with user space.
Use ringbuf, perf_event buffer, global vars, maps, etc.
trace_pipe is debug only because it's global and will conflict with
all other debug sessions.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-22 19:30               ` Alexei Starovoitov
@ 2023-03-22 19:33                 ` Stanislav Fomichev
  2023-03-23  8:51                   ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-22 19:33 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jesper Dangaard Brouer, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > > > > <jbrouer@redhat.com> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > > > > > > <brouer@redhat.com> wrote:
> > > > > > >>
> > > > > > >> When driver developers add XDP-hints kfuncs for RX hash it is
> > > > > > >> practical to print the return code in bpf_printk trace pipe log.
> > > > > > >>
> > > > > > >> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> > > > > > >> as this makes it easier to spot poor quality hashes.
> > > > > > >>
> > > > > > >> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > > > > > >> ---
> > > > > > >>   .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > > > > > >>   tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > > > > > >>   2 files changed, 10 insertions(+), 4 deletions(-)
> > > > > > >>
> > > > > > >> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > >> index 40c17adbf483..ce07010e4d48 100644
> > > > > > >> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > >> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > >> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > > > > >>                  meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> > > > > > >>          }
> > > > > > >>
> > > > > > >> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> > > > > > >> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> > > > > > >> -       else
> > > > > > >> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> > > > > > >> +       if (ret >= 0) {
> > > > > > >> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > > > > > >> +       } else {
> > > > > > >> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > > > > > >>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > > > > > >> +       }
> > > > > > >>
> > > > > > >>          return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> > > > > > >>   }
> > > > > > >> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > >> index 400bfe19abfe..f3ec07ccdc95 100644
> > > > > > >> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > >> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > >> @@ -3,6 +3,9 @@
> > > > > > >>   /* Reference program for verifying XDP metadata on real HW. Functional test
> > > > > > >>    * only, doesn't test the performance.
> > > > > > >>    *
> > > > > > >> + * BPF-prog bpf_printk info outout can be access via
> > > > > > >> + * /sys/kernel/debug/tracing/trace_pipe
> > > > > > >
> > > > > > > s/outout/output/
> > > > > > >
> > > > > >
> > > > > > Fixed in V3
> > > > > >
> > > > > > > But let's maybe drop it? If you want to make it more usable, let's
> > > > > > > have a separate patch to enable tracing and periodically dump it to
> > > > > > > the console instead (as previously discussed).
> > > > > >
> > > > > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> > > > > > setting in
> > > > > > /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > > > > >
> > > > > > We likely need a followup patch that adds a BPF config switch that can
> > > > > > disable bpf_printk calls, because this adds overhead and thus affects
> > > > > > the timestamps.
> > > > >
> > > > > No. This is by design.
> > > > > Do not use bpf_printk* in production.
> > > >
> > > > But that's not for the production? xdp_hw_metadata is a small tool to
> > > > verify that the metadata being dumped is correct (during the
> > > > development).
> > > > We have a proper (less verbose) selftest in
> > > > {progs,prog_tests}/xdp_metadata.c (over veth).
> > > > This xdp_hw_metadata was supposed to be used for running it against
> > > > the real hardware, so having as much debugging at hand as possible
> > > > seems helpful? (at least it was helpful to me when playing with mlx4)
> > >
> > > The only use of bpf_printk is for debugging of bpf progs themselves.
> > > It should not be used in any tool.
> >
> > Hmm, good point. I guess it also means we won't have to mess with
> > enabling/dumping ftrace (and don't need this comment about cat'ing the
> > file).
> > Jesper, maybe we can instead pass the status of those
> > bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
> > from the userspace if needed.
>
> There are so many other ways for bpf prog to communicate with user space.
> Use ringbuf, perf_event buffer, global vars, maps, etc.
> trace_pipe is debug only because it's global and will conflict with
> all other debug sessions.

👍 makes sense, ty! hopefully we won't have to add a separate channel
for those and can (ab)use the metadata area.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-22 19:33                 ` Stanislav Fomichev
@ 2023-03-23  8:51                   ` Jesper Dangaard Brouer
  2023-03-23 17:35                     ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-23  8:51 UTC (permalink / raw)
  To: Stanislav Fomichev, Alexei Starovoitov
  Cc: brouer, bpf, Network Development, Martin KaFai Lau,
	Alexei Starovoitov, Daniel Borkmann, Alexander Lobakin,
	Larysa Zaremba, xdp-hints, anthony.l.nguyen, Song, Yoong Siang,
	Ong, Boon Leong, intel-wired-lan, Paolo Abeni, Jesse Brandeburg,
	Jakub Kicinski, Eric Dumazet, John Fastabend,
	Jesper Dangaard Brouer, David S. Miller


On 22/03/2023 20.33, Stanislav Fomichev wrote:
> On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev <sdf@google.com> wrote:
>>>
>>> On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>>
>>>> On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
>>>>>
>>>>> On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>
>>>>>> On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
>>>>>> <jbrouer@redhat.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 21/03/2023 19.47, Stanislav Fomichev wrote:
>>>>>>>> On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
>>>>>>>> <brouer@redhat.com> wrote:
>>>>>>>>>
>>>>>>>>> When driver developers add XDP-hints kfuncs for RX hash it is
>>>>>>>>> practical to print the return code in bpf_printk trace pipe log.
>>>>>>>>>
>>>>>>>>> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
>>>>>>>>> as this makes it easier to spot poor quality hashes.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>>>>>>>>> ---
>>>>>>>>>    .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
>>>>>>>>>    tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
>>>>>>>>>    2 files changed, 10 insertions(+), 4 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>>>>>>>>> index 40c17adbf483..ce07010e4d48 100644
>>>>>>>>> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>>>>>>>>> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>>>>>>>>> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
>>>>>>>>>                   meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
>>>>>>>>>           }
>>>>>>>>>
>>>>>>>>> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
>>>>>>>>> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
>>>>>>>>> -       else
>>>>>>>>> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
>>>>>>>>> +       if (ret >= 0) {
>>>>>>>>> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
>>>>>>>>> +       } else {
>>>>>>>>> +               bpf_printk("rx_hash not-avail errno:%d", ret);
>>>>>>>>>                   meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
>>>>>>>>> +       }
>>>>>>>>>
>>>>>>>>>           return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
>>>>>>>>>    }
>>>>>>>>> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
>>>>>>>>> index 400bfe19abfe..f3ec07ccdc95 100644
>>>>>>>>> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
>>>>>>>>> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
>>>>>>>>> @@ -3,6 +3,9 @@
>>>>>>>>>    /* Reference program for verifying XDP metadata on real HW. Functional test
>>>>>>>>>     * only, doesn't test the performance.
>>>>>>>>>     *
>>>>>>>>> + * BPF-prog bpf_printk info outout can be access via
>>>>>>>>> + * /sys/kernel/debug/tracing/trace_pipe
>>>>>>>>
>>>>>>>> s/outout/output/
>>>>>>>>
>>>>>>>
>>>>>>> Fixed in V3
>>>>>>>
>>>>>>>> But let's maybe drop it? If you want to make it more usable, let's
>>>>>>>> have a separate patch to enable tracing and periodically dump it to
>>>>>>>> the console instead (as previously discussed).
>>>>>>>
>>>>>>> Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
>>>>>>> setting in
>>>>>>> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
>>>>>>>
>>>>>>> We likely need a followup patch that adds a BPF config switch that can
>>>>>>> disable bpf_printk calls, because this adds overhead and thus affects
>>>>>>> the timestamps.
>>>>>>
>>>>>> No. This is by design.
>>>>>> Do not use bpf_printk* in production.

I fully agree do not use bpf_printk in *production*.

>>>>>
>>>>> But that's not for the production? xdp_hw_metadata is a small tool to
>>>>> verify that the metadata being dumped is correct (during the
>>>>> development).
>>>>> We have a proper (less verbose) selftest in
>>>>> {progs,prog_tests}/xdp_metadata.c (over veth).
>>>>> This xdp_hw_metadata was supposed to be used for running it against
>>>>> the real hardware, so having as much debugging at hand as possible
>>>>> seems helpful? (at least it was helpful to me when playing with mlx4)

My experience when developing these kfuncs for igc (real hardware), this
"tool" xdp_hw_metadata was super helpful, because it was very verbose
(and I was juggling reading chip registers BE/LE and see patch1 a buggy
implementation for RX-hash).

As I wrote in cover-letter, I recommend other driver developers to do
the same, because it really help speed up the development. In theory
xdp_hw_metadata doesn't belong in selftests directory and IMHO it should
have been placed in samples/bpf/, but given the relationship with real
selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
keep here.


>>>>
>>>> The only use of bpf_printk is for debugging of bpf progs themselves.
>>>> It should not be used in any tool.
>>>
>>> Hmm, good point. I guess it also means we won't have to mess with
>>> enabling/dumping ftrace (and don't need this comment about cat'ing the
>>> file).
>>> Jesper, maybe we can instead pass the status of those
>>> bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
>>> from the userspace if needed.
>>
>> There are so many other ways for bpf prog to communicate with user space.
>> Use ringbuf, perf_event buffer, global vars, maps, etc.
>> trace_pipe is debug only because it's global and will conflict with
>> all other debug sessions.

I want to highlight above paragraph: It is very true for production
code. (Anyone Googling this pay attention to above paragraph).

> 
> 👍 makes sense, ty! hopefully we won't have to add a separate channel
> for those and can (ab)use the metadata area.
> 

Proposed solution: How about default disabling the bpf_printk's via a 
macro define, and then driver developer can manually reenable them 
easily via a single define, to enable a debugging mode.

I was only watching /sys/kernel/debug/tracing/trace_pipe when I was 
debugging a driver issue.  Thus, an extra step of modifying a single 
define in BPF seems easier, than instrumenting my driver with printk.

--Jesper


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-23  8:51                   ` Jesper Dangaard Brouer
@ 2023-03-23 17:35                     ` Alexei Starovoitov
  2023-03-23 17:47                       ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2023-03-23 17:35 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Stanislav Fomichev, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Thu, Mar 23, 2023 at 1:51 AM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>
>
> On 22/03/2023 20.33, Stanislav Fomichev wrote:
> > On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> >>
> >> On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev <sdf@google.com> wrote:
> >>>
> >>> On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> >>> <alexei.starovoitov@gmail.com> wrote:
> >>>>
> >>>> On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
> >>>>>
> >>>>> On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> >>>>> <alexei.starovoitov@gmail.com> wrote:
> >>>>>>
> >>>>>> On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> >>>>>> <jbrouer@redhat.com> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 21/03/2023 19.47, Stanislav Fomichev wrote:
> >>>>>>>> On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> >>>>>>>> <brouer@redhat.com> wrote:
> >>>>>>>>>
> >>>>>>>>> When driver developers add XDP-hints kfuncs for RX hash it is
> >>>>>>>>> practical to print the return code in bpf_printk trace pipe log.
> >>>>>>>>>
> >>>>>>>>> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> >>>>>>>>> as this makes it easier to spot poor quality hashes.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> >>>>>>>>> ---
> >>>>>>>>>    .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> >>>>>>>>>    tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> >>>>>>>>>    2 files changed, 10 insertions(+), 4 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> >>>>>>>>> index 40c17adbf483..ce07010e4d48 100644
> >>>>>>>>> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> >>>>>>>>> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> >>>>>>>>> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> >>>>>>>>>                   meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> >>>>>>>>>           }
> >>>>>>>>>
> >>>>>>>>> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> >>>>>>>>> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> >>>>>>>>> -       else
> >>>>>>>>> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> >>>>>>>>> +       if (ret >= 0) {
> >>>>>>>>> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> >>>>>>>>> +       } else {
> >>>>>>>>> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> >>>>>>>>>                   meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> >>>>>>>>> +       }
> >>>>>>>>>
> >>>>>>>>>           return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> >>>>>>>>>    }
> >>>>>>>>> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> >>>>>>>>> index 400bfe19abfe..f3ec07ccdc95 100644
> >>>>>>>>> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> >>>>>>>>> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> >>>>>>>>> @@ -3,6 +3,9 @@
> >>>>>>>>>    /* Reference program for verifying XDP metadata on real HW. Functional test
> >>>>>>>>>     * only, doesn't test the performance.
> >>>>>>>>>     *
> >>>>>>>>> + * BPF-prog bpf_printk info outout can be access via
> >>>>>>>>> + * /sys/kernel/debug/tracing/trace_pipe
> >>>>>>>>
> >>>>>>>> s/outout/output/
> >>>>>>>>
> >>>>>>>
> >>>>>>> Fixed in V3
> >>>>>>>
> >>>>>>>> But let's maybe drop it? If you want to make it more usable, let's
> >>>>>>>> have a separate patch to enable tracing and periodically dump it to
> >>>>>>>> the console instead (as previously discussed).
> >>>>>>>
> >>>>>>> Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> >>>>>>> setting in
> >>>>>>> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> >>>>>>>
> >>>>>>> We likely need a followup patch that adds a BPF config switch that can
> >>>>>>> disable bpf_printk calls, because this adds overhead and thus affects
> >>>>>>> the timestamps.
> >>>>>>
> >>>>>> No. This is by design.
> >>>>>> Do not use bpf_printk* in production.
>
> I fully agree do not use bpf_printk in *production*.
>
> >>>>>
> >>>>> But that's not for the production? xdp_hw_metadata is a small tool to
> >>>>> verify that the metadata being dumped is correct (during the
> >>>>> development).
> >>>>> We have a proper (less verbose) selftest in
> >>>>> {progs,prog_tests}/xdp_metadata.c (over veth).
> >>>>> This xdp_hw_metadata was supposed to be used for running it against
> >>>>> the real hardware, so having as much debugging at hand as possible
> >>>>> seems helpful? (at least it was helpful to me when playing with mlx4)
>
> My experience when developing these kfuncs for igc (real hardware), this
> "tool" xdp_hw_metadata was super helpful, because it was very verbose
> (and I was juggling reading chip registers BE/LE and see patch1 a buggy
> implementation for RX-hash).
>
> As I wrote in cover-letter, I recommend other driver developers to do
> the same, because it really help speed up the development. In theory
> xdp_hw_metadata doesn't belong in selftests directory and IMHO it should
> have been placed in samples/bpf/, but given the relationship with real
> selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
> keep here.
>
>
> >>>>
> >>>> The only use of bpf_printk is for debugging of bpf progs themselves.
> >>>> It should not be used in any tool.
> >>>
> >>> Hmm, good point. I guess it also means we won't have to mess with
> >>> enabling/dumping ftrace (and don't need this comment about cat'ing the
> >>> file).
> >>> Jesper, maybe we can instead pass the status of those
> >>> bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
> >>> from the userspace if needed.
> >>
> >> There are so many other ways for bpf prog to communicate with user space.
> >> Use ringbuf, perf_event buffer, global vars, maps, etc.
> >> trace_pipe is debug only because it's global and will conflict with
> >> all other debug sessions.
>
> I want to highlight above paragraph: It is very true for production
> code. (Anyone Googling this pay attention to above paragraph).
>
> >
> > 👍 makes sense, ty! hopefully we won't have to add a separate channel
> > for those and can (ab)use the metadata area.
> >
>
> Proposed solution: How about default disabling the bpf_printk's via a
> macro define, and then driver developer can manually reenable them
> easily via a single define, to enable a debugging mode.
>
> I was only watching /sys/kernel/debug/tracing/trace_pipe when I was
> debugging a driver issue.  Thus, an extra step of modifying a single
> define in BPF seems easier, than instrumenting my driver with printk.

It's certainly fine to have commented out bpf_printk in selftests
and sample code.
But the patch does:
+       if (ret >= 0) {
+               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
+       } else {
+               bpf_printk("rx_hash not-avail errno:%d", ret);
                meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
+       }

It feels that printk is the only thing that provides the signal to the user.
Doing s/ret >= 0/true/ won't make any difference to selftest and
to the consumer and that's my main concern with such selftest/samples.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-23 17:35                     ` Alexei Starovoitov
@ 2023-03-23 17:47                       ` Stanislav Fomichev
  2023-03-24 12:14                         ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-23 17:47 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jesper Dangaard Brouer, Jesper Dangaard Brouer, bpf,
	Network Development, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Alexander Lobakin, Larysa Zaremba, xdp-hints,
	anthony.l.nguyen, Song, Yoong Siang, Ong, Boon Leong,
	intel-wired-lan, Paolo Abeni, Jesse Brandeburg, Jakub Kicinski,
	Eric Dumazet, John Fastabend, Jesper Dangaard Brouer,
	David S. Miller

On Thu, Mar 23, 2023 at 10:35 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Mar 23, 2023 at 1:51 AM Jesper Dangaard Brouer
> <jbrouer@redhat.com> wrote:
> >
> >
> > On 22/03/2023 20.33, Stanislav Fomichev wrote:
> > > On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > >>
> > >> On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >>>
> > >>> On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> > >>> <alexei.starovoitov@gmail.com> wrote:
> > >>>>
> > >>>> On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >>>>>
> > >>>>> On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > >>>>> <alexei.starovoitov@gmail.com> wrote:
> > >>>>>>
> > >>>>>> On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > >>>>>> <jbrouer@redhat.com> wrote:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > >>>>>>>> On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
> > >>>>>>>> <brouer@redhat.com> wrote:
> > >>>>>>>>>
> > >>>>>>>>> When driver developers add XDP-hints kfuncs for RX hash it is
> > >>>>>>>>> practical to print the return code in bpf_printk trace pipe log.
> > >>>>>>>>>
> > >>>>>>>>> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
> > >>>>>>>>> as this makes it easier to spot poor quality hashes.
> > >>>>>>>>>
> > >>>>>>>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > >>>>>>>>> ---
> > >>>>>>>>>    .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > >>>>>>>>>    tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > >>>>>>>>>    2 files changed, 10 insertions(+), 4 deletions(-)
> > >>>>>>>>>
> > >>>>>>>>> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > >>>>>>>>> index 40c17adbf483..ce07010e4d48 100644
> > >>>>>>>>> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > >>>>>>>>> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > >>>>>>>>> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > >>>>>>>>>                   meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
> > >>>>>>>>>           }
> > >>>>>>>>>
> > >>>>>>>>> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
> > >>>>>>>>> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
> > >>>>>>>>> -       else
> > >>>>>>>>> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
> > >>>>>>>>> +       if (ret >= 0) {
> > >>>>>>>>> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> > >>>>>>>>> +       } else {
> > >>>>>>>>> +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > >>>>>>>>>                   meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> > >>>>>>>>> +       }
> > >>>>>>>>>
> > >>>>>>>>>           return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
> > >>>>>>>>>    }
> > >>>>>>>>> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > >>>>>>>>> index 400bfe19abfe..f3ec07ccdc95 100644
> > >>>>>>>>> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > >>>>>>>>> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > >>>>>>>>> @@ -3,6 +3,9 @@
> > >>>>>>>>>    /* Reference program for verifying XDP metadata on real HW. Functional test
> > >>>>>>>>>     * only, doesn't test the performance.
> > >>>>>>>>>     *
> > >>>>>>>>> + * BPF-prog bpf_printk info outout can be access via
> > >>>>>>>>> + * /sys/kernel/debug/tracing/trace_pipe
> > >>>>>>>>
> > >>>>>>>> s/outout/output/
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> Fixed in V3
> > >>>>>>>
> > >>>>>>>> But let's maybe drop it? If you want to make it more usable, let's
> > >>>>>>>> have a separate patch to enable tracing and periodically dump it to
> > >>>>>>>> the console instead (as previously discussed).
> > >>>>>>>
> > >>>>>>> Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
> > >>>>>>> setting in
> > >>>>>>> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > >>>>>>>
> > >>>>>>> We likely need a followup patch that adds a BPF config switch that can
> > >>>>>>> disable bpf_printk calls, because this adds overhead and thus affects
> > >>>>>>> the timestamps.
> > >>>>>>
> > >>>>>> No. This is by design.
> > >>>>>> Do not use bpf_printk* in production.
> >
> > I fully agree do not use bpf_printk in *production*.
> >
> > >>>>>
> > >>>>> But that's not for the production? xdp_hw_metadata is a small tool to
> > >>>>> verify that the metadata being dumped is correct (during the
> > >>>>> development).
> > >>>>> We have a proper (less verbose) selftest in
> > >>>>> {progs,prog_tests}/xdp_metadata.c (over veth).
> > >>>>> This xdp_hw_metadata was supposed to be used for running it against
> > >>>>> the real hardware, so having as much debugging at hand as possible
> > >>>>> seems helpful? (at least it was helpful to me when playing with mlx4)
> >
> > My experience when developing these kfuncs for igc (real hardware), this
> > "tool" xdp_hw_metadata was super helpful, because it was very verbose
> > (and I was juggling reading chip registers BE/LE and see patch1 a buggy
> > implementation for RX-hash).
> >
> > As I wrote in cover-letter, I recommend other driver developers to do
> > the same, because it really help speed up the development. In theory
> > xdp_hw_metadata doesn't belong in selftests directory and IMHO it should
> > have been placed in samples/bpf/, but given the relationship with real
> > selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
> > keep here.
> >
> >
> > >>>>
> > >>>> The only use of bpf_printk is for debugging of bpf progs themselves.
> > >>>> It should not be used in any tool.
> > >>>
> > >>> Hmm, good point. I guess it also means we won't have to mess with
> > >>> enabling/dumping ftrace (and don't need this comment about cat'ing the
> > >>> file).
> > >>> Jesper, maybe we can instead pass the status of those
> > >>> bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
> > >>> from the userspace if needed.
> > >>
> > >> There are so many other ways for bpf prog to communicate with user space.
> > >> Use ringbuf, perf_event buffer, global vars, maps, etc.
> > >> trace_pipe is debug only because it's global and will conflict with
> > >> all other debug sessions.
> >
> > I want to highlight above paragraph: It is very true for production
> > code. (Anyone Googling this pay attention to above paragraph).
> >
> > >
> > > 👍 makes sense, ty! hopefully we won't have to add a separate channel
> > > for those and can (ab)use the metadata area.
> > >
> >
> > Proposed solution: How about default disabling the bpf_printk's via a
> > macro define, and then driver developer can manually reenable them
> > easily via a single define, to enable a debugging mode.
> >
> > I was only watching /sys/kernel/debug/tracing/trace_pipe when I was
> > debugging a driver issue.  Thus, an extra step of modifying a single
> > define in BPF seems easier, than instrumenting my driver with printk.
>
> It's certainly fine to have commented out bpf_printk in selftests
> and sample code.
> But the patch does:
> +       if (ret >= 0) {
> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
> +       } else {
> +               bpf_printk("rx_hash not-avail errno:%d", ret);
>                 meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
> +       }
>
> It feels that printk is the only thing that provides the signal to the user.
> Doing s/ret >= 0/true/ won't make any difference to selftest and
> to the consumer and that's my main concern with such selftest/samples.

I agree, having this signal delivered to the user (without the ifdefs)
seems like a better way to go.
Seems trivial to do something like the following below? (untested,
doesn't compile, gmail-copy-pasted so don't try to apply it)

diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index 4c55b4d79d3d..061c410f68ea 100644
--- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
@@ -12,6 +12,9 @@ struct {
  __type(value, __u32);
 } xsk SEC(".maps");

+int received;
+int dropped;
+
 extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
  __u64 *timestamp) __ksym;
 extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx,
@@ -52,11 +55,11 @@ int rx(struct xdp_md *ctx)
  if (udp->dest != bpf_htons(9091))
  return XDP_PASS;

- bpf_printk("forwarding UDP:9091 to AF_XDP");
+ __sync_fetch_and_add(&received, 1);

  ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
  if (ret != 0) {
- bpf_printk("bpf_xdp_adjust_meta returned %d", ret);
+ __sync_fetch_and_add(&dropped, 1);
  return XDP_PASS;
  }

@@ -65,19 +68,12 @@ int rx(struct xdp_md *ctx)
  meta = data_meta;

  if (meta + 1 > data) {
- bpf_printk("bpf_xdp_adjust_meta doesn't appear to work");
+ __sync_fetch_and_add(&dropped, 1);
  return XDP_PASS;
  }

- if (!bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp))
- bpf_printk("populated rx_timestamp with %llu", meta->rx_timestamp);
- else
- meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
-
- if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
- bpf_printk("populated rx_hash with %u", meta->rx_hash);
- else
- meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
+ meta->rx_timestamp_ret = bpf_xdp_metadata_rx_timestamp(ctx,
&meta->rx_timestamp);
+ meta->rx_hash_ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);

  return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
 }
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c
b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 1c8acb68b977..a4ea742547b5 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -140,8 +140,17 @@ static void verify_xdp_metadata(void *data)

  meta = data - sizeof(*meta);

- printf("rx_timestamp: %llu\n", meta->rx_timestamp);
- printf("rx_hash: %u\n", meta->rx_hash);
+ printf("received: %d dropped: %d\n", obj->xxx->received, obj->xxx->dropped);
+
+ if (meta->rx_timestamp_ret)
+ printf("rx_timestamp errno: %d\n", meta->rx_timestamp_ret);
+ else
+ printf("rx_timestamp: %llu\n", meta->rx_timestamp);
+
+ if (meta->rx_hash_ret)
+ printf("rx_hash errno: %d\n", meta->rx_hash_ret);
+ else
+ printf("rx_hash: %u\n", meta->rx_hash);
 }

 static void verify_skb_metadata(int fd)
diff --git a/tools/testing/selftests/bpf/xdp_metadata.h
b/tools/testing/selftests/bpf/xdp_metadata.h
index f6780fbb0a21..179f8d902059 100644
--- a/tools/testing/selftests/bpf/xdp_metadata.h
+++ b/tools/testing/selftests/bpf/xdp_metadata.h
@@ -10,6 +10,8 @@
 #endif

 struct xdp_meta {
+ int rx_timestamp_ret;
  __u64 rx_timestamp;
+ int rx_hash_ret;
  __u32 rx_hash;
 };

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-23 17:47                       ` Stanislav Fomichev
@ 2023-03-24 12:14                         ` Jesper Dangaard Brouer
  2023-03-24 19:40                           ` Stanislav Fomichev
  0 siblings, 1 reply; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-24 12:14 UTC (permalink / raw)
  To: Stanislav Fomichev, Alexei Starovoitov
  Cc: brouer, Jesper Dangaard Brouer, bpf, Network Development,
	Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
	Alexander Lobakin, Larysa Zaremba, xdp-hints, anthony.l.nguyen,
	Song, Yoong Siang, Ong, Boon Leong, intel-wired-lan, Paolo Abeni,
	Jesse Brandeburg, Jakub Kicinski, Eric Dumazet, John Fastabend,
	Jesper Dangaard Brouer, David S. Miller



On 23/03/2023 18.47, Stanislav Fomichev wrote:
> On Thu, Mar 23, 2023 at 10:35 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Thu, Mar 23, 2023 at 1:51 AM Jesper Dangaard Brouer
>> <jbrouer@redhat.com> wrote:
>>>
>>>
>>> On 22/03/2023 20.33, Stanislav Fomichev wrote:
>>>> On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>
>>>>> On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev <sdf@google.com> wrote:
>>>>>>
>>>>>> On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
>>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>>
>>>>>>> On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev <sdf@google.com> wrote:
>>>>>>>>
>>>>>>>> On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
>>>>>>>> <alexei.starovoitov@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
>>>>>>>>> <jbrouer@redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 21/03/2023 19.47, Stanislav Fomichev wrote:
>>>>>>>>>>> On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard Brouer
>>>>>>>>>>> <brouer@redhat.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> When driver developers add XDP-hints kfuncs for RX hash it is
>>>>>>>>>>>> practical to print the return code in bpf_printk trace pipe log.
>>>>>>>>>>>>
>>>>>>>>>>>> Print hash value as a hex value, both AF_XDP userspace and bpf_prog,
>>>>>>>>>>>> as this makes it easier to spot poor quality hashes.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>     .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
>>>>>>>>>>>>     tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
>>>>>>>>>>>>     2 files changed, 10 insertions(+), 4 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>>>>>>>>>>>> index 40c17adbf483..ce07010e4d48 100644
>>>>>>>>>>>> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>>>>>>>>>>>> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
>>>>>>>>>>>> @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
>>>>>>>>>>>>                    meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */
>>>>>>>>>>>>            }
>>>>>>>>>>>>
>>>>>>>>>>>> -       if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash))
>>>>>>>>>>>> -               bpf_printk("populated rx_hash with %u", meta->rx_hash);
>>>>>>>>>>>> -       else
>>>>>>>>>>>> +       ret = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash);
>>>>>>>>>>>> +       if (ret >= 0) {
>>>>>>>>>>>> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
>>>>>>>>>>>> +       } else {
>>>>>>>>>>>> +               bpf_printk("rx_hash not-avail errno:%d", ret);
>>>>>>>>>>>>                    meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
>>>>>>>>>>>> +       }
>>>>>>>>>>>>
>>>>>>>>>>>>            return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
>>>>>>>>>>>>     }
>>>>>>>>>>>> diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
>>>>>>>>>>>> index 400bfe19abfe..f3ec07ccdc95 100644
>>>>>>>>>>>> --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
>>>>>>>>>>>> +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
>>>>>>>>>>>> @@ -3,6 +3,9 @@
>>>>>>>>>>>>     /* Reference program for verifying XDP metadata on real HW. Functional test
>>>>>>>>>>>>      * only, doesn't test the performance.
>>>>>>>>>>>>      *
>>>>>>>>>>>> + * BPF-prog bpf_printk info outout can be access via
>>>>>>>>>>>> + * /sys/kernel/debug/tracing/trace_pipe
>>>>>>>>>>>
>>>>>>>>>>> s/outout/output/
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Fixed in V3
>>>>>>>>>>
>>>>>>>>>>> But let's maybe drop it? If you want to make it more usable, let's
>>>>>>>>>>> have a separate patch to enable tracing and periodically dump it to
>>>>>>>>>>> the console instead (as previously discussed).
>>>>>>>>>>
>>>>>>>>>> Cat'ing /sys/kernel/debug/tracing/trace_pipe work for me regardless of
>>>>>>>>>> setting in
>>>>>>>>>> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
>>>>>>>>>>
>>>>>>>>>> We likely need a followup patch that adds a BPF config switch that can
>>>>>>>>>> disable bpf_printk calls, because this adds overhead and thus affects
>>>>>>>>>> the timestamps.
>>>>>>>>>
>>>>>>>>> No. This is by design.
>>>>>>>>> Do not use bpf_printk* in production.
>>>
>>> I fully agree do not use bpf_printk in *production*.
>>>
>>>>>>>>
>>>>>>>> But that's not for the production? xdp_hw_metadata is a small tool to
>>>>>>>> verify that the metadata being dumped is correct (during the
>>>>>>>> development).
>>>>>>>> We have a proper (less verbose) selftest in
>>>>>>>> {progs,prog_tests}/xdp_metadata.c (over veth).
>>>>>>>> This xdp_hw_metadata was supposed to be used for running it against
>>>>>>>> the real hardware, so having as much debugging at hand as possible
>>>>>>>> seems helpful? (at least it was helpful to me when playing with mlx4)
>>>
>>> My experience when developing these kfuncs for igc (real hardware), this
>>> "tool" xdp_hw_metadata was super helpful, because it was very verbose
>>> (and I was juggling reading chip registers BE/LE and see patch1 a buggy
>>> implementation for RX-hash).
>>>
>>> As I wrote in cover-letter, I recommend other driver developers to do
>>> the same, because it really help speed up the development. In theory
>>> xdp_hw_metadata doesn't belong in selftests directory and IMHO it should
>>> have been placed in samples/bpf/, but given the relationship with real
>>> selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
>>> keep here.
>>>
>>>
>>>>>>>
>>>>>>> The only use of bpf_printk is for debugging of bpf progs themselves.
>>>>>>> It should not be used in any tool.
>>>>>>
>>>>>> Hmm, good point. I guess it also means we won't have to mess with
>>>>>> enabling/dumping ftrace (and don't need this comment about cat'ing the
>>>>>> file).
>>>>>> Jesper, maybe we can instead pass the status of those
>>>>>> bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump this info
>>>>>> from the userspace if needed.
>>>>>
>>>>> There are so many other ways for bpf prog to communicate with user space.
>>>>> Use ringbuf, perf_event buffer, global vars, maps, etc.
>>>>> trace_pipe is debug only because it's global and will conflict with
>>>>> all other debug sessions.
>>>
>>> I want to highlight above paragraph: It is very true for production
>>> code. (Anyone Googling this pay attention to above paragraph).
>>>
>>>>
>>>> 👍 makes sense, ty! hopefully we won't have to add a separate channel
>>>> for those and can (ab)use the metadata area.
>>>>
>>>
>>> Proposed solution: How about default disabling the bpf_printk's via a
>>> macro define, and then driver developer can manually reenable them
>>> easily via a single define, to enable a debugging mode.
>>>
>>> I was only watching /sys/kernel/debug/tracing/trace_pipe when I was
>>> debugging a driver issue.  Thus, an extra step of modifying a single
>>> define in BPF seems easier, than instrumenting my driver with printk.
>>
>> It's certainly fine to have commented out bpf_printk in selftests
>> and sample code.
>> But the patch does:
>> +       if (ret >= 0) {
>> +               bpf_printk("populated rx_hash with 0x%08X", meta->rx_hash);
>> +       } else {
>> +               bpf_printk("rx_hash not-avail errno:%d", ret);
>>                  meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */
>> +       }
>>
>> It feels that printk is the only thing that provides the signal to the user.
>> Doing s/ret >= 0/true/ won't make any difference to selftest and
>> to the consumer and that's my main concern with such selftest/samples.
> 
> I agree, having this signal delivered to the user (without the ifdefs)
> seems like a better way to go.

I agree that we have a signal that we are not delivering to the user.

Just so we are on the same page, let me explain how above code is
problematic. As the rx_hash value zero have two meanings:

  (1) Negative 'ret' set rx_hash=0 as err signal to AF_XDP "user"
  (2) Hardware set rx_hash=0, when hash-type == IGC_RSS_TYPE_NO_HASH

Case (2) happens for all L2 packets e.g. ARP packets.  See in patch-1
where I map IGC_RSS_TYPE_NO_HASH to netstack type PKT_HASH_TYPE_L2.
I did consider return errno/negative number for IGC_RSS_TYPE_NO_HASH,
but I wanted to keep kfunc code as simple as possible (both for speed
and if we need to unroll as byte-code later). As i225 hardware still
writes zero into hash word I choose to keep code simple.


IMHO this symptom is related to an API problem of the kfunc, that
doesn't provide the hash-type.

> Seems trivial to do something like the following below? (untested,
> doesn't compile, gmail-copy-pasted so don't try to apply it)
> 

If the kfunc provided the hash-type, which will be a positive number.
Then I would add a signed integer to meta, which can store the hash-type
or the negative errno number.


> diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> index 4c55b4d79d3d..061c410f68ea 100644
> --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> @@ -12,6 +12,9 @@ struct {
>    __type(value, __u32);
>   } xsk SEC(".maps");
> 
> +int received;
> +int dropped;
> +
>   extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
>    __u64 *timestamp) __ksym;
>   extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx,
> @@ -52,11 +55,11 @@ int rx(struct xdp_md *ctx)
>    if (udp->dest != bpf_htons(9091))
>    return XDP_PASS;

It we wanted to make this program user "friendly", we should also count
packets that doesn't get redirected to AF_XDP "user" and instead skipped.

> - bpf_printk("forwarding UDP:9091 to AF_XDP");
> + __sync_fetch_and_add(&received, 1);
> 
>    ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
>    if (ret != 0) {
> - bpf_printk("bpf_xdp_adjust_meta returned %d", ret);
> + __sync_fetch_and_add(&dropped, 1);
>    return XDP_PASS;

Is it a "dropped" or a "skipped" packet (return XDP_PASS)?

>    }
[...]

--Jesper


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [xdp-hints] Re: [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info
  2023-03-24 12:14                         ` Jesper Dangaard Brouer
@ 2023-03-24 19:40                           ` Stanislav Fomichev
  0 siblings, 0 replies; 20+ messages in thread
From: Stanislav Fomichev @ 2023-03-24 19:40 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexei Starovoitov, brouer, bpf, Network Development,
	Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
	Alexander Lobakin, Larysa Zaremba, xdp-hints, anthony.l.nguyen,
	yoong.siang.song, boon.leong.ong, intel-wired-lan, Paolo Abeni,
	Jesse Brandeburg, Jakub Kicinski, Eric Dumazet, John Fastabend,
	Jesper Dangaard Brouer, David S. Miller

On 03/24, Jesper Dangaard Brouer wrote:


> On 23/03/2023 18.47, Stanislav Fomichev wrote:
> > On Thu, Mar 23, 2023 at 10:35 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Mar 23, 2023 at 1:51 AM Jesper Dangaard Brouer
> > > <jbrouer@redhat.com> wrote:
> > > >
> > > >
> > > > On 22/03/2023 20.33, Stanislav Fomichev wrote:
> > > > > On Wed, Mar 22, 2023 at 12:30 PM Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 22, 2023 at 12:23 PM Stanislav Fomichev  
> <sdf@google.com> wrote:
> > > > > > >
> > > > > > > On Wed, Mar 22, 2023 at 12:17 PM Alexei Starovoitov
> > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Mar 22, 2023 at 12:00 PM Stanislav Fomichev  
> <sdf@google.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Mar 22, 2023 at 9:07 AM Alexei Starovoitov
> > > > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 22, 2023 at 9:05 AM Jesper Dangaard Brouer
> > > > > > > > > > <jbrouer@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On 21/03/2023 19.47, Stanislav Fomichev wrote:
> > > > > > > > > > > > On Tue, Mar 21, 2023 at 6:47 AM Jesper Dangaard  
> Brouer
> > > > > > > > > > > > <brouer@redhat.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > When driver developers add XDP-hints kfuncs for  
> RX hash it is
> > > > > > > > > > > > > practical to print the return code in bpf_printk  
> trace pipe log.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Print hash value as a hex value, both AF_XDP  
> userspace and bpf_prog,
> > > > > > > > > > > > > as this makes it easier to spot poor quality  
> hashes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jesper Dangaard Brouer  
> <brouer@redhat.com>
> > > > > > > > > > > > > ---
> > > > > > > > > > > >  
> >     .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |    9 ++++++---
> > > > > > > > > > > > >      
> tools/testing/selftests/bpf/xdp_hw_metadata.c      |    5 ++++-
> > > > > > > > > > > > >     2 files changed, 10 insertions(+), 4  
> deletions(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git  
> a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c  
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > index 40c17adbf483..ce07010e4d48 100644
> > > > > > > > > > > > > ---  
> a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > +++  
> b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > > > > > > > > > > > > @@ -77,10 +77,13 @@ int rx(struct xdp_md *ctx)
> > > > > > > > > > > > >                    meta->rx_timestamp = 0; /*  
> Used by AF_XDP as not avail signal */
> > > > > > > > > > > > >            }
> > > > > > > > > > > > >
> > > > > > > > > > > > > -       if (!bpf_xdp_metadata_rx_hash(ctx,  
> &meta->rx_hash))
> > > > > > > > > > > > > -               bpf_printk("populated rx_hash  
> with %u", meta->rx_hash);
> > > > > > > > > > > > > -       else
> > > > > > > > > > > > > +       ret = bpf_xdp_metadata_rx_hash(ctx,  
> &meta->rx_hash);
> > > > > > > > > > > > > +       if (ret >= 0) {
> > > > > > > > > > > > > +               bpf_printk("populated rx_hash  
> with 0x%08X", meta->rx_hash);
> > > > > > > > > > > > > +       } else {
> > > > > > > > > > > > > +               bpf_printk("rx_hash not-avail  
> errno:%d", ret);
> > > > > > > > > > > > >                    meta->rx_hash = 0; /* Used by  
> AF_XDP as not avail signal */
> > > > > > > > > > > > > +       }
> > > > > > > > > > > > >
> > > > > > > > > > > > >            return bpf_redirect_map(&xsk,  
> ctx->rx_queue_index, XDP_PASS);
> > > > > > > > > > > > >     }
> > > > > > > > > > > > > diff --git  
> a/tools/testing/selftests/bpf/xdp_hw_metadata.c  
> b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > index 400bfe19abfe..f3ec07ccdc95 100644
> > > > > > > > > > > > > ---  
> a/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > +++  
> b/tools/testing/selftests/bpf/xdp_hw_metadata.c
> > > > > > > > > > > > > @@ -3,6 +3,9 @@
> > > > > > > > > > > > >     /* Reference program for verifying XDP  
> metadata on real HW. Functional test
> > > > > > > > > > > > >      * only, doesn't test the performance.
> > > > > > > > > > > > >      *
> > > > > > > > > > > > > + * BPF-prog bpf_printk info outout can be access  
> via
> > > > > > > > > > > > > + * /sys/kernel/debug/tracing/trace_pipe
> > > > > > > > > > > >
> > > > > > > > > > > > s/outout/output/
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Fixed in V3
> > > > > > > > > > >
> > > > > > > > > > > > But let's maybe drop it? If you want to make it  
> more usable, let's
> > > > > > > > > > > > have a separate patch to enable tracing and  
> periodically dump it to
> > > > > > > > > > > > the console instead (as previously discussed).
> > > > > > > > > > >
> > > > > > > > > > > Cat'ing /sys/kernel/debug/tracing/trace_pipe work for  
> me regardless of
> > > > > > > > > > > setting in
> > > > > > > > > > >  
> /sys/kernel/debug/tracing/events/bpf_trace/bpf_trace_printk/enable
> > > > > > > > > > >
> > > > > > > > > > > We likely need a followup patch that adds a BPF  
> config switch that can
> > > > > > > > > > > disable bpf_printk calls, because this adds overhead  
> and thus affects
> > > > > > > > > > > the timestamps.
> > > > > > > > > >
> > > > > > > > > > No. This is by design.
> > > > > > > > > > Do not use bpf_printk* in production.
> > > >
> > > > I fully agree do not use bpf_printk in *production*.
> > > >
> > > > > > > > >
> > > > > > > > > But that's not for the production? xdp_hw_metadata is a  
> small tool to
> > > > > > > > > verify that the metadata being dumped is correct (during  
> the
> > > > > > > > > development).
> > > > > > > > > We have a proper (less verbose) selftest in
> > > > > > > > > {progs,prog_tests}/xdp_metadata.c (over veth).
> > > > > > > > > This xdp_hw_metadata was supposed to be used for running  
> it against
> > > > > > > > > the real hardware, so having as much debugging at hand as  
> possible
> > > > > > > > > seems helpful? (at least it was helpful to me when  
> playing with mlx4)
> > > >
> > > > My experience when developing these kfuncs for igc (real hardware),  
> this
> > > > "tool" xdp_hw_metadata was super helpful, because it was very  
> verbose
> > > > (and I was juggling reading chip registers BE/LE and see patch1 a  
> buggy
> > > > implementation for RX-hash).
> > > >
> > > > As I wrote in cover-letter, I recommend other driver developers to  
> do
> > > > the same, because it really help speed up the development. In theory
> > > > xdp_hw_metadata doesn't belong in selftests directory and IMHO it  
> should
> > > > have been placed in samples/bpf/, but given the relationship with  
> real
> > > > selftest {progs,prog_tests}/xdp_metadata.c I think it makes sense to
> > > > keep here.
> > > >
> > > >
> > > > > > > >
> > > > > > > > The only use of bpf_printk is for debugging of bpf progs  
> themselves.
> > > > > > > > It should not be used in any tool.
> > > > > > >
> > > > > > > Hmm, good point. I guess it also means we won't have to mess  
> with
> > > > > > > enabling/dumping ftrace (and don't need this comment about  
> cat'ing the
> > > > > > > file).
> > > > > > > Jesper, maybe we can instead pass the status of those
> > > > > > > bpf_xdp_metadata_xxx kfuncs via 'struct xdp_meta'? And dump  
> this info
> > > > > > > from the userspace if needed.
> > > > > >
> > > > > > There are so many other ways for bpf prog to communicate with  
> user space.
> > > > > > Use ringbuf, perf_event buffer, global vars, maps, etc.
> > > > > > trace_pipe is debug only because it's global and will conflict  
> with
> > > > > > all other debug sessions.
> > > >
> > > > I want to highlight above paragraph: It is very true for production
> > > > code. (Anyone Googling this pay attention to above paragraph).
> > > >
> > > > >
> > > > > 👍 makes sense, ty! hopefully we won't have to add a separate  
> channel
> > > > > for those and can (ab)use the metadata area.
> > > > >
> > > >
> > > > Proposed solution: How about default disabling the bpf_printk's via  
> a
> > > > macro define, and then driver developer can manually reenable them
> > > > easily via a single define, to enable a debugging mode.
> > > >
> > > > I was only watching /sys/kernel/debug/tracing/trace_pipe when I was
> > > > debugging a driver issue.  Thus, an extra step of modifying a single
> > > > define in BPF seems easier, than instrumenting my driver with  
> printk.
> > >
> > > It's certainly fine to have commented out bpf_printk in selftests
> > > and sample code.
> > > But the patch does:
> > > +       if (ret >= 0) {
> > > +               bpf_printk("populated rx_hash with 0x%08X",  
> meta->rx_hash);
> > > +       } else {
> > > +               bpf_printk("rx_hash not-avail errno:%d", ret);
> > >                  meta->rx_hash = 0; /* Used by AF_XDP as not avail  
> signal */
> > > +       }
> > >
> > > It feels that printk is the only thing that provides the signal to  
> the user.
> > > Doing s/ret >= 0/true/ won't make any difference to selftest and
> > > to the consumer and that's my main concern with such selftest/samples.
> >
> > I agree, having this signal delivered to the user (without the ifdefs)
> > seems like a better way to go.

> I agree that we have a signal that we are not delivering to the user.

> Just so we are on the same page, let me explain how above code is
> problematic. As the rx_hash value zero have two meanings:

>   (1) Negative 'ret' set rx_hash=0 as err signal to AF_XDP "user"
>   (2) Hardware set rx_hash=0, when hash-type == IGC_RSS_TYPE_NO_HASH

> Case (2) happens for all L2 packets e.g. ARP packets.  See in patch-1
> where I map IGC_RSS_TYPE_NO_HASH to netstack type PKT_HASH_TYPE_L2.
> I did consider return errno/negative number for IGC_RSS_TYPE_NO_HASH,
> but I wanted to keep kfunc code as simple as possible (both for speed
> and if we need to unroll as byte-code later). As i225 hardware still
> writes zero into hash word I choose to keep code simple.


> IMHO this symptom is related to an API problem of the kfunc, that
> doesn't provide the hash-type.

> > Seems trivial to do something like the following below? (untested,
> > doesn't compile, gmail-copy-pasted so don't try to apply it)
> >

> If the kfunc provided the hash-type, which will be a positive number.
> Then I would add a signed integer to meta, which can store the hash-type
> or the negative errno number.

What's wrong with storing the return value from the kfuncs as is?
We are not writing super optimized production code here, so should be
fine?

> > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > index 4c55b4d79d3d..061c410f68ea 100644
> > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
> > @@ -12,6 +12,9 @@ struct {
> >    __type(value, __u32);
> >   } xsk SEC(".maps");
> >
> > +int received;
> > +int dropped;
> > +
> >   extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx,
> >    __u64 *timestamp) __ksym;
> >   extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx,
> > @@ -52,11 +55,11 @@ int rx(struct xdp_md *ctx)
> >    if (udp->dest != bpf_htons(9091))
> >    return XDP_PASS;

> It we wanted to make this program user "friendly", we should also count
> packets that doesn't get redirected to AF_XDP "user" and instead skipped.

Sure, let's do that?

> > - bpf_printk("forwarding UDP:9091 to AF_XDP");
> > + __sync_fetch_and_add(&received, 1);
> >
> >    ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
> >    if (ret != 0) {
> > - bpf_printk("bpf_xdp_adjust_meta returned %d", ret);
> > + __sync_fetch_and_add(&dropped, 1);
> >    return XDP_PASS;

> Is it a "dropped" or a "skipped" packet (return XDP_PASS)?

Up to you. I was mostly throwing an idea on how to drop these bpf_printk
completely and report those events from the userspace.

> >    }
> [...]

> --Jesper


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-03-24 19:40 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-21 13:47 [xdp-hints] [PATCH bpf-next V2 0/6] XDP-hints kfuncs for Intel driver igc Jesper Dangaard Brouer
2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 1/6] igc: enable and fix RX hash usage by netstack Jesper Dangaard Brouer
2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 2/6] selftests/bpf: xdp_hw_metadata track more timestamps Jesper Dangaard Brouer
2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 3/6] selftests/bpf: xdp_hw_metadata RX hash return code info Jesper Dangaard Brouer
2023-03-21 18:47   ` [xdp-hints] " Stanislav Fomichev
2023-03-22 16:05     ` Jesper Dangaard Brouer
2023-03-22 16:07       ` Alexei Starovoitov
2023-03-22 19:00         ` Stanislav Fomichev
2023-03-22 19:17           ` Alexei Starovoitov
2023-03-22 19:22             ` Stanislav Fomichev
2023-03-22 19:30               ` Alexei Starovoitov
2023-03-22 19:33                 ` Stanislav Fomichev
2023-03-23  8:51                   ` Jesper Dangaard Brouer
2023-03-23 17:35                     ` Alexei Starovoitov
2023-03-23 17:47                       ` Stanislav Fomichev
2023-03-24 12:14                         ` Jesper Dangaard Brouer
2023-03-24 19:40                           ` Stanislav Fomichev
2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 4/6] igc: add igc_xdp_buff wrapper for xdp_buff in driver Jesper Dangaard Brouer
2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 5/6] igc: add XDP hints kfuncs for RX timestamp Jesper Dangaard Brouer
2023-03-21 13:47 ` [xdp-hints] [PATCH bpf-next V2 6/6] igc: add XDP hints kfuncs for RX hash Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox