From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mail.toke.dk (Postfix) with ESMTPS id 7E12D9BAB5C for ; Thu, 10 Nov 2022 15:32:39 +0100 (CET) Authentication-Results: mail.toke.dk; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=V5aOW/Be DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668090758; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aDaFAaQtB2soCtzuycmd7Dz9YFlmTjxmDM8DT+X5LJg=; b=V5aOW/BepwnoLPuF1SZyUoyY4V3x+qU54HM7F9i/ip6mXMt3SdgkFhwb4UdrPoBbkVUVx2 eYzW7vJCDtm6WPvRIkgtB2vuHndnwJIYAHlNKdEzjKDm8c9KlPDFksyPitH5c3rm9auSya fAJWwD2uV23CD6oTVlk5rsAZXJcxFeg= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-225-KIrw8MRFNL-IVEFf4Tj29w-1; Thu, 10 Nov 2022 09:32:35 -0500 X-MC-Unique: KIrw8MRFNL-IVEFf4Tj29w-1 Received: by mail-ed1-f70.google.com with SMTP id z15-20020a05640240cf00b00461b253c220so1647697edb.3 for ; Thu, 10 Nov 2022 06:32:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kKjCDDez5PjtMLjEbRFqRZ0Q9e8y5qhGRMnKFr/7yvE=; b=u16oNz4kYIXg2mdH4zIXTVRNnze/PHWe11QNyp7ZoB+gNr3hVpcBIhpri8F4wMOdE3 JJ2ORbOgHK1Qrl95WGYk+Yp5bOrKAVYzGJircJQOGM6sNTLDsp+6EDgLQZ4lwt5kzZc2 1CFj1Tr3txQxz/jM9hbH5cM1lqNAVPgfCFexERr9dQPR4Xra2RMmP0+6uQv5M+9GF26I dkJLJ/FhJ1xNSFmvseC6YvIjW+swF+pKchpT6wTSHWvdtvzauUZih/sk+CWQqybXW4RV v0/MOC0Chz3CePSmNNR9qbqSd9B8oohwZK9IrjlBtc06ikzNxjfObTug+4x0+ROrEhSY cdew== X-Gm-Message-State: ANoB5pno/LcveWIPxKLorgRWgruKXlBaRos9VFwpQELJlWVmE3KmoyQI Wj5Y8eEv40ZbcrcUpKahfKdawFdGVEpgVAT1t/SLd7LhEO5m8Oee+JE4gw4fkOzhdVMI64Otzzv dVgCowi31KcvSgPy69id9 X-Received: by 2002:a17:906:748:b0:7ae:8d01:8202 with SMTP id z8-20020a170906074800b007ae8d018202mr5362128ejb.384.1668090753681; Thu, 10 Nov 2022 06:32:33 -0800 (PST) X-Google-Smtp-Source: AA0mqf6UEJrPsczEooz7DX8InQa+zTUE9wf/exSIc56qc+Cf6pWb/J8hT4/bzL4bKtHmF+mnPM0WRw== X-Received: by 2002:a17:906:748:b0:7ae:8d01:8202 with SMTP id z8-20020a170906074800b007ae8d018202mr5362115ejb.384.1668090753302; Thu, 10 Nov 2022 06:32:33 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id j2-20020a17090623e200b007aa3822f4d2sm7396132ejg.17.2022.11.10.06.32.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Nov 2022 06:32:32 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 77CB87826D5; Thu, 10 Nov 2022 15:32:32 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: John Fastabend , Martin KaFai Lau , Stanislav Fomichev In-Reply-To: <636c533231572_13c9f42087c@john.notmuch> References: <20221104032532.1615099-1-sdf@google.com> <20221104032532.1615099-7-sdf@google.com> <187e89c3-d7de-7bec-c72e-d9d6eb5bcca0@linux.dev> <9a8fefe4-2fcb-95b7-cda0-06509feee78e@linux.dev> <6f57370f-7ec3-07dd-54df-04423cab6d1f@linux.dev> <87leokz8lq.fsf@toke.dk> <636c533231572_13c9f42087c@john.notmuch> X-Clacks-Overhead: GNU Terry Pratchett Date: Thu, 10 Nov 2022 15:32:32 +0100 Message-ID: <87v8nmyj5r.fsf@toke.dk> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: 6QM7UAYTOD5ZI4GSD3IQMDNVLUSM3ROL X-Message-ID-Hash: 6QM7UAYTOD5ZI4GSD3IQMDNVLUSM3ROL X-MailFrom: toke@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org, bpf@vger.kernel.org X-Mailman-Version: 3.3.6 Precedence: list Subject: [xdp-hints] Re: [RFC bpf-next v2 06/14] xdp: Carry over xdp metadata into skb context List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: John Fastabend writes: > Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> Snipping a bit of context to reply to this bit: >>=20 >> >>>> Can the xdp prog still change the metadata through xdp->data_meta? = tbh, I am not >> >>>> sure it is solid enough by asking the xdp prog not to use the same = random number >> >>>> in its own metadata + not to change the metadata through xdp->data_= meta after >> >>>> calling bpf_xdp_metadata_export_to_skb(). >> >>> >> >>> What do you think the usecase here might be? Or are you suggesting w= e >> >>> reject further access to data_meta after >> >>> bpf_xdp_metadata_export_to_skb somehow? >> >>> >> >>> If we want to let the programs override some of this >> >>> bpf_xdp_metadata_export_to_skb() metadata, it feels like we can add >> >>> more kfuncs instead of exposing the layout? >> >>> >> >>> bpf_xdp_metadata_export_to_skb(ctx); >> >>> bpf_xdp_metadata_export_skb_hash(ctx, 1234); >>=20 > > Hi Toke, > > Trying not to bifurcate your thread. Can I start a new one here to > elaborate on these use cases. I'm still a bit lost on any use case > for this that makes sense to actually deploy on a network. > >> There are several use cases for needing to access the metadata after >> calling bpf_xdp_metdata_export_to_skb(): >>=20 >> - Accessing the metadata after redirect (in a cpumap or devmap program, >> or on a veth device) > > I think for devmap there are still lots of opens how/where the skb > is even built. For veth it's pretty clear; i.e., when redirecting into containers. > For cpumap I'm a bit unsure what the use case is. For ice, mlx and > such you should use the hardware RSS if performance is top of mind. Hardware RSS works fine if your hardware supports the hashing you want; many do not. As an example, Jesper wrote this application that uses cpumap to divide out ISP customer traffic among different CPUs (solving an HTB scaling problem): https://github.com/xdp-project/xdp-cpumap-tc > And then for specific devices on cpumap (maybe realtime or ptp > things?) could we just throw it through the xdp_frame? Not sure what you mean here? Throw what through the xdp_frame? >> - Transferring the packet+metadata to AF_XDP > > In this case we have the metadata and AF_XDP program and XDP program > simply need to agree on metadata format. No need to have some magic > numbers and driver specific kfuncs. See my other reply to Martin: Yeah, for AF_XDP users that write their own kernel XDP programs, they can just do whatever they want. But many users just rely on the default program in libxdp, so having a standard format to include with that is useful. >> - Returning XDP_PASS, but accessing some of the metadata first (whether >> to read or change it) >>=20 > > I don't get this case? XDP_PASS should go to stack normally through > drivers build_skb routines. These will populate timestamp normally. > My guess is simply descriptor->skb load/store is cheaper than carrying > around this metadata and doing the call in BPF side. Anyways you > just built an entire skb and hit the stack I don't think you will > notice this noise in any benchmark. If you modify the packet before calling XDP_PASS you may want to update the metadata as well (for instance the RX hash, or in the future the metadata could also carry transport header offsets). -Toke