From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by mail.toke.dk (Postfix) with ESMTPS id DC14B8520A5 for ; Fri, 28 May 2021 16:35:48 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=haPUt9F0 Received: by mail-io1-xd2d.google.com with SMTP id z24so4387824ioi.3 for ; Fri, 28 May 2021 07:35:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:message-id:in-reply-to:references:subject :mime-version:content-transfer-encoding; bh=RVCfLlR9AJCKIEVDjBMLTdJHnBEqO6hgbC9ask8w04o=; b=haPUt9F0IJQtlaectEDAt87rd9moHP/0I5UPujwbgpFFzs73V9YQltxTkPvmWI9+na QzSnGx/MqCqlZaUY5MdksUNjIsiRo22yVV7mFbc9Z9JKuHxWKp83SA9dMPkvKce77rYU x8s709uJe/cKDCmyWwzmBDMGCq2Q4LUOFb183Ax2Vihb8DPwgERS8XrbAY2VfZSF2NEP ykpdRQysPxUt2laK3Z75ZMwirlMVfsgAH0EMZeKyNAPk0AKRGgzzzbt+s+06TZbFxBMr hzpFfs7efjWYmk3w+vWXHqZSTXScg7WH55liP5o+thBOWBQEsGYAf/P74TBBUJ671IC0 3ewQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to :references:subject:mime-version:content-transfer-encoding; bh=RVCfLlR9AJCKIEVDjBMLTdJHnBEqO6hgbC9ask8w04o=; b=K7sLGh0dSxUkp0R2WL9uY30JYyIDx7OTmnbNwbh5/fGyCQyrTsu3RB7FPFHWvHGXDb 1e2BkYxaD9pUYEJj3IggcxhmnkSEwfv+kytgdS0Euc9q+pfs5Pq9Q2pduCdmlUBgTXXm 4vHtZuM+brWb7ecGYCcOdO6Y2IZJhaz/NJl8xJBrXeb+NCLBjMfNCdV9Oh+CodDQCaKs vHgWLxQ3f7+IrURBYCNugDwxrO1O+AmGDSQ2r0JQTfd1raULFCCiZg/bZTlfvrrOLg5Q 7It593s+WtNHelEnPeaXaSF7B52AiwnKmg2ReS0Ih2yH17ieW3P8OpsQar7MgbT+Fym6 SbNQ== X-Gm-Message-State: AOAM530yMX1lB0NK2uFEFuoBaWwG+XtTXcpiwotTu8Pt6twRpcnh5gOC zQ+hFJo8jlShmvhq20mh4Dg= X-Google-Smtp-Source: ABdhPJxxsIrq9pnmliYtoQ8+/sH4noV9GStOLToiyGq9KJPk9FzvHB6CgIMYxKfpkQUs547DSemiZA== X-Received: by 2002:a02:1989:: with SMTP id b131mr9110466jab.54.1622212542707; Fri, 28 May 2021 07:35:42 -0700 (PDT) Received: from localhost ([172.243.157.240]) by smtp.gmail.com with ESMTPSA id k8sm2679562iov.53.2021.05.28.07.35.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 May 2021 07:35:42 -0700 (PDT) Date: Fri, 28 May 2021 07:35:34 -0700 From: John Fastabend To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= , John Fastabend , Andrii Nakryiko , John Fastabend Message-ID: <60b0ffb63a21a_1cf82089e@john-XPS-13-9370.notmuch> In-Reply-To: <87fsy7gqv7.fsf@toke.dk> References: <20210526125848.1c7adbb0@carbon> <60aeb01ebcd10_fe49208b8@john-XPS-13-9370.notmuch> <60aeeb5252147_19a622085a@john-XPS-13-9370.notmuch> <60b08442b18d5_1cf8208a0@john-XPS-13-9370.notmuch> <87fsy7gqv7.fsf@toke.dk> Subject: Re: XDP-hints: Howto support multiple BTF types per packet basis? Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: NVPZRTZMBF4SRNHABENNG4CDP3DCFOXM X-Message-ID-Hash: NVPZRTZMBF4SRNHABENNG4CDP3DCFOXM X-MailFrom: john.fastabend@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Jesper Dangaard Brouer , BPF-dev-list , Alexander Lobakin , "Karlsson, Magnus" , Magnus Karlsson , David Ahern , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Saeed Mahameed , "kurt@linutronix.de" , "Raczynski, Piotr" , "Zhang, Jessica" , "Maloor, Kishen" , "Gomes, Vinicius" , "Brandeburg, Jesse" , "Swiatkowski, Michal" , "Plantykow, Marta A" , "Desouza, Ederson" , "Song, Yoong Siang" , "Czapnik, Lukasz" , "Joseph, Jithu" , William Tu , Ong Boon Leong , xdp-hints@xdp-project.net X-Mailman-Version: 3.3.4 Precedence: list List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Toke H=C3=B8iland-J=C3=B8rgensen wrote: > John Fastabend writes: > = > >> > > union and independent set of BTFs are two different things, I'll= let > >> > > you guys figure out which one you need, but I replied how it cou= ld > >> > > look like in CO-RE world > >> > > >> > I think a union is sufficient and more aligned with how the > >> > hardware would actually work. > >> = > >> Sure. And I think those are two orthogonal concerns. You can start > >> with a single struct mynic_metadata with union inside it, and later > >> add the ability to swap mynic_metadata with another > >> mynic_metadata___v2 that will have a similar union but with a > >> different layout. > > > > Right and then you just have normal upgrade/downgrade problems with > > any struct. > > > > Seems like a workable path to me. But, need to circle back to the > > what we want to do with it part that Jesper replied to. > = > So while this seems to be a viable path for getting libbpf to do all th= e > relocations (and thanks for hashing that out, I did not have a good gri= p > of the details), doing it all in userspace means that there is no way > for the XDP program to react to changes once it has been loaded. So thi= s > leaves us with a selection of non-very-attractive options, IMO. I.e., > we would have to: I don't really understand what this means 'having XDP program to react to changes once it has been loaded.' What would a program look like thats dynamic? You can always version your metadata and write programs like this, if (meta->version =3D=3D VERSION1) {do_foo} else {do_bar} And then have a headeer, struct meta { int version; union ... // union of versions } I fail to see how a program could 'react' dynamically. An agent could load new programs dynamically into tail call maps of fentry with the need handlers, which would work as well and avoid unions. > = > - have to block any modifications to the hardware config that would > change the metadata format; this will probably result in irate users I'll need a concrete example if I swap out my parser block, I should also swap out my BPF for my shiny new protocol. I don't see how a user might write programs for things they've not configured hardware for yet. Leaving aside knobs like VLAN on/off, VXLAN on/off, and such which brings the next point. > = > - require XDP programs to deal with all possible metadata permutations > supported by that driver (by exporting them all via a BTF union or > similar); this means a potential for combinatorial explosion of confi= g > options and as NICs become programmable themselves I'm not even sure > if it's possible for the driver to know ahead of time I don't see the problem sorry. For current things that exist I can't think up too many fields vlan, timestamp, checksum(?), pkt_type, hash maybe. For programmable pipelines (P4) then I don't see a problem with reloading your program or swapping out a program. I don't see the value of adding a new protocol for example dynamically. Surely the hardware is going to get hit with a big reset anyways. > = > - throw up our hands and just let the user deal with it (i.e., to > nothing and so require XDP programs to be reloaded if the NIC config > changes); this is not very friendly and is likely to lead to subtle > bugs if an XDP program parses the metadata assuming it is in a > different format than it is I'm not opposed to user error causing logic bugs. If I give users power to reprogram their NICs they should be capabable of managing a few BPF programs. And if not then its a space where a distro/vendor should help them with tooling. > = > Given that hardware config changes are not just done by ethtool, but > also by things like running `tcpdump -j`, I really think we have to > assume that they can be quite dynamic; which IMO means we have to solve= > this as part of the initial design. And I have a hard time seeing how > this is possible without involving the kernel somehow. I guess here your talking about building an skb? Wouldn't it use whatever logic it uses today to include the timestamp. This is a bit of an aside from metadata in the BPF program. Building timestamps into skbs doesn't require BPF program to have the data. Or maybe the point is an XDP variant of tcpdump would like timestamps. But then it should be in the metadata IMO. > = > Unless I'm missing something? WDYT? Distilling above down. I think we disagree on how useful dynamic programs are because of two reasons. First I don't see a large list of common attributes that would make the union approach as painful as you fear. And two, I believe users who are touching core hardware firmware need to also be smart enough (or have smart tools) to swap out their BPF programs in the correct order so as to not create subtle races. I didn't do it here but if we agree walking through that program swap flow with firmware update would be useful. > = > -Toke > =