From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mail.toke.dk (Postfix) with ESMTPS id ACBE69C8D32 for ; Thu, 1 Dec 2022 00:01:09 +0100 (CET) Authentication-Results: mail.toke.dk; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Th1joS5u DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669849267; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=56TLxaxmQDA3wZXttP4KVscpW4vRU8KIrzMLSg98pxM=; b=Th1joS5uMKVKDmgQPZ5xoiWRpDkv7wdiFCPDQeM//7y6yorFu0sU7FG7eRLSrfUiECukBE 1GvD3fXiXp0G/RhAdRKgRh7wzSRDGL64kmH5gunZqFm23WXP4jGGAc3FxdpUsQJEufPlEL xiabkl3QtirUekVaIwr9so8BTYF6Ghg= Received: from mail-ej1-f72.google.com (mail-ej1-f72.google.com [209.85.218.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-582-UaeB8f__M2Wo15C-cU-fFw-1; Wed, 30 Nov 2022 18:01:05 -0500 X-MC-Unique: UaeB8f__M2Wo15C-cU-fFw-1 Received: by mail-ej1-f72.google.com with SMTP id ne29-20020a1709077b9d00b007c0905baae1so46893ejc.8 for ; Wed, 30 Nov 2022 15:01:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WtPoW3uGxrDpE4M8rJV3W7KqyYEeMHbReVLlNL7JOzU=; b=bsv4zo0ec3uXbJ+Fgm7Nj23IHyRTP7uNiDaSQ889OoDjKgxOkiIpPx2ktbSw7yHCxO 6EMw37R0/AlJ1vIxO9LZDgKRb3fOxIwy9NhfEOIgUNAjhfdB1bsq3IxJg8rEc/zQmle0 8K/1nJ21tEMZBlchCxI5Oe7CUK1g8aYpikwWHwr+PVZqd/XN4/lw9uTCmV01BU2/wGJG IEr9KORpgRvRKCVqeAYZKY+b2oA28vwZ2hmilM/AHKL5r/ppQr1qBv/7WTzj64Oa69Eu 8FyTPkL7LvTn3e9tz+R/J2hxp0JZhZ+m+DWJzuWv/tpJi2jvtnGmlIu8IdXX2VwrNsMB p/iQ== X-Gm-Message-State: ANoB5pkI7U1aqm3ScIRWJkLreTt8i7Vnl4Ku6U2rGA26SPi142rN0YeA vHj4OgCsMTeXfXzoFFlAhZakB2NHbIPTRR90j1Az8XWtuhavOhEGqK82ENQ/snhyO4WYaf1I0Nx hlOO1xigvFQ1wohzYKQlS X-Received: by 2002:aa7:c690:0:b0:46a:e2b8:1be9 with SMTP id n16-20020aa7c690000000b0046ae2b81be9mr21743366edq.182.1669849263492; Wed, 30 Nov 2022 15:01:03 -0800 (PST) X-Google-Smtp-Source: AA0mqf6Pc+jVN4AQZ2o6WxZy7QkFFj3Om63BsCTurb6PVOF/KZQtd5aZyoH2jGxdINcJJODRqBRQ4w== X-Received: by 2002:aa7:c690:0:b0:46a:e2b8:1be9 with SMTP id n16-20020aa7c690000000b0046ae2b81be9mr21743294edq.182.1669849262349; Wed, 30 Nov 2022 15:01:02 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id g18-20020a17090604d200b0078d9b967962sm1099598eja.65.2022.11.30.15.01.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Nov 2022 15:01:01 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id D688480AE89; Thu, 1 Dec 2022 00:01:00 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Stanislav Fomichev In-Reply-To: References: <20221129193452.3448944-1-sdf@google.com> <8735a1zdrt.fsf@toke.dk> X-Clacks-Overhead: GNU Terry Pratchett Date: Thu, 01 Dec 2022 00:01:00 +0100 Message-ID: <87o7soxd1v.fsf@toke.dk> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: ZKPZU62KB75CJ5JQUUCEMI3XB7JV52SI X-Message-ID-Hash: ZKPZU62KB75CJ5JQUUCEMI3XB7JV52SI X-MailFrom: toke@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org X-Mailman-Version: 3.3.7 Precedence: list Subject: [xdp-hints] Re: [PATCH bpf-next v3 00/11] xdp: hints via kfuncs List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Stanislav Fomichev writes: > On Tue, Nov 29, 2022 at 12:50 PM Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >> Stanislav Fomichev writes: >> >> > Please see the first patch in the series for the overall >> > design and use-cases. >> > >> > Changes since v2: >> > >> > - Rework bpf_prog_aux->xdp_netdev refcnt (Martin) >> > >> > Switched to dropping the count early, after loading / verification i= s >> > done. At attach time, the pointer value is used only for comparing >> > the actual netdev at attach vs netdev at load. >> >> So if we're not holding the netdev reference, we'll end up with a BPF >> program with hard-coded CALL instructions calling into a module that >> could potentially be unloaded while that BPF program is still alive, >> right? >> >> I suppose that since we're checking that the attach iface is the same >> that the program should not be able to run after the module is unloaded, >> but it still seems a bit iffy. And we should definitely block >> BPF_PROG_RUN invocations of programs with a netdev set (but we should do >> that anyway). > > Ugh, good point about BPF_PROG_RUN, seems like it should be blocked > regardless of the locking scheme though, right? > Since our mlx4/mlx5 changes expect something after the xdp_buff, we > can't use those per-netdev programs with our generic > bpf_prog_test_run_xdp... Yup, I think we should just block it for now; maybe it can be enabled later if it turns out to be useful (and we find a way to resolve the kfuncs for this case). Also, speaking of things we need to disable, tail calls is another one. And for freplace program attachment we need to add a check that the target interfaces match as well. >> > (potentially can be a problem if the same slub slot is reused >> > for another netdev later on?) >> >> Yeah, this would be bad as well, obviously. I guess this could happen? > > Not sure, that's why I'm raising it here to see what others think :-) > Seems like this has to be actively exploited to happen? (and it's a > privileged operation) > > Alternatively, we can go back to the original version where the prog > holds the device. > Matin mentioned in the previous version that if we were to hold a > netdev refcnt, we'd have to drop it also from unregister_netdevice. Yeah; I guess we could keep a list of "bound" XDP programs in struct net_device and clear each one on unregister? Also, bear in mind that the "unregister" callback is also called when a netdev moves between namespaces; which is probably not what we want in this case? > It feels like beyond that extra dev_put, we'd need to reset our > aux->xdp_netdev and/or add some flag or something else to indicate > that this bpf program is "orphaned" and can't be attached anywhere > anymore (since the device is gone; netdev_run_todo should free the > netdev it seems). You could add a flag, and change the check to: +=09=09if (new_prog->aux->xdp_has_netdev && +=09=09 new_prog->aux->xdp_netdev !=3D dev) { +=09=09=09NL_SET_ERR_MSG(extack, "Cannot attach to a different target devic= e"); +=09=09=09return -EINVAL; +=09=09} That way the check will always fail if xdp_netdev is reset to NULL (while keeping the flag) on dereg? > That should address this potential issue with reusing the same addr > for another netdev, but is a bit more complicated code-wise. > Thoughts? I'd be in favour of adding this tracking; I worry that we'll end up with some very subtle and hard-to-debug bugs if we somehow do end up executing the wrong kfuncs... -Toke