Implement delete() with wildcards via for_each_elem #2922

viktormalik · 2024-01-15T08:57:46Z

This implements a new syntax, delete(@map[*, 10]) which allows partial deletes from maps, all within BPF. It uses bpf_for_each_map_elem helper with custom callback function.

It currently supports only integer values as concrete values.

Example usage:

@map[10, 10] = 1;
@map[20, 10] = 2;
@map[20, 1] = 3;
delete(@map[*, 10]);
// only @map[20, 1] is left here.

This is a re-submission of the last commit of #2321. The original PR contained a lot of commits for adding support for BPF subprograms (used for the custom callback). These were all merged separately in the meantime.

Resolves #2267.

Checklist

Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md
User-visible and non-trivial changes updated in CHANGELOG.md
The new behaviour is covered by tests

src/ast/passes/codegen_llvm.cpp

danobi

Only had time to day to think about high level - seems good to me. Hope to have time later this week/weekend to do a full review.

This implements a new syntax, delete(@map[*, 10]) which allows partial deletes from maps, all within BPF. It uses bpf_for_each_map_elem helper with custom callback function.

It currently supports only integer values as concrete values.

Example usage:
@map[10, 10] = 1;
@map[20, 10] = 2;
@map[20, 1] = 3;
delete(@map[*, 10]);
// only @map[20, 1] is left here.
This is a re-submission of the last commit of #2321. The original PR contained a lot of commits for adding support for BPF subprograms (used for the custom callback). These were all merged separately in the meantime.

Resolves #2267.

Checklist

Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md

User-visible and non-trivial changes updated in CHANGELOG.md

The new behaviour is covered by tests

danobi · 2024-01-24T15:49:23Z

docs/reference_guide.md

+```
+# bpftrace -e 'tracepoint:workqueue:workqueue_execute_start { @work[tid] = nsecs; }
+    profile:ms:1 /@work[tid]/ { @stacks[tid, kstack] = count(); }
+    tracepoint:workqueue:workqueue_execute_end { if (nsecs - @work[tid] <= 10_000_000) {
+        delete(@stacks[tid, *]); } delete(@work[tid]); }'


Perhaps a bit more readable if you only show contents of script and not use the one-liner invocation.

danobi · 2024-01-24T15:53:42Z

Another thought: would this feature overlap with #2955 ? For example, if we can loop over the key/value pairs of a map, would it also be possible to delete() entries in the body of the loop? If so, would this feature be needed?

viktormalik · 2024-01-25T06:29:57Z

Another thought: would this feature overlap with #2955 ? For example, if we can loop over the key/value pairs of a map, would it also be possible to delete() entries in the body of the loop? If so, would this feature be needed?

Good question. I think that #2955 could eventually support deleting the entries but I'd still argue that this syntax has its usage. After all it's much more convenient to write:

delete(@map[*, *, 100])

than

for ($key : @map) {
  if ($key.2 == 100) {
    delete(@map[$key]);
  }
}

In addition, it will take some time to get #2955 in and this seems to be an often requested capability.

ajor

Thanks again for resurrecting this old code - I know it wasn't you that wrote it in the first place! I've not fully gone over the semantic analyser and codegen changes, but have a few comments to start with.

One pretty major issue I've found it that this doesn't work! I haven't looked into why yet.

Delete doesn't delete:

$ sudo ./src/bpftrace -e 'BEGIN { @x[1,1] = 1; delete(@x[1,*]); exit(); }'
Attaching 1 probe...


@x[1, 1]: 1

The runtime tests should have caught this, but they are passing because they are not checking for the right thing. The EXPECT clause looks for a match, which it finds, and it just ignores the extra elements in the output that haven't been deleted.

ajor · 2024-01-16T15:59:13Z

docs/reference_guide.md

@@ -3396,6 +3396,7 @@ END
 - `hist(int n[, int k])` - Produce a log2 histogram of values of n with 2^k buckets per power of 2
 - `lhist(int n, int min, int max, int step)` - Produce a linear histogram of values of n
 - `delete(@x[key])` - Delete the map element passed in as an argument
+- `delete(@x[*, val])` - Delete all map elements with `val` as the second level


"second key" instead of "second level"?

ajor · 2024-01-16T16:01:13Z

src/ast/ast.cpp

+MapWildcard::~MapWildcard()
+{
+}
+


Suggested change

MapWildcard::~MapWildcard()

{

}

Don't think this is needed

ajor · 2024-01-16T16:03:25Z

src/ast/ast.h

+  size_t wildcards = 0;
+  size_t concrete_idx = 0; // only makes sense if wildcards != 0


I'm not very familiar with our LEAFCOPY semantics, but do these need to be included in copy-ctor since they're not nodes?

I believe they do, good catch.

ajor · 2024-01-16T16:05:00Z

src/ast/irbuilderbpf.cpp

-  : IRBuilder<>(context),
-    module_(module),
-    bpftrace_(bpftrace)
+    : IRBuilder<>(context), module_(module), bpftrace_(bpftrace)


Unrelated change: Let's save this for Daniel's big reformatting commit

ajor · 2024-01-22T18:12:53Z

src/ast/passes/semantic_analyser.cpp

@@ -694,6 +694,46 @@ void SemanticAnalyser::visit(Call &call)
      auto &arg = *call.vargs->at(0);
      if (!arg.is_map)
        LOG(ERROR, call.loc, err_) << "delete() expects a map to be provided";
+
+      auto &map = static_cast<Map &>(arg);


This is undefined behaviour - we aren't certain that arg is a map at this point as the check above carries on after logging an error. Probably easiest to log and then return return above if arg isn't a map

ajor · 2024-01-25T11:31:47Z

src/ast/passes/semantic_analyser.cpp

+  // intentionally keep nodes with wildcards untyped to prevent their use
+  // in unexpected locations.
+  if (search_val != map_val_.end() && map.wildcards == 0)


I don't think this is a good approach - it saves coding but leaves bad error messages presented to the user. "None" type is an implementation detail that users shouldn't see (it might leak in places currently, but there should also be a user-friendly accompanying error)

ajor · 2024-01-25T15:25:36Z

tests/runtime/delete_wildcard

@@ -0,0 +1,14 @@
+NAME delete_wildcard_idx_0
+PROG BEGIN { @map[10, 10] = 1; @map[2, 20] = 2; @map[10, 11] = 1; delete(@map[10, *]); print(@map); exit(); } END {}


Suggested change

PROG BEGIN { @map[10, 10] = 1; @map[2, 20] = 2; @map[10, 11] = 1; delete(@map[10, *]); print(@map); exit(); } END {}

PROG BEGIN { @map[10, 10] = 1; @map[2, 20] = 2; @map[10, 11] = 1; delete(@map[10, *]); print(@map); exit(); }

ajor · 2024-01-25T15:25:45Z

tests/runtime/delete_wildcard

+TIMEOUT 5
+
+NAME delete_wildcard_idx_3
+PROG BEGIN { @map[10, 0, 1241, 0xcafe] = 1; @map[2, 20, 12, 0] = 2; @map[10, 192, 23, 0xcafe] = 1; delete(@map[*, *, *, 0xcafe]); print(@map); exit(); } END {}


Suggested change

PROG BEGIN { @map[10, 0, 1241, 0xcafe] = 1; @map[2, 20, 12, 0] = 2; @map[10, 192, 23, 0xcafe] = 1; delete(@map[*, *, *, 0xcafe]); print(@map); exit(); } END {}

PROG BEGIN { @map[10, 0, 1241, 0xcafe] = 1; @map[2, 20, 12, 0] = 2; @map[10, 192, 23, 0xcafe] = 1; delete(@map[*, *, *, 0xcafe]); print(@map); exit(); }

ajor · 2024-01-25T15:28:25Z

tests/runtime/delete_wildcard

+TIMEOUT 5
+
+NAME delete_wildcard_before_usage
+PROG END { delete(@map[10, *]); } BEGIN {@map[10, 20] = 1; @map[20, 10] = 2; exit();}


Runtime tests are pretty slow to run, so I'd move this into a semantic analyser unit test. It's not really testing anything different to delete_wildcard_idx_0, runtime-wise

ajor · 2024-01-25T15:48:28Z

tests/semantic_analyser.cpp

We could do with some more tests here to cover the special errors conditions added to the semantic analyser with this change, e.g. wildcards with undefined maps, accessing the value of a wildcarded map. If you're feeling especially fancy you could add checks for the correct error messages to confirm we don't just fail with "none type" errors.

I was able to trigger different aborts and segvs with these two scripts:

BEGIN { @x[0, 1] = 1; @x[*, 1]; } BEGIN { @x[*, 1]; }

danobi · 2024-01-25T16:56:19Z

Another dimension to consider: since user-defined functions are close to merging, would it make sense to start writing a standard library? With map loops and user-defined functions, we could in theory keep the terseness of the new delete() overload without adding stuff to core language.

Since this PR does solve a concrete issue in the near term (the other efforts are medium term), I am still in favor of it. And it seems like it would be possible to have a migration path from moving stuff out of core language and into standard library.

danobi · 2024-01-28T20:21:31Z

src/ast/ast.h

+  DEFINE_LEAFCOPY(MapWildcard)
+
+  explicit MapWildcard(location loc);
+  ~MapWildcard();


Can we = default the destructor?

danobi · 2024-01-28T20:21:31Z

src/ast/passes/codegen_llvm.cpp

+   *
+   * struct ctx_t {
+   *   uint64_t offset;
+   *   uint64_t value;


Perhaps good to call out somewhere in this function or the above that integers are always stored as
64bit values in maps. I see that mentioned in AssignMapStatement code but might be useful to point
it out again

danobi · 2024-01-28T20:21:31Z

src/ast/passes/semantic_analyser.cpp

+  if (assignment.map->wildcards)
+  {
+    LOG(ERROR, assignment.loc, err_)
+        << "Wildcards are only valid for delete() calls";


Nit: perhaps a more maintainable error message is to say soemthing like "wildcards are not valid for assignments". So that if map wildcards gain more use cases, we don't need to update this messsage.

danobi · 2024-01-28T20:21:31Z

src/parser.yy

+                map_vargs "," expr { $$ = $1; $1->push_back($3); }
+        |       map_vargs "," "*"  { $$ = $1; $1->push_back(new ast::MapWildcard(@$)); }
+        |       "*"            { $$ = new ast::ExpressionList; $$->push_back(new ast::MapWildcard(@$)); }


Should we use MUL instead of * here?

danobi · 2024-01-28T20:21:31Z

tests/runtime/delete_wildcard

@@ -0,0 +1,14 @@
+NAME delete_wildcard_idx_0
+PROG BEGIN { @map[10, 10] = 1; @map[2, 20] = 2; @map[10, 11] = 1; delete(@map[10, *]); print(@map); exit(); } END {}
+EXPECT @map\[2, 20\]: 2


This doesn't actually check that the other values are gone now, right? Cuz this would be on its own line. To check that the other entries are deleted, you'd want to probably use json output here.

That's correct. We could either use JSON output or I'm thinking about adding a new EXPECT_NOT clause to runtime tests to specify regex which should not be present in the output. But using JSON is probably better.

danobi · 2024-01-28T20:21:31Z

tests/runtime/delete_wildcard

+EXPECT @map\[2, 20, 12, 0\]: 2
+TIMEOUT 5
+
+NAME delete_wildcard_before_usage
+PROG END { delete(@map[10, *]); } BEGIN {@map[10, 20] = 1; @map[20, 10] = 2; exit();}
+EXPECT @map\[20, 10\]: 2


Same as above

danobi · 2024-01-28T20:21:32Z

tests/semantic_analyser.cpp

+TEST(semantic_analyser, call_delete_wildcard)
+{
+  test("kprobe:f { @x[0, 1] = 1; delete(@x[*, 1]); }", 0);
+  test("kprobe:f { @x[0, 1] = 1; $y = 1; delete(@x[* ,$y]); }", 0);


Add a successful test case for:

trailing wildcard (ie delete(@x[1, *]))

multiple wildcards (ie delete(@x[*, 1, *]))

danobi · 2024-01-28T20:21:32Z

tests/semantic_analyser.cpp

+  test("kprobe:f { @x[0, 1] = 1; @y = delete(@x[10, *]); }", 1);
+  test("kprobe:f { @x[0, 1] = 1; @[delete(@x[10, *])] = 1; }", 1);
+  test("kprobe:f { @x[0, 1] = 1; if(delete(@x[10, *])) { 123 } }", 10);
+  test("kprobe:f { @x[0, 1] = 1; delete(@x[10, *]) ? 0 : 1; }", 10);


Add a negative test case for lone wildcard (ie delete(@x[*])) - should error and say "use clear()` isntead ?

danobi · 2024-01-28T20:21:32Z

tests/semantic_analyser.cpp

+{
+  test("kprobe:f { @x[0, 1] = 1; @x[*, 1] = 2; }", 1);
+  test("kprobe:f { @x[0, 1] = 1; print(@x[*, 1]); }", 1);
+  test("kprobe:f { @x[0, 1] = 1; $y = @x[*, 1]; }", 10);


Also a test case for undeduced map type? ie a lone delete(@[*, 1]) without any other usages to declare map type

danobi · 2024-01-28T20:21:33Z

src/parser.yy

Perhaps also add parser tests? Would be good to see some validation around negative test cases like:

delete(@["*"]) delete(@['*']) delete(@[**])

The first one will parse successfully as "*" is a string which is a valid map key. The latter two should fail so I'll add tests for them.

viktormalik · 2024-01-29T11:42:13Z

One pretty major issue I've found it that this doesn't work! I haven't looked into why yet.

Delete doesn't delete:
$ sudo ./src/bpftrace -e 'BEGIN { @x[1,1] = 1; delete(@x[1,*]); exit(); }'
Attaching 1 probe...


@x[1, 1]: 1

Damn, stupid mistake in LLVM IR. And shame on me for not checking it manually and relying solely on the tests.

The runtime tests should have caught this, but they are passing because they are not checking for the right thing. The EXPECT clause looks for a match, which it finds, and it just ignores the extra elements in the output that haven't been deleted.

This is not a good behaviour. I'm thinking about introducing a new directive EXPECT_NOT for checking that the given regex is not in the output.

viktormalik · 2024-01-30T08:04:21Z

Thanks for the reviews @ajor and @danobi! I believe I addressed all of your comments (except for adding semantic tests for particular error messages). I didn't resolve any comments so that you can double check.

I added a bunch of new semantic analyser and parser tests and changed runtime tests to use json output to be able to verify that map entries are truly missing.

danobi · 2024-01-31T20:10:12Z

codeql warning looks legit. Seems is_quoted_type() is missing map_wildcard

danobi · 2024-01-31T20:13:07Z

docs/reference_guide.md

missing man page update? I (maybe we?) should really finish that conversion...

Yeah, you're right, I updated it. This is really annoying though, we should make the conversion a priority. Let's discuss at the office hours.

viktormalik · 2024-02-01T09:10:59Z

clang-format failure is fixed by #2978.

@Map

This commit implements a new syntax, delete(@Map[*, 10]) which allows partial deletes from maps, all within bpf. It currently supports only integer values as concrete values. Example usage: ``` @Map[10, 10] = 1; @Map[20, 10] = 2; @Map[20, 1] = 3; delete(@Map[*, 10]); // only @Map[20, 1] is left here. ```

danobi · 2024-05-24T16:14:18Z

Do we still need this now that looping over map elements is supported?

viktormalik requested review from ajor, danobi and fbs as code owners January 15, 2024 08:57

viktormalik force-pushed the for_each branch from e581850 to 6b25d4a Compare January 15, 2024 08:58

viktormalik mentioned this pull request Jan 15, 2024

Filter on arbitrary level of map key #2267

Open

github-advanced-security bot found potential problems Jan 15, 2024

View reviewed changes

src/ast/passes/codegen_llvm.cpp Dismissed Show dismissed Hide dismissed

src/ast/passes/codegen_llvm.cpp Dismissed Show dismissed Hide dismissed

src/ast/passes/codegen_llvm.cpp Dismissed Show dismissed Hide dismissed

ajor mentioned this pull request Jan 19, 2024

Support of buffering string output #2938

Open

ajor self-assigned this Jan 19, 2024

danobi self-assigned this Jan 23, 2024

danobi reviewed Jan 24, 2024

View reviewed changes

viktormalik mentioned this pull request Jan 25, 2024

Provide method for looping over map elements #2955

Open

ajor requested changes Jan 25, 2024

View reviewed changes

danobi reviewed Jan 28, 2024

View reviewed changes

viktormalik force-pushed the for_each branch from 6b25d4a to bde7eb6 Compare January 30, 2024 08:04

danobi reviewed Jan 31, 2024

View reviewed changes

viktormalik force-pushed the for_each branch from bde7eb6 to 2d8773f Compare February 1, 2024 08:56

viktormalik force-pushed the for_each branch from 2d8773f to ff7d7d6 Compare February 1, 2024 11:41

danobi added the waiting-on-author label Mar 28, 2024

		size_t wildcards = 0;
		size_t concrete_idx = 0; // only makes sense if wildcards != 0

		@@ -0,0 +1,14 @@
		NAME delete_wildcard_idx_0
		PROG BEGIN { @map[10, 10] = 1; @map[2, 20] = 2; @map[10, 11] = 1; delete(@map[10, *]); print(@map); exit(); } END {}

	PROG BEGIN { @map[10, 0, 1241, 0xcafe] = 1; @map[2, 20, 12, 0] = 2; @map[10, 192, 23, 0xcafe] = 1; delete(@map[, , *, 0xcafe]); print(@map); exit(); } END {}
	PROG BEGIN { @map[10, 0, 1241, 0xcafe] = 1; @map[2, 20, 12, 0] = 2; @map[10, 192, 23, 0xcafe] = 1; delete(@map[, , *, 0xcafe]); print(@map); exit(); }

Implement delete() with wildcards via for_each_elem #2922

Are you sure you want to change the base?

Implement delete() with wildcards via for_each_elem #2922

Conversation

viktormalik commented Jan 15, 2024

Checklist

danobi left a comment

Choose a reason for hiding this comment

Checklist

Choose a reason for hiding this comment

danobi commented Jan 24, 2024

viktormalik commented Jan 25, 2024

ajor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danobi commented Jan 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viktormalik commented Jan 29, 2024

viktormalik commented Jan 30, 2024

danobi commented Jan 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viktormalik commented Feb 1, 2024

danobi commented May 24, 2024