Set the rule handle after flush #88

alexispires · 2019-12-26T11:47:10Z

Currently this very basic and useful workflow is not possible:

r1 := c.AddRule(&nftables.Rule{
	Table: filter,
	Chain: prerouting,
	Exprs: []expr.Any{
		&expr.Verdict{
			// [ immediate reg 0 drop ]
			Kind: expr.VerdictDrop,
		},
	},
})

c.Flush()

c.DelRule(r1)

c.Flush()

The solution:
Because of NLM_F_ECHO all netlink messages in the transaction are returned back. In the case of rules, there are returned back with Handle. Rules are linked to requests by keeping the index of netlink request of a rule. We rely on sequence number to retrieve the right request when receiving a netlink response.

What does the solution allow?

Manipulating rules without fetching and processing rules before
Adding comment to rules, because we don't need user data to retrieve rules anymore

sbezverk · 2019-12-26T15:36:24Z

@alexispires check out this library github.com/sbezverk/nftableslib it is based on github.com/google/nftables and it allows to do easily what you need.

sbezverk · 2019-12-26T19:20:06Z

I think this change would break 1 important use case, when Flush() is called only once for everything, chains, rules, sets etc. We have a use case when everything gets prepared and then programmed in a single shot. I do not see how proposed implementation would address such case.

alexispires · 2019-12-26T19:38:06Z

Look at my test function, I send 1 table, 1 chain and 2 rules in a single transaction

sbezverk · 2019-12-26T19:53:04Z

When there are several chains, multiple rules how do you insure correspondence with returned handle and rule? It seems you rely only on sequence, I do not think it is reliable enough. Still it is better to extend functionality which ensures backward compatibility and not change with breakage.

alexispires · 2019-12-27T16:20:10Z

I understand your concern. The same problem appeared to nft tool and the same assumptions were made, which makes them acceptable to me.

Look at this commit : http://git.netfilter.org/nftables/commit/?id=bb32d8db9a125d9676f87866e48ffbf0221ec16a

The basic principle is to not return a JSON object freshly created from
netlink responses, but just update the existing user-provided one to
make sure callers get back exactly what they expect.
To achieve that, keep the parsed JSON object around in a global variable
('cur_root') and provide a custom callback to insert handles into it
from received netlink messages. The tricky bit here is updating rules
since unique identification is problematic. Therefore drop possibly
present handles from input and later assume updates are received in
order so the first rule not having a handle set is the right one.

The proposal to extend the functionality bothers me since it introduces a behavior specific to the go implementation. The notion of Handle is already used to identify the rules and the echo function exist mainly to return the Handle.

Look at this commit too: http://git.netfilter.org/nftables/commit/?id=b99c4d072d9969f7a0dfc539b2b68b517f90af68

sbezverk · 2019-12-27T16:55:46Z

The work you quoted geared towards less dynamic and more static implementations, imho. My focus is in dynamic, concurrent use uses, example kube-proxy using nftables. Here is scenario when this approach pretty much guarantees the issue, there are 2 processes handling ipv4 and ipv6 tables concurrently, there is a possibility that ipv4 adn ipv6 rules are put in the same messages queue for Flush to program, if you cannot identify precisely which rule handle corresponds to which rule, then it is a matter of chance before they get mixed up. I think using user defined field to be able to identify a particular rule and its id is a safe and reliable way covering all possible use cases, well at least none of use cases presented in the past were not covered by this approach.

sbezverk · 2019-12-27T17:09:25Z

The same problem appeared to nft tool and the same assumptions were made, which makes them acceptable to me.

It is a public and shared library, what is acceptable to you might not work for others, let's find a solution which is not breaking the current behaviour and providing you with what you need for your use case.

conn.go

stapelberg · 2020-01-01T15:57:27Z

I’m not sure yet whether we should support this behavior.

For robustness, programs should generally do idempotent operations where possible. Changing the program from its current “modify/modify” approach to a “modify, query/modify” approach lends itself better to making the program work correctly when interrupted.

Can you share some more details regarding what you’re trying to do?

alexispires · 2020-01-02T09:48:09Z

Today there's only two way to delete or replace a rule with this library:

Using UserData and put a custom userland id (not standard), then fetch all rules and retrieve it
Fetch all rules and compare field by field to retrieve the right rule

IMO the right way is to get back the handle after flush. I can't tell much about what I'm trying to do (commercial product) but it's like a SDN application. I store in memory inside a container (so not problem with interruption it's almost stateless) all my rules to delete them after.

I'm trying to code it in a strong way by matching the sequence number of request (rule by rule) with sequence of the response wich contain handle. Maybe it can be stronger.

alexispires · 2020-01-03T09:38:55Z

@stapelberg @sbezverk
I'm trying to improve my loop control and I'm not sure about the best way to accomplish it. By using strace I see that nftables do a select syscall to detect if a netlink message is going to be received. Do you think this is the right way ?

select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0})
recvmsg(......)
select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)

alexispires · 2020-01-04T01:06:53Z

I made a proto with select, what do you think?

conn_linux.go (from github.com/mdlayher/netlink)

func (c *conn) Select() (int, error) {

	var fdSet unix.FdSet
	fdSet.Zero()
	fdSet.Set(c.s.FD())

	n, err := unix.Select(c.s.FD()+1, &fdSet, nil, nil, &unix.Timeval{})

	return n, err
}

conn.go (from github.com/google/nftables)

        smsg, err := conn.SendMessages(batch(cc.messages))

	if err != nil {
		return fmt.Errorf("SendMessages: %w", err)
	}

	// Retrieving of seq number associated to entities
	entitiesBySeq := make(map[uint32]Entity)
	for i, e := range cc.entities {
		entitiesBySeq[smsg[i].Header.Sequence] = e
	}

	// Trigger entities callback
	for i := 1; i > 0; i, _ = conn.Select() {
		rmsg, err := conn.Receive()

		if err != nil {
			return fmt.Errorf("Receive: %w", err)
		}

		for _, msg := range rmsg {
			if e, ok := entitiesBySeq[msg.Header.Sequence]; ok {
				e.HandleResponse(msg)
			}
		}
	}

conn.go

alexispires · 2020-01-06T14:09:03Z

I just added a more robust test suites (4096 rules added in // by 16 workers) with assert which rely on UserData to check Handle value.

rule.go

alexispires · 2020-01-08T13:43:24Z

Cleaner version below, but both works and are reliable.

diff --git a/conn.go b/conn.go
index 6a552d8..34cf0b9 100644
--- a/conn.go
+++ b/conn.go
@@ -17,7 +17,6 @@ package nftables
 import (
 	"fmt"
 	"sync"
-	"sync/atomic"
 
 	"github.com/google/nftables/expr"
 	"github.com/mdlayher/netlink"
@@ -41,7 +40,7 @@ type Conn struct {
 	sync.Mutex
 	put      sync.Mutex
 	messages []netlink.Message
-	entities map[int32]Entity
+	entities map[int]Entity
 	it       int32
 	err      error
 }
@@ -52,7 +51,6 @@ func (cc *Conn) Flush() error {
 	defer func() {
 		cc.messages = nil
 		cc.entities = nil
-		cc.it = 0
 		cc.Unlock()
 	}()
 	if len(cc.messages) == 0 {
@@ -71,7 +69,7 @@ func (cc *Conn) Flush() error {
 
 	cc.endBatch(cc.messages)
 
-	_, err = conn.SendMessages(cc.messages[:cc.it+1])
+	_, err = conn.SendMessages(cc.messages)
 
 	if err != nil {
 		return fmt.Errorf("SendMessages: %w", err)
@@ -104,36 +102,29 @@ func (cc *Conn) Flush() error {
 }
 
 // PutMessage store netlink message to sent after
-func (cc *Conn) PutMessage(msg netlink.Message) int32 {
+func (cc *Conn) PutMessage(msg netlink.Message) int {
 	cc.put.Lock()
 	defer cc.put.Unlock()
 
 	if cc.messages == nil {
-		cc.messages = make([]netlink.Message, 16)
-		cc.messages[0] = netlink.Message{
+		cc.messages = append(cc.messages, netlink.Message{
 			Header: netlink.Header{
 				Type:  netlink.HeaderType(unix.NFNL_MSG_BATCH_BEGIN),
 				Flags: netlink.Request,
 			},
 			Data: extraHeader(0, unix.NFNL_SUBSYS_NFTABLES),
-		}
-	}
-
-	i := atomic.AddInt32(&cc.it, 1)
-
-	if len(cc.messages) <= int(i) {
-		cc.messages = resize(cc.messages)
+		})
 	}
 
-	cc.messages[i] = msg
+	cc.messages = append(cc.messages, msg)
 
-	return i
+	return len(cc.messages) - 1
 }
 
 // PutEntity store entity to relate to netlink response
-func (cc *Conn) PutEntity(i int32, e Entity) {
+func (cc *Conn) PutEntity(i int, e Entity) {
 	if cc.entities == nil {
-		cc.entities = make(map[int32]Entity)
+		cc.entities = make(map[int]Entity)
 	}
 	cc.entities[i] = e
 }
@@ -214,23 +205,14 @@ func (cc *Conn) marshalExpr(e expr.Any) []byte {
 
 func (cc *Conn) endBatch(messages []netlink.Message) {
 
-	i := atomic.AddInt32(&cc.it, 1)
-
-	if len(cc.messages) <= int(i) {
-		cc.messages = resize(cc.messages)
-	}
+	cc.put.Lock()
+	defer cc.put.Unlock()
 
-	cc.messages[i] = netlink.Message{
+	cc.messages = append(cc.messages, netlink.Message{
 		Header: netlink.Header{
 			Type:  netlink.HeaderType(unix.NFNL_MSG_BATCH_END),
 			Flags: netlink.Request,
 		},
 		Data: extraHeader(0, unix.NFNL_SUBSYS_NFTABLES),
-	}
-}
-
-func resize(messages []netlink.Message) []netlink.Message {
-	new := make([]netlink.Message, cap(messages)*2)
-	copy(new, messages)
-	return new
+	})
 }

sbezverk · 2020-01-08T16:18:56Z

@alexispires could you please demonstrate that with this approach you can create rules with a reference to anonymous set. The reason for this request is the fact that creation of anonymous set must happen within the same transaction of a rule creation. It is not clear to me if it works with this solution.

alexispires · 2020-01-08T16:43:36Z

@alexispires could you please demonstrate that with this approach you can create rules with a reference to anonymous set. The reason for this request is the fact that creation of anonymous set must happen within the same transaction of a rule creation. It is not clear to me if it works with this solution.

I'm not sure to understand, there are tests for this case (TestCreateUseAnonymousSet), you mean if handle ref is still retrieved ?

sbezverk · 2020-01-08T16:49:14Z

@alexispires The test case only tests the sequence of bytes that gets sent, not the actual kernel/netfilter programming, unfortunately travis does not allow this type of "e2e" tests. I was asking you a favour, to build an actual simple binary, based on your change that programs a rule with a reference to an anonymous set and then run 'nft list table {your table name}' to confirm it.

alexispires · 2020-01-08T16:55:01Z

@sbezverk ok no problem, i'll come back to you

alexispires · 2020-01-08T17:43:32Z

It looks good:

Before flush rule Handle is 0
After flush rule Handle is 3
table ip filter {
        chain nfpoc {
                type filter hook postrouting priority 0; policy accept;
                tcp dport { tftp, 1163 } drop # handle 3
        }
}

Code (also in my github, with binary included for x64: https://github.com/alexispires/nfpoc):

package main

import (
	"fmt"
	"os/exec"

	"github.com/google/nftables"
	"github.com/google/nftables/binaryutil"
	"github.com/google/nftables/expr"
	"golang.org/x/sys/unix"
)

func main() {

	c := &nftables.Conn{}

	table := c.AddTable(&nftables.Table{
		Family: nftables.TableFamilyIPv4,
		Name:   "filter",
	})

	chain := c.AddChain(&nftables.Chain{
		Name:     "nfpoc",
		Table:    table,
		Type:     nftables.ChainTypeFilter,
		Hooknum:  nftables.ChainHookPostrouting,
		Priority: nftables.ChainPriorityFilter,
	})

	set := &nftables.Set{
		Anonymous: true,
		Constant:  true,
		Table:     table,
		KeyType:   nftables.TypeInetService,
	}

	if err := c.AddSet(set, []nftables.SetElement{
		{Key: binaryutil.BigEndian.PutUint16(69)},
		{Key: binaryutil.BigEndian.PutUint16(1163)},
	}); err != nil {
		fmt.Printf("c.AddSet() failed: %s", err.Error())
	}

	r := c.AddRule(&nftables.Rule{
		Table: table,
		Chain: chain,
		Exprs: []expr.Any{
			// [ meta load l4proto => reg 1 ]
			&expr.Meta{Key: expr.MetaKeyL4PROTO, Register: 1},
			// [ cmp eq reg 1 0x00000006 ]
			&expr.Cmp{
				Op:       expr.CmpOpEq,
				Register: 1,
				Data:     []byte{unix.IPPROTO_TCP},
			},

			// [ payload load 2b @ transport header + 2 => reg 1 ]
			&expr.Payload{
				DestRegister: 1,
				Base:         expr.PayloadBaseTransportHeader,
				Offset:       2,
				Len:          2,
			},
			// [ lookup reg 1 set __set%d ]
			&expr.Lookup{
				SourceRegister: 1,
				SetName:        set.Name,
				SetID:          set.ID,
			},
			// [ immediate reg 0 drop ]
			&expr.Verdict{
				Kind: expr.VerdictDrop,
			},
		},
	})

	fmt.Printf("Before flush rule Handle is %d\n", r.Handle)

	if err := c.Flush(); err != nil {
		fmt.Printf(err.Error())
	}

	fmt.Printf("After flush rule Handle is %d\n", r.Handle)

	out, err := exec.Command("/usr/sbin/nft", "-a", "list", "table", "filter").Output()
	if err != nil {
		fmt.Printf(err.Error())
	}
	fmt.Printf("%s\n", out)
}

alexispires · 2020-01-13T13:21:10Z

@stapelberg do you need more informations?

stapelberg

Thanks! First review pass done

conn.go

rule.go

stapelberg · 2020-01-14T16:02:13Z

conn.go

-		return fmt.Errorf("Receive: %w", err)
+	// Retrieving of seq number associated to entities
+	entitiesBySeq := make(map[uint32]Entity)
+	for i, e := range cc.entities {


Instead of the separate entites book-keeping, why not just iterate over cc.messages and check if the type implements HandleResponse?

for _, msg := range cc.messages { m, ok := msg.(interface { HandleResponse(netlink.Message) }) if !ok { continue } m.HandleResponse(rmsg) }

HandleResponse is not a method that belong to netlink.Message. It's belong to entity.
For a given entity, it's perform some operations on it by using the netlink response. We have to connect an entity (like a Rule) and a netlink message in case of the slice of responses is unordered. I connect them with the netlink sequence.

Okay. I’m still not particularly happy about having to manage messages and entities, which are separate but connected.

Instead of the current approach, we could make messages take an interface type which has a NetlinkMessage method. putMessages would then wrap a netlink.Message in a type which implements NetlinkMessage, and can optionally implement HandleResponse as well.

@stapelberg I'm not sure to understand what do you have in mind but the main limit I have is that the SendMessages method of netlink library take a slice of netlink.Message and not a slice of netlink.Message's pointer. So the only data struct that give me the netlink sequence number is the slice of netlink.Message returned by sendMessages. So the slice of netlink.Message have to be connected in one way or another to something else. In your solution I have to connect the type that wrap netlink.Message with the netlink.Message that contains the seq even if both are the same netlink message.

I’m a little confused now: you’re saying the sequence numbers are only set in the []netlink.Message that’s returned by (*netlink.Conn).SendMessages, but in your PR, you’re discarding that result:

if _, err = conn.SendMessages(cc.messages); err != nil {

Is that a bug in your PR, or am I misunderstanding?

(Independently, related to your question: the fact that the netlink package takes a []netlink.Message instead of []*netlink.Message doesn’t limit us in which data structures we want to use. If need be, we can do a copy from one to the other. A netlink.Message’s Header is just a few ints, and copying the Body byte slice won’t need to copy its contents.)

nftables_test.go

stapelberg · 2020-01-14T16:03:18Z

nftables_test.go

+			}
+		}
+	}
+


Why is this test running many workers concurrently? In other words: why is not sufficient to test with 1 worker?

To test the behaviors with concurrency as exprimed here: #88 (comment)
IMO it's safer to keep it to identify regression on concurent access. But it's not specific on this part of lib, I think concurrency have to be tested on the whole lib.

conn.go

sbezverk · 2020-01-15T14:46:13Z

@alexispires @stapelberg Can we preserve old Flush() call the way it is now, but new changes implement in something like FlushWithCallback. I will not be able to switch to new functionality right away, maybe eventually after running some performance tests.

alexispires · 2020-01-18T18:23:27Z

@alexispires @stapelberg Can we preserve old Flush() call the way it is now, but new changes implement in something like FlushWithCallback. I will not be able to switch to new functionality right away, maybe eventually after running some performance tests.

I let @stapelberg decide about it, but what do you suggest as API ?

sbezverk · 2020-01-18T18:39:48Z

@alexispires I was suggesting to have old API and new API in parallel, it will help to preserve the backward compatibility and give us a chance to move to new API when confidence/timing is right.

stapelberg · 2020-01-21T07:42:12Z

conn.go

+	TestDial   nltest.Func // for testing only; passed to nltest.Dial
+	NetNS      int         // Network namespace netlink will interact with.
+	entities   map[int]Entity
+	messagesMu sync.Mutex


Move messagesMu and messages down and separate them with a newline (to form a group of fields):

type Conn struct { TestDial nltest.Func // … entities map[int]Entity messagesMu sync.Mutex messages []netlink.Message }

stapelberg · 2020-01-21T07:47:57Z

conn.go

+	}
+
+	// Trigger entities callback
+	msg, err := cc.checkReceive(conn)


Why call checkReceive instead of just calling conn.Receive? It’s not clear to me whether you’re blocking until the next message is available, or only reading as long as messages are available. If the latter, that doesn’t seem sound to me—instead, we should read blockingly until we can identify the end of the response.

It's to read as long as message are available. I'm not sure how to detect the end of the response without select.

Here an example of batch with echo flag:

sendmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=[{{len=20, type=NFNL_MSG_BATCH_BEGIN, flags=NLM_F_REQUEST, seq=3, pid=0}, "\x00\x00\x0a\x00"}, {{len=36, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWTABLE, flags=NLM_F_REQUEST|NLM_F_ECHO, seq=4, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x08\x00\x02\x00\x00\x00\x00\x00"}, {{len=84, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=5, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x0f\x00\x03\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x14\x00\x04\x80\x08\x00\x01\x00\x00\x00\x00\x00\x08\x00\x02\x00\x00\x00\x00\x00\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00"}, {{len=84, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=6, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x10\x00\x03\x00\x70\x6f\x73\x74\x72\x6f\x75\x74\x69\x6e\x67\x00\x14\x00\x04\x80\x08\x00\x01\x00\x00\x00\x00\x04\x08\x00\x02\x00\x00\x00\x00\x64\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00"}, {{len=96, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSET, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=7, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x0c\x00\x02\x00\x5f\x5f\x73\x65\x74\x25\x64\x00\x08\x00\x03\x00\x00\x00\x00\x03\x08\x00\x04\x00\x00\x00\x00\x07\x08\x00\x05\x00\x00\x00\x00\x04\x08\x00\x0a\x00\x00\x00\x00\x04\x0c\x00\x09\x80\x08\x00\x01\x00\x00\x00\x00\x04\x0a\x00\x0d\x00\x00\x04\x02\x00\x00\x00\x00\x00"}, {{len=116, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_CREATE, seq=7, pid=0}, "\x02\x00\x00\x00\x0c\x00\x02\x00\x5f\x5f\x73\x65\x74\x25\x64\x00\x08\x00\x04\x00\x00\x00\x00\x04\x08\x00\x01\x00\x6e\x61\x74\x00\x44\x00\x03\x80\x10\x00\x01\x80\x0c\x00\x01\x80\x08\x00\x01\x00\x54\x4d\x28\x84\x10\x00\x02\x80\x0c\x00\x01\x80\x08\x00\x01\x00\xb0\x1f\x35\x63\x10\x00\x03\x80\x0c\x00\x01\x80\x08\x00\x01\x00\x51\x13\x60\x94\x10\x00\x04\x80\x0c\x00\x01\x80\x08\x00\x01\x00\x8a\x64\x3e\x08"}, {{len=168, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWRULE, flags=NLM_F_REQUEST|NLM_F_ECHO|NLM_F_EXCL|NLM_F_CREATE|NLM_F_APPEND, seq=8, pid=0}, "\x02\x00\x00\x00\x08\x00\x01\x00\x6e\x61\x74\x00\x0f\x00\x02\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x7c\x00\x04\x80\x34\x00\x01\x80\x0c\x00\x01\x00\x70\x61\x79\x6c\x6f\x61\x64\x00\x24\x00\x02\x80\x08\x00\x01\x00\x00\x00\x00\x01\x08\x00\x02\x00\x00\x00\x00\x01\x08\x00\x03\x00\x00\x00\x00\x0c\x08\x00\x04\x00\x00\x00\x00\x04\x30\x00\x01\x80\x0b\x00\x01\x00\x6c\x6f\x6f\x6b\x75\x70\x00\x00\x20\x00\x02\x80\x08\x00\x02\x00\x00\x00\x00\x01\x0c\x00\x01\x00\x5f\x5f\x73\x65\x74\x25\x64\x00\x08\x00\x04\x00\x00\x00\x00\x04\x14\x00\x01\x80\x0c\x00\x01\x00\x63\x6f\x75\x6e\x74\x65\x72\x00\x04\x00\x02\x80"}, {{len=20, type=NFNL_MSG_BATCH_END, flags=NLM_F_REQUEST, seq=9, pid=0}, "\x00\x00\x0a\x00"}], iov_len=624}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 624 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=104, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=0, seq=5, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x01\x0f\x00\x03\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x14\x00\x04\x00\x08\x00\x01\x00\x00\x00\x00\x00\x08\x00\x02\x00\x00\x00\x00\x00\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x08\x00\x06\x00\x00\x00\x00\x04"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 104 fstat(1, {st_dev=makedev(0, 22), st_ino=3, st_mode=S_IFCHR|0620, st_nlink=1, st_uid=1000, st_gid=5, st_blksize=1024, st_blocks=0, st_rdev=makedev(136, 0), st_atime=1579772816 /* 2020-01-23T10:46:56.597292734+0100 */, st_atime_nsec=597292734, st_mtime=1579772816 /* 2020-01-23T10:46:56.597292734+0100 */, st_mtime_nsec=597292734, st_ctime=1579772494 /* 2020-01-23T10:41:34.597292734+0100 */, st_ctime_nsec=597292734}) = 0 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=104, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, flags=0, seq=6, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x02\x10\x00\x03\x00\x70\x6f\x73\x74\x72\x6f\x75\x74\x69\x6e\x67\x00\x14\x00\x04\x00\x08\x00\x01\x00\x00\x00\x00\x04\x08\x00\x02\x00\x00\x00\x00\x64\x08\x00\x05\x00\x00\x00\x00\x01\x0b\x00\x07\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x08\x00\x06\x00\x00\x00\x00\x00"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 104 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=100, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSET, flags=0, seq=7, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x0c\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x09\x08\x00\x03\x00\x00\x00\x00\x03\x08\x00\x04\x00\x00\x00\x00\x07\x08\x00\x05\x00\x00\x00\x00\x04\x0a\x00\x0d\x00\x00\x04\x02\x00\x00\x00\x00\x00\x0c\x00\x09\x00\x08\x00\x01\x00\x00\x00\x00\x04"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 100 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\x54\x4d\x28\x84"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\xb0\x1f\x35\x63"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\x51\x13\x60\x94"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=60, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWSETELEM, flags=0, seq=0, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0b\x00\x02\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x14\x00\x03\x00\x10\x00\x01\x00\x0c\x00\x01\x00\x08\x00\x01\x00\x8a\x64\x3e\x08"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 60 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 1 (in [3], left {tv_sec=0, tv_usec=0}) recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000040}, msg_namelen=12, msg_iov=[{iov_base={{len=204, type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWRULE, flags=0, seq=8, pid=3583}, "\x02\x00\x00\x07\x08\x00\x01\x00\x6e\x61\x74\x00\x0f\x00\x02\x00\x70\x72\x65\x72\x6f\x75\x74\x69\x6e\x67\x00\x00\x0c\x00\x03\x00\x00\x00\x00\x00\x00\x00\x00\x0a\x94\x00\x04\x00\x34\x00\x01\x00\x0c\x00\x01\x00\x70\x61\x79\x6c\x6f\x61\x64\x00\x24\x00\x02\x00\x08\x00\x01\x00\x00\x00\x00\x01\x08\x00\x02\x00\x00\x00\x00\x01\x08\x00\x03\x00\x00\x00\x00\x0c\x08\x00\x04\x00\x00\x00\x00\x04\x30\x00\x01\x00\x0b\x00\x01\x00\x6c\x6f\x6f\x6b\x75\x70\x00\x00\x20\x00\x02\x00\x0b\x00\x01\x00\x5f\x5f\x73\x65\x74\x33\x00\x00\x08\x00\x02\x00\x00\x00\x00\x01\x08\x00\x05\x00\x00\x00\x00\x00\x2c\x00\x01\x00\x0c\x00\x01\x00\x63\x6f\x75\x6e\x74\x65\x72\x00\x1c\x00\x02\x00\x0c\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00"}, iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 204 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)

How about setting NLM_F_ACK or NLM_F_ECHO on the batch start and batch end messages? That way, you should be able to read until you encounter the sequence number for the batch end message.

stapelberg · 2020-01-21T07:51:51Z

conn.go

-		return fmt.Errorf("Receive: %w", err)
+	// Retrieving of seq number associated to entities
+	entitiesBySeq := make(map[uint32]Entity)
+	for i, e := range cc.entities {


Okay. I’m still not particularly happy about having to manage messages and entities, which are separate but connected.

Instead of the current approach, we could make messages take an interface type which has a NetlinkMessage method. putMessages would then wrap a netlink.Message in a type which implements NetlinkMessage, and can optionally implement HandleResponse as well.

prologic · 2022-09-08T15:29:48Z

Damn I need this! Any changes we can get this merged? 🙏

prologic · 2022-09-08T15:44:10Z

I'm not familiar enough with the codebase to help resolve the merge conflicts :/

alexispires · 2022-09-09T09:20:44Z

There was some reluctance about it. I could continue this work if it's deemed useful.

prologic · 2022-09-09T20:17:04Z

There was some reluctance about it. I could continue this work if it's deemed useful.

Please do 🙏 I could really use the improvement you set out to fix here 🙇‍♂️

I think the comments from @stapelberg should be easily addressable 👌

prologic · 2022-09-10T01:46:02Z

@alexispires Just an FYI, this is where I want to use this: https://git.mills.io/prologic/box/issues/18 so it would be awesome if you could revive this PR of yours, rebase it and hopefully we can get it merged soon 🤞 Let me know if I can help in any way, I'm not that familiar with the codebase (let alone nftables 🤣) but my Go is "okay" 😅

alexispires · 2022-09-13T15:17:06Z

@alexispires Just an FYI, this is where I want to use this: https://git.mills.io/prologic/box/issues/18 so it would be awesome if you could revive this PR of yours, rebase it and hopefully we can get it merged soon 🤞 Let me know if I can help in any way, I'm not that familiar with the codebase (let alone nftables 🤣) but my Go is "okay" 😅

I'm still waiting opinion (if it changed) from the maintainers about this feature.

However you can fix your problem by doing as this:

Maintain an unique id for each rule.
When creating your rules, set the UserData field with the defined id.
After the flush, fetch all rules.
Retrieve the right rule by comparing the UserData field with the defined id.
Retrieve the handle from the rule retrieved. After that you can return it, as doing here : https://git.mills.io/prologic/box/src/branch/master/nat/nat.go#L305

prologic · 2022-09-13T22:36:15Z

I'm still waiting opinion (if it changed) from the maintainers about this feature.

Is that @stapelberg ?

However you can fix your problem by doing as this:

Maintain an unique id for each rule.

When creating your rules, set the UserData field with the defined id.

After the flush, fetch all rules.

Retrieve the right rule by comparing the UserData field with the defined id.

Retrieve the handle from the rule retrieved. After that you can return it, as doing here : https://git.mills.io/prologic/box/src/branch/master/nat/nat.go#L305

Ahh brilliant! I didn't know you could do this 😅

prologic · 2022-09-13T22:37:00Z

@alexispires If you at least rebase this PR I can also test it in my project 👌 I'm not sure I wouldn't screw up the merge conflicts myself 😂

stapelberg · 2022-09-29T06:45:19Z

It’s been a long time since I last looked at this, so reading up always takes some time. Maybe a summary in the first post with open questions/issues would be helpful.

I will note that I’m not against this feature per se, but we need to introduce it in a way that doesn’t cause breakage.

There seem to be still a number of unaddressed code review comments. To move this forward, rebasing the PR and replying to the comments would be the first step :)

fungaren · 2023-04-04T03:47:58Z

However you can fix your problem by doing as this:

Maintain an unique id for each rule.

When creating your rules, set the UserData field with the defined id.

After the flush, fetch all rules.

Retrieve the right rule by comparing the UserData field with the defined id.

Retrieve the handle from the rule retrieved. After that you can return it, as doing here : https://git.mills.io/prologic/box/src/branch/master/nat/nat.go#L305

Ahh brilliant! I didn't know you could do this 😅

Anyway, this PR is important for us. As for now, use this code as described above:

var errNotFound = errors.New("not found")

// When a new rule is added or inserted, we do not know its handle.
// Use this method to get its handle if you want to delete it later.
func FindRule(c *nftables.Conn, rule *nftables.Rule) (*nftables.Rule, error) {
	rules, err := c.GetRules(rule.Table, rule.Chain)
	if err != nil {
		return nil, err
	}
	for _, r := range rules {
		if bytes.Equal(r.UserData, rule.UserData) {
			for idx, e := range r.Exprs {
				if _, ok := e.(*expr.Lookup); ok {
					continue
				} else if !reflect.DeepEqual(e, rule.Exprs[idx]) {
					goto next
				}
			}
			return r, nil
		}
	next:
	}
	return nil, errNotFound
}

   rule := conn.InsertRule(&nftables.Rule{...})
   err := conn.Flush()
   if err != nil {
      return nil, err
   }
   rule, err = FindRule(conn, rule)
   if err != nil {
      return nil, err
   }

set handle back

8cb78f7

alexispires changed the title ~~set handle back~~ Set the rule handle after flush Dec 26, 2019

alexispires marked this pull request as ready for review December 26, 2019 11:52

apiresatos added 2 commits December 27, 2019 23:18

improve reliability

5a645a1

fix typo

b00522b

sbezverk reviewed Dec 27, 2019

View reviewed changes

conn.go Outdated Show resolved Hide resolved

disable test on travis

39f8fec

generic

87f28ce

apiresatos added 2 commits January 6, 2020 13:19

better loop control

994c20d

fix tests

3650f83

sbezverk reviewed Jan 6, 2020

View reviewed changes

conn.go Show resolved Hide resolved

sbezverk reviewed Jan 6, 2020

View reviewed changes

conn.go Outdated Show resolved Hide resolved

sbezverk reviewed Jan 6, 2020

View reviewed changes

conn.go Show resolved Hide resolved

improved tests

9dc224d

apiresatos added 2 commits January 6, 2020 15:19

errors handling

e906f33

allow handle test in travis

3aaad4c

alexispires mentioned this pull request Jan 6, 2020

Add select mdlayher/netlink#160

Closed

sbezverk reviewed Jan 6, 2020

View reviewed changes

rule.go Outdated Show resolved Hide resolved

apiresatos added 4 commits January 8, 2020 10:52

improve reliability

c578ee3

fix

4edcb60

fix 2

c8335d6

remove slice limit

a400c5d

cleaner way to put msg

0d3f3ff

stapelberg reviewed Jan 14, 2020

View reviewed changes

refactoring conn.go

0f16d39

stapelberg reviewed Jan 21, 2020

View reviewed changes

danopia mentioned this pull request Jul 24, 2020

Receive: netlink receive: recvmsg: no buffer space available #103

Closed

elmeyer mentioned this pull request Jun 21, 2021

Rule returned from AddRule does not contain Handle, therefore impossible to delete #125

Closed

Set the rule handle after flush #88

Are you sure you want to change the base?

Set the rule handle after flush #88

Conversation

alexispires commented Dec 26, 2019 • edited

sbezverk commented Dec 26, 2019

sbezverk commented Dec 26, 2019

alexispires commented Dec 26, 2019

sbezverk commented Dec 26, 2019

alexispires commented Dec 27, 2019

sbezverk commented Dec 27, 2019

sbezverk commented Dec 27, 2019

stapelberg commented Jan 1, 2020

alexispires commented Jan 2, 2020

alexispires commented Jan 3, 2020 • edited

alexispires commented Jan 4, 2020

alexispires commented Jan 6, 2020 • edited

alexispires commented Jan 8, 2020

sbezverk commented Jan 8, 2020

alexispires commented Jan 8, 2020

sbezverk commented Jan 8, 2020

alexispires commented Jan 8, 2020

alexispires commented Jan 8, 2020 • edited

alexispires commented Jan 13, 2020

stapelberg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexispires Jan 21, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sbezverk commented Jan 15, 2020

alexispires commented Jan 18, 2020

sbezverk commented Jan 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prologic commented Sep 8, 2022

prologic commented Sep 8, 2022

alexispires commented Sep 9, 2022

prologic commented Sep 9, 2022

prologic commented Sep 10, 2022

alexispires commented Sep 13, 2022

prologic commented Sep 13, 2022

prologic commented Sep 13, 2022

stapelberg commented Sep 29, 2022

fungaren commented Apr 4, 2023 • edited

alexispires commented Dec 26, 2019 •

edited

alexispires commented Jan 3, 2020 •

edited

alexispires commented Jan 6, 2020 •

edited

alexispires commented Jan 8, 2020 •

edited

alexispires Jan 21, 2020 •

edited

fungaren commented Apr 4, 2023 •

edited