Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite arch/linux #4352

Merged
merged 3 commits into from Apr 28, 2024
Merged

Rewrite arch/linux #4352

merged 3 commits into from Apr 28, 2024

Conversation

gpotter2
Copy link
Member

@gpotter2 gpotter2 commented Apr 14, 2024

This rewrites most of the arch/linux code that handles the loading of conf.ifaces, conf.route and conf.route6, in order to use a RTNETLINK socket (instead of the current reading of /proc/net/XXX).

Among those:

  • the read_routes(6) functions
  • the linux interfaces provider
  • arch/linux util functions

This:

  • adds support for multiple IPv4 addresses to interfaces
  • handles interfaces aliases, etc. natively without needing additional tricks
  • allows us to have filter routes by routing tables, also giving us access to 'local', rather than only all tables, giving us generally much better handling of routes.

fixes #4201
closes #3275
(maybe other issues too?)

Note to all maintainers

This has been tested on 32 big endian architecture thanks to the folks at packit. (woooo)

This might need thorough TESTING on other platforms (that we don't include in the CI). I hope I didn't break anything on *BSD.

@gpotter2 gpotter2 added enhancement major Major changes labels Apr 14, 2024
@gpotter2 gpotter2 force-pushed the rewrite-linux-arch branch 2 times, most recently from 7afd27b to 725a8b3 Compare April 14, 2024 21:41
Copy link

codecov bot commented Apr 14, 2024

Codecov Report

Merging #4352 (7161b13) into master (86c7a05) will increase coverage by 0.02%.
Report is 1 commits behind head on master.
The diff coverage is 95.25%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4352      +/-   ##
==========================================
+ Coverage   82.13%   82.16%   +0.02%     
==========================================
  Files         350      351       +1     
  Lines       83136    83207      +71     
==========================================
+ Hits        68283    68363      +80     
+ Misses      14853    14844       -9     
Files Coverage Δ
scapy/arch/__init__.py 72.72% <ø> (ø)
scapy/arch/common.py 71.69% <100.00%> (+4.30%) ⬆️
scapy/arch/linux/__init__.py 84.42% <100.00%> (ø)
scapy/arch/windows/__init__.py 62.45% <100.00%> (-0.28%) ⬇️
scapy/fields.py 92.39% <ø> (ø)
scapy/route.py 90.29% <100.00%> (+0.14%) ⬆️
scapy/utils6.py 87.87% <100.00%> (+0.06%) ⬆️
scapy/arch/linux/rtnetlink.py 94.71% <94.71%> (ø)

... and 4 files with indirect coverage changes

@gpotter2 gpotter2 added this to the 2.6.0 milestone Apr 14, 2024
@gpotter2 gpotter2 force-pushed the rewrite-linux-arch branch 3 times, most recently from 5234f90 to b79adc2 Compare April 20, 2024 00:39
@gpotter2 gpotter2 marked this pull request as ready for review April 20, 2024 00:49
@gpotter2
Copy link
Member Author

gpotter2 commented Apr 20, 2024

@guedou @p-l- This is ready for review. If you could just try playing around with routes, it would help a lot.

Just note an intended change: for IPv4, 'via' and 'default' routes show '0.0.0.0' as the outgoing IP, but that gets resolved properly when calling conf.route.route.

@evverx Hi ! I was wondering, if you have the time, if it was possible to test this PR with your packit setup. (BTW, If you have the time, feel free to submit a PR to include it here. It looks like It's working great). I'm a bit worried that this PR breaks on some plateforms, because of the rnetlink implementation. Especially on big-endian or 32-bits platforms. Thanks a lot.

@evverx
Copy link
Contributor

evverx commented Apr 20, 2024

@gpotter2 I cherry-picked the commit on top of my fork and triggered Packit. Looks like the tests passed on all the architectures: https://copr.fedorainfracloud.org/coprs/packit/evverx-scapy-2/build/7329726/. I'll open a PR a bit later. (I opened #4355)

@evverx
Copy link
Contributor

evverx commented Apr 20, 2024

Looks like with this PR applied I can see the following log messages in the journal when the test suite is run

Apr 20 03:10:26 H kernel: __nla_validate_parse: 6 callbacks suppressed
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.                                                             
Apr 20 03:10:50 H kernel: netlink: 4 bytes leftover after parsing attributes in process `python'.        

I switched back to the master branch, ran the tests and the logs disappeared.

@gpotter2
Copy link
Member Author

Thanks a lot. This should be fixes in the latest version.

@gpotter2
Copy link
Member Author

/packit build

Copy link

No config file for packit (e.g. .packit.yaml) found in secdev/scapy on commit 9fab3ca

For more info, please check out the documentation or contact the Packit team. You can also use our CLI command validate-config or our pre-commit hooks for validation of the configuration.

2 similar comments

This comment was marked as spam.

This comment was marked as spam.

This rewrites much of the arch/linux code, in order to use a RTNETLINK
socket instead of reading /proc/net/XXX.

Among those:
- the read_routes(6) functions
- the linux interfaces provider
- arch/linux util functions

This adds support for multiple IPv4 addresses to interfaces, among with
a generally much better handling of routes.

fixes secdev#4201
@evverx
Copy link
Contributor

evverx commented Apr 21, 2024

Looks like the kernel no longer complains.

fixes #4201
(maybe other issues too?)

As far as I understand it should also address #2133 and #3275.

guedou
guedou previously approved these changes Apr 21, 2024
Copy link
Member

@guedou guedou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome PR! Thanks a lot for made it happen. I bet that it will also speed up loading Scapy on systems with many routes.

What's the minimal kernel version that is needed for this PR to work?

scapy/arch/linux/rtnetlink.py Outdated Show resolved Hide resolved
scapy/arch/linux/rtnetlink.py Outdated Show resolved Hide resolved
@gpotter2
Copy link
Member Author

gpotter2 commented Apr 21, 2024

So just a heads up: I'm really not sure this speeds up loading. It gives a much more more accurate result on my machine and in several test setups I tried, but I haven't tested it on a machine that has a billion routes, or similar. I'm not really sure how it performs on those, so maybe it would be a good idea to try that out.

It's very likely that it is possible to properly filter the routes when querying them, in order to add only the relevant (non BGP for instance) ones, but I haven't looked into it. Implemented.

@guedou
Copy link
Member

guedou commented Apr 21, 2024

I tried to load 1000 routes using the following shell and Python scripts, and got the following error:

bash bench_load_prefixes.sh 
gpotter2/rewrite-linux-arch 1000
9

real	0m1.848s
user	0m0.766s
sys	0m0.389s
root@vagrant:/vagrant# vim bench_load_prefixes.sh 
root@vagrant:/vagrant# bash bench_load_prefixes.sh 
gpotter2/rewrite-linux-arch 10000
Traceback (most recent call last):
  File "/vagrant/test_routes_loading.py", line 1, in <module>
    from scapy.all import *
  File "/vagrant/scapy/all.py", line 26, in <module>
    from scapy.route import *
  File "/vagrant/scapy/route.py", line 222, in <module>
    conf.route = Route()
                 ^^^^^^^
  File "/vagrant/scapy/route.py", line 37, in __init__
    self.resync()
  File "/vagrant/scapy/route.py", line 47, in resync
    self.routes = read_routes()
                  ^^^^^^^^^^^^^
  File "/vagrant/scapy/arch/linux/rtnetlink.py", line 796, in read_routes
    results = _read_routes(socket.AF_INET)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/vagrant/scapy/arch/linux/rtnetlink.py", line 772, in _read_routes
    results = _sr1_rtrequest(
              ^^^^^^^^^^^^^^^
  File "/vagrant/scapy/arch/linux/rtnetlink.py", line 647, in _sr1_rtrequest
    msgs = rtmsghdrs(sock.recv(65535))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/vagrant/scapy/base_classes.py", line 398, in __call__
    i.__init__(*args, **kargs)
  File "/vagrant/scapy/packet.py", line 183, in __init__
    self.dissect(_pkt)
  File "/vagrant/scapy/packet.py", line 1076, in dissect
    s = self.do_dissect(s)
        ^^^^^^^^^^^^^^^^^^
  File "/vagrant/scapy/packet.py", line 1014, in do_dissect
    s, fval = f.getfield(self, s)
              ^^^^^^^^^^^^^^^^^^^
  File "/vagrant/scapy/fields.py", line 1843, in getfield
    raise MaximumItemsCount(
scapy.fields.MaximumItemsCount: Maximum amount of items reached in PacketListField: 100 (defaults to conf.max_list_count)

real	0m1.329s
user	0m0.542s
sys	0m0.260s

Setting the value of conf.max_list_count to 1000 in scapy/config.py fixed this issue, but it seems a bit too much to change this value globally only to parse routes.

You can put random IPv4 prefixes into prefixes.txt.gz or populate it from real ones (as I did).

cat bench_load_prefixes.sh 
N=1000
for PREFIX in $(zcat prefixes.txt.gz |head -n $N)
do
  sudo ip route add $PREFIX dev eth0
done

echo $(git branch --show-current) $N
time python3 test_routes_loading.py

for PREFIX in $(zcat prefixes.txt.gz |head -n $N)
do
  sudo ip route del $PREFIX dev eth0
done
from scapy.all import *
print(len(conf.route.routes))

@guedou
Copy link
Member

guedou commented Apr 21, 2024

Empirical benchmarks show similar results (i.e. 2s to 5s loading time) between master and this PR. However, for approximately 50000 routes, master loads in 7s while this PR loads in 41s. I was not expecting this PR to be slower when dealing with more routes.

I did the test on Ubuntu 23.10 with kernel 6.5.0-15-generic.

@gpotter2
Copy link
Member Author

gpotter2 commented Apr 21, 2024

I expected that, we're popping a socket and parsing several messages instead of simply opening a file descriptor and parsing some text.

That being said, it's the only way of solving the 'how do you not load the BGP table' problem. And I think that it's not realistic to have 50k routes in main/local.

@gpotter2
Copy link
Member Author

gpotter2 commented Apr 21, 2024

I tried to load 1000 routes using the following shell and Python scripts, and got the following error:

This is fixed.

@guedou Do you agree with my reasoning regarding the slowdowns?

@gpotter2 gpotter2 merged commit cd01ec1 into secdev:master Apr 28, 2024
33 checks passed
@gpotter2 gpotter2 deleted the rewrite-linux-arch branch April 28, 2024 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IPv4 routing table is missing entries for additional addresses on the loopback interface
3 participants