Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: migrate to netaddr #540

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

*: migrate to netaddr #540

wants to merge 6 commits into from

Conversation

jzelinskie
Copy link
Member

I'm sorry for the huge commit. This ended up touching far more than I thought it would.
I'm interested in seeing some perf numbers on real workloads before merging.

This change impacted almost the entire codebase:

  • All uses of net.IP are now netaddr.IP or netaddr.IPPort
  • Scrape now passes an entire peer
  • Testing in bittorrent package refactored
  • Parsing and API changes to Peers, PeerIDs, ClientIDs
  • Redis Storage now uses shorter keys

@jzelinskie jzelinskie requested a review from mrd0ll4r July 5, 2021 22:32
@jzelinskie jzelinskie force-pushed the netaddr branch 7 times, most recently from f036b87 to c5b626f Compare July 5, 2021 23:42
Copy link
Member

@mrd0ll4r mrd0ll4r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some performance and memory things I disagree with, and a lot of small things :)

frontend/udp/parser.go Outdated Show resolved Hide resolved
frontend/udp/parser.go Show resolved Hide resolved
storage/memory/peer_store.go Outdated Show resolved Hide resolved
bittorrent/bittorrent.go Outdated Show resolved Hide resolved
bittorrent/bittorrent.go Outdated Show resolved Hide resolved
// "I4" => number of IPv4 InfoHashes
// "I6" => number of IPv6 InfoHashes
func infohashCountKey(ip netaddr.IP) string {
var b strings.Builder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the Builder is necessary or actually adds overhead. We know this is gonna be two bytes, so we might as well just allocate a slice for that.

storage/redis/peer_store.go Outdated Show resolved Hide resolved
storage/redis/peer_store.go Outdated Show resolved Hide resolved
storage/redis/peer_store.go Outdated Show resolved Hide resolved
storage/redis/peer_store.go Outdated Show resolved Hide resolved
@jzelinskie jzelinskie force-pushed the netaddr branch 5 times, most recently from ae7f43f to df28e7e Compare July 7, 2021 05:14
Copy link
Member

@mrd0ll4r mrd0ll4r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw the one buffer growth thing. Apart from that: It's a big one, not sure if I saw everything... maybe we can/should run the various benchmarking things? tbomb for just raw performance, and the storage memory usage thing (although I haven't tried that out in ages and don't know if it still works).

var b strings.Builder
switch {
case ip.Is4(), ip.Is4in6():
b.Grow(1 + 1 + 4) // "4" + "S"/"L" + IPv4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing 20 bytes for the actual infohash 🙃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

I also noticed that in PutLeecher it does not check if the infohash was new. It's apparently missing a

	if reply[1] == 1 { // The Infohash Key (and thus the infohash) was new.
		_, err = conn.Do("INCR", infohashCountKey(ip))
		if err != nil {
			return err
		}
	}

Is that on purpose? Should I add this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that's on purpose, because for redis we only count infohashes with at least one seeder. I don't remember why I did it like that, but I assume there was some reason for it...maybe it would have been difficult to correctly determine whether a seeder-only or leecher-only swarm already exists or something, not sure.

@Aranjedeath
Copy link

Yay!!! Thank you all. @mrd0ll4r we are running this branch in production currently :) I use regular memory storage backend, so redis issue does not affect me.

@mrd0ll4r
Copy link
Member

mrd0ll4r commented Jul 8, 2021

Oh nice @Aranjedeath ! I can see the reboot and most things look fine so far. One thing I can't see is CPU usage (and request durations iirc), let us know how that works out :)

@Aranjedeath
Copy link

Oh nice @Aranjedeath ! I can see the reboot and most things look fine so far. One thing I can't see is CPU usage (and request durations iirc), let us know how that works out :)

IPV6 has request timing enabled, we disabled it on v4 previously for performance :)

@Aranjedeath
Copy link

Oh nice @Aranjedeath ! I can see the reboot and most things look fine so far. One thing I can't see is CPU usage (and request durations iirc), let us know how that works out :)

OK I think we have performance details in v4 now :)

This change impacted almost the entire codebase.

- All uses of net.IP are now netaddr.IP or netaddr.IPPort
- Scrape now passes an entire peer
- Testing in bittorrent package refactored
- Parsing and API changes to Peers, PeerIDs, ClientIDs
- Redis Storage now uses shorter keys
@jzelinskie
Copy link
Member Author

I pushed a commit that migrates us to zerolog in this branch. It's for convenience for testing. If you want me to open it as another PR after this, I can.

@jzelinskie jzelinskie force-pushed the netaddr branch 2 times, most recently from 866886f to 0a75d53 Compare July 27, 2021 01:50
@@ -14,6 +14,7 @@ import (
"inet.af/netaddr"

"github.com/chihaya/chihaya/pkg/log"
"github.com/jzelinskie/stringz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

Copy link
Member

@mrd0ll4r mrd0ll4r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time I look at this I find some new small things..
I think it's good overall and I want it merged, but the PR is too big for me to confidently review it 🙃

Can we somehow split it up maybe?

log.Debug("sanitized scrape", r, log.Fields{
"maxScrapeInfoHashes": maxScrapeInfoHashes,
})
log.Debug().Uint32("maxScrapeInfoHashes", maxScrapeInfoHashes).Msg("sanitized scrape")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the sanitized request

r.IPPort = r.IPPort.WithIP(netaddr.IPFrom4(ip.As4()))
}

log.Debug().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also missing the request

@@ -435,7 +437,7 @@ func AnnounceSeeder1kInfohash(b *testing.B, ps PeerStore) {
// ScrapeSwarm can run in parallel.
func ScrapeSwarm(b *testing.B, ps PeerStore) {
runBenchmark(b, ps, true, putPeers, func(i int, ps PeerStore, bd *benchData) error {
ps.ScrapeSwarm(bd.infohashes[0], bittorrent.IPv4)
ps.ScrapeSwarm(bd.infohashes[0], bd.peers[0].IPPort.IP())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should be pulled outside the runner function

@@ -445,7 +447,7 @@ func ScrapeSwarm(b *testing.B, ps PeerStore) {
// ScrapeSwarm1kInfohash can run in parallel.
func ScrapeSwarm1kInfohash(b *testing.B, ps PeerStore) {
runBenchmark(b, ps, true, putPeers, func(i int, ps PeerStore, bd *benchData) error {
ps.ScrapeSwarm(bd.infohashes[i%1000], bittorrent.IPv4)
ps.ScrapeSwarm(bd.infohashes[i%1000], bd.peers[0].IPPort.IP())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@jzelinskie
Copy link
Member Author

I'll break this down and rework everything now that netaddr got added to the standard library as net/netip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants