Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix type mismatch in new/delete of addrinfo::ai_addr #1671

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

jtstrs
Copy link
Contributor

@jtstrs jtstrs commented Feb 13, 2024

ERROR: AddressSanitizer: new-delete-type-mismatch...:
size of the allocated type: 28 bytes;
size of the deallocated type: 16 bytes.
in Ip::Address::InitAddr(addrinfo*&) src/ip/Address.cc:678
in Ip::Address::FreeAddr(addrinfo*&) src/ip/Address.cc:690
  • delete used addrinfo::ai_addr type (i.e. sockaddr)
  • some new calls use sockaddr_in type
  • some new calls use sockaddr_in6 type

Mismatching new/delete types result in undefined behavior. However,
since these are all PODs, the bug may not have resulted in runtime
problems in most environments, even though sockaddr_in6 is usually
larger than sockaddr and sockaddr_in tructure (e.g., 28 vs. 16 bytes).

@squid-prbot

This comment was marked as outdated.

@kinkie

This comment was marked as resolved.

@jtstrs

This comment was marked as resolved.

@kinkie

This comment was marked as resolved.

code polish
yadij

This comment was marked as outdated.

@yadij yadij added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels S-could-use-an-approval An approval may speed this PR merger (but is not required) labels Feb 13, 2024
@yadij

This comment was marked as outdated.

src/ip/Address.cc Outdated Show resolved Hide resolved
src/ip/Address.cc Outdated Show resolved Hide resolved
src/ip/Address.cc Outdated Show resolved Hide resolved
@rousskov rousskov changed the title Fix memory leak at FreeAddr Fix type mismatch in new/delete of addrinfo::ai_addr Feb 13, 2024
@rousskov rousskov added S-waiting-for-author author action is expected (and usually required) and removed M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels S-could-use-an-approval An approval may speed this PR merger (but is not required) labels Feb 13, 2024
@rousskov
Copy link
Contributor

Please also double check the adjusted PR title/description.

@jtstrs, have you seen this bug result in actual memory leaks with some common memory management library? I am just curious. It is a bug regardless of whether those memory leaks are common in some typical environments...

@jtstrs
Copy link
Contributor Author

jtstrs commented Feb 14, 2024

@rousskov
Actually there is just an new-delete-type-mismatch was caught with Address Sanitizer.
Memory leak was my assumption. Sorry, if it added some unclearance

There typical log from ASan for this case:

=================================================================
==(squid-1)==20157==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x603000115990 in thread T0:
object passed to delete has wrong type:
size of the allocated type: 28 bytes;
size of the deallocated type: 16 bytes.
#0 0x7f900f51b5a5 in operator delete(void*, unsigned long) (/usr/lib64/libasan.so.5+0x10a5a5)
1 0xefd1c7 in Ip::Address::FreeAddr(addrinfo*&) SQUID_PATH//squid-5.9/src/ip/Address.cc:690

0x603000115990 is located 0 bytes inside of 28-byte region [0x603000115990,0x6030001159ac)
allocated by thread T0 here:
#0 0x7f900f51a11f in operator new(unsigned long) (/usr/lib64/libasan.so.5+0x10911f)
1 0xefd05a in Ip::Address::InitAddr(addrinfo*&) SQUID_PATH/squid-5.9/src/ip/Address.cc:678

SUMMARY: AddressSanitizer: new-delete-type-mismatch (/usr/lib64/libasan.so.5+0x10a5a5) in operator delete(void*, unsigned long)
==(squid-1)==20157==HINT: if you don't care about these errors you may set ASAN_OPTIONS=new_delete_type_mismatch=0

Command: (squid-1) --kid squid-1 -f /etc/squid/squid_captive.conf

@kinkie
Copy link
Contributor

kinkie commented Feb 14, 2024

@rousskov Actually there is just an new-delete-type-mismatch was caught with Address Sanitizer. Memory leak was my assumption. Sorry, if it added some unclearance

It looks to me like you found a genuine issue.
Not only that, Ip::Address is used quite a lot, which means it's also impactful.
@rousskov 's comment is spot on, I reckon that the best path forward is to ensure ai_addrlen is consistent with the actual size of what is being stored in ai_addr, and to use that information. Long term, the whole class should probably be redesigned but that's against the actual short-term objective of fixing the problem quickly

@rousskov
Copy link
Contributor

Francesco: It looks to me like you found a genuine issue.

I would like to second Francesco's opinion: @jtstrs, your research/PR is very valuable, even if there are usually no memory leaks associated with the new/delete mismatch bugs you have uncovered.

@jtstrs, would you like to refactor PR changes to isolate ai_addr construction (including correct length setting and zeroing) and destruction into dedicated functions or should we take a stab at that first? Either way is fine; just let us know.

@jtstrs
Copy link
Contributor Author

jtstrs commented Feb 14, 2024

I'll prefer to stick an approach with encapsulation of construction/deleting ai_addr

@rousskov

This comment was marked as outdated.

@rousskov rousskov removed the S-waiting-for-author author action is expected (and usually required) label Feb 16, 2024
This also reduces doubts about what FreeAddrMember() frees.
Code that frees memory is often used in destructors.
Destructors must not throw.

We could assert instead, but perhaps there is no need to kill worker in
this case? Current ip/Address.cc code is not consistent with regard to
assert-vs-report-a-bug choice.
Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a few polishing adjustments and can now approve this PR. I am not requesting any further code changes in this PR.

@jtstrs, please review and retest my changes. I could have missed something! Thank you.

memset(ai->ai_addr, 0, sizeof(struct sockaddr_in6));

ai->ai_addrlen = sizeof(struct sockaddr_in6);
FreeAddrMember(*ai);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also move this call into a new else clause for the above addrinfo allocation if statement.

debugs(14, DBG_CRITICAL, "ERROR: Squid Bug: Unexpected addrinfo::ai_addr size: " << ai.ai_addrlen <<
Debug::Extra << "sockaddr_in size: " << sizeof(struct sockaddr_in) <<
Debug::Extra << "sockaddr_in6 size: " << sizeof(struct sockaddr_in6));
// leak memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can assert() instead. Throwing in cleanup code is not a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert() would also be a bad idea for now. There are struct sockaddr_un stored for UDS sockets Comm::Connection. They have variable-length allocations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are struct sockaddr_un stored for UDS sockets Comm::Connection. They have variable-length allocations.

N.B. comm_open_uds() does not use this function (i.e. does not call Ip::Address::InitAddr() and FreeAddr()).

If you know of any Squid code that (directly or indirectly) allocates addrinfo::ai_addr using something other than the two PR-handled types and then calls (directly or indirectly) Ip::Address::FreeAddr(), please point me to it. I could not find it. I hope such code does not exist, but if it does exist, then this PR will need to be changed to reflect its existence. (This PR code will need to be changed for other reasons as well, of course, but I am documenting one of the assumptions that the current PR design relies on and that is not directly related to those other reasons).

There is code that allocates addrinfo::ai_addr using getaddrinfo(), but that code calls freeaddrinfo(), as it should. That code does not directly affect this part of the discussion.

@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Feb 16, 2024
@jtstrs
Copy link
Contributor Author

jtstrs commented Feb 22, 2024

Changes looks good from my side.
Thank you for polishing

@jtstrs
Copy link
Contributor Author

jtstrs commented Feb 22, 2024

I'll check it patch with ASan again and add comment with results little bit later

src/ip/Address.cc Outdated Show resolved Hide resolved
src/ip/Address.cc Outdated Show resolved Hide resolved
delete reinterpret_cast<struct sockaddr_in*>(ai.ai_addr);
} else if (ai.ai_addrlen == sizeof(struct sockaddr_in6)) {
delete reinterpret_cast<struct sockaddr_in6*>(ai.ai_addr);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squid also stores struct sockaddr_storage when the exact address type is unknown before being filled with values.

(Please double-check my suggestion. I am not sure if the ai_addrlen value actually survives the syscall()'s which fill in the address value.)

Suggested change
} else {
else if (ai.ai_addrlen == sizeof(struct sockaddr_storage))
delete reinterpret_cast<struct sockaddr_storage*>(ai.ai_addr);
else {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the ai_addrlen value actually survives the syscall()'s which fill in the address value.

Good catch! I agree that the PR code is buggy: We cannot always rely on ai_addrlen to be preserved during lifetime of an Ip::Address::InitAddr()-allocated addrinfo object:

recvfrom(int socket, void *restrict buffer, size_t length, int flags, struct sockaddr *restrict address,
         socklen_t *restrict address_len);

     ... The address_len argument is a value-result argument, initialized to the size of
     the buffer associated with address, and modified on return to indicate the actual size of the address
     stored there.

Same for getsockname() and probably other system calls.

    struct addrinfo *AI = nullptr;
    Ip::Address::InitAddr(AI);
    int x = recvfrom(fd, buf, len, flags, AI->ai_addr, &AI->ai_addrlen);
    ...
    Ip::Address::FreeAddr(AI);

Same for at least one getsockname() caller -- comm_local_port().

To move closer to a solution, we should stop (ab)using addrinfo type for affected recvfrom(), getsockname(), and possibly other system calls that have nothing to do with addrinfo. Most likely, all those calls should not use dynamic memory allocation at all! If we are lucky, eliminating Ip::Address::InitAddr() callers that do not need addrinfo will reduce the original problem footprint enough for the existing PR Init()/Free() implementation to start working reliably.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR; these were an attempt to standardize on one C-type for the IP::Address API. It did not get completed and the individual sockaddr_* type methods removed. struct addrinfo was chosen as it is the formal type used by modern IPv6-enabled APIs. It was begun before I learned of sockaddr_storage which is essentially the C equivalent of what Ip::Address is supposed to be.

In hindsight the syscalls using any sockaddr_* type should have been converted to sockaddr_storage. So we have addrinfo for just the DNS related APIs and sockaddr_storage for the socket APIs. Dropping support of the other methods from Ip::Address

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtstrs, I believe Amos' recollection supports my recommendation and gives you an additional hint with regard to the right type to use for at least some of the system calls you are considering converting away from addrinfo.

debugs(14, DBG_CRITICAL, "ERROR: Squid Bug: Unexpected addrinfo::ai_addr size: " << ai.ai_addrlen <<
Debug::Extra << "sockaddr_in size: " << sizeof(struct sockaddr_in) <<
Debug::Extra << "sockaddr_in6 size: " << sizeof(struct sockaddr_in6));
// leak memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert() would also be a bad idea for now. There are struct sockaddr_un stored for UDS sockets Comm::Connection. They have variable-length allocations.

@rousskov rousskov self-requested a review February 25, 2024 12:17
src/ip/Address.cc Outdated Show resolved Hide resolved
delete reinterpret_cast<struct sockaddr_in*>(ai.ai_addr);
} else if (ai.ai_addrlen == sizeof(struct sockaddr_in6)) {
delete reinterpret_cast<struct sockaddr_in6*>(ai.ai_addr);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the ai_addrlen value actually survives the syscall()'s which fill in the address value.

Good catch! I agree that the PR code is buggy: We cannot always rely on ai_addrlen to be preserved during lifetime of an Ip::Address::InitAddr()-allocated addrinfo object:

recvfrom(int socket, void *restrict buffer, size_t length, int flags, struct sockaddr *restrict address,
         socklen_t *restrict address_len);

     ... The address_len argument is a value-result argument, initialized to the size of
     the buffer associated with address, and modified on return to indicate the actual size of the address
     stored there.

Same for getsockname() and probably other system calls.

    struct addrinfo *AI = nullptr;
    Ip::Address::InitAddr(AI);
    int x = recvfrom(fd, buf, len, flags, AI->ai_addr, &AI->ai_addrlen);
    ...
    Ip::Address::FreeAddr(AI);

Same for at least one getsockname() caller -- comm_local_port().

To move closer to a solution, we should stop (ab)using addrinfo type for affected recvfrom(), getsockname(), and possibly other system calls that have nothing to do with addrinfo. Most likely, all those calls should not use dynamic memory allocation at all! If we are lucky, eliminating Ip::Address::InitAddr() callers that do not need addrinfo will reduce the original problem footprint enough for the existing PR Init()/Free() implementation to start working reliably.

debugs(14, DBG_CRITICAL, "ERROR: Squid Bug: Unexpected addrinfo::ai_addr size: " << ai.ai_addrlen <<
Debug::Extra << "sockaddr_in size: " << sizeof(struct sockaddr_in) <<
Debug::Extra << "sockaddr_in6 size: " << sizeof(struct sockaddr_in6));
// leak memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are struct sockaddr_un stored for UDS sockets Comm::Connection. They have variable-length allocations.

N.B. comm_open_uds() does not use this function (i.e. does not call Ip::Address::InitAddr() and FreeAddr()).

If you know of any Squid code that (directly or indirectly) allocates addrinfo::ai_addr using something other than the two PR-handled types and then calls (directly or indirectly) Ip::Address::FreeAddr(), please point me to it. I could not find it. I hope such code does not exist, but if it does exist, then this PR will need to be changed to reflect its existence. (This PR code will need to be changed for other reasons as well, of course, but I am documenting one of the assumptions that the current PR design relies on and that is not directly related to those other reasons).

There is code that allocates addrinfo::ai_addr using getaddrinfo(), but that code calls freeaddrinfo(), as it should. That code does not directly affect this part of the discussion.

@rousskov rousskov removed the S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box label Feb 26, 2024
@jtstrs
Copy link
Contributor Author

jtstrs commented Feb 28, 2024

Im little bit lost track of the conversation.
@rousskov @yadij, now I supposed to push my efforts to getting rid of a dynamic allocations for all usages of addrinfo related to recv and getsockname syscalls in scope of current PR?

@rousskov
Copy link
Contributor

rousskov commented Feb 28, 2024

now I supposed to push my efforts to getting rid of a dynamic allocations for all usages of addrinfo related to recv and getsockname syscalls in scope of current PR?

Personally, I recommend a slightly different approach: Find one addrinfo use case where addrinfo object is dynamically allocated, its length argument is supplied to some system call like recvfrom() to be updated, but the object as a whole is not used for anything (e.g. it is not then given to some other function to do something else). That is the simplest use case I can think of. Post a pull request dedicated to removing that one unnecessary addrinfo usage. Make sure the new code handles any reasonable (for the given context) address and errors. Once that PR lands, we will know how to address several similar (and possibly even all of the remaining) cases. If that plan works, eventually, we will get back to this PR with no dynamic addrinfo allocation cases left except for those this PR already handles correctly.

Icmp4::Recv() is a candidate but there might be even simpler use cases (Icmp4::Recv() does feed that addrinfo object to one of the fifty1 Ip::Address assignment operators, but it is possible that you will find another Ip::Address method that accomplishes what we want without adding one).

Footnotes

  1. "Fifty" is a wild exaggeration, of course. There are "only" eight AFAICT.

@jtstrs

This comment was marked as resolved.

@kinkie

This comment was marked as resolved.

@jtstrs

This comment was marked as off-topic.

@kinkie

This comment was marked as off-topic.

@jtstrs
Copy link
Contributor Author

jtstrs commented Apr 2, 2024

I've started analysis of things which we discussed above, and I'm afraid, that I could'nt found any "dummy" usages
of addrinfo structure, even Icmp* classes use information provided by recvfrom function for some logging purposes (I mean Icmp::Log(...)). So I'm not sure that there is possible to fully remove any usage of addrinfo, but I think that there is a few places, where we can get rid of from InitAddr usages.

@rousskov
Copy link
Contributor

rousskov commented Apr 2, 2024

Find one addrinfo use case where addrinfo object is dynamically allocated, its length argument is supplied to some system call like recvfrom() to be updated, but the object as a whole is not used for anything (e.g. it is not then given to some other function to do something else).

I've started analysis of things which we discussed above, and I'm afraid, that I could'nt found any "dummy" usages of addrinfo structure, even Icmp* classes use information provided by recvfrom function for some logging purposes (I mean Icmp::Log(...)). So I'm not sure that there is possible to fully remove any usage of addrinfo

Icmp::Log() does not take addrinfo. It takes Ip::Address. Thus, Icmp::Log() is a red herring on the path to removing excessive addrinfo use.

If you are studying Icmp4::Recv(), then decide what recvfrom() argument to use to store the "from" address and then how to convert that filled "from" address to Ip::Address (i.e. to preply.from). That new argument type should be byte-compatible with sockaddr structure layout (i.e. the type required by recvfrom()), of course. Most likely, that argument will not be allocated on the heap.

One possible (but not guaranteed) way to find an acceptable solution for the above problem is the following process:

  1. What is the overall best solution for correctly calling recvfrom() without using any 3rd party libraries?
  2. How to implement that solution in Icmp4::Recv() context?

@jtstrs
Copy link
Contributor Author

jtstrs commented Apr 3, 2024

Got it, thanks for hints, will proceed analysis then

@squid-anubis squid-anubis added M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels and removed M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels labels Apr 5, 2024
@jtstrs
Copy link
Contributor Author

jtstrs commented Apr 11, 2024

Probably we could use sockaddr_storage type inside of Ip::Address class since its supposed to be wrapper for C api (according to @yadij said) and also wrap all system api calls inside of separate module like it's done in for example comm.cc:comm_udp_from, thus we get all interuction with this api in one place, also later there is possible to implement some wrappers for socket class instead of working with it handlers directly. What do you think?

@rousskov
Copy link
Contributor

Probably we could use sockaddr_storage type inside of Ip::Address class ... and also wrap all system api calls inside of separate module ... What do you think?

Ip::Address does need a lot of changes, but a major development like changing Ip::Address storage type is way outside this PR scope (regardless of its long-term merits). Correctly creating a new module to "wrap all system API calls" is also a huge undertaking. With all due respect, this kind of development should not be driven by a new Squid developer -- correct changes on those paths require too many Squid-specific (and difficult) decisions.

Please start small instead! Let's see how we can solve a few isolated/small problems first. Use Ip::Address as needed; add or adjust a method or two if you must, but stop if class changes snowball beyond that. Use sockaddr_storage as/if needed. Eliminate one or two specific addrinfo use cases. We will then decide how to generalize your fixes.

This PR changes may affect Ip::Address, its future rewrites, and even creation of new modules, of course, but I hope that we do not need to cross that bridge right here, right now, before we have settled a single addrinfo removal change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-for-author author action is expected (and usually required)
Projects
None yet
6 participants