Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packet::setRawPacket parses layers it should not when parseUntilLayer is set #657

Open
weyrick opened this issue May 26, 2021 · 13 comments
Open

Comments

@weyrick
Copy link

weyrick commented May 26, 2021

Hello, I found that Packet::setRawPacket is parsing too many layers when parseUntilLayer is set. If parseUntilLayer is set to OsiModelTransportLayer and the packet contains DNS traffic, it will still parse into the DNS layer even though it should stop at the UDP layer.

This is the offending loop, which parses until <= parseUntilLayer (should be just <?):

while (curLayer != NULL && (curLayer->getProtocol() & parseUntil) == 0 && curLayer->getOsiModelLayer() <= parseUntilLayer)
{
m_ProtocolTypes |= curLayer->getProtocol();
curLayer->parseNextLayer();
curLayer->m_IsAllocatedInPacket = true;
curLayer = curLayer->getNextLayer();
if (curLayer != NULL)
m_LastLayer = curLayer;
}

There seems to be a hack in the code to clean up the extra layer if it went too far:

if (curLayer != NULL && curLayer->getOsiModelLayer() > parseUntilLayer)
{
m_LastLayer = curLayer->getPrevLayer();
delete curLayer;
m_LastLayer->m_NextLayer = NULL;
}

But this is extra work, and the result is that the DNS payload is parsed twice (because I do my own DNS parsing later based on just UDP payload from pcpp).

@seladb
Copy link
Owner

seladb commented May 28, 2021

@weyrick thanks for reaching out!
The issue is that there might be several layers of the same OSI layer. The most trivial example I can think of is multiple VLAN layers. If we change <= to < only the first VLAN layer will be parsed.
The only way to know what the next layer would be is to parse it. Maybe we can introduce "partial parsing" or "detect layer" functionality that does the minimum work needed to detect what is the next layer 🤔

@weyrick
Copy link
Author

weyrick commented Jun 1, 2021

Hi @seladb, ok interesting. Right so how about add a pure virtual Layer::identifyNextLayer and give each layer the ability to peek into the payload to see what is next without parsing it. This is essentially what parseNextLayer does already, except that it creates the layer immediately when it finds it, for example:

	if (HttpMessage::isHttpPort(portDst) && HttpRequestFirstLine::parseMethod((char*)payload, payloadLen) != HttpRequestLayer::HttpMethodUnknown)
		m_NextLayer = new HttpRequestLayer(payload, payloadLen, this, m_Packet);

So identifyNextLayer could do:

	if (HttpMessage::isHttpPort(portDst) && HttpRequestFirstLine::parseMethod((char*)payload, payloadLen) != HttpRequestLayer::HttpMethodUnknown)
        return std::make_pair(pcpp::HTTP, pcpp::OsiModelApplicationLayer);

or similar. Then this information could be used by setRawPacket and parseNextLayer. Make sense?

@seladb
Copy link
Owner

seladb commented Jun 3, 2021

That's an interesting idea @weyrick !
Maybe this pure virtual method should return the protocol, the OSI layer, and an indicator of which layer to create? 🤔
That way it can be used both instead of parseNextLayer(), for example (this is just pseudo code):

Layer* Layer::createLayer(protocolType, payload, payloadLen, ...)
{
    switch (protocolType)
    {
        case X:
            return new XLayer(payload, payloadLen, ...);
        case Y:
            return new YLayer(payload, payloadLen, ...);
        ...
        ...
    }
}
...

void Layer::parseNextLayer()
{
    pcpp::ProtocolInfo info = identifyNextLayer(payload, payloadLen);
    m_NextLayer = createLayer(info.protocolType, payload, payloadLen, ...);
}

And then identifyNextLayer() will contain pretty much all of the code that is currently in parseNextLayer(), except for creating the next layer. Please let me know what you think

@weyrick
Copy link
Author

weyrick commented Jun 3, 2021

Seems like that should work, although it means that adding new protocols requires changing this big createLayer() method - reduces cohesion. But not the end of the world. Are the arguments to the construction of all layers always the same? If there was special logic that required access to local member variables and that may be a problem too.

@seladb
Copy link
Owner

seladb commented Jun 4, 2021

although it means that adding new protocols requires changing this big createLayer() method

Yes, this is a good point. Maybe we can consider adding this method outside of the Layer class? 🤔

Are the arguments to the construction of all layers always the same?

Yes, as far as I remember the c'tors are always the same and include the pointer to the payload, its length, the previous layer and a pointer to the Packet object. I hope I haven't missed any layer...

@weyrick
Copy link
Author

weyrick commented Jun 5, 2021

I actually prefer my first suggestion, just splitting the identification from the creation and adding an identifyNextLayer, but otherwise leaving the architecture.

@seladb
Copy link
Owner

seladb commented Jun 9, 2021

I actually prefer my first suggestion, just splitting the identification from the creation and adding an identifyNextLayer, but otherwise leaving the architecture.

But this will create code duplication between identifyNextLayer() and parseNextLayer(), won't it? 🤔

@weyrick
Copy link
Author

weyrick commented Jun 9, 2021

Well they would both still have to have if blocks or switches able to handle any of the next layers, but no functionality would be duplicated no. parseNextLayer() would almost stay as is and be renamed to identifyNextLayer(), except it would never actually create a layer, only return what should be created. parseNextLayer() would then be added to take those returned values as input, and all it does is create the indicated layer. So the idea is just to separate those two pieces of logic into two functions.

So if this is a snippet of UdpLayer:

void UdpLayer::parseNextLayer()
{
	if (m_DataLen <= sizeof(udphdr))
		return;

	uint16_t portDst = getDstPort();
	uint16_t portSrc = getSrcPort();

	uint8_t* udpData = m_Data + sizeof(udphdr);
	size_t udpDataLen = m_DataLen - sizeof(udphdr);

	if ((portSrc == 68 && portDst == 67) || (portSrc == 67 && portDst == 68) || (portSrc == 67 && portDst == 67))
		m_NextLayer = new DhcpLayer(udpData, udpDataLen, this, m_Packet);
	else if (DnsLayer::isDataValid(udpData, udpDataLen) && (DnsLayer::isDnsPort(portDst) || DnsLayer::isDnsPort(portSrc)))
		m_NextLayer = new DnsLayer(udpData, udpDataLen, this, m_Packet);
...
}

we would get after refactor (pseudo)

NextLayerInfo UdpLayer::identifyNextLayer()
{
	if (m_DataLen <= sizeof(udphdr))
		return;

	uint16_t portDst = getDstPort();
	uint16_t portSrc = getSrcPort();

	uint8_t* udpData = m_Data + sizeof(udphdr);
	size_t udpDataLen = m_DataLen - sizeof(udphdr);
 
        NextLayerInfo result;

	if ((portSrc == 68 && portDst == 67) || (portSrc == 67 && portDst == 68) || (portSrc == 67 && portDst == 67)) {
                 result.protocol = pcpp::DHCP;
                 result.osiLayer = pcpp::OsiModelApplicationLayer;
        }
	else if (DnsLayer::isDataValid(udpData, udpDataLen) && (DnsLayer::isDnsPort(portDst) || DnsLayer::isDnsPort(portSrc))) {
                 result.protocol = pcpp::DNS;
                 result.osiLayer = pcpp::OsiModelApplicationLayer;
        }
...
}
void UdpLayer::parseNextLayer(NextLayerInfo info) {
	if (m_DataLen <= sizeof(udphdr))
		return;

	uint16_t portDst = getDstPort();
	uint16_t portSrc = getSrcPort();

	uint8_t* udpData = m_Data + sizeof(udphdr);
	size_t udpDataLen = m_DataLen - sizeof(udphdr);

        switch (info.protocol) {
        case pcpp.DHCP:
            m_NextLayer = new DhcpLayer(udpData, udpDataLen, this, m_Packet);
            break;
        case pcpp::DNS:
            m_NextLayer = new DnsLayer(udpData, udpDataLen, this, m_Packet);
            break;
        }

}

Make sense?

@seladb
Copy link
Owner

seladb commented Jun 10, 2021

hi @weyrick , yes this can be a nice refactoring. There is still some code duplication but I guess we can live with that

@weyrick
Copy link
Author

weyrick commented Jun 10, 2021

Actually I did a little thinking, and some downsides with this is that it will take some work, it breaks a public API, and has some small duplication as you mention. Perhaps a much easier refactor would be to simply propagate the parseUntilLayer parameter to the existing parseNextLayer() function (which can default to unknown), and rely on the layers to not parse too far if it's not requested. That seems simpler, doesn't break the API, and I think it solves the problem. What do you think?

virtual void parseNextLayer(ProtocolType parseUntil = UnknownProtocol, OsiModelLayer parseUntilLayer = OsiModelLayerUnknown) = 0;

@seladb
Copy link
Owner

seladb commented Jun 15, 2021

@weyrick that's a great idea, and a much cleaner solution! It will still require changes in each layer's parseNextLayer() to check if the next layer should be parsed, but as you mentioned it will require fewer AP changes.
Will you consider implementing this and open a PR?

@weyrick
Copy link
Author

weyrick commented Jun 17, 2021

Unfortunately won't have time in the near future.

@seladb
Copy link
Owner

seladb commented Jun 17, 2021

that makes sense @weyrick , thanks for letting me know. Let's keep this issue open for now, maybe someone will want to take it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants