Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSCH queue add error #2105

Closed
Conrad2210 opened this issue Feb 9, 2017 · 14 comments
Closed

TSCH queue add error #2105

Conrad2210 opened this issue Feb 9, 2017 · 14 comments

Comments

@Conrad2210
Copy link

favorite
I'm developing a forwarding protocol using Contiki OS. The protocol is running on top of IEEE802.15.4 TSCH mode. The protocol requires to add a certain amount of packets during a short period of time very often I get following error:

[RLL]:Send to Parent 0 base timeslot: 40, currentTimeslot: 1, send timeslot: 45 at: asn-0.46c41d
TSCH: send packet to 255 with seqno 0, queue 0 1, len 8 120
[RLL]:Send to CS base timeslot: 40, currentTimeslot: 2, send timeslot: 50 at: asn-0.46c41e
TSCH-queue:! add packet failed: 0 #0x20003004 8 #0x0 #0x0
TSCH:! can't send packet to 255 with seqno 0, queue 1 1

While it adds the first packet, it can't add the second packet. The queue is not full, i checked that. The error simply says, its not possible to allocate memory for another packet, while there should be more than enough space.

Probably its just a simple setting i oversea but I can't find it. If anyone has a suggestion, please let me know.

Conrad

@simonduq
Copy link
Member

simonduq commented Feb 9, 2017

Hi Conrad,
Did you try tuning QUEUEBUF_CONF_NUM?
Simon

@Conrad2210
Copy link
Author

Hi Simon,
yeah I increased it to 32, but I found a similar problem #1766
It seems like that's the problem, working on a workaround at the moment.
If i know more, I'll update this post....

@simonduq
Copy link
Member

simonduq commented Feb 9, 2017

OK. This sounds like a problem in tsch_queue_reset. No time to dig myself now but please share any findings on your side :)

@atiselsts
Copy link
Contributor

Hi,
I think that's related to TSCH locking. tsch_queue_remove_nbr grabs a lock and calls tsch_queue_flush_nbr_queue. Then tsch_queue_flush_nbr_queue calls tsch_queue_remove_packet_from_queue for each packet, but this function does the job only if TSCH is not locked.

Now that I think of it, I suspect I've seen this as well.

@simonduq
Copy link
Member

simonduq commented Feb 9, 2017

Could be lock-related but not exactly what you describe: tsch_queue_remove_nbr releases the lock before calling tsch_queue_flush_nbr_queue.

@Conrad2210
Copy link
Author

I couldn't find what caused the problem, but I found a workaround for the moment. I tested it overnight and I didn't receive a single error. I will further test it during my experiments and if its working fine I'll create a PR out of it.

But here is for the moment what I did to solve it:

void tsch_queue_reset(void)
{
  /* Deallocate unneeded neighbors */
  if (!tsch_is_locked())
  {
	struct tsch_neighbor *n = list_head(neighbor_list);
	while (n != NULL)
	{

		struct tsch_neighbor *next_n = list_item_next(n);
		/* Flush queue */
		tsch_queue_flush_nbr_queue(n);
		/* Reset backoff exponent */
		tsch_queue_backoff_reset(n);
		n = next_n;
	}


	//re-initialise the buffers
	memb_init(&packet_memb); // <--- re-initialise packet buffer
	queuebuf_init(); //<--- re-initialise queue buffer
	}
}

@simonduq
Copy link
Member

Right, but I'd rather fix the root cause than aggressively re-init the modules. We need to find the memory leak.

@Conrad2210
Copy link
Author

I'll dig deeper when I have more time but might take some time till I can do it...

@simonduq
Copy link
Member

I fully understand that!

@yatch
Copy link
Contributor

yatch commented Feb 10, 2017

@Conrad2210 The patches of #2046 could resolve the issue you are experiencing. See #2108 for more information.

@Conrad2210
Copy link
Author

@yatch thanks for this, I will test it as soon as possible. Anyways, what you describe sounds reasonable and could be the problem. As soon as I know more, I'll let you know.

@Conrad2210
Copy link
Author

I had time today to check if the solution mentioned in #2108 and #2046 is solving the problem.
After running a few experiments, the error didn't appear anymore. I will run me experiments today and tomorrow. As soon as I get the results I will confirm my first observations

@Conrad2210
Copy link
Author

I was running experiments all day yesterday and during the night, and the problem is gone.
The solution mentioned in #2108 and #2046 solves the problem.

Thanks @yatch for the help!!!

@yatch
Copy link
Contributor

yatch commented Feb 14, 2017

@Conrad2210 Thank you for the test and the report! I'm happy to hear that!

alexrayne pushed a commit to alexrayne/contiki that referenced this issue Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants