|Subject:||Re: the Send bug - solved?|
|From:||Arshan Poursohi (Arsh...@sun.com)|
|Date:||Oct 2, 2008 3:50:55 pm|
Ron Goldman wrote:
I think I understand why Yggdrasil was having trouble sleeping when it couldn't reach the basestation. There are a number of bad interactions & wrong assumptions between Yggi and the Routing Manager --- plus some outright bugs in the Routing Manager.
The major problems include:
1. Yggi still shuts the Routing Manager (RM) down when it wants to deep sleep & restarts it when it wakes up. This causes the RM to spawn new RequestTable & RouteTable cleaning threads. But since Java ME doesn't let one explicitly kill a thread, the old cleaning threads are still alive, blocked on the same lock that the new threads immediately also block on. The original AODV code didn't allow for the cleaning threads to be stopped & restarted, so it uses notify() instead of notifyAll() with the result that the old cleaning thread gets woken up & then realizes that it is "dead" so it goes away. Meanwhile the new thread is still blocked until another table entry is added. And by that point it also has been marked for death & another cleaning thread started. The end result of all this is that the Route Request never times out.
why do you say still? we saw this problem with the send before the latest changes that I understand make the explicit sleeping of the routing manager uneccessary, so when we still had to make the calls to get the spot to sleep at all.
2. Commenting out the Yggi Switchboard calls to start/stop the RM then exposes a second issue: AODV keeps timed out Routing Requests around for a short time so that it can ignore any duplicates that arrive later, preventing loops, etc. The time it keeps them for is currently 60 seconds -- during which time the RequestTable cleaning thread will prevent deep sleep from happening. Now if Yggi is waking up every minute and trying to connect to a basestation that isn't there that means just before the old Route Request is flushed from the RequestTable and the RM will allow deep sleep to happen, a new Route Request is made that will overlap with the old. So no deep sleep for the SPOT until it can successfully reach the basestation.
dont know if this goes back to the 1st problem, but we were seeing the spots hang entirely when using sleep periods of 30 min between samples.
I'll work with Pete to try to improve the RM to avoid these problems.
could you guys note the mods that should be made to correspond with any new changes and with the changes made for the explicit sleeping of the routingmanager?
Also we are considering some changes in the way the sleeping is managed in yggi and I'd love to talk to you a bit about this at tuesdays yggi/solarium meeting if you can make it.
-- Ron --
p.s. I think this fully explains the problems that Yggi was seeing, but I ran into what I thought was the same bug and it still occurs despite fixes for the above....