atom feed10 messages in org.apache.hadoop.zookeeper-userRe: Confused about KeeperState.Discon...
FromSent OnAttachments
Jean-Daniel CryansJun 23, 2009 12:32 pm 
Benjamin ReedJun 23, 2009 3:03 pm 
Jean-Daniel CryansJun 23, 2009 3:43 pm 
Benjamin ReedJun 24, 2009 11:00 am 
Jean-Daniel CryansJun 24, 2009 11:55 am 
Gustavo NiemeyerJun 24, 2009 12:14 pm 
Jean-Daniel CryansJun 24, 2009 12:22 pm 
Gustavo NiemeyerJun 24, 2009 12:38 pm 
Jean-Daniel CryansJun 24, 2009 12:56 pm 
Benjamin ReedJun 24, 2009 1:41 pm 
Subject:Re: Confused about KeeperState.Disconnected and KeeperState.Expired
From:Benjamin Reed (bre@yahoo-inc.com)
Date:Jun 23, 2009 3:03:49 pm
List:org.apache.hadoop.zookeeper-user

ZooKeeper only tells you about states that it is sure about, so you will not get the Expired event until you reconnect to ZooKeeper. if you never connect again to ZooKeeper, you will not get the Expired event. if you want to timeout using some sanity value, 2 times the session timeout for example, you can implement that yourself by setting a timer when you get the disconnected event and then close the session explicitly when the timer goes off.

there is a caveat in doing this: if your whole cluster goes down for 20 mins and then comes back up, your session timeout will get reset and the session will still be alive even though you have closed it. it will then have to timeout before it actually goes away. closing the session when the client is disconnected just stops the client from trying to reconnect.

does this make sense?

ben

Jean-Daniel Cryans wrote:

Hey all,

Working on integrating HBase with ZK, we came around an issue that we are unable to resolve. I was trying to see how was our handling of network partitions and session expirations and what I did is just starting a single ZK instance with a very simple HBase setup, then I killed the ZK server. The only thing I got from Zookeeper was a KeeperState.Disconnected then... nothing (for like 20+ minutes). Normally if I had a quorum I would still get that message but then I would get another one telling me it's connected to another ZK quorum server. So how do I know if I'm really partitioned from the ZK quroum? Shouldn't we get a session expired at some point? From what I understand you can only get a KeeperState.Expired when you connect back to the quorum after x time, but what if you can "never" connect back to it?

BTW this is r785019.

Thx a lot!

J-D