atom feed1 message in org.apache.hadoop.core-userHelp with LeaseExpiredException
FromSent OnAttachments
Michael StackSep 20, 2006 2:35 pm 
Subject:Help with LeaseExpiredException
From:Michael Stack (sta@archive.org)
Date:Sep 20, 2006 2:35:11 pm
List:org.apache.hadoop.core-user

Dear Hadoopers:

I'm using hadoop 0.5.0 (My job is a derivative of the nutch fetch job). I've had success in the past with older versions of hadoop but now jobs keep failing because one of the reduces invariably encounters 4 instances of the below:

org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/stack/nla/2005-outputs/segments/20060920054847-nla2005/crawl_fetch/part-00018/data at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:454) at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:228) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:468)

at org.apache.hadoop.ipc.Client$Connection.run(Client.java:159)

I've been playing making the jobs smaller in size -- shrinking from multi-day to single-day, and on down -- but they continue to fail with the above. I was going to try the Konstantin suggestion from here --
http://mail-archives.apache.org/mod_mbox/lucene-hadoop-dev/200607.mbox/%3C331ED54F-9FA7-48FE-A604-017CC54DA524@yahoo-inc.com%3E -- lowering the ipc timeout down to about 20 seconds from 60 but am a little worried that'll provoke issues elsewhere.

Was wondering if anyone else is running into this issue or if pointers on things to try.

Thanks, St.Ack