

![]() | Start a set with this search |
![]() | Include this search in one of my sets |
![]() | Exclude this search from one of my sets |
![]() | Permalink to these results Paste this link in email or IM: |
| Atom feed for tracking future search results Paste this URL into your reader: |
5 messages in net.sunsource.gridengine.usersRe: [GE users] JSV scripts running un...| From | Sent On | Attachments |
|---|---|---|
| ah_sunsource | Jun 10, 2009 3:15 am | |
| ernst | Jun 10, 2009 4:32 am | |
| ah_sunsource | Jun 10, 2009 6:10 am | .Other |
| ernst | Jun 10, 2009 10:00 am | |
| dougalb | Aug 25, 2009 1:26 am |

![]() | Permalink for this message Paste this link in email or IM: |
![]() | Permalink for this thread Paste this link in email or IM: |
| Atom feed for this thread Paste this URL into your reader: |
| Subject: | Re: [GE users] JSV scripts running unreliably | Actions... |
|---|---|---|
| From: | ernst (Erns...@sun.com) | |
| Date: | Jun 10, 2009 10:00:40 am | |
| List: | net.sunsource.gridengine.users | |
Hi Andreas,
The reason for your problem is the line "if [ $all_ok]; then". You should replace it by "if [ $all_ok -eq 1 ]; then". Due to that error the jsv_reject() and jsv_accept() functions are both executed. As a result an error message will be send to master that will trigger a restart of corresponding process. Unfortunately the error message is not logged to the message file. This would have helped you to find the error.
In general I recommend that you implement your script in a different scripting language (e.g. TCL or Perl). Bourne shell does not provide built in parsing capabilities and lines like "value=`echo $h_vmem | head -c -2`" fork a process. This might become a performance bottleneck.
Cheers,
Ernst
ah_sunsource wrote:
Hi Ernst,
thanks for your reply.
On Wed, 2009-06-10 at 13:32 +0200, ernst wrote:
Hi Andreas,
Your JSV scripts are restarted due to two reasons:
1) The Message "JSV modification time in ..." indicates that the modification time stamp of your JSV script has changed. Within GE a worker thread detects that and restarts the corresponding JSV process when the next incoming job should be verified.
OK.
2) There is a protocol error between a JSV process and the corresponding thread in master. I assume that your JSV script is not implemented correctly. The first job that is verified by JSV process is handled correctly but the second results in a protocol error. To debug your JSV script you can set the "logging_enabled" and "log_file" variable in the file that is included in your JSV script (e.g. JSV.pm, jsc_include.tcl or jsv_include.sh). After enabling this you can find the data that is exchanged between master and JSV process in the log_file.
I'm assuming a bug in my jsv as well ... ;-) But I don't really see a reason for this. I'm attaching it. Nevertheless I found a perl example and rewrote the jsv in perl and this one works perfectly :-)
I've also enabled logging but I only get a logfile in case the check runs correctly. In the other case no log is written at all. Any idea?
Cheers, Andreas
-- Sun Microsystems GmbH Ernst Bablick Dr.-Leo-Ritter-Str. 7 Software Engineer D-93049 Regensburg Phone: +49 (0)941 3075 135 Germany Fax: +49 (0)941 3075 222 http://www.sun.de mailto: erns...@sun.com
Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten Amtsgericht München: HRB 161028 Geschäftsführer: Thomas Schröder, Wolfgang Engels, Wolf Frenkel Vorsitzender des Aufsichtsrates: Martin Häring
------------------------------------------------------ http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=201449
To unsubscribe from this discussion, e-mail:
[user...@gridengine.sunsource.net].








.Other