Discussion:
[gt-user] GRAM Problem .. Need your kind help
Abdul Rauf
2015-04-29 06:56:30 UTC
Permalink
Hi all,

We are running a research grid with GT 6.0 as middle-ware and Open Grid
Engine as LRM. The job is being submitted directly through SGE and is
executing. However, when we run even a simple job like below, it gives
error. Can you please guide me to resolve this issue? The other services
like myproxy and GridFTP are working fine.

--------------------------------------------------------------------------------------------------------------
[***@client1 ~]$ globus-job-run g1mu01 /bin/hostname

* GRAM Job failed because the job manager detected an invalid script status
(error code 25)*
[***@client1 ~]$
-----------------------------------------------------------------------------------------------------------------

Waiting your response.
Abdul Rauf
Steven Timm
2015-04-29 13:36:10 UTC
Permalink
I have not tried globus toolkit 6.0 yet but it is my experience
with previous versions of globus that globus error 25 is usually
due to some kind of file permissions error on your globus gatekeeper.
What unix uid/gid is the grid user getting mapped to? does it have
permissions to write the various globus areas? what
does the /var/log/globus-gatekeeper.log say?

Steve Timm


On Wed, 29 Apr 2015, Abdul Rauf wrote:

> Hi all,
> We are running a research grid with GT 6.0 as middle-ware and Open Grid Engine as LRM. The job is being submitted directly through SGE and is
> executing. However, when we run even a simple job like below, it gives error. Can you please guide me to resolve this issue? The other services
> like myproxy and GridFTP are working fine.
>
> --------------------------------------------------------------------------------------------------------------
> [***@client1 ~]$ globus-job-run g1mu01 /bin/hostname                         
>  GRAM Job failed because the job manager detected an invalid script status (error code 25)[***@client1 ~]$
> -----------------------------------------------------------------------------------------------------------------
>
> Waiting your response.
> Abdul Rauf
>
>

------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525
***@fnal.gov http://home.fnal.gov/~timm/
Office: Feynman Computing Center 243
Fermilab Scientific Computing Division,
Scientific Computing Facilities Quadrant.,
Experimental Computing Facilities Dept.,
Project Lead for Virtual Facility Project.
Abdul Rauf
2015-04-30 09:04:25 UTC
Permalink
Hi Steve,

Thanks for your response. Here is log from
/var/log/globus/gram_globus.log which
may help to diagnose:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ts=2015-04-30T08:54:51.546868Z id=13871 event=gram.script_read.end
level=ERROR gramid=/16433941732015088016/17731012974724016525/ status=-25*
reason="globus_xio: System error in read*: Connection reset by
peer\nglobus_xio: A system call failed: Connection reset by peer\n"
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

and log from */var/log/globus/globus-gatekeeper.log*

-------------------------------------------------------------------------------------------
[***@g1mu01 log]# tail globus-gatekeeper.log
TIME: Thu Apr 30 11:54:50 2015
PID: 13871 -- Notice: 0: Set GATEWAY_INTERFACE to CGI/1.1
TIME: Thu Apr 30 11:54:50 2015
PID: 13871 -- Notice: 0: Set SERVER_NAME to g1mu01
TIME: Thu Apr 30 11:54:50 2015
PID: 13871 -- Notice: 0: Set SERVER_PORT to 2119
TIME: Thu Apr 30 11:54:51 2015
PID: 13870 -- Notice: 0: Read 269 bytes from proxy pipe
TIME: Thu Apr 30 11:54:51 2015
PID: 13870 -- Notice: 0: Child 13871 started
[***@g1mu01 log]#
-------------------------------------------------------------------------------------------

I'm waiting for your response.

Thanks and best regards,
Abdul Rauf


On Wed, Apr 29, 2015 at 4:36 PM, Steven Timm <***@fnal.gov> wrote:

> I have not tried globus toolkit 6.0 yet but it is my experience
> with previous versions of globus that globus error 25 is usually
> due to some kind of file permissions error on your globus gatekeeper.
> What unix uid/gid is the grid user getting mapped to? does it have
> permissions to write the various globus areas? what
> does the /var/log/globus-gatekeeper.log say?
>
> Steve Timm
>
>
>
> On Wed, 29 Apr 2015, Abdul Rauf wrote:
>
> Hi all,
>> We are running a research grid with GT 6.0 as middle-ware and Open Grid
>> Engine as LRM. The job is being submitted directly through SGE and is
>> executing. However, when we run even a simple job like below, it gives
>> error. Can you please guide me to resolve this issue? The other services
>> like myproxy and GridFTP are working fine.
>>
>>
>> --------------------------------------------------------------------------------------------------------------
>> [***@client1 ~]$ globus-job-run g1mu01 /bin/hostname
>>
>> GRAM Job failed because the job manager detected an invalid script
>> status (error code 25)[***@client1 ~]$
>>
>> -----------------------------------------------------------------------------------------------------------------
>>
>> Waiting your response.
>> Abdul Rauf
>>
>>
>>
> ------------------------------------------------------------------
> Steven C. Timm, Ph.D (630) 840-8525
> ***@fnal.gov http://home.fnal.gov/~timm/
> Office: Feynman Computing Center 243
> Fermilab Scientific Computing Division,
> Scientific Computing Facilities Quadrant.,
> Experimental Computing Facilities Dept.,
> Project Lead for Virtual Facility Project.
>
>
Abdul Rauf
2015-05-02 05:49:24 UTC
Permalink
Hi Steven & other GT users,

I am waiting for your response. Please help to resolve the issue.

Thanks and best regards,
Abdul Rauf




On Thu, Apr 30, 2015 at 12:04 PM, Abdul Rauf <***@gmail.com> wrote:

> Hi Steve,
>
> Thanks for your response. Here is log from /var/log/globus/gram_globus.log
> which may help to diagnose:
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ts=2015-04-30T08:54:51.546868Z id=13871 event=gram.script_read.end
> level=ERROR gramid=/16433941732015088016/17731012974724016525/ status=-25*
> reason="globus_xio: System error in read*: Connection reset by
> peer\nglobus_xio: A system call failed: Connection reset by peer\n"
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> and log from */var/log/globus/globus-gatekeeper.log*
>
>
> -------------------------------------------------------------------------------------------
> [***@g1mu01 log]# tail globus-gatekeeper.log
> TIME: Thu Apr 30 11:54:50 2015
> PID: 13871 -- Notice: 0: Set GATEWAY_INTERFACE to CGI/1.1
> TIME: Thu Apr 30 11:54:50 2015
> PID: 13871 -- Notice: 0: Set SERVER_NAME to g1mu01
> TIME: Thu Apr 30 11:54:50 2015
> PID: 13871 -- Notice: 0: Set SERVER_PORT to 2119
> TIME: Thu Apr 30 11:54:51 2015
> PID: 13870 -- Notice: 0: Read 269 bytes from proxy pipe
> TIME: Thu Apr 30 11:54:51 2015
> PID: 13870 -- Notice: 0: Child 13871 started
> [***@g1mu01 log]#
>
> -------------------------------------------------------------------------------------------
>
> I'm waiting for your response.
>
> Thanks and best regards,
> Abdul Rauf
>
>
> On Wed, Apr 29, 2015 at 4:36 PM, Steven Timm <***@fnal.gov> wrote:
>
>> I have not tried globus toolkit 6.0 yet but it is my experience
>> with previous versions of globus that globus error 25 is usually
>> due to some kind of file permissions error on your globus gatekeeper.
>> What unix uid/gid is the grid user getting mapped to? does it have
>> permissions to write the various globus areas? what
>> does the /var/log/globus-gatekeeper.log say?
>>
>> Steve Timm
>>
>>
>>
>> On Wed, 29 Apr 2015, Abdul Rauf wrote:
>>
>> Hi all,
>>> We are running a research grid with GT 6.0 as middle-ware and Open Grid
>>> Engine as LRM. The job is being submitted directly through SGE and is
>>> executing. However, when we run even a simple job like below, it gives
>>> error. Can you please guide me to resolve this issue? The other services
>>> like myproxy and GridFTP are working fine.
>>>
>>>
>>> --------------------------------------------------------------------------------------------------------------
>>> [***@client1 ~]$ globus-job-run g1mu01 /bin/hostname
>>>
>>> GRAM Job failed because the job manager detected an invalid script
>>> status (error code 25)[***@client1 ~]$
>>>
>>> -----------------------------------------------------------------------------------------------------------------
>>>
>>> Waiting your response.
>>> Abdul Rauf
>>>
>>>
>>>
>> ------------------------------------------------------------------
>> Steven C. Timm, Ph.D (630) 840-8525
>> ***@fnal.gov http://home.fnal.gov/~timm/
>> Office: Feynman Computing Center 243
>> Fermilab Scientific Computing Division,
>> Scientific Computing Facilities Quadrant.,
>> Experimental Computing Facilities Dept.,
>> Project Lead for Virtual Facility Project.
>>
>>
>
Sill, Alan
2015-05-02 07:12:03 UTC
Permalink
I believe Steve had also asked these questions:

> On May 2, 2015, at 12:49 AM, Abdul Rauf <***@gmail.com> wrote:
>
> What unix uid/gid is the grid user getting mapped to? does it have
> permissions to write the various globus areas?
Loading...