atom feed11 messages in org.apache.incubator.airavata-devRe: Usefulness of JCR for GFac Descri...
FromSent OnAttachments
ptan...@umail.iu.eduJul 29, 2011 11:08 am 
Suresh MarruJul 31, 2011 9:07 pm 
Suresh MarruAug 9, 2011 9:54 am 
Mattmann, Chris A (388J)Aug 9, 2011 10:00 am 
Suresh MarruAug 9, 2011 10:09 am 
Mattmann, Chris A (388J)Aug 9, 2011 10:24 am 
Suresh MarruAug 9, 2011 7:27 pm 
Mattmann, Chris A (388J)Aug 9, 2011 7:31 pm 
Suresh MarruAug 9, 2011 7:36 pm 
Lahiru GunathilakeAug 10, 2011 5:56 am 
Mattmann, Chris A (388J)Aug 10, 2011 7:10 am 
Subject:Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)
From:Suresh Marru (sma@apache.org)
Date:Aug 9, 2011 10:09:12 am
List:org.apache.incubator.airavata-dev

On Aug 9, 2011, at 1:00 PM, Mattmann, Chris A (388J) wrote:

Hey Guys,

I would check out the Apache Jackrabbit project:

http://jackrabbit.apache.org/

It's a full implementation of the JCR spec and very active and healthy as Apache
projects go.

HI Chris,

It indeed looks like a very active project and the reference implementation for
JCR, thank for the pointer. I was poking through the documentation, but did not
get yet get my hands dirty. It might be quick to ask you, do you know how easy
will it be to add custom schemas and make the content of the document
searchable? For example, can I add a WSDL or a BPEL document and find out across
the repository which of the application services wsdl's wrap Gaussian molecular
chemistry model? This is a just an illustrative example, but I am curious how
the indexes will be built for content and how bad the performance will be if we
make lot of content searchable.

Thanks for your insights, Suresh

On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:

Hi All,

We are stalled on this thread, so how about getting to a consensus. Since I did
not see any further discussion on the use of schemas, should we assume we want
to retain XML Schemas and add simplified beans to easily work with instead of
generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai
explained in the original message below, there are three types of schema
documents for GFAC to describe the computational host, application deployment
description and finally service interface. Using these three descriptions, a
application service wsdl is generated and GFAC manages the deployed application
on various computational resources. There is a mapping between these deployment
descriptions. I am reading the JCR API document [2] and intrigued by the
relevance. But my inference is from a theoretical stand point and wondering if
any one on the list has experience good and bad on working against JCR spec.

[1] -
https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/ [2] - http://jcp.org/en/jsr/detail?id=283

Hi Patanachai,

Thanks for explaining the issue in detail. In simple terms, we need multiple
client components register a description about an application and store it in a
registry. GFac will need to pull the registered description document and execute
and manage the compute job. Along with XBaya as the client which registers the
document, there are other clients including a gadget interface.

I agree that the current scheme has to revisited (and fix minor issues like you
mention about the gridftp tags). But moving from xmlschema to a light weight
option is a bigger question. With a proper bean generation library and
serializing/deserializing methods I personally favor xml schema but I do not
want to be biased either. I am -1 for POJO simply because it will limit non-java
bases clients like a simple php web form. JSON in general sounds like a good
alternative, but I do not experience with it in a validation and schema sense.

I will wait for others to chime in, if there are no better alternatives
suggestion, I will import the missing GFac schema from code donation into a
commons area -
https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/

Cheers, Suresh

On Jul 29, 2011, at 2:09 PM, ptan@umail.iu.edu wrote:

Hi devs,

I want to discuss about the type system in GFAC-Core.

Currently, GFAC module read and write a necessary information based on XML schema (called GFAC-Schema) as a definition. GFAC-Schema library is generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in the project.

Examples of GFAC-Schema are: HostTypeDescription, which describes an environment for a host such as Java version, Temp directory, GridFTP endpoint etc. ServiceTypeDescription, which describes a service such as parameters, service name, etc. GFAC-SimpleType, which defines a simple parameter type to the service such as Boolean, Double, Integer, etc.

This is how system work roughly: After deploying their software on a computing host, users will register their host, application, service description via XBaya-GUI (Java Swing). This registration information will be saved to XRegistry as XML string according to XML schema. When users invoke a (Web) service, GFAC will load the necessary information (host, application directory, parameters, etc.) and execute the deployed software . Then, GFAC parses the output from the software, wraps it and send out as an appropriate parameter type format.

So, the question is do we want to continue using XML-Schema. If, we agree to use XML-Schema, we should import some initial schema from OGCE GFAC as a new module in Airavata. Also, we need to redesign some schema. For Instance, current HostType schema requires GridFTP Endpoint element which is not necessary if a computing host doesn't have GridFTP.

Otherwise, what do you propose? POJO, JSON, etc.