Tags:
create new tag
view all tags

Explorations with a VOEvent Ontology

for a User-Annotated Solar Catalogue


Elizabeth Auden, VOTech

13 November 2005

Introduction

Solar event catalogues provide spatial, time, spectral and coordinate information for occurrences on the Sun such as flares, coronal mass ejections (CMEs) and solar waves. This use of this metadata make solar events ideally described by VOEvent packets. Most event catalogues are maintained by scientists associated with specific facilities or instruments, such as the Yohkoh SXT / TRACE flare list [1], the Hessi flare list [2], and the NOAA SGAS energetic event list [3]. As new solar missions are launched that produce increasingly larger datasets, more solar events are likely to be discovered by scientists who analyse mission datasets but are not formally associated with the mission in question. Therefore, a solar physicist at MSSL has suggested the development of an online solar event catalogue that could receive contributions from any member of the solar community. The Solar User-Annotated VOEvent Catalogue (SuaveCAT) will be implemented as VOTech research into the use of a VOEvent ontology with practical space science applications.

Background

The EGSO Solar Event Catalogue [4] has become quite popular as solar researchers have begun to interact with virtual observatories. This event catalogue combines other online catalogues, such as the Yohkoh, Hessi and NOAA catalogues mentioned above, into a single searchable interface. Users may search either one or two catalogues simultaneously based on event start and end times, or they may freely search the combined system using SQL queries. The EGSO SEC has been integrated with AstroGrid using the DataSet Access (DSA) [5] module. This allows users to incorporate searches of the SEC into larger workflows, such as the AstroGrid solar movie maker. In order to integrate successfully with solar workflows, the SuaveCAT facility should offer similar event metadata to the catalogues incorporated in the EGSO SEC.

In addition to the first requirement that virtual observatory users be able to update this event catalogue, two further requirements were imposed to gain experimental value from the project. The second requirement was the ability to both search the catalogue and add new events to it using the AstroGrid infrastructure. Third, as the event metadata contained in most solar event catalogues overlapped with concepts encapsulated in the VOEvent schema developed by the International Virtual Observatory Alliance [6], it was decided that the SuaveCAT project would provide a good base for investigating a VOEvent ontology and accompanying software agents.

This paper examines the initial development of an ontology upon which to base the catalogue, development of the catalogue itself, and the configuration of AstroGrid tools to search the catalogue, add to the catalogue, and retrieve catalogue entries as VOEvent packets. In addition, I hope to raise several questions that can be explored in further work on SuaveCAT: what value can an ontology add to a space event catalogue? Are there science issues that an ontology-based software agent can answer better than an SQL query?

Ontologies

STC UML to OWL

The first draft of an ontology based on the VOEvent schema 0.90 [7] was developed in June 2005. Several issues were raised during discussion on the IVOA DM mailing list [8]. Two such issues were conducive to simultaneous investigation: first, subelements of the WhereWhen element of the VOEvent schema could “in general, be any legal VO STC expression.”[9] The Space-Time Coordinates Metadata for the Virtual Observatory (STC) is an IVOA schema that provides a precise format in which to specify the spatial, time, and spectral information for a VO resource [10]. Although this detailed schema had not yet been encoded as an OWL file that could be imported into a VOEvent ontology, it was available as a series of UML diagrams. This tied in with a second issue announced on the IVOA DM mailing list; Dragan Gasevic’s new XSLT tool could translate an XMI file to OWL, and this tool could be used in a wider context to convert UML diagrams to XMI and finally OWL [11].

Converting the existing STC UML diagrams to OWL with tools seemed to offer a potentially high savings in effort compared to building an STC ontology by hand from the schema. Arnold Rots kindly provided me with STC UML diagrams constructed in Microsoft Visio 2003. Microsoft offered a Visio plugin XMIExprt that could convert UML static structure diagrams to XMI. After some trial an error, I was able to build the XMIExprt plugin (with minor edits) in MicroSoft Visio 2005 Beta, install the plugin in Visio 2003, and export the STC UML diagrams to a single XMI file. [10]. Turning to Gasevic’s XMItoOWL.xslt tool, I read that the tool works best with XMI files created with the UML software Poseidon [13]. Applying the XMItoOWL.xslt tool to the STC.xmi file exported with Visio 2003 produced an OWL file containing only XML namespace declarations. I tried opening STC.xmi in Poseidon and reexporting the file as XMI, but unfortunately the tool produced the same results. I was unable to convert the STC UML diagrams to an OWL file using these methods.

VOEvent 1.20 to OWL

Rather than forging ahead with the creation of an STC ontology by hand, I turned back to the VOEvent ontology. Between June and October 2005, the VOEvent schema had grown from version 0.90 to 1.0 [14]. The initial VOEvent ontology was updated to be contemporary with the 1.0 schema using the ontology tool Protégé [15].

The VOEvent ontology is based on three concepts detailed in the Protégé User Tutorial: classes, object properties, and datatype properties [16]. Each element and subelement of the VOEvent ontology is represented by a class. Relationships between elements are represented with object properties; the most common relationship in this ontology is “has[SubElement]”. Each element has one has[SubElement] object property for each subelement it contains; this design decision was debated in the IVOA DM mailing list. The strict definition of forcing each VOEvent element to be built with a specific number and type of subelements will hopefully aid the software agents and correlation tools later on. Finally, the XML attitributes of relevant elements in the VOEvent schema have been included in the ontology as functional datatype properties. The use of “functional” restricts each element to having exactly one occurrence of the corresponding attribute.

Solar VOEvent Catalogue Ontology

A VOEvent packet contains up to eight subelements: Who, What, WhereWhen, Why, How, Citations, Description, and Reference. An individual VOEvent packet may contain at most one of each of these subelements. Five of these subelements were chosen for inclusion in the solar VOEvent catalogue ontology.

The Who element provides curation information encapsulated in PublisherID, Contact, and Date elements. Although the Contact element can contain a number of subelements that provide address, email and telephone information, for simplicity in the catalogue only two subelements were chosen: Name and Institution. Therefore, a Who class was created with “has” relationships to PublisherID, Date, and Contact classes. The Contact class has “has” relationships with Name and Institution.

The What element contains observational information; this may include Param elements grouped under Group elements, individual Param elements, References and Descriptions. For this ontology, the What class only has a “has” relationship with the Param class. This reflects the structure of the catalogue; observational information, here restricted to “ARN” (active region number) and “Instrument Name”, can be encapsulated in Param classes using name and value functional datatype properties without further need for Reference or Description classes. “Instrument Name” was included as a Param class under Why instead of as a Reference under How for the user’s ease. Existing solar event catalogues simply include the name of the mission and instrument rather than a URI pointing to the instrument’s description. For this reason, the How element was included in the solar VOEvent ontology, but it is not used in the catalogue.

Further observational data is described inside the WhereWhen element. This element contains space-time coordinate metadata as described in the STC schema, such as spatial and time frames, observation time data, coordinates, and spectral data. Until an STC ontology has been built that can be imported into a VOEvent ontology, specific elements relevant to solar observations were chosen and encapsulated in an ObservationLocation class. This class has “has” relationships with AstroCoords, AstroCoordSystem, and AstroCoordArea classes. The AstroCoords class contains SpaceFrame and TimeFrame classes; the chosen spatial frame for this ontology is “HGC”, or heliographic coordinates, and “TOPOCENTER” indicates the position of the instrument. The TimeFrame class has TimeScale “UTC” for Universal Time, “TOPOCENTER” indicating instrument time, and a Name that can be filled as “Time”. The AstroCoords class contains latitude and longitude data inside the Position2D class along with spectral units, name, value, and error data inside the Spectral class. Finally, the AstroCoordArea class contains time information; within a TimeInterval class, StartTime and StopTime classes each contain ISOTime classes in which users can add the event’s start and stop times as ISO8601 dates.

The Why element contains Concept and Name subelements that can be grouped under an Inference element. In this ontology, the Why class has a “has” relationship with Inference, which in turn has “has” relationships with Concept and Name. Concept can be used in the solar VOEvent catalogue to describe the event type, such as flare, CME, or wave. Users could be more specific with the Concept class and mark an event as a “Class B X-ray Flare”. As work with this ontology develops, it may be sensible to provide separation between broad event types and specific event classes. The Name class may be used in the context of an event name if a particularly notable event later has a date attached to it, such as “The Valentine’s Day Flare”. This instance of the Name class may prove to be unnecessary to the catalogue.

Finally, the Citations element was created as a class with “has” relationships to EventID and Description along with a functional datatype property for “reason” (supersedes, followup, retraction). The EventID class allows users to specify events from either the SuaveCAT resource or other solar event catalogues that may be related to the catalogue entry being made, and a user can expound upon the relationship between two events with the Description class. The Citations class will be important to the solar VOEvent catalogue as future development with event correlation tools reveals not only events observed with different instruments in multiple catalogues, but also relationships between flares and coronal mass ejections.

Catalogue Creation

To create a user-updatable database from the solar VOEvent ontology described above, the various elements of a VOEvent packet representative of a solar catalogue entry were sorted into information that could be hard-coded for the catalogue and information that required user input. Much of the space-time “infrastructure” metadata contained in the WhereWhen class could be easily hard-coded along with the catalogue’s publisherID, Param element names, event roles and VOEvent version. The curation, coordinates, spectral data, time information and citation data were reduced to nineteen fields of user input; slightly more information required than the average solar event catalogue, but not overwhelming. Each catalogue entry’s EventID is generated automatically to ensure uniqueness. The catalogue’s user input is stored in a MySQL database.

User-provided Hard-coded Generated
Active region number Role (“observation”) Event ID
Instrument name Version (“1.0”)
Event description PublisherID (“ivo://mssl.ucl.ac.uk”)
Solar longitude Param element for ARN
Solar latitude Param element for instrument name
Spectral unit AstroCoordSystem ID
Spectral name (“HGC-UTC-TOPO”)
Spectral value TimeFrame
Spectral error Name=”Time”
Start time TimeScale=”UTC”
End time TOPOCENTER
Event name SpaceFramev
Event concept Name=”Solar Space Frame”
Event reporter HGC
Reporter’s institution TOPOCENTER
Reporting date SPHERICAL
Cited eventID AstroCoords
Citation reason coord_system_id=”HGC-UTC-TOPO”
Citation description Position2D
  Unit=”deg”
  Name=”Longitude, Latitude”
  AstroCoordArea
  ID=”Sun”
  Coord_system_id=”HGC-UTC-TOPO”

AstroGrid Integration

DSA

The solar VOEvent catalogue data has been made available through an AstroGrid DSA module, using astrogrid-pal-skycatserver-1.1-004pl.war. This DSA instance has been configured as a TabularDB resource. Users can build queries with the Astronomical Data Query Language (ADQL) to search the catalogue entries. Using ADQL queries with DSA imparts the functionality of searching events by date, a feature of most solar event catalogues. However, the ADQL also allows searches to be performed on any combination of the nineteen user input columns plus the autogenerated EventID column.

Currently, the catalogue’s DSA instance is only available through a development AstroGrid installation at MSSL while testing occurs with false event data. Once the catalogue is stable, it will be migrated to a live AstroGrid installation so that solar physicists may begin inputting data from observed solar events.

CEA

In addition to searching facilities, two other tools have been developed for use with this event catalogue: a tool to add new events and a tool to return a VOEvent packet given an event ID from the catalogue. Both tools have been deployed as AstroGrid CEA applications interfacing with unix commandline scripts.

The ability to add new events to this catalogue was one of the primary project requirements. Originally, functionality to add, edit and delete solar events was going to be provided through a JSP interface. However, the secondary requirement to use the AstroGrid infrastructure where possible shifted development of these tools to CEA applications. Currently, the SuaveCAT “add event” tool can be accessed through the portal or workbench. Users are presented with nineteen text boxes in which to enter curation, location, spectral, time, and event description metadata. The add tool then appends this information to a MySQL database that holds the event catalogue. An event ID of the format “suavecat#” is generated each time an event is added. No output is returned to the user, but the catalogue gains an additional entry.

The second tool developed for SuaveCAT generates a VOEvent packet from an individual catalogue entry. The user provides a single entry – the SuaveCAT event ID – and the tool extracts the relevant catalogue entry from the database. The commandline tool then constructs an XML file in accordance with the VOEvent schema. Information from the SuaveCAT database is written into curation, location, time, spectral and description elements, but many of the VOEvent subelements are hard-coded to uniform values for all catalogue entries, particularly spatial and time frame information along with some curation metadata.

A sample SuaveCAT VOEvent packet generated by the tool takes the following format:

<?xml version="1.0" encoding="UTF-8"?>
<VOEvent id="ivo://mssl.ucl.ac.uk/10249" role="observation" version="1.0"
    xmlns="http://www.ivoa.net/xml/VOEvent/v1.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.ivoa.net/xml/VOEvent/v1.0
    http://www.ivoa.net/internal/IVOA/IvoaVOEvent/VOEvent-v1.0.xsd">
    <Who>
        <PublisherID>ivo://mssl.ucl.ac.uk</PublisherID>
        <Contact>
            <Name>10243</Name>
            <Institution>10244</Institution>
        </Contact>
        <Date>2005-11-05T12:00:00</Date>
    </Who>
    <What>
        <Param name="ARN" value="1024" />
        <Param name="InstrumentName" value="TRACE" />
    </What>
    <Why>
        <Concept>10242</Concept>
    </Why>
    <WhereWhen>
        <ObservationLocation>
            <AstroCoordSystem xmlns="http://www.ivoa.net/xml/STC/stc-v1.20.xsd" ID="HGC-UTC-TOPO">
                <TimeFrame>
                    <Name>Time</Name>
                    <TimeScale>UTC</TimeScale>
                    <TOPOCENTER/>
                </TimeFrame>
                <SpaceFrame>
                   <Name>Solar Space Frame</Name>
                    <HGC/>
                   <TOPOCENTER/>
                    <SPHERICAL coord_naxes="2"></SPHERICAL>
                </SpaceFrame>
           </AstroCoordSystem>
            <AstroCoords xmlns="http://www.ivoa.net/xml/STC/STCcoords/v1.20" coord_system_id="HGC-UTC-TOPO">
                <Position2D unit="deg">
                    <Name>Longitude, Latitude</Name>
                    <Value2>148 72</Value2>
                </Position2D>
                <Spectral unit="Angstrom">
                    <Name>x-ray</Name>
                    <Value>600</Value>
                    <Error>0.8</Error>
                </Spectral>
            </AstroCoords>
            <AstroCoordArea ID="Sun" coord_system_id="HGC-UTC-TOPO">
                <TimeInterval>
                    <StartTime>
                        <ISOTime xmlns="http://www.ivoa.net/xml/STC/STCcoords/v1.20">2002-12-02T12:00:00</ISOTime>
                    </StartTime>
                    <StopTime>
                        <ISOTime xmlns="http://www.ivoa.net/xml/STC/STCcoords/v1.20">2002-12-02T13:00:00</ISOTime>
                    </StopTime>
                </TimeInterval>
            </AstroCoordArea>
        </ObservationLocation>
    </WhereWhen>
    <Citations>
        <EventID cite="followup">ivo://mssl.ucl.ac.uk/suavecat20</EventID>
        <Description>10248</Description>
    </Citations>
</VOEvent>

Future Work

New Tools

In addition to the existing add, search, and return VOEvent packet facilities, further tools will be developed to add functionality to this solar event catalogue. First, functionality to delete and edit event entries will be added to the catalogue. Next, a series of event correlation tools will use the solar VOEvent ontology to link related events.

The delete functionality may be approached in two ways; either events could be removed by deleting rows from the catalogue database, or the event entry could remain in the database but be set to “inactive”. The first approach has the advantage that as scientists become accustomed to adding events to the catalogue, events entries made in error can be easily removed without cluttering up the database. However, this approach is open to accidental or malicious deletion of valid event entries. The second approach guards against this problem as any event may be reset to “active” at any time. The concept of active and inactive catalogue entries reflects the philosophy of resource entries in VO registries. Also, users wishing to publish a VOEvent packet retraction will be able to cite inactive events that remain in the catalogue. Edit functionality may be difficult to implement as an asynchronous tool. Ideally, a user could search the catalogue for the event to be edited, select the event, and generate a text field form pre-populated with the existing data for the event entry. However, the asynchronous nature of the AstroGrid workflow system’s interaction with CEA applications prevents the return of a pre-populated form for resubmission. One possible implementation would be for the user to generate a VOEvent packet using the existing tool, edit the VOEvent packet outside of the workflow, and submit the edited VOEvent packet to a CEA application that would extract the relevant information in order to update the catalogue database.

Aside from edit and delete functionality, a set of event correlation tools will be developed to use the solar VOEvent ontology with the catalogue. The first such tool will correlate events observed with different instruments; not only will this involve events within SuaveCAT, but the tool should also examine events reported through the EGSO SEC. The next correlation function will associate event entries for related solar flares and coronal mass ejections. A third tool will attempt to assign event classifications such as flare, wave, or CME if none has been provided by the event reporter. These correlation tools will investigate whether software using an ontology can uncover richer relationships in a VOEvent context than software using database queries.

Full Ontology

As the solar VOEvent tools are developed, it may become apparent that a fuller VOEvent ontology would be more powerful than an ontology restricted to the needs of a single event catalogue. The main areas for expansion are creating and importing an STC ontology, using a unit ontology, and making greater use of the How element. The combination of full VOEvent and STC ontologies would open the ontology to a wide range of uses with astronomical and solar terrestrial physics events. This could benefit the solar VOEvent catalogue by allowing correlation of solar events with atmospheric magnetic and plasma events; alternatively, solar events could be catalogued and compared with stellar events in a broader catalogue.

The next ontology steps will be the development of an STC ontology and a full VOEvent ontology.

References

  1. Yohkoh SXT TRACE Flare List, http://www.lmsal.com/nitta/sxt_trace_flares/list.html, Updated 8 March 2002, Viewed 13 November 2005.
  2. Hessi Flare List, http://hesperia.gsfc.nasa.gov/ssw/hessi/dbase/, Updated 13 November 2005, Viewed 13 November 2005.
  3. NOAA SGAS Energetic Event List, http://www.nwra-az.com/spawx/listsgas.html, Updated 13 November 2005, Viewed 13 November 2005.
  4. EGSO SEC, http://sec.ts.astro.it/sec_ui.php, Viewed 13 November 2005
  5. “Publisher’s Astrogrid Library Overview” (DSA). http://www.astrogrid.org/maven/docs/HEAD/pal/index.html, Updated 5 November 2005, Viewed 13 November 2005.
  6. IVOA Status Report. IVOA Executive. http://www.ivoa.net/pub/info/, Updated May 2005, Viewed 13 November 2005.
  7. Auden, E. “VOEvent Ontology”. http://wiki.eurovotech.org/bin/view/VOTech/VoEventOntology, Updated 26 May 2005, Viewed 13 November 2005.
  8. IVOA Data Modelling Forum, http://www.ivoa.net/forum/dm/0506/date.htm, 1-28 June 2005. Viewed 13 November 2005.
  9. “Sky Event Reporting Metadata”, IVOA WG Internal Draft 2005-07-11. http://www.ivoa.net/Documents/WD/VOEvent/VOEvent-20050714.html, Viewed 13 November 2005.
  10. “Space-Time Coordinate Metadata for the Virtual Observatory”, Version 1.21, IVOA Proposed Recommendation 15 March 2005. http://www.ivoa.net/Documents/PR/STC/STC-20050315.html, Viewed 13 November 2005.
  11. Gasevic, D. “UMLtoOWL: Converter from UML to OWL.” http://afrodita.rcub.bg.ac.yu/~gasevic/projects/UMLtoOWL/. Viewed 13 November 2005.
  12. Auden, E. “Creating Ontologies from UML Diagrams”, http://wiki.eurovotech.org/bin/view/VOTech/OntologiesFromUML, Updated 17 October 2005, Viewed 13 November 2005.
  13. “Poseidon for UML”, http://www.gentleware.com/index.php, Viewed 13 November 2005
  14. VOEvent 1.0 Schema, http://www.ivoa.net/internal/IVOA/IvoaVOEvent/VOEvent-v1.0.xsd, Viewed 13 November 2005
  15. “The Protégé Ontology Editor and Knowledge Acquisition System,” http://protege.stanford.edu/, Viewed 13 November 2005
  16. Horridge, M. “A Practical Guide to Building OWL Ontologies Using the Protégé-OWL Plugin and CO-ODE Tools Edition 1.0”, http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial.pdf, Updated 27 August 2004, Viewed 13 November 2005.

-- ElizabethAuden - 13 Nov 2005

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | More topic actions
Topic revision: r1 - 2005-11-13 - ElizabethAuden
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback