Service model Telecon, July 27, 2001
Updated: August 2, 2001 9:33 - Page Maintained by Allan Doyle


Agenda in blue, discussion during telecon in black, action items in red

Present

Yonsook Enloe
Allan Doyle
John Evans
George Percivall
Peter Vretanos
Ko Fung (user - CCRS)
Les Whitney (user - CCRS)
Ken McDonald
Skip Reber (user - NASA)
Robin Pfister (user - NASA)
Mike Adair
Terry Fisher
Larry Bouzane
Larry Fishtahler

Agenda

1. Introductions
-- introduce the users
-- explain the purpose of this review

Issues we are trying to work on are the use of data catalogs and service catalogs (and their relation to each other). People are probably familiar with data catalogs, searching, and metadata. In Open GIS (OGC) we're working on setting up services and entering them into catalogs. E.g. coordinate transformation, data conversion, processing, web mapping, etc.

Rest of notes at bottom of page...

2. Use Case Review
http://www.intl-interfaces.net/servicemodel#usecases

There are 3 use cases at the bottom of the top level web page.

* Fire Validation Use Case (CCRS)
* ECHO Use Case
* GOFC Fire Use Case

3. Questions to discuss

Have you ever searched for data via data catalogs?

What are your first steps when you want to retrieve data of interest?
do you always start with search?
do you search at collection level or product level or both?

Have you ever searched for data services ?

* subsetting
* reprojection
* data format translation (ex: hdf to netcdf)
* processing
* map & image overlay visualization
* web feature service
* ....

Suppose that you could search for various data services in a services catalog...

Would you look for data first and then look for services that you want to apply to the data or
Would you look for services first and then see what data can be run through the services?

If there were two offerings of similar services, how would you choose one?

If we could build a catalog that offered information about data and services,

* should data and the services be linked?
* should data and services be independently described?
* should the catalog be data centric or service centric?

Discussion

- can't see any use for a service centric catalog. As a scientist, you are looking for the data. If you can't find the data, however many services you have, it doesn't matter.

- that's the kind of input we're looking for! Is the fact that you think this because you're familiar with data catalogs and not service catalogs?

- no - you really need to have data first.

- ECHO has identified services that are independent of data, and is working on how to characterize these from a data point of view. E.g. a data gathering service - someone could offer a service that can fly missions and gather data from different places - how do you tie this directly to data?

- CCRS deals with a number of users dealing with remote sensing data. Emphasis is that data are important, but image contains too much detail. Users want products that have been generated via some processing. Data is the bottom layer, second layer is the knowledge (i.e. processed data), top is the decision making. Services are important. E.g. disaster (Canada). You start with the location, take a fire. Local, province, then fed. govt gets involved. Lots of information exchange is via networks. But they don't tend to use a lot of raw data.

- When you say the user is interested in data, this means that the user is looking for something but not just low-level data product.

- If I'm working on oceans - I might know what kind of data is available, or I'll look around to see what kind of data there is. The initial action is to find data. Either by knowing it's there or by looking for data.

- People don't always store higher level products, they will store algorithms and re-run them on raw data. Other systems store and disseminate derived products.

- The derived products are analogous to the "information" in the CCRS sense.

- Agreement - you're looking for the information content. The question is how would you guide someone through the system and help them get the higher level info out? How should it be described in a catalog? As data or as the services (algorithms, etc.) that are run in order to produce them?

- CCRS use case assumes that the fire polygons have already been generated. The question here might be if the polygons have not already been generated - how would you store info about the ability to derive the polygons from NOAA data?

- Question - if you're looking for fire information - you could start at the top of a tree, if you find the fire polygons, you would be pretty far down the tree.

- Question - are we talking about users as scientists or as decision makers? The latter care about accuracy and currency but not so much about how to actually derive the info.

- CCRS scenario is a workflow to facilitate the end-of-season validation of derived fire polygons against raw data.

- Can we look at these scenarios at a higher level - do they make any sense from a science point of view?

- The scenarios are fairly specific. Unfortunately, one of the NASA ones is for some people who are not on this call...

- Questions to the users - if the user can pull up all the data and info related to a keyword and there is various info represented in that metadata, most of the scenarios talk about the user getting base level info, then calling a service, overlaying info from other place, etc. I.e. a sequence of actions. How realistic is this? If the system provided those capabilities, could the users do their jobs?

- Think of the phone menu trees where you have to push buttons as you are given menus - the longer the sequences you have to go through, the less likely you are going to be to follow through to the end.

- Comment on that - there are applications that really want to access info on the net, on the fly. Can be operational. Disaster point of view - fire - want to know how the fire is moving, want wind info - how is the fire going to move? People are willing to follow steps if they have to. But we may be concentrating in this telecon on info that's already there. That kind of data needs to be available w/o a long sequence of actions.

- What complicates this is that we're talking about so many different kinds of services. At many different levels. WMS vs. data collection - these are very different. Some tie well into dataset databases, others don't. Or at varying degrees. Have to look at services independently of dataset database. Have to take a fresh look at how to relate the services info to the data info.

- We should be looking at how the data and the services interact. CCRS fire scenario has a-priori created polygons. Could also look at the steps that went into building the polygons. We have to start looking at different categories and see how they affect the design.

- Key assumption - we assume that there are products that users are going to want that can't be (or won't be) precomputed. The assumption here is that a user has to be able to request the generation of the products.

- A long string of activities would be much more palatable if at the end of the sequence you could save the steps, sort of like saving a bookmark on the web, then you could re-run that sequence.

- It's not inconsistent with the data-centric view to think about looking for data that may not be available to find instead the things you need to do in order to generate/derive that data.

- Users may have their own models. They are looking for high level data to get from a catalog. Thinking about OGC services - systems should also have services available so users can find them in order to customize things for their needs. Some users have a lot of ability to generate their own, other users don't have much "native" capacity to generate their own.

- What is being discussed here sounds very close to the model that's on the web. We're talking about both ad hoc and structured discovery of data and/or services. This matches very much the Yahoo model and something similar needs to exist for geodata processing.

- We need to find a way that's the equivalent of a saved sequence - this relates back to the web model.

- There was some discussion about a portal from OGC's point of view.

- Example of services on the web is routing services. People are interested in getting from A to B. They don't care about the data underneath.

- Example - an out-of-control fire. Someone may be looking for an overhead view service. If there's no satellite, they may look for aerial photography service.

- Orbimage, etc. are operating partially on this model.

- There may be a duality between data and services - e.g. if I find a route, is my result a route so I think of this as having found data? Or is the result a route because I found a service?

- You may also want to supply your own data to a service.

- Key difference - when you find a data product, all the metadata are fixed. Now we're talking about parameterized metadata. When you find this, did you find data or a service? You can probably dress up the result to make it look like either, and you can probably have started your search for data or for services. Our job is to figure out how to package this.

- Think of AVS (visualization tool) - all the services are boxes and you can visually drop the boxes into your workspace and connect them together. Then you can string things together and develop a sequence. The resulting sequence can be saved and reused.

- Think of data, prepackaged bundles of data, saved processing steps.

- Remember - we're looking for additional use cases. If we can get use cases from the scientists, that would be great! Getting them from different disciplines would be good, too.

- CCRS will try to develop a disaster response use case.

- Land-based vs. Ocean vs. Atmosphere studies are very different in their use of data, their geographic extent per investigation, kinds of tools, kinds of processing and processing steps, and data format.

We had some conclusions about data/services from user point of view :

Telecon Conclusions :