E-Agriculture

Question 2: What are the prospects for interoperability in the future?

Question 2: What are the prospects for interoperability in the future?

"Interoperabilty"1 is a feature both of data sets and of information services that gives access to data sets. When a data set or a service is interoperable it means that data coming from it can be easily "operated" also by other systems. The easier it is for other systems to retrieve, process, re-use and re-package data from a source, and the less coordination and tweaking of tools is required to achieve this, the more interoperable that source is.

Interoperability ensures that distributed data can be exchanged and re-used by and between partners without the need to centralize data or standardise software.
Some examples of scenarios where data sets need to be interoperable:

   transfer data from one repository to another;
   harmonize different data and metadata sets;
   aggregate different data and metadata sets;
   virtual research environments;
   creating documents from distributed data sets;
   reasoning on distributed datasets;
   creating new information services using distributed data sets.


There are current examples of how an interesting degree of  internal interoperability  is achieved through centralized systems.  Facebook and  Google are the largest examples of centralized systems that allow easy sharing of data and a very good level of inter-operation within their own  services. This is due to the use of uniform environments (software and database schemas)  that  can  easily  make physically distributed information repositories interoperable, but only within the limits of that environment. What  is interesting is that centralized services like Google, Facebook and all social networks are adopting interoperable technologies in order to expose  part of their data to other applications, because the  huge range of social platforms is distributed and has to meet the needs of users in terms of easier access to information across different platforms.

Since there are social, political and practical reasons why centralization of repositories or omologation  of software and working tools will not happen, a higher degree of standardization and generalization ("abstraction") is needed to make data sets interoperable across systems.

The alternative to centralization of data or  omologation  of working environments is the development of a set of standards, protocols and tools that make distributed data sets interoperable and sharing possible among heterogeneous and un-coordinated systems ("loose coupling").

This has been addressed by the W3C with the concept of the "semantic web". The semantic web heralds the goal of global interoperability of data on the WWW. The concept  was proposed more than 10 years ago. Since then the W3C has developed a range of standards to achieve this goal, specifically semantic description languages (RDF, OWL), which should get data out of  isolated database silos and structure text that was born unstructured. Interoperability is achieved when machines understand the meaning of distributed data and therefore are able to process them in the correct way.

 


1 Interoperability http://en.wikipedia.org/wiki/Interoperability 

Jim Cory
Jim CoryHorizon MappingUnited States of America

Interoperability seems more like an irresistable force than a strategy. People want information and providers wishing to be accessed, provide it in several usable formats. As standards become commonly available, major data managers, document publishers and content streams adopt them in order to remain competitive or viable. This is a natural progression and can be observed as one looks back on the evolution of information technology.

Data sharing standards are inevitably accompanied by open access tools that form the glue to tie separate bits together. Information consumers follow after, looking to create new analyses and perspectives. On an as needed basis, the pieces are arranged and connected in a freeform construction of content and functionality. Each of these unique triangles are designed for a subset of information consumers. Each of the triangles can in turn be linked to other information networks by using standards to create yet more community specific applications.

There is no one-size-fits-all. Standards and flexible linking provide for all the uses one can imagine. They are also constantly evolving to provide the next great trend in information sharing.

I agree that interoperability looks to be inevitable. The only question is how quickly it will happen and what can we do to make it happen faster?

Education is one thing. Although it may seem obvious to the likes of us, there is still a lot of basic awareness raising needed about the benefits of standards and sharing.
 
Building standards and mechanisms for data exchange into the tools that people use in their everyday life is another, so that the data is captured in an interoperable form from the start.
 
Standards compliance should 'just happen' without people having to think about it.
Jim Cory
Jim CoryHorizon MappingUnited States of America

Thinking is good. That's why any of this is happening. It is also inevitable. It is one of the things we are good at. In regards to that, I have been thinking...

I agree with you that there can be things we do to facilitate the adoption and use of standards rather than letting things take there own course. You mentioned education which is something that needs to happen at all levels of the information spectrum, from data/tool producers to information consumers. One way of spreading the word might be to come up with a set of recommendations for different categories of interaction that are tailored to the needs of specific user groups. Are there different technical requirements for research groups as opposed to community farmers and can those information tool kits be adjusted to accommodate different cultural expectations?

Possible criteria for the tool kits might include ease of implementation, affordability, robustness of the standard, current level of adoption, flexibility, extensibility, infrastructue opportunites and limitations. Development of new and better standards will continue, but for putting tools in place now we need solutions that work out of the box.

Thomas Baker
Thomas BakerDublin Core Metadata InitiativeUnited States of America

san_jay writes about the Interoperability Triangle:
> It is good to see that some of us are trying to bring the human factor in
> interoperability. ...
> But if I summarise from everything from this thread, doesn't everything
> comes to people, processes and technology?

kbheenick writes:
> I feel that the concept of 'interoperability' needs to be considered ,
> ranging all the way from people collaborating to systems collaborating,
> with concepts and information interoperability being somewhere in
> between. ...
> People successfully interoperating means that there has been...

> an agreed set of communication protocols...

I like Sanjay's notion of an Interoperability Triangle
of "People, Processes, and Technology", and I also like
Krishan's point that "processes" have to do with "concepts"
and "communication".

One might summarize this as a triangle of "People --
Communication -- Technology".

PEOPLE

I enthusiastically agree with the emerging emphasis in this
discussion on the "human factor" in interoperability.  VIVO is
an excellent example, as the emphasis since its beginnings
some five years ago has been on "connecting people" and
"creating a community" [1].

COMMUNICATION

What makes Linked Data technology different from traditional
IT approaches is that it is analogous to the most familiar
of all communication technologies -- human language.

RDF is the grammar for a language of data.  The words of
that language are URIs -- URIs for naming both the things
described and the concepts used to describe those things, from
verb-like "properties" to noun-like "classes" and "concepts".
The sentences of that grammar -- RDF triples -- mirror the
simple three-part grammar of subject, predicate, and object
common to all natural languages.  It is a language designed
by humans for processing by machines.

The language of Linked Data does not itself solve the
difficulties of human communication any more than the
prevalence of English guarantees world understanding.
However, it does support communication across a similarly
broad spectrum.

When used with "core" vocabularies such as the fifteen-element
Dublin Core, the result may be a "pidgin" for the sort
of rudimentary but serviceable communication that occurs
between speakers of different languages.  When used with
richer vocabularies, it supports the precision needed for
communication among specialists.  And just as English provides
a basis for second-language communication among non-native
speakers, RDF provides a common second language into which
local data formats can be translated and exposed.

TECHNOLOGY

Given the speed of technical change, it is inevitable that the
software applications and user interfaces we use today will
soon be superseded.  The Linked Data approach acknowledges this
by addressing the problem on a level above specific formats and
software solutions, expressing data in a generic form designed
for ease of translation into different formats.  It is an
approach designed to make data available for unanticipated uses
-- uses unanticipated both in the present and for the future.

[1] http://www.dlib.org/dlib/july07/devare/07devare.html

Valeria Pesce
Valeria PesceGlobal Forum on Agricultural Research and Innovation (GFAR)Italy

It seems we all agree that Linked Data is the way to go. So the framework is set. But within this framework the issue of defining a minimum set of data that allow to interoperate information of a certain type by other systems is still open.

It is not so much an issue of which description vocabularies (Dublin Core, FOAF, MODS, AgMES, Darwin Core, geoRSS...) to use, since this can be tackled by mapping vocabularies and using stylesheets - although the LOD recommendation is always to use widely adopted vocabularies - but it is more an issue of which data should be included in an information object so that it is fully interoperable.

For instance, if we are exchanging data about events, is it enough to use the basic RSS metadata set? RSS 1.0 is RDF, can use URIs and can be LOD-compliant, but if we don't include information on the dates and the location of the event in specific RDF properties, is an RSS feed of events fully interoperable?

An example of a service that aggregates events from different sources is AgriFeeds. The added-value service that Agrifeeds offers in aggregating events is that users can browse events chronologically in a calendar and geographically by region and country. A feed of events that doesn't have properties for the start and end date of the event and for the location, is not interoperable by AgriFeeds. In fact, it is not discarded but it is treated as a basic news feed, without the possibility to exploit the advanced chronological and geographical browse.

Another similar issue is subject indexing. Since none of the sources aggregated by AgriFeeds uses Agrovoc or other subject lists mapped to Agrovoc to tag news and events, no coherent subject browsing is possible.

In this sense, defining the actual data (or the metadata set, in traditional terms) that are recommended for each information type is more important than agreeing on a specific standard in terms of DTD or RDF schema (the "description vocabulary"). Vocabulary issues can be solved from a technical point of view, but if the data we need are not there "interoperation" and therefore re-use of information may not be possible.

Valeria Pesce
Valeria PesceGlobal Forum on Agricultural Research and Innovation (GFAR)Italy

Just a hint to still another "prospect", which maybe will be better covered in the next thread on latest developments.

It is good to agree on LOD as the future of interopeability, but what are we going to say to Institutions that are supposed to produce and consume LOD and don't have tools that allow them to do it?

It is true that software tools are clearly moving towards LOD, but we have keep monitoring  developments in this field in order to be able to recommend tools that are not only capable of producing LOD (and therefore create a triple store of all contents managed in the system) but also flexible enough to allow to customize the classes and properties used in the triple store.

More perhaps in the next thread.
 

Interoperabilty is particularly needed within an international networking environment in order to make the intended sharing date to be accessible,  available and usable by all stakeholders interested. This will surely foster and enhance the agricultural data and information sharing, collaborations and reuse as well as value added services. For this, there must be some general regulations for data exchanges and reuse; standardized tools (platforms or systems) for transferring and aggregating various data sets, creating new service contents; and high quality data sets produced and provided by institutions or others from each country.

I like very much the concept of people - communication - technology. In my case, as head of the open access network in my organisation INRA, I can act toward people and make everything possible to communicate.

The Information System Division in the curl

But even if I am aware of what LOD can bring to data dissemination, I have to work with the Information System Division in all institutional projects. They have different purposes.  They choose the technology : SQL database at first, then XML ones. They don't want to investigate in RDF. A group of information managers inside INRA are working on semantic projects to demonstrate the ability to use this technology and to achieve scientific goals. It is the only way to convince the IS division to go further with RDF and LOD...

Question of skill

It is not easy in France to find computer scientist - RDF skilled - to work with. Most of them have never heard about OAI-PMH. OAI is much easier than RDF to learn. Even companies that provide computer services are not yet ready with RDF development. I would be interested in knowing the situation in other countries as well as potential subcontractors !

Diane

Dear colleagues, I was late to join the very interesting and useful discussion. It also took time to read more than 100 posts. Practically all aspects of information sharing and interoperability are considered. I will allow myself to draw your attention to only one point.
 
In countries/regions where agricultural scientists and consultants have limited ICT skills and access to the Internet, some organizations, at current stage at least, have to take responsibility of collecting data and maintaining, as san_jay sais,  "websites / platforms etc with a minimum level of features that will allow them to handle meta data (and all the associated tit bits) and to communicate among themselves".
 
Thus the important task is to identify these organizations and build their capacity. In particular, they have to explain researchers, consultants and other stakeholders the benefits of information sharing. They also should contribute to gradual possession of more ICT skills by stakeholders which finally may end up in all necessary pre-processing of data by stakeholders themselves.
 

Dear colleagues. I might take again to some ideas concerning the re-thinking on how to encourage information sharing and create shared values. In addition to the already raised obstacles to information sharing and interoperability, such as lack of clear policy and investment plans, lack of incentives, time constraint, cultural heritage, lack of relevant knowledge packed products, etc. in my experience, information technologists and information experts' community are moving very much faster than research and development community, especially researchers and extensionists in developing countries, as too many terminologies, data formats and information platforms are introduced and increasing miss much between national, regional and international information systems. This situation  make it difficult to create shared culture and harmony as well as mutual trust in step by step improvement of information structuring and information sharing. So, by taking this in mind, we should think of mechanisms, and ways of reducing this gap.  Creating a learning processes, integration and synergy that allow all stakeholders to contribute is important.