E-Agriculture

Question 3: What are the emerging tools, standards and infrastructures?

Question 3: What are the emerging tools, standards and infrastructures?

The new paradigm for interoperability on the web and  for building the basic layer for a semantic web is the concept of Linked Open Data1 (LOD).

Instead of pursuing ad hoc solutions for the exchange of specific data sets, the concept of linked open data establishes the possibility to express structured data in a way that it can be linked to other data sets that are following the same principle. Examples of an extensive use of "linked open data" technologies are the NYT or the BBC news service. Some governments too are pressing heavily to publish administrative information as LOD.

                             


   The Linking Open Data cloud diagram


The technology of LOD is based on W3C standards  such  as the "Resource Description Framework2" (RDF), which facilitates the exchange of structured information regardless of the specific structure in which they are expressed at  the  source level. Any database can easily be expressed using the RDF, but also structured textual information from content management systems can be expressed in RDF. The presentation of data in RDF makes them understandable and processable by machines, which are able to mash up data from different sites. There are now mainstream open source data management  tools like  Drupal or Fedora commons which already include RDF as the way to present data.

Within the area of agricultural research for development an infrastructure to facilitate the production of linked open data is needed. The four key elements to make this possible are:

   a registry of services and data sets (CIARD RING,http://www.ring.ciard.net);

   common vocabularies to facilitate automatic data linking (thesauri, authority files, value vocabularies);

   technology (content management systems, RDF wrappers for legacy systems);

   training and capacity development

 



1 Linked Data - Connect Distributed Data across the Web http://linkeddata.org/ Last accessed March 2011
2 Resource Description Framework
http://www.w3.org/RDF/ Last accessed March 2011

Jim Cory
Jim CoryHorizon MappingUnited States of America

We have talked about systems for sharing data within a broad group of users. Perhaps we need to think about these systems as a hierarchy. At one level are a set of systems that fully support global standards and provide automatic semantic matching.

At an intermediate level are a group of systems that are more localized and which understand the global standards and intake and output data to and from global systems, but are focused on incorporating and sharing local data gathered in less formal ways. These intermediate systems work with local developers to provide access to global and local information in a format that is relevant to local practitioners.

At the practioner level there are flexible delivery and reporting tools that provide current, localized, market oriented data and also incorporate crowd sourcing methods for gathering information from individual practitioners.

All levels may not support the same standards, but there could be localized methods for taking data in and then translating them into global standards as the data moves up the hierarchy.

Laurent Lefort
Laurent LefortCSIRO ICT CentreAustralia

Data citation 

One way to encourage the sharing of data is to develop the practice of data citation.

Here are two useful background documents, one from the Australian National Data Service (ANDS) and the other from the 

ANDS and the other from Gen2Phen, an EU project focusing on Health and Life science research data.

- Data Citation Awareness http://ands.org.au/guides/data-citation-awareness.html

- D9.3 Draft Report on Incentives and Rewards in the Field of Biomedical Research Databases http://www.gen2phen.org/system/files/private/D9.3%20Draft%20Report%20on%...

 

Standards and tools to transform data from SQL to RDF

I recommend to RDF beginners to start with tools which implements a Direct Mapping from a database. Direct Mapping is defined as a mapping that mirrors the database schema in RDF with a minimal effort required to implement it. There have also been efforts to let users annotate the SQL code to provide the same capability e.g the work done by the FlyWeb project 

- My first mapping from RDB to RDF using a direct mapping http://ivan-herman.name/2010/11/19/my-first-mapping-from-direct-mapping/

- Future of FlyWeb work on Chado OWL ontology and RDF mapping http://generic-model-organism-system-database.450254.n5.nabble.com/Futur...

These tools are not only interesting for users willing to transform an existing database. MOLGENIS http://sourceforge.net/projects/molgenis/ is a tool to allow users to create their own TAB-delimited format to record science data and then move it into RDF (using D2RQ) to allow other apps to process it. The starting point is a couple of XML files (one for the data model and one for the UI) with a simple syntax (there is a utility to extract the data model one from an existing database). Out of these original "models", the MOLGENIS project aims to derive a range of tools including a R API and RDF access using D2RQ in a comparable way to what has been done in the FlyWeb project.

 

LOD and ontology 

The gap between "lightweight semantics" like LOD and ontology-based approach is much smaller than it used to be. The Datatype Reasoning capabilities enabled by the OWL2 standard http://www.w3.org/TR/owl2-overview/ and the new features provided by ontology engineering tools like SPARQL-DL http://www.w3.org/2001/sw/wiki/SPARQL-DL can help LOD users to exploit ontology content even if it is mixed with numerical data.  

 

Graph databases (NoSQL) 

Finally, Graph databases may also have a role to play in future Linked Open Data infrastructures because there are new (and also old) products now fighting over a market niche which is roughly half way between traditional databases and triple stores. 

Sandro Hawke Toward Standards for NoSQL NoSQL Live … from Boston March 11, 2010 http://www.w3.org/2010/Talks/0311-nosql/talk.pdf