PROV (Provenance)

From HandWiki
PROV
StatusPublished, W3C Recommendation
Year started2013
EditorsPaul Groth, Luc Moreau
Related standardsRDF, OWL, XML
DomainSemantic Web
AbbreviationPROV
Websitewww.w3.org/TR/prov-overview/

The PROV standard defines a data model, serializations, and definitions to support the interchange of provenance information on the Web.[1] Here provenance includes all "information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness".

PROV is a set of recommended standards of the World Wide Web Consortium.[2] These include its data model,[3] an XML schema for that model, an OWL2 ontology mapping that model to RDF, and a mapping from that ontology to Dublin Core. It also includes a notation standard for provenance that is easy for humans to read; methods for accessing and querying prov; and a few other subspecifications.[1]

PROV model overview

The core concepts defined by the PROV Model are Entity, Activity and Agent.[4] The remaining concepts are relationships between these (e.g. Derivation, Usage, Generation) or specializations (e.g. Person, Collection, Plan).

Overview of the W3C PROV model.

An Entity captures a thing in the world (in a particular state). The entity was derived from some other entity, and was generated by an Activity that used other entities.

An Agent (e.g. a person or software execution) was associated with the activity, and the entity that was generated by the activity was attributed to that agent.

PROV serializations

Provenance statements can be serialized in different PROV formats, while expressing the same PROV model. Some of the PROV types and relationship names have slight variations from the PROV model concepts to be idiomatic to the format.

For example, PROV-N is a textual format that has a direct mapping to the PROV model:

document
 prefix ex <http://example.com/>

 entity(ex:e1)

 activity(ex:a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)

 wasGeneratedBy(ex:e1, ex:a2, -)

endDocument

The above can be expressed as XML using the PROV-XML schema:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<prov:document xmlns:prov="http://www.w3.org/ns/prov#"
               xmlns:ex="http://example.com/">

    <prov:entity prov:id="ex:e1"/>

    <prov:activity prov:id="ex:a2">
        <prov:startTime>2011-11-16T16:00:00.000Z</prov:startTime>
        <prov:endTime>2011-11-16T16:00:01.000Z</prov:endTime>
    </prov:activity>

    <prov:wasGeneratedBy>
        <prov:entity prov:ref="ex:e1"/>
        <prov:activity prov:ref="ex:a2"/>
    </prov:wasGeneratedBy>

</prov:document>

Using the PROV-O mapping to the OWL2 ontology language, which again can be serialized in the RDF format Turtle:

@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.com/> .

ex:e1 a prov:Entity .

ex:a2 a prov:Activity ;
	prov:startedAtTime "2011-11-16T16:00:00.000Z"^^xsd:dateTime ;
	prov:endedAtTime "2011-11-16T16:00:01.000Z"^^xsd:dateTime .

ex:e1 prov:wasGeneratedBy ex:a2 .

Tooling

Software tools have been developed to help converting between PROV formats and to generate/parse PROV documents in different programming languages:

References

  1. 1.0 1.1 "PROV-Overview" (in en). https://www.w3.org/TR/2013/NOTE-prov-overview-20130430/. 
  2. Moreau, Luc; Groth, Paul; Cheney, James; Lebo, Timothy; Miles, Simon (2015-12-01). "The rationale of PROV" (in en). Web Semantics: Science, Services and Agents on the World Wide Web 35: 235–257. doi:10.1016/j.websem.2015.04.001. ISSN 1570-8268. 
  3. "PROV-DM: The PROV Data Model" (in en). https://www.w3.org/TR/2013/REC-prov-dm-20130430/. 
  4. "PROV Model Primer" (in en). W3C. https://www.w3.org/TR/2013/NOTE-prov-primer-20130430/#intuitive-overview-of-prov.