Quality of Data (QoD)
This article provides insufficient context for those unfamiliar with the subject.May 2018) (Learn how and when to remove this template message) ( |
Quality-of-Data (QoD) is a designation coined by L. Veiga, that specifies and describes the required Quality of Service of a distributed storage system from the Consistency point of view of its data. It can be used to support Big Data management frameworks, Workflow management, and HPC systems (mainly for data replication and consistency). It takes into account data semantics, namely Time interval of data freshness, Sequence of tolerable number of outstanding versions of the data read before refresh, and Value divergence allowed before displaying it. Initially it was based in a model from an existing research work regarding vector-field Consistency,[1] awarded the best-paper prize in the ACM/IFIP/Usenix Middleware Conference 2007 and later enhanced for increased scalability and fault-tolerance.[2]
This consistency model has been successfully applied and proven in Big Data key/value store Apache HBase,[3] initially designed as a middleware[4] module seating between clusters from separate data centres. The HBase-QoD coupling [5] minimises bandwidth usage and optimises resources allocation during replication achieving the desired consistency level at a more fine-grained level.
QoD is defined by the three-dimensions of vector k=(θ,σ,ν), but with a broader view of the issue, applicable also to large-scale data management techniques in regards to their timely delivery.[6]
Other descriptions
Quality-of-Data should not be confused with other definitions for data quality such as [7] [8] - Completeness - Validity - Accuracy
References
- ↑ Nuno Santos; Luís Veiga; Paulo Ferreira (2007). "Vector-Field Consistency for Adhoc Gaming". http://www.gsd.inesc-id.pt/~lveiga/msc-08-09/vfc-middleware-07.pdf.
- ↑ Luís Veiga; André Negrão; Nuno Santos; Paulo Ferreira (2010). "Unifying Divergence Bounding and Locality Awareness in Replicated Systems with Vector-Field Consistency". http://www.gsd.inesc-id.pt/~lveiga/vfc-JISA-2010.pdf.
- ↑ "Apache HBase – Apache HBase™ Home". https://hbase.apache.org/.
- ↑ Sergio Estéves; João Silva; Luís Veiga (2013). "Quality-of-service for consistency of data geo-replication in cloud computing". http://www.gsd.inesc-id.pt/~sesteves/papers/vfc3-europar12.pdf.
- ↑ Álvaro García-Recuero; Sergio Estéves; Luís Veiga (2013). "Quality-of-Data for Consistency Levels in Geo-replicated Cloud Data Stores". http://www.inesc-id.pt/ficheiros/publicacoes/9253.pdf.
- ↑ Data Quality Published by IBM
- ↑ Richard Y. Wang (1992). "Toward quality data : an attribute-based approach". http://web.mit.edu/tdqm/www/tdqmpub/Toward%20Quality%20Data.pdf.
- ↑ George A. Mihaila; Louiqa Raschid; María-Esther Vidal (2000). "Using Quality of Data Metadata for Source Selection and Ranking".