Software:Druid (open-source data store)

Druid
Original author(s)	Eric Tschetter, Fangjin Yang
Developer(s)	The Druid community
Stable release	0.12.3 / 18 September 2018
Written in	Java
Operating system	Cross-platform
Type	distributed, real-time, column-oriented data store
License	Apache License 2.0
Website	druid.io

Druid is a column-oriented, open-source, distributed data store written in Java. Druid is designed to quickly ingest massive quantities of event data, and provide low-latency queries on top of the data.^[1] The name Druid comes from the shapeshifting Druid class in many role-playing games, to reflect the fact that the architecture of the system can shift to solve different types of data problems.

Druid is commonly used in business intelligence/OLAP applications to analyze high volumes of real-time and historical data.^[2] Druid is used in production by technology companies such as Alibaba,^[2] Airbnb,^[2] Cisco,^[3] eBay,^[4] Netflix,^[5] Paypal,^[2], Yahoo.^[6] and Wikimedia Foundation ^[7]

History

Druid was started in 2011 to power the analytics product of a company named Metamarkets. The project was open-sourced under the GPL license in October 2012,^[8]^[9] and moved to an Apache License in February 2015.^[10]^[11]

Over time, a number of organizations and companies have integrated Druid into their backend technology,^[2] and committers have been added from numerous different organizations.^[12]

In October 2015, the commercial company Imply launched to provide an enterprise product built around Druid.^[13]

Architecture

Fully deployed, Druid runs as a cluster of specialized processes (called nodes in Druid) to support a fault-tolerant architecture^[14] where data is stored redundantly, and there is no single point of failure.^[15] The cluster includes external dependencies for coordination (Apache ZooKeeper), metadata storage (e.g. MySQL, PostgreSQL, or Derby), and a deep storage facility (e.g. HDFS, or Amazon S3) for permanent data backup.

Query management

Client queries first hit broker nodes, which forward them to the appropriate data nodes (either historical or real-time). Since Druid segments may be partitioned, an incoming query can require data from multiple segments and partitions (or shards) stored on different nodes in the cluster. Brokers are able to learn which nodes have the required data, and also merge partial results before returning the aggregated result.

Cluster management

Operations relating to data management in historical nodes are overseen by coordinator nodes. Apache ZooKeeper is used to register all nodes, manage certain aspects of internode communications, and provide for leader elections.

Features

Low latency (streaming) data ingestion
Arbitrary slice and dice data exploration
Sub-second analytic queries
Approximate and exact computations

References

↑ Hemsoth, Nicole. "Druid Summons Strength in Real-Time", Datanami, 08 November 2012
↑ ^{Jump up to: 2.0} ^2.1 ^2.2 ^2.3 ^2.4 druid. "Druid | Powered by Druid". http://druid.io/druid-powered.html.
↑ Butler, Brandon. "Under the hood of Cisco’s Tetration Analytics platform". http://www.networkworld.com/article/3086250/cisco-subnet/under-the-hood-of-cisco-s-tetration-analytics-platform.html.
↑ "Druid at Pulsar - ebay的专栏 - 博客频道 - CSDN.NET". http://blog.csdn.net/ebay/article/details/50205611.
↑ "The Netflix Tech Blog: Announcing Suro: Backbone of Netflix's Data Pipeline". http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflixs.html.
↑ "Complementing Hadoop at Yahoo: Interactive Analytics with Druid". http://yahooeng.tumblr.com/post/125287346011/complementing-hadoop-at-yahoo-interactive.
↑ https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/60986
↑ Tschetter, Eric. "Introducing Druid", Druid.io, 24 October 2012
↑ Higginbotham, Stacey. "Metamarkets open sources Druid, its in-memory database", GigaOM, 24 October 2012
↑ Harris, Derrick (2015-02-20). "The Druid real-time database moves to an Apache license". https://gigaom.com/2015/02/20/the-druid-real-time-database-moves-to-an-apache-license/. Retrieved 2015-08-04.
↑ "Druid Gets Open Source-ier Under the Apache License". https://metamarkets.com/2015/druids-now-open-source-ier-under-an-apache-license/. Retrieved 2015-08-04.
↑ druid. "Druid | Druid Community". http://druid.io/community/.
↑ Novet, Jordan. "Imply launches with $2M to commercialize the Druid open-source data store", VentureBeat, 19 October 2015
↑ Druid Project Documentation
↑ Yang, Fangjin; Tschetter, Eric; Léauté, Xavier; Ray, Nelson; Merlino, Gian; Ganguli, Deep. "Druid: A Real-time Analytical Data Store", Metamarkets, retrieved 6 February 2014

External links

Official website

0.00

(0 votes)

[datanami-1] Hemsoth, Nicole. "Druid Summons Strength in Real-Time", Datanami, 08 November 2012

[powered-2] {Jump up to: 2.0} ^2.1 ^2.2 ^2.3 ^2.4 druid. "Druid | Powered by Druid". http://druid.io/druid-powered.html.

[3] Butler, Brandon. "Under the hood of Cisco’s Tetration Analytics platform". http://www.networkworld.com/article/3086250/cisco-subnet/under-the-hood-of-cisco-s-tetration-analytics-platform.html.

[4] "Druid at Pulsar - ebay的专栏 - 博客频道 - CSDN.NET". http://blog.csdn.net/ebay/article/details/50205611.

[5] "The Netflix Tech Blog: Announcing Suro: Backbone of Netflix's Data Pipeline". http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflixs.html.

[6] "Complementing Hadoop at Yahoo: Interactive Analytics with Druid". http://yahooeng.tumblr.com/post/125287346011/complementing-hadoop-at-yahoo-interactive.

[7] ttps://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/60986

[druidblog-8] Tschetter, Eric. "Introducing Druid", Druid.io, 24 October 2012

[gigaom-9] Higginbotham, Stacey. "Metamarkets open sources Druid, its in-memory database", GigaOM, 24 October 2012

[10] Harris, Derrick (2015-02-20). "The Druid real-time database moves to an Apache license". https://gigaom.com/2015/02/20/the-druid-real-time-database-moves-to-an-apache-license/. Retrieved 2015-08-04.

[11] "Druid Gets Open Source-ier Under the Apache License". https://metamarkets.com/2015/druids-now-open-source-ier-under-an-apache-license/. Retrieved 2015-08-04.

[12] ruid. "Druid | Druid Community". http://druid.io/community/.

[imply-13] Novet, Jordan. "Imply launches with $2M to commercialize the Druid open-source data store", VentureBeat, 19 October 2015

[druid-docs-14] Druid Project Documentation

[15] Yang, Fangjin; Tschetter, Eric; Léauté, Xavier; Ray, Nelson; Merlino, Gian; Ganguli, Deep. "Druid: A Real-time Analytical Data Store", Metamarkets, retrieved 6 February 2014

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

Anonymous

Search

Software:Druid (open-source data store)

Namespaces

More

Page actions

Contents

History

Architecture

Query management

Cluster management

Features

References

External links

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Software:Druid (open-source data store)

History

Architecture

Query management

Cluster management

Features

References

External links

Navigation

Wiki tools

Page tools

Other projects

Categories