Software:Apache Beam

From HandWiki
Short description: Unified programming model for data processing pipelines
Apache Beam
Apache Beam logo (3 color, wordmark right).svg
Original author(s)Google
Developer(s)Apache Software Foundation
Initial releaseJune 15, 2016; 7 years ago (2016-06-15)
Stable release2.53.0 (January 4, 2024; 3 months ago (2024-01-04)[1]) [±]
RepositoryBeam Repository
Written inJava, Python, Go
Operating systemCross-platform
LicenseApache License 2.0
Websitebeam.apache.org

Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream (continuous) processing.[2] Beam Pipelines are defined using one of the provided SDKs and executed in one of the Beam’s supported runners (distributed processing back-ends) including Apache Flink, Apache Samza, Apache Spark, and Google Cloud Dataflow.[3]

History

Apache Beam[3] is one implementation of the Dataflow model paper.[4] The Dataflow model is based on previous work on distributed processing abstractions at Google, in particular on FlumeJava[5] and Millwheel.[6][7]

Google released an open SDK implementation of the Dataflow model in 2014 and an environment to execute Dataflows locally (non-distributed) as well as in the Google Cloud Platform service.

Timeline

Apache Beam makes minor releases every 6 weeks.[8]

Version Release date
2.53.0 2024-01-04
2.52.0 2023-11-17
2.51.0 2023-10-11
2.50.0 2023-08-30
2.49.0 2023-07-17
2.48.0 2023-05-31
2.47.0 2023-05-10
2.46.0 2023-03-10
2.45.0 2023-02-15
2.44.0 2023-01-12
2.43.0 2022-11-17
2.42.0 2022-10-17
2.41.0 2022-08-23
2.40.0 2022-06-27
2.39.0 2022-05-25
2.38.0 2022-04-20
2.37.0 2022-03-04
2.36.0 2022-02-07
2.35.0 2021-12-29
2.34.0 2021-11-11
2.33.0 2021-10-07
2.32.0 2021-08-25
2.31.0 2021-07-08
2.30.0 2021-06-09
2.29.0 2021-04-27
2.28.0 2021-02-22
2.27.0 2021-01-08
2.26.0 2020-12-11
2.25.0 2020-10-23
2.24.0 2020-09-18
2.23.0 2020-07-29
2.22.0 2020-06-08
2.21.0 2020-05-27
2.20.0 2020-04-15
2.19.0 2020-02-04
2.18.0 2020-01-23
2.17.0 2020-01-06
2.16.0 2019-10-07
2.15.0 2019-08-22
2.14.0 2019-08-01
2.13.0 2019-05-22
2.12.0 2019-04-25
2.11.0 2019-02-26
2.10.0 2019-02-01
2.9.0 2018-12-13
2.8.0 2018-10-29
2.7.0 (LTS) 2018-10-03
2.6.0 2018-08-08
2.5.0 2018-06-26
2.4.0 2018-03-20
2.3.0 2018-01-30
2.2.0 2017-12-02
2.1.0 2017-08-23
2.0.0 2017-05-17
0.6.0 2017-03-11
0.5.0 2017-02-02
0.4.0 2016-12-29
0.3.0 2016-10-31
0.2.0 2016-08-08
0.1.0 2016-06-15
{{{2}}}

See also

References

  1. "Blogs". The Apache Software Foundation. https://beam.apache.org/blog/beam-2.53.0/. 
  2. Woodie, Alex (22 April 2016). "Apache Beam's Ambitious Goal: Unify Big Data Development". https://www.datanami.com/2016/04/22/apache-beam-emerges-ambitious-goal-unify-big-data-development/. 
  3. 3.0 3.1 "Cloud Dataflow - Batch & Stream Data Processing". https://cloud.google.com/dataflow/. 
  4. Akidau, Tyler; Schmidt, Eric; Whittle, Sam; Bradshaw, Robert; Chambers, Craig; Chernyak, Slava; Fernández-Moctezuma, Rafael J.; Lax, Reuven et al. (1 August 2015). "The dataflow model". Proceedings of the VLDB Endowment 8 (12): 1792–1803. doi:10.14778/2824032.2824076. http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf. Retrieved 4 August 2016. 
  5. Chambers, Craig; Raniwala, Ashish; Perry, Frances; Adams, Stephen; Henry, Robert R.; Bradshaw, Robert; Weizenbaum, Nathan (1 January 2010). "FlumeJava: Easy, efficient data-parallel pipelines". Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM. pp. 363–375. doi:10.1145/1806596.1806638. ISBN 9781450300193. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35650.pdf. Retrieved 4 August 2016. 
  6. Akidau, Tyler; Whittle, Sam; Balikov, Alex; Bekiroğlu, Kaya; Chernyak, Slava; Haberman, Josh; Lax, Reuven; McVeety, Sam et al. (27 August 2013). "MillWheel". Proceedings of the VLDB Endowment 6 (11): 1033–1044. doi:10.14778/2536222.2536229. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41378.pdf. Retrieved 4 August 2016. 
  7. Pointer, Ian (14 April 2016). "Apache Beam wants to be uber-API for big data". InfoWorld. http://www.infoworld.com/article/3056172/application-development/apache-beam-wants-to-be-uber-api-for-big-data.html. 
  8. "Policies". https://beam.apache.org/community/policies/.