Software:Tidyverse

From HandWiki
Tidyverse
A black hexagon logo with the word "tidyverse" in white letter in the middle, while having smaller colorful hexagons throughout the larger black hexagon logo
The tidyverse hex logo
Initial releaseSeptember 15, 2016; 9 years ago (2016-09-15)[1][2]
Repositorygithub.com/tidyverse/tidyverse
Written inR
TypePackage collection
LicenseMIT
Websitewww.tidyverse.org

The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham[3] and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data.[4] Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping.[5][6][7]

As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages.[8] The tidyverse is the subject of multiple books and papers.[9][10][11][12] In 2019, the ecosystem has been published in the Journal of Open Source Software.[13]

Its syntax has been referred to as "supremely readable",[14] and some[15] have argued that tidyverse is an effective way to introduce complete beginners to programming, as pedagogically it allows students to quickly begin doing data processing tasks.[16][15] Moreover, some practitioners have pointed out that data processing tasks are intuitively easier to chain together with tidyverse compared to Python's equivalent data processing package, pandas.[17] There is also an active R community around the tidyverse. For example, there is the TidyTuesday social data project organised by the Data Science Learning Community (DSLC),[18] where varied real-world datasets are released each week for the community to participate, share, practice, and make learning to work with data easier.[19] Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their built-in, base R equivalents and are too dissimilar to some programming languages.[20][21]

The tidyverse principles more generally encourage and help ensure that a universe of streamlined packages, in principle, will help alleviate dependency issues and compatibility with current and future features.[22] An example of such a tidyverse principled approach is the pharmaverse, which is a collection of R packages for clinical reporting usage in pharma.[23]

Packages

The core tidyverse packages, which provide functionality to model, transform, and visualize data, include:[24]

  • ggplot2 – for data visualization
  • dplyr – for wrangling and transforming data
  • tidyr help transform data specifically into tidy data, where each variable is a column, each observation is a row; each row is an observation, and each value is a cell.
  • readr help read in common delimited, text files with data
  • purrr a functional programming toolkit
  • tibble a modern implementation of the built-in data frame data structure
  • stringr helps to manipulate string data types
  • forcats helps to manipulate category data types

Additional packages assist the core collection.[25] Other packages based on the tidy data principles are regularly developed, such as tidytext[26] for text analysis, tidymodels[27] for machine learning, or tidyquant[28] for financial operations.

References

  1. Wickham, Hadley. "tidyverse 1.0.0". https://posit.co/blog/tidyverse-1-0-0/. 
  2. Wickham, Hadley (April 15, 2025). "A personal history of the tidyverse". https://hadley.github.io/25-tidyverse-history/index.pdf. 
  3. "Welcome to the Tidyverse". https://blog.revolutionanalytics.com/2016/09/tidyverse.html. 
  4. "Tidyverse" (in en-us). https://www.tidyverse.org/. 
  5. Stefan Milton Bache; Hadley Wickham (2014-11-22), magrittr: A Forward-Pipe Operator for R, https://cran.r-project.org/package=magrittr, retrieved 2020-04-20 
  6. Wickham, Hadley. 4 Pipes | The tidyverse style guide. https://style.tidyverse.org/pipes.html. 
  7. Wickham, Hadley (May 30, 2019). Advanced R (2nd ed.). New York: Chapman & Hall. ISBN 978-0815384571. 
  8. "RDocumentation". https://www.rdocumentation.org/trends. 
  9. Duggan, Jim (2018-09-07). "Input and output data analysis for system dynamics modelling using the tidyverse libraries of R" (in en). System Dynamics Review 34 (3): 438–461. doi:10.1002/sdr.1600. ISSN 0883-7066. 
  10. Chang, Winston (2013) (in en). R Graphics Cookbook. "O'Reilly Media, Inc.". ISBN 9781449316952. https://books.google.com/books?id=_iVFgKTRYrQC&q=ggplot2. 
  11. Boehmke, Bradley C. (2016-11-17). Data wrangling with R. Cham: Springer. ISBN 9783319455990. OCLC 964404346. 
  12. Hadley, Wickham (2017). R for data science : import, tidy, transform, visualize, and model data. Grolemund, Garrett (First ed.). Sebastopol, CA: O'Reilly Media. ISBN 9781491910399. OCLC 968213225. 
  13. Wickham, Hadley; Averick, Mara; Bryan, Jennifer; Chang, Winston; McGowan, Lucy D'Agostino; François, Romain; Grolemund, Garrett; Hayes, Alex et al. (21 November 2019). "Welcome to the Tidyverse". Journal of Open Source Software 4 (43): 1686. doi:10.21105/joss.01686. Bibcode2019JOSS....4.1686W. 
  14. Steinmetz, Art (2024-04-10). "Outsider Data Science - The Truth About Tidy Wrappers" (in en). https://outsiderdata.netlify.app/posts/2024-04-10-the-truth-about-tidy-wrappers/benchmark_wrappers.html. 
  15. 15.0 15.1 Heppler, Jason (2018-02-27). "Teaching the tidyverse to R novices" (in en). https://medium.com/@jaheppler/teaching-the-tidyverse-to-r-novices-7747e8ce14e. 
  16. on, Teach the tidyverse to beginners was published (5 July 2017). "Teach the tidyverse to beginners" (in en). http://varianceexplained.org/r/teach-tidyverse/. 
  17. "Why pandas feels clunky when coming from R" (in en-us). https://sumsar.net/blog/pandas-feels-clunky-when-coming-from-r/. 
  18. "dslc.io" (in en). https://dslc.io/. 
  19. rfordatascience/tidytuesday, Data Science Learning Community, 2024-08-11, https://github.com/rfordatascience/tidytuesday, retrieved 2024-08-11 
  20. Matloff, Norm (30 September 2019). "An opinionated view of the Tidyverse "dialect" of the R language". https://github.com/matloff/TidyverseSkeptic. Retrieved 28 October 2019. 
  21. Muenchen, Bob (23 March 2017). "The Tidyverse Curse" (in en). http://r4stats.com/2017/03/23/the-tidyverse-curse/. 
  22. "The Power of Transitioning to a '-verse' Approach in R Package Development" (in en). https://www.appsilon.com/post/the-power-of-transitioning-to-a-verse. 
  23. "pharmaverse". https://pharmaverse.org/. 
  24. "Tidyverse packages - Tidyverse" (in en-us). https://www.tidyverse.org/packages/. 
  25. "Tidyverse packages" (in en-us). https://www.tidyverse.org/packages/. 
  26. Silge, Julia (2023-02-01), tidytext: Text mining using tidy tools, https://github.com/juliasilge/tidytext, retrieved 2023-02-03 
  27. "Tidymodels" (in en-us). https://www.tidymodels.org/. 
  28. "Tidy Quantitative Financial Analysis" (in en). https://business-science.github.io/tidyquant/. 

Template:R (programming language)