R package
R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN (the Comprehensive R Archive Network).[1][2] The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor driving the widespread adoption of the language in data science.[3][4][5][6]
Compared to libraries in other programming language, R packages must conform to a relatively strict specification.[3] The Writing R Extensions manual[7] specifies a standard directory structure for R source code, data, documentation, and package metadata, which enables them to be installed and loaded using R's in-built package management tools.[3] Packages distributed on CRAN must meet additional standards.[3][8] According to John Chambers, whilst these requirements "impose considerable demands" on package developers, they improve the usability and long-term stability of packages for end users.[3]
Repositories
Comprehensive R Archive Network (CRAN)
The Comprehensive R Archive Network (CRAN) is R's central software repository, supported by the R Foundation.[9] It contains an archive of the latest and previous versions of the R distribution, documentation, and contributed R packages.[10] It includes both source packages and pre-compiled binaries for Windows and macOS.[11] (As of November 2020), more than 16,000 packages are available.[12] CRAN was created by Kurt Hornik and Friedrich Leisch in 1997,[13][14] with the name paralleling other early packing systems such as TeX's CTAN (released 1992) and Perl's CPAN (released 1995).[15] (As of 2021), it is still maintained by Hornik and a team of volunteers.[9] The master site is located at the Vienna University of Economics and Business and is mirrored on servers around the world.[10]
The "Task Views" page (subject list) on the CRAN website[16] lists a wide range of tasks (in fields such as finance, genetics, high performance computing, machine learning, medical imaging, meta-analysis, social sciences and spatial statistics) for which R packages are available. Another way to browse CRAN packages is provided by Metacran,[17] which also maintains lists of featured, most downloaded, trending or most depended upon packages.
The number of CRAN packages has grown exponentially for many years,[18] and (As of 2018) an average of 21 submissions of new or updated packages were made every day.[6] Since each submission is manually reviewed by a small team of CRAN maintainers, many of whom, according to R core developer Peter Dalgaard, are "approaching pensionable age", there is a concern that this system is not sustainable in the long term.[6] The growth of CRAN has exposed limitations of its dependency management infrastructure, particularly the fact that it assumes that dependencies always refer to the latest version of a package, meaning that new releases of CRAN packages must always be backwards compatible,[19] and that CRAN packages cannot have dependencies that are not on CRAN.[20] It has also led to concerns about declining quality of packages.[21]
MRAN and Posit Package Manager
The Microsoft R Application Network (MRAN) is a mirror of CRAN maintained by Microsoft which is based on the company's downstream distribution of R, Microsoft R Open (formerly Revolution R Open).[22] It also includes an archive of daily CRAN snapshots, branded as the "CRAN Time Machine", which enables users of MRAN to bypass the dependency versioning limitations of CRAN by installing a fixed set of R package versions via the package checkpoint.[23][24] In January 2023 Microsoft announced that MRAN was being retired and the associated websites and repositories became unavailable in July 2023.[25]
The Posit Package Manager (formerly RStudio Package Manager) is a similar tool produced by the developers of RStudio which, in addition to CRAN snapshots, includes an archive of R packages from Bioconductor and Python packages from the Python Package Index.[26] It also distributes pre-compiled binary packages for Linux (only Windows and macOS binaries are included on CRAN).[27]
Other repositories
The Bioconductor project provides R packages for the analysis of genomic data. This includes object-oriented data-handling and analysis tools for data from Affymetrix, cDNA microarray, and next-generation high-throughput sequencing methods.[28]
R-Forge,[29] is a central platform for the collaborative development of R packages, R-related software, and projects. R-Forge also hosts many unpublished beta packages, and development versions of CRAN packages.
Base and recommended packages
R is distributed with fifteen "base packages": base, compiler, datasets, grDevices, graphics, grid, methods, parallel, splines, stats, stats4, tcltk, tools, translations, and utils.[30]
In addition, there are fifteen "recommended packages" from CRAN which are included with binary distributions of R: KernSmooth, MASS, Matrix, boot, class, cluster, codetools, foreign, lattice, mgcv, nlme, nnet, rpart, spatial, and survival.[30]
Other packages
A group of packages called the Tidyverse, which can be considered a "dialect of the R language", is increasingly popular in the R ecosystem. As of 2020-06-13, Metacran[17] listed 7 of the 8 core packages of the Tidyverse in the list of most downloaded R packages. The group of packages strives to provide a cohesive collection of functions to deal with common data science tasks, including data import, cleaning, transformation and visualisation (notably with the ggplot2 package).
The R Infrastructure packages[31] support coding and the development of R packages and as of 2021-05-04, Metacran[17] lists 16 of these packages among the 25 most downloaded packages.
See also
References
- ↑ Hornik, Kurt (2020-02-20). "Frequently Asked Questions on R". 7.29: What is the difference between package and library?. https://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Add_002dOn-Packages.
- ↑ Wickham, Hadley; Bryan, Jennifer. "Introduction". R Packages (2nd ed.). https://r-pkgs.org/intro.html. Retrieved 2020-11-02.
- ↑ 3.0 3.1 3.2 3.3 3.4 Chambers, John M. (2020). "S, R, and Data Science" (in en). The R Journal 12 (1): 462–476. doi:10.32614/RJ-2020-028. ISSN 2073-4859. https://journal.r-project.org/archive/2020/RJ-2020-028/index.html. Retrieved 2020-11-02.
- ↑ Vance, Ashlee (2009-01-06). "Data Analysts Captivated by R's Power". New York Times. https://www.nytimes.com/2009/01/07/technology/business-computing/07program.html.
- ↑ Tippmann, Sylvia (2014-12-29). "Programming tools: Adventures with R" (in en). Nature News 517 (7532): 109–110. doi:10.1038/517109a. PMID 25557714.
- ↑ 6.0 6.1 6.2 Thieme, Nick (2018). "R generation" (in en). Significance 15 (4): 14–19. doi:10.1111/j.1740-9713.2018.01169.x. ISSN 1740-9713.
- ↑ "Writing R Extensions". https://cran.r-project.org/doc/manuals/r-release/R-exts.html.
- ↑ "CRAN Repository Policy". https://cran.r-project.org/web/packages/policies.html.
- ↑ 9.0 9.1 "CRAN Repository Policy". R Project. https://cran.r-project.org/web/packages/policies.html.
- ↑ 10.0 10.1 Hornik, Kurt (2020-02-20). "Frequently Asked Questions on R". 2.1: What is CRAN?: R Project. https://cran.r-project.org/doc/FAQ/R-FAQ.html#What-is-CRAN_003f.
- ↑ CRAN Repository Maintainers. "The Comprehensive R Archive Network". R Project. https://cran.r-project.org/.
- ↑ CRAN Repository Maintainers. "CRAN - Contributed Packages". CRAN. https://cran.r-project.org/web/packages.
- ↑ Hornik, Kurt (1997-04-23). "ANNOUNCE: CRAN". r-announce (Mailing list). Archived from the original on 2021-03-08. Retrieved 20 November 2020.
- ↑ Thieme, Nick (2018). "R generation" (in en). Significance 15 (4): 14–19. doi:10.1111/j.1740-9713.2018.01169.x. ISSN 1740-9713.
- ↑ Fitzgerald, Brian (2016-02-09). "A Survey of Programming Language Package Systems". https://neurocline.github.io/papers/survey-of-programming-language-packaging-systems.html.
- ↑ "CRAN Task Views". https://cran.r-project.org/web/views/.
- ↑ 17.0 17.1 17.2 "Metacran". https://www.r-pkg.org/downloaded.
- ↑ Asay, Matt (April 21, 2016). "Exponential growth of R's open source community threatens commercial competitors" (in en). https://www.techrepublic.com/article/exponential-growth-of-rs-open-source-community-threatens-commercial-competitors/.
- ↑ Ooms, Jeroen (2013). "Possible Directions for Improving Dependency Versioning in R" (in en). The R Journal 5 (1): 197–206. doi:10.32614/RJ-2013-019. ISSN 2073-4859. https://journal.r-project.org/archive/2013/RJ-2013-019/index.html. Retrieved 2020-11-02.
- ↑ Decan, A.; Mens, T.; Claes, M.; Grosjean, P. (2016). "When GitHub Meets CRAN: An Analysis of Inter-Repository Package Dependency Problems". 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 1. pp. 493–504. doi:10.1109/SANER.2016.12. ISBN 978-1-5090-1855-0. https://ieeexplore.ieee.org/document/7476669. Retrieved 2021-05-12.
- ↑ Hornik, Kurt (2012). "Are There Too Many R Packages?" (in en). Austrian Journal of Statistics 41 (1): 59–66–59–66. doi:10.17713/ajs.v41i1.188. ISSN 1026-597X. https://www.ajs.or.at/index.php/ajs/article/view/vol41%2C%20no1%20-%205. Retrieved 2020-11-02.
- ↑ "Welcome to MRAN". Microsoft. https://mran.microsoft.com/.
- ↑ "Reproducibility: Using Fixed CRAN Repository Snapshots". Microsoft. https://mran.microsoft.com/documents/rro/reproducibility.
- ↑ Smith, David (2019-05-22). "MRAN snapshots, and you". Revolution Analytics. https://blog.revolutionanalytics.com/2019/05/cran-snapshots-and-you.html.
- ↑ "Microsoft R Application Network retirement" (in en). https://techcommunity.microsoft.com/t5/azure-sql-blog/microsoft-r-application-network-retirement/ba-p/3707161.
- ↑ Lopp, Sean (2020-12-07). "RStudio Package Manager 1.2.0 - Bioconductor & PyPI" (in en-us). RStudio. https://blog.rstudio.com/2020/12/07/package-manager-1-2-0/.
- ↑ Lopp, Sean (2020-07-01). "Announcing Public Package Manager and v1.1.6" (in en-us). RStudio. https://blog.rstudio.com/2020/07/01/announcing-public-package-manager/.
- ↑ Huber, W; Carey, VJ; Gentleman, R; Anders, S; Carlson, M; Carvalho, BS; Bravo, HC; Davis, S et al. (2015). "Orchestrating high-throughput genomic analysis with Bioconductor". Nature Methods (Nature Publishing Group) 12 (2): 115–121. doi:10.1038/nmeth.3252. PMID 25633503.
- ↑ "R-Forge: Welcome". https://r-forge.r-project.org/.
- ↑ 30.0 30.1 Hornik, Kurt (2020-02-20). "Frequently Asked Questions on R". 5.1: Which add-on packages exist for R?. https://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Add_002dOn-Packages.
- ↑ "R infrastructure". https://github.com/r-lib.
Further reading
- Claes, M.; Mens, T.; Grosjean, P. (2014). "On the maintainability of CRAN packages". 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE). pp. 308–312. doi:10.1109/CSMR-WCRE.2014.6747183. ISBN 978-1-4799-3752-3. https://ieeexplore.ieee.org/document/6747183.
- Decan, Alexandre; Mens, Tom; Claes, Maelick; Grosjean, Philippe (2015-09-07). "On the Development and Distribution of R Packages". Proceedings of the 2015 European Conference on Software Architecture Workshops. ECSAW '15. Dubrovnik, Cavtat, Croatia: Association for Computing Machinery. pp. 1–6. doi:10.1145/2797433.2797476. ISBN 978-1-4503-3393-1. https://doi.org/10.1145/2797433.2797476.
- Fox, John (2009). "Aspects of the Social Organization and Trajectory of the R Project" (in en). The R Journal 1 (2): 5–13. doi:10.32614/RJ-2009-014. ISSN 2073-4859. https://journal.r-project.org/archive/2009/RJ-2009-014/index.html.
- Fox, John; Leanage, Allison (12 September 2016). "R and the Journal of Statistical Software" (in en). Journal of Statistical Software 73 (1): 1–13. doi:10.18637/jss.v073.i02. ISSN 1548-7660. https://www.jstatsoft.org/article/view/v073i02.
- Plakidas, Konstantinos; Schall, Daniel; Zdun, Uwe (2017). "Evolution of the R software ecosystem: Metrics, relationships, and their impact on qualities" (in en). Journal of Systems and Software 132: 119–146. doi:10.1016/j.jss.2017.06.095. ISSN 0164-1212. http://www.sciencedirect.com/science/article/pii/S0164121217301371.
External links
- The Comprehensive R Archive Network (CRAN)
- METACRAN, a directory of R packages
- CRAN Task Views, listing of CRAN packages by topics
Original source: https://en.wikipedia.org/wiki/R package.
Read more |