DraCor

From HandWiki

DraCor (Drama Corpora) is an open digital infrastructure developed for the computational study of European drama from Greco-Roman antiquity to the 20th century. The platform hosts plays encoded in the TEI format across various languages, supporting comparative and computational methods in drama studies. As of 2025, the collection comprised over 4,000 texts in more than 20 languages. Data provided by DraCor has seen widespread use in digital humanities research.[1] The project received the Rahtz Prize for TEI Ingenuity by the TEI Consortium in 2022.[2]

Overview

DraCor aims to create reliable, expandable, and interoperable corpora of dramatic literature. The project emphasises the concept of Programmable Corpora,[3] where the data is not only accessible but also designed for computational analysis through APIs and integration with other tools. The platform strives to adhere to FAIR data principles (Findability, Accessibility, Interoperability, Reusability).

Key features

  • Multilingual corpora: Contains drama corpora in more than 20 languages, primarily European.
  • TEI encoding: Texts are encoded according to the TEI guidelines to maintain structural and semantic consistency.
  • API access: Provides a documented Application Programming Interface for programmatic access to texts and metadata.
  • Network visualisations: Generates network graphs representing character co-occurrences within plays.
  • Data download: Offers options to download subsets of texts, such as speeches or stage directions, as well as network data.
  • Open access: Data is openly available for research and related purposes.
  • Programmable Corpora: Supports integration with external analytical tools and programming languages, with API wrappers available for Python (pydracor[4]) and R (rdracor[5]).

Corpora

DraCor's collection of corpora is continuously growing and covers plays in Dutch, English, French, German, Ancient Greek, Hungarian, Italian, Latin, Polish, Russian, Spanish, Swedish, Ukrainian, and other languages. Each corpus is curated by individual scholars or teams[6] and provides rich metadata alongside TEI-encoded texts, supporting analyses of dramatic structures, character interactions, and related topics.

Tools and usage

The DraCor platform includes basic visualisation tools, particularly for network analysis. It also supports programmatic access to the corpora, enabling integration into computational research workflows. This facilitates various types of analyses, including:

Community, development, impact

DraCor is developed through collaboration among researchers from multiple institutions. The DraCor platform is maintained by researchers at the Freie Universität Berlin, the University of Potsdam, the University of Göttingen, and the University of Würzburg.[7] As an open-source project, it actively encourages community contributions and feedback. The DraCor community presented its corpora and associated research projects at the DraCor Summit, a five-day event in Berlin in September 2025.[8]

References

  1. "DraCor Research". https://dracor.org/doc/research. 
  2. "Rahtz Prize for TEI Ingenuity". https://tei-c.org/activities/rahtz-prize-for-tei-ingenuity/. 
  3. Fischer, Frank; Börner, Ingo; Göbel, Mathias; Hechtl, Angelika; Kittel, Christopher; Milling, Carsten; Trilcke, Peer (2019). "Programmable Corpora: Introducing DraCor, an Infrastructure for the Research on European Drama". DH2019: “Complexities”. Utrecht University. doi:10.5281/zenodo.4284002. 
  4. "pydracor". https://pypi.org/project/pydracor/. 
  5. "rdracor". 26 September 2024. https://cran.r-project.org/web/packages/rdracor/index.html. 
  6. "DraCor Corpus Registry". https://dracor.org/doc/corpora. 
  7. "DraCor Credits". https://dracor.org/doc/credits. 
  8. "DraCor Summit". https://summit.dracor.org/.