D4Science

From HandWiki
D4Science
D4Science logo 600px(1).png
NicknameD4Science
HeadquartersIstituto di Scienza e Tecnologie dell'Informazione, Pisa, Italy
ProductsVirtual Research Environments, Science Gateways, cloud computing, e-infrastructure
Websitewww.d4science.org

D4Science is an organisation operating a Data Infrastructure offering a rich array of services by community-driven virtual research environments.[1] In particular, it supports communities of practice willing to implement open science practices.[2] The infrastructure follows the system of systems approach, where the constituent systems (Service providers) offer “resources” (namely services and by them data, computing, storage) assembled together to implement the overall set of D4Science services.[3] In particular, D4Science aggregates “domain agnostic” service providers as well as community-specific ones to build a unifying space where the aggregated resources can be exploited via Virtual research Environments and their services.

This organization is hosted by the Istituto di Scienza e Tecnologie dell'Informazione of National Research Council (Italy).

At the earth of this infrastructure there is an Open Source Software named gCube system.[4]

Services

D4Science offers a rich array of services:

  • Virtual Research Environment as a Service providing any community of practice with a dedicated working environment supporting any knowledge production process in a collaborative way, in fact every VRE enables computer-supported cooperative work by design. D4Science-based VREs are web-based, community-oriented, collaborative, user-friendly, open-science-enabler working environments for scientists and practitioners willing to work together to perform a set of (research) task. From the end-user perspective, each VRE manifests in a unifying web application (and a set of application programming interfaces (APIs)): (a) comprising several applications organised in specific menu items and (b) running in a plain web browser. Every application is providing VRE users with facilities implemented by relying on one or more services provisioned by diverse providers. Among the basic services every VRE is equipped with there are
    • a Social Networking area enabling collaborative and open discussions on any topic and disseminating information of interest for the community, for example, the availability of a research outcome;
    • a Workspace for storing, organizing and sharing any version of a research artifact, including dataset and model implementation;
    • a User Management dashboard for managing membership and roles;
    • a Catalogue Service recording the assets worth being published thus to make it possible for others to be informed and make use of these assets.
  • Science Gateway as a Service providing a community of practice with a dedicated science gateway hosting a selected set of virtual research environments.
  • Data Analytics at scale providing the members of a VRE with a rich array of solutions for data analytics including:
    • a proprietary data analytics platform (DataMiner) [5][6] to execute analytics tasks either by relying on methods provided by the user or by others. It is endowed with importing and sharing facilities for analytics methods implemented in heterogeneous forms including R, Java, Python, and KNIME. The platform enacts tasks execution by a distributed and hybrid computing infrastructure. Moreover, one of the worth highlighting feature of this platform is its open science-friendliness. All the analytics methods integrated in it are exposed by a standard protocol (the OGC WPS protocol) clients can use to get informed on available methods as well as to start processes, monitor their execution and access results. Every analytics task performed by the platform automatically produces a provenance record catering for the reproducibility of the task;
    • an RStudio-based development environment for R enabling to perform statistical computing tasks in the cloud. This RStudio environment is (i) preconfigured with libraries and packages to ease the execution of common data analytics tasks, and (ii) provides seamless access to the VRE Workspace enabling sharing of resources with other members of the same working environment.
    • a Jupyter-based notebook environment for developing and executing interactive computing by JupyterLab instances. Each JupyterLab is (i) preconfigured with libraries and packages to ease the execution of common data analytics tasks, and (ii) provides access to the VRE Workspace enabling sharing of resources with other members of the same working environment.

The D4Science Infrastructure is serving thousands of users (more than 20,000 registered users in June 2023) by 178 active VREs offered via 20 Science gateways.

History

The D4Science initiative has been developed and supported by several European-funded projects.

DILIGENT (2004-2007) in the Sixth Framework Programme for Research and Technological Development was the forerunner where a testbed infrastructure built by integrating digital library and grid computing technologies and resources was conceived and developed to serve the needs of communities of practice involved in knowledge development.[7]

In the context of the Seventh Framework Programme for research, technological development and demonstration the development of the D4Science initiative started with the support of D4Science (2008-2009), D4Science-II (2009-2011), ENVRI (2011-2014), EUBrazilOpenBio (2011-2013), iMarine (2011-2014). In this period the infrastructure was established and developed to serve communities of practices from domains ranging from Earth Science to Marine Science with worldwide scope[8]

In the context of the H2020 research and innovation programme the maturity level of the D4Science infrastructure was high enough to allow a large and very diverse set of communities of practice to benefit from it and its services and further contribute to its development. Moreover, the services offered by the infrastructure have been developed to support open science practices.[2] The following projects contributed to D4Science development: BlueBRIDGE (2015-2018), EGI-Engage (2015-2017), ENVRIplus (2015-2019), Parthenos (2015-2019), SoBigData (2015-2019), AGINFRAplus (2017-2019), PerformFish (2017-2022), ARIADNEplus (2019-2022), EOSC-Pillar (2019-2022), DESIRA (2019-2023), RISIS2 (2019-2022), SoBigData++ (2019-2022), MOVING (2020-2024), EcoScope (2021-2025), SNAPSHOT (2020-2022), I-GENE (2021-2025), NAVIGATOR (2020-2023).

The operation and improvement of the D4Science infrastructure facilities are still ongoing while its exploitation is progressively growing. These activities are additionally partly supported by the following Horizon Europe programme projects: BlueCloud2026 (2023-2026), and SoBigData RI PPP (2022-2025).

Supported communities and cases range from Agri-food [9] to Social Data Science[10] , Earth Science [11] and Marine Science.[12]

See also

  • European Open Science Cloud the European initiative for creating an environment for hosting and processing research data and promote open science.
  • European Grid Infrastructure the e-Infrastructure set up to provide advanced computing and data analytics services for research and innovation.
  • OpenAIRE the European initiative to shift scholarly communication towards openness and transparency and to facilitate innovative ways to communicate and monitor research.

External links

References

  1. Candela, L.; Castelli, D.; Pagano, P. (2023). "The D4Science Experience on Virtual Research Environments Development". Computing in Science & Engineering: 1–9. doi:10.1109/MCSE.2023.3290433. https://openportal.isti.cnr.it/doc?id=people______::469fe67912f64f04ce512316048f79aa. 
  2. 2.0 2.1 Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F. et al. (2019). "Enacting open science by D4Science". Future Generation Computer Systems 101: 555–563. doi:10.1016/j.future.2019.05.063. https://openportal.isti.cnr.it/doc?id=people______::2fb441d7958c2f2810fd9035e83ed79f. 
  3. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Dell'Amico, A.; Frosini, L.; Lelii, L. et al. (2019). "Virtual research environments co-creation: The D4Science experience". Concurrency and Computation: Practice and Experience 35 (18). doi:10.1002/cpe.6925. https://openportal.isti.cnr.it/doc?id=people______::2f82ff7a9958905b32a94f2ca2d41367. 
  4. Assante, M.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Frosini, L.; Lelii, L.; Mangiacrapa, F. et al. (2019). "The gCube system: Delivering Virtual Research Environments as-a-Service". Future Generation Computer Systems 95: 445–453. doi:10.1016/j.future.2018.10.035. https://openportal.isti.cnr.it/doc?id=people______::755293b9721a872a624d5e55716ba16d. 
  5. Coro, G.; Panichi, G.; Scarponi, P.; Pagano, P. (2017). "Cloud computing in a distributed e‐infrastructure using the web processing service standard". Concurrency and Computation: Practice and Experience 29 (18): e4219. doi:10.1002/cpe.4219. https://openportal.isti.cnr.it/doc?id=people______::46cabd130b8d7b68b8bc5229398050e4. 
  6. Candela, L.; Coro, G.; Lelii, L.; Pagano, P.; Panichi, G. (2020). "Data Processing and Analytics for Data-Centric Sciences". Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. 12003. pp. 176–191. doi:10.1007/978-3-030-52829-4_10. ISBN 978-3-030-52828-7. https://openportal.isti.cnr.it/doc?id=people______::c24292d5e8d17e82e938b31be3118940. 
  7. Candela, L.; Akal, F.; Avancini, H.; Castelli, D.; Fusco, L.; Guidetti, V.; Langguth, C.; Manzi, A. et al. (2007). "DILIGENT: integrating digital library and Grid technologies for a new Earth observation research infrastructure". International Journal on Digital Libraries 7 (1–2): 59–80. doi:10.1007/s00799-007-0023-8. https://openportal.isti.cnr.it/doc?id=people______::b378bcc5cc8c1806be8943610fb2966f. 
  8. Amaral, R. (2015). "Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure". Concurrency and Computation: Practice and Experience 27 (2): 376–394. doi:10.1002/cpe.3238. https://openportal.isti.cnr.it/doc?id=people______::494a8fc5f420110f9cb187423fd362a3. 
  9. Assante, M.; Boizet, A.; Candela, L.; Castelli, D.; Cirillo, R.; Coro, G.; Fernández, E.; Filter, M. et al. (2020). "Realizing virtual research environments for the agri‐food community: The AGINFRA PLUS experience". Concurrency and Computation: Practice and Experience n.a. (19): n.a. doi:10.1002/cpe.6087. https://openportal.isti.cnr.it/doc?id=people______::a3e8ec3bad01d7a432cab3cdee473e76. 
  10. Grossi, V.; Giannotti, F.; Pedreschi, D.; Manghi, P.; Pagano, P.; Assante, M. (2021). "Data science: a game changer for science and innovation". International Journal of Data Science and Analytics 11 (4): 263–278. doi:10.1007/s41060-020-00240-2. 
  11. Jeffery, K.; Candela, L.; Glaves, E. (2020). "Virtual Research Environments for Environmental and Earth Sciences: Approaches and Experiences". Towards Interoperable Research Infrastructures for Environmental and Earth Sciences. Lecture Notes in Computer Science. 12003. pp. 272–289. doi:10.1007/978-3-030-52829-4_15. ISBN 978-3-030-52828-7. https://openportal.isti.cnr.it/doc?id=people______::a16d49ec5e12a1399719ef538fdc4f50. 
  12. Coro, G.; Gonzalez Vilas, L.; Magliozzi, C.; Ellenbroek, A.; Scarponi, P.; Pagano, P. (2018). "Forecasting the ongoing invasion of Lagocephalus sceleratus in the Mediterranean Sea". Ecological Modelling 371: 37–49. doi:10.1016/j.ecolmodel.2018.01.007. https://openportal.isti.cnr.it/doc?id=people______::9cdae69f5fc65090d86eeb8b461e9c0c.