This manuscript (permalink) was automatically generated from digital-botanical-gardens-initiative/dbgi-green-paper@71cd348 on December 26, 2022.
✉ — Correspondence possible via GitHub Issues or email to The Digital Botanical Gardens Initiative Consortium <dbgi@protonmail.com>.
The Digital Botanical Gardens Initiative (DBGI) ambitions to explore innovative solutions for the collection, management and sharing of digital information acquired on living botanical collections. A particular focus will be placed on the large scale characterization of the chemodiversity of living plants collections through mass spectrometric approaches. The acquired data will be structured, organized and connected with relevant metadata through semantic web technology. The gathered knowledge will then inform ecosystem functioning research and orient biodiversity conservation projects. The DBGI initially aims to take advantage of the readily available living collections of Swiss botanical gardens to establish robust and scalable chemo- and biodiversity digitization workflows. The ultimate goal is to apply these approaches in the field and at the global scale in wild ecosystems.
Biodiversity is a major determinant of ecosystem stability [1]. Hundreds of studies spanning terrestrial and aquatic ecosystems support that higher levels of biodiversity, in all its forms, promote better ecosystem functions, such as carbon sequestration, underpinning human well‐being [2,3]. Sadly, earth is experiencing a major biodiversity crisis and of the estimated nine million species of fungi, plants, and animals which have been described [4], about a million are currently at risk of extinction and may go extinct before the end of the century [5]. One major issue is that more than 80% of the estimated biodiversity still awaits to be described. We are in fact facing what is now called the Anthropocene extinction (sixth mass extinction) [6]. In order to try to deviate from these alarming trends, all possible efforts must be made by the responsible (i.e. our species) for the conservation of biodiversity. For this, the characterisation and documentation of biodiversity is a fundamental prerequisite. Over 3.5 billion years of evolution, natural selection, the craftsman of biodiversity, has created an overwhelming array of molecular entities. Myriad compounds are produced by all living organisms from bacteria to whales, forming the backbone of the ever-growing tree of life. Through the lens of chemistry, every species, biotic interaction, and community, reveals a unique ensemble of molecular structures: the metabolome. These chemical assemblages are a valuable, yet largely unexplored reflection of biodiversity and ecosystem functioning. To go beyond the simple quantitative representation of biodiversity provided by species inventories, and to reinforce our understanding of links between biodiversity and ecosystem functioning, we see chemical diversity as an alternative and highly complementary view of the diversity of our planet’s ecosystems.
With these urgent biodiversity characterization objectives in mind we are setting up the Digital Botanical Gardens Initiative (DBGI), which aims to develop robust and scalable workflows for the digitization of botanical gardens using several approaches. The central one being the use of analytical chemistry strategies to build information-rich chemical maps that will guide researchers focusing on biodiversity characterization and conservation. Sampling plants for chemical characterization can be done in three ways, either from natural living collections (botanical gardens), in the wild, or by growing plants in highly controlled settings (See Figure 1 for an overview of the advantages and inconveniences of each biodiversity source). For this first project, the sampling in botanical gardens is the chosen option, as within a very accessible location, thousands of species, which are already identified, labeled and organized, can be readily sampled.
The main goals of the DBGI are resumed in Figure 2 Some details are given hereafter:
1. Establish chemical extracts libraries of Swiss botanical gardens. These chemical libraries can be considered as complementary to herbarium samples. They are easily conserved over time and in a reasonable space. They represent the chemical diversity of a sample. They can be easily aliquoted. They can be screened for bioassays.
2. Digitize, through mass spectrometry, the chemodiversity of Swiss botanical gardens. Here high-resolution mass spectrometry is considered as a digital scanner allowing to capture the chemical fingerprint of the profiled sample. State-of-the-art computational metabolomics solutions are used to organize and annotate the acquired spectra with molecular data.
3. Gather chemical information and relevant samples metadata in a tailored knowledge graph. Chemical information acquired at the previous step (spectra and chemical structures) are connected to relevant samples metadata (taxonomy, phenology, geolocalisation, time of collection etc.). For this semantic web technologies (namely the RDF data model) are employed and a tailored knowledge graph is established.
4. Connect to existing ontologies (bio, chemo) and biodiversity digitization projects. Chemical and biological objects of the graph are connected to relevant pre-established ontologies (e.g. CHEBI, Plants Ontology) and data graphs (e.g. Wikidata). Connection with complementary biodiversity digitization efforts will be done (e.g BiCIKL, Dissco)
5. Establish web and programmatic interfaces for the query of the acquired knowledge. A web interface will allow a convenient access to the data acquired within the framework of the project. A dashboard will allow simple visualizations (e.g pie charts, barplots, treemap) to interpret the data. In addition a SPARQL endpoint and an application programming interface (API) will allow retrieval of data programmatically.
6. Illustrate the feasibility and advantages of an end-to-end Open Science project. The DBGI will strictly follow the Open Science guidelines by using open-source software and making available the acquired data and scripts under an open license agreement. In addition the DBGI results will be published at the moment they are acquired (previous to formal publication or even pre-prints) thus following the Open Notebook Science concepts.
7. Establish robust and scalable workflows for the digitization of wild ecosystems biodiversity. The DBGI, albeit ambitious, is a pilot project. The future objective is to propose digitization workflows for the characterization of the wild ecosystems chemodiversity, on a global scale. These workflow will be central to the future Earth Metabolome Initiative.
8. Provide “molecular arguments” for biodiversity conservation policies. The ultimate goal of DBGI is to use all the gathered metabolic information to support, extend or implement conservation efforts worldwide. We believe that by providing chemical maps of the landscape it will be possible to contribute to the prioritization of conservation and restoration targets . In other words, by establishing large scale chemical maps we expect to provide “molecular arguments” to biodiversity conservation endeavors (e.g. “This piece of land has a high content of antibacterial scaffolds.” or “This place might be poor in species diversity but rich in a rare chemodiversity”.)
The aim of the DBGI is to characterize the chemodiversity of the all botanical gardens of Switzerland. This ambitious objective requires us to evaluate and test the entire workflow and methodology. For this, the initiators of the DBGI started gathering preliminary data from two botanical collections: the Jardin Botanique de l’Université de Fribourg (JBUF) and the Jardin Botanique de Neuchâtel (JBN). This choice was made for practical reasons (these are the respective working places of the DBGI initiators) but also because these two gardens each offer their unique characteristics. On one hand the JBUF presents over 5000 species systematically organized to reflect the Angiosperm Phylogeny Group (APG) system. The JBUF researchers are specialized in conservation biology. On the other hand, the JBN presents over 2000 species organized in 8 sub-gardens. The JBN has a focus on ethnomedicinal plants. Below, we will briefly outline the research workflow envisioned for the DBGI. The main steps have been schematized in Figure 3). The overall workflow can be divided into two main parts: one dealing with physical objects (upper part of Figure 3) and a second one dealing with data and metadata acquired from these objects (lower part of Figure 3).
Physical objects. Starting from a botanical garden (a living specimens collection), the aim is to sample each specimen and to build two libraries: a library of dried plant material and a library of chemical extracts. The dried plant material library will serve as a backup for extractions to be repeated, further characterization of compounds, or for orthogonal analysis (e.g. genetic sequencing). The chemical extracts library will be the source material for the mass-spectrometry digitization stage. This library will also be available for backups and orthogonal analysis (e.g. NMR profiling or bioassays campaigns). Complementary to herbarium, these two libraries offer an efficient way (reduced space, long term storage) to capture and conserve the chemistry of the botanical gardens.
Digital objects. For all operations occurring on physical objects (sampling, conservation, extraction), metadata are collected to document the experiment. For each botanical garden, data collection is made at the species level. A species is collected for each garden, curated and taxonomically resolved (using the Open Tree of Life framework) in order to be compared across gardens. The Botalista platform will also be exploited at this step. For each collected sample, data are acquired at a finer granularity (namely at the specimen level). Here we take advantage of the iNaturalist platform and app. Using a smartphone, pictures of the sampled specimen (including eventual label in the botanical gardens, sampled organ and collection label), collector identity, date and geolocation are conveniently captured at the time of the collection. This data is then automatically collected by the iNaturalist DBGI project. The data of the project can then be programmatically accessed via the iNaturalist API. All species, specimens and experimental metadata will be collected and managed through an SQL database and accessed through a NocoDB instance for a convenient tracking of the samples by the DBGI participants.
The mass-spectrometry digitization then constitutes the core of the chemical information acquisition process. We use Ultra High Performance Liquid Chromatography coupled to High Resolution Mass Spectrometry (UHPLC-HRMS) to acquire fragmentation data in an untargeted fashion. Building on our computational metabolomics expertise we then organize and annotate the acquired pool of MS data. Here we will take advantage of five central tools (four of which were conceived by us). Molecular networking [7] will serve as a basis for spectral organization. The metabolite annotation will be performed by spectral matching against a theoretical natural products spectral database [8] and via a taxonomically informed scoring process [9] fed by the widest open resource of natural products biological occurrences (LOTUS [10]).
The DBGI conducts this project following the Open Science principles, and by following the Open Notebook Science concepts. This approach allows all research artifacts (e.g. research proposals, drafts, ideas, source code, raw and processed data etc.) produced in the frame of the DBGI to be publicly available immediately, from the moment of their production, and not only after peer-reviewed publication.
To implement the DBGI Open Notebook we employ Dendron, an open-source and lightweight note-taking and knowledge management software.
Dendrons are built using an ingenious system of markdown files hierarchically organized based on their filename.
This allows for extremely efficient note searching and refactoring of the hierarchies.
Dendrons can be conveniently shared across members of the DBGI, versioned via git and automatically published as websites.
The DBGI Dendron can be browsed at http://www.dbgi.org/dendron-dbgi/.
Regarding raw data sharing, specimen-related information are hosted on the iNaturalist DBGI project home page and pictures are shared under a permissive CC0 license allowing further reuse in Wikidata for example.
Mass spectrometry profiles and metabolite annotation files will be hosted on the MassIVE data platform where they will benefit from a permanent DOI.
All the code written in the frame of the DBGI will be publicly shared and versioned through the DBGI Github organization.
Ideas, comments and issues will be collected using Github discussions.
Note: The DBGI is welcoming researchers at any academic age : from bachelor students to old emeritus professors. The DBGI is also open to people outside academia interested in joining the initiative. According to your interests, specific expertises and available time, you can participate at several levels of the DBGI Consortium. These levels are not mutually exclusive. Please contact us (dbgi@protonmail.com) if you are willing to jump in or want to know more.