By Dora Ann Lange Canhos1, Sidnei de Souza1, Alexandre Marino1, Vanderlei Perez Canhos1, and Leonor Costa Maia2
1Centro de Referência em Informação Ambiental – CRIA
2Universidade Federal de Pernambuco, UFPE
- Herbarium, a collection of preserved samples of plants and fungi and associated data, is key documentation of the biodiversity of the past and an important instrument with which to model the biodiversity of the future. If prepared and maintained correctly, these specimens hold their scientific value for centuries.
- Comparing Brazil’s collection of herbaria with that of Europe or the USA demonstrates a significant difference in the size of their holdings. Therefore integrating the data of Brazilian herbaria through a network, enabled by the development of information and communication technology, makes it possible for Brazil to have an on-line herbarium with significant holdings that are comparable to more dominant global holdings.
- Today, data providers to Brazil’s Virtual Herbarium hold over 7 million samples and share about 5.6 million data records and 1.5 million images of more than 78 thousand distinct species. There is a constant increase in data and data providers and more than 1.7 billion data records were used between October 2012 and May 2017. The year 2016 indicated an average usage of 1.2 million records a day.
- As a result of the continued growth of the Virtual Herbarium, it appears that the number of angiosperm species identified in Brazil and described by Brazilian scientists also has increased. Furthermore, the geographic distribution of participating herbaria and their association to graduate programs has also had a great impact on education and training of future researchers, not only as to tools and data usage, but also as to more open sharing of data and knowledge.
- The greatest impact of the OCSDNet support of the Virtual Herbarium has been on the herbaria (data providers) that, through the project, have had the opportunity to reflect upon outcomes derived from their participation in the network and share their thoughts with other herbaria. This reflection has helped to turn what used to be more “one sided” data providers into a robust and collaborative human network.
A herbarium can be defined as a collection of preserved samples of plants and fungi and associated data. In a herbarium, specimens (samples) of plants and fungi collected in the field are normally dried and mounted on sheets of paper and labelled with essential data, including who collected the sample, when and where it was collected, what was collected (the scientific name of the plant or fungi) and who identified the sample. The identification of the sample can be informed as part of the field data or can be determined later by experts. As taxonomy evolves with time, scientific names, and sometimes even municipalities and country names must be updated. A dynamic curation of specimens and associated data must be in place.
Through herbaria data one can analyze species’ distribution across both time and space. Studies based on this data are important for a number of applications such as education, research, and conservation. A herbarium is a center of documentation of the biodiversity of the past and an important instrument to model the biodiversity of the future. If prepared and maintained correctly, these specimens hold their scientific value for centuries. Having a sample associated with the data enables reassessing its identification and perhaps carrying out analysis that may not have existed when the sample was collected. An example is DNA sequencing of historical specimens. This undoubtedly increases the importance of a herbarium.
Herbaria have existed since 1635, with the establishment of the Museum National d’Histoire Naturelle in Paris, France, but the first herbaria in Brazil were established in the early 19th century. It is important to compare Brazil’s largest herbaria with those in Europe and USA. The largest herbaria in the world, such as Muséum National d’Histoire Naturelle and New York Botanical Garden (NY), hold about 8 million specimens, while Brazil’s two largest herbaria, Botanical Garden of Rio de Janeiro and our National Museum only hold about 600 thousand specimens each. However, Brazil has about 200 active herbaria that together hold about 8 million specimens. Integrating the data of Brazilian herbaria through a network, enabled by the development of information and communication technology, has made it possible for Brazil to have an on-line herbarium with significant holdings, comparable to large herbaria worldwide.
This e-infrastructure is enabling the consolidation of a collaborative network of experts engaged in improving the quality of the holdings and of on-line data. This network is alsoinvolved with 95% of the botany related graduate courses of the country, influencing research and education of future botanists.
Brazil’s Virtual Herbarium today – some numbers
The network integrates 193 datasets from 130 Brazilian herbaria and 20 herbaria from Europe and the USA, with holdings of samples collected in Brazil.
Together these data providers hold over 7 million samples and share about 5.6 million data records and 1.5 million images of more than 78 thousand distinct species (all with accepted names). Besides integrating new herbaria to the network, it is still necessary to invest in digitization.
An important indicator is the movement of data (entry and removal) in the network, showing its dynamic nature. Figure 2 presents monthly averages for both total online records and total georeferenced records. The red line shows the number of data providers per month.
One can see the constant increase of data and data providers, with the exception of October 2015 when one of Brazil’s largest herbaria decided to remove its data because they felt that they were losing visibility. This shows that there are still some cultural and institutional barriers to overcome.
As for usage, more than 1.7 billion data records were used between October 2012 and May 2017. The year 2016 indicated an average usage of 1.2 million records a day.
Brazil’s Virtual Herbarium and OCSDNet
Within the context of the Open and Collaborative Science in Development Network (OCSDNet), Brazil’s Virtual Herbarium sought to understand the impact of data sharing and open collaboration for both data providers and users. For this purpose, the following activities were carried out:
- Application of a semi-structured questionnaire and SWOT analysis indicating strengths, weaknesses, opportunities, and threats concerning Brazil’s Virtual Herbarium;
- Email sent out to voluntary contributors of two on-line tools – annotation system and geographic distribution modeling workflow (BioGeo);
- Analysis on blocked data, asking data providers the reasons why data was blocked; and,
- Analysis on data users and usage of Brazil’s Virtual Herbarium.
The greatest impact of this OCSDNet project was on the herbaria (data providers) that had the opportunity to reflect on outcomes derived from their participation in the network and share their thoughts with other herbaria. This has made previous “herbaria that share data” into a robust network. The human network established is, by far, the greatest impact of this project and is the center of its innovative character.
It was found that on-line voluntary contributions through the annotation system and BioGeo (see http://biogeo.inct.florabrasil.net) had their own research as their motivation for participation. However, these contributors helped to improve data quality. The distribution models were also reported to be used to determine new areas for field collections and as such, these users also contributed in sharing new data records.
One of the aims of the Virtual Herbarium was to promote e-science. An example of its impact on taxonomy and the identification of new species is shown in Figure 4.
The growth of the number of angiosperm species in Brazil described by Brazilian scientists is clear. The availability of data through Brazil’s Virtual Herbarium and the network of specialists established surely contributed to this change.
An important characteristic of this network, the geographic distribution of participating herbaria and their association to graduate programs, also has a great impact on education and training of future researchers, not only as to tools and data usage, but also as to the promotion of open sharing of data and knowledge.
When developing the Virtual Herbarium, its main aim was to make data available on-line to all interested. This is not an easy task as it implies a cultural change as well. In the beginning of the project, it was not clear what the data provider, in this case, the team responsible for the herbaria, would gain from this. To most, it seemed to involve much more work with very little to gain. Due to the reflection carried out through the OCSDNet project, it became clear that a lot is gained from organizing and sharing one’s own data. Besides receiving credit for their work, most data providers also became intensive users of data and were able to also share knowledge and benefit from other users’ knowledge to improve their own data.
Our team therefore concludes that providing data is not a one-way road. Besides using and benefiting from feedback mechanisms, most herbaria found that visits to the facilities increased as did new collaborative projects and research. Most benefited from an increase in usage and in awareness of the data’s importance. With time, this also brought an increase in their recognition and support from their own institution.
For more details on the project and project findings, download the final progress report here.