Skip to content

This is the raw data and code repository of the SARS-ANI project on SARS-CoV-2 events in animals.

Notifications You must be signed in to change notification settings

amel-github/sars-ani

Repository files navigation

README

June, 2022

Background
About this repository
Prerequisite
About the data
Contributors
License
Contact
Disclaimer

SARS-ANI: A Global Open Access Dataset of Reported SARS-CoV-2 Events in Animals

Background

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) most probably emerged from an animal source and subsequently spilled over to the human population in 2019, in China. Despite its zoonotic origin, the current COVID-19 pandemic is being sustained through human-to-human transmission.

Animal infections with SARS-CoV-2 have been reported by several countries. A wide range of animal species have proven to be susceptible to SARS-CoV-2 either via natural and/or experimental infection.

Collecting and sharing data on reported SARS-CoV-2 natural infections in animals is of critical importance to assess their epidemiological significance for animal and human health, as well as their implications for biodiversity and conservation.

About this repository

This is the public repository of the SARS-ANI Dataset and related documentation.

This repository contains:

sars_ani_data.csv: This file contains the raw data of the SARS-ANI Dataset, which presents structured information on SARS-CoV-2 events in animals (.csv format, UTF-8 encoded).

sars_ani_validation.R: This file contains the code to validate and curate the dataset. This code enables the users to explore the structure of the dataset, check the different entries for each field, and search for the presence of duplicates.

sars_ani_visualization.Rmd: This Markdown file contains the code to explore, describe, and visualize the dataset. This code is used for the visual validation of the data. To see all the results, knit it to .pdf (default output).

Explanations for the code are displayed in the .R and .Rmd files.

sars_ani_examples.pdf: This PDF file contains three examples illustrating the structure and coding scheme of the SARS-ANI Dataset.

Contributing.md: This file provides guidelines for contributing to the project: suggesting changes to the data or to the code, submitting new data, and contributing to the code.

sars_ani_PDF_archives: For each SARS-CoV-2 event recorded in the dataset, a copy of the report used as primary and secondary information source is saved in this folder. Each report was downloaded as a .pdf file, on which a timestamp was inserted (ProMED-mail reports) or the download date was added to the file name (it was not possible to insert a timestamp on WAHIS reports).

sars_ani_excluded_rep.xlsx: This file contains the list of ProMED-mail and WAHIS reports that were not included in the dataset and reasons for exclusion.

The SARS-ANI Dashboard provides interactive visualizations of the dynamic version of the dataset.

The data on this repository is weekly updated.

Prerequisite

Install LaTeX (TinyTeX) for PDF reports

If you would like to create PDF documents from R Markdown, you will need to have a LaTeX distribution installed.

To install TinyTeX with the R package tinytex, run the following code in the Console: tinytex::install_tinytex()

To uninstall TinyTeX, run: tinytex::uninstall_tinytex()

Environment

The dataset and code were created in an English-typing environment, where the . is used as the decimal symbol and dates are in the format yyyy-mm-dd.

To know more about local environment, see: https://cran.r-project.org/web/packages/readr/vignettes/locales.html.

About the Data

Data sources

Information has been collected from two major databases: i) the Program for Monitoring Emerging Diseases ProMED-mail, which is a program of the International Society for Infectious Diseases (ISID); ii) the World Animal Health Information System WAHIS of the World Organisation for Animal Health (WOAH, formerly OIE).

ProMED-mail is the largest publicly available system reporting of global infectious disease outbreaks. It provides reports (called “posts”) on outbreaks and disease emergence. The information flow leading to publication of ProMED-mail reports is as follows: a disease event to be dispatched is selected from daily notifications of outbreaks received via emails, searching through the Internet and traditional media, and scanning of official and unofficial websites. All incoming information is reviewed and filtered by an editor or associate editor who, subsequently sends them to a multidisciplinary global team of subject matter expert moderators who assess the accountability and accuracy of the information, interpret it, provide commentary, and give references to previous ProMED-mail reports and to the scientific literature.

WAHIS is a Web-based computer system that processes data on animal diseases in real-time. WAHIS data reflects the information gathered by the Veterinary Services from WOAH Members and non-Members Countries and Territories on WOAH-listed diseases in domestic animals and wildlife, as well as on emerging and zoonotic diseases. The detection of infection with SARS-CoV-2 in animals meets the criteria for reporting to the World Animal Health Organisation (WOAH) as an emerging infection in accordance with the WOAH Terrestrial Animal Health Code. Only authorized users, i.e. the Delegates of WOAH Member Countries and their authorised representatives, can enter data into the WAHIS platform to notify the WOAH of relevant animal disease information. All information are publicly accessible on the WAHIS interface.

One ProMED-mail or WAHIS report, identified via a unique report identifier, may depict one single or several health events (or outbreaks). ProMED-mail and WAHIS also publish follow-up reports of outbreaks (describing e.g. clinical follow-up, further spread of the virus, treatment outcome, number of newly infected animals and new deaths, newly implemented control measures) that have also been entered in the dataset.

Remarks

The number of reported SARS-CoV-2 events in animals in each country depends on the reporting strategy of the country to the WOAH, the intensity of the research and surveillance strategy in the different animal species (e.g. whether pets from infected households are systematically investigated or not), the media coverage on the diagnosed cases, and the uptake of the reported event by the ProMED-mail team. If an event/outbreak is not published in WAHIS and/or ProMED-mail then it will not be included in the dataset.

Data collection process

Data on each SARS-CoV-2 event in animals was collected and entered manually in a .csv file.

Data records

The dataset is structured such as:

  • Each row of the dataset represents a SARS-CoV-2 event in animal(s), identified by a unique identifier (field ID). We consider as an event when one single case or several epidemiologically related cases were identified by the presence of viral RNA (proof of infection) and/or antibodies (proof of exposure) in an animal. Epidemiologically related cases include e.g. animals belonging to the same farm, captive animals housed together, pets belonging to the same household, or animals sampled within the same (generally transversal) study, featuring similar event and patient attributes, i.e. they underwent the same laboratory test(s) and showed the same results (including variant), exhibited the same symptoms and disease outcome, and were confirmed, reported (when applicable), and published on the same date (e.g. when pets of the same species sharing the same household showed different symptoms, they are reported as two distinct events). Events include follow-up history reports of outbreaks (e.g. follow-up on the clinical status of the animal, variant identification after case confirmation).

  • Each SARS-CoV-2 event is characterized by 50 quantitative and qualitative event and patient attributes (columns) that structure the dataset.

Field dictionary

ID Unique identifier for each unique event of SARS-CoV-2 infection/exposure in animal(s).

primary_source Primary source of information to document the event. Possible pre-defined string values are: ProMED; WAHIS.

archive_event_number Unique identifier for the report, as provided by the primary source. Also corresponds to the name of the PDF file describing the event in the sars_ani_PDF_archives folder.

link_web Link to the online primary source to document the event.

secondary_source Secondary source of information to document the event. Possible pre-defined string values are: ProMED; WAHIS.

secondary_source_ID Unique identifier for the report, as provided by the secondary source. Also corresponds to the name of the PDF file describing the event in the sars_ani_PDF_archives folder.

secondary_source_web Link to the online secondary source for the event.

host_com_orig Most specific designation of the animal host provided by the source(s), in English.

host_sci_orig Scientific name of the animal host as mentioned in the source(s) (scientific names are harmonized so that only the first letter of the genus is capitalized).

host_com_res Common name of the animal host, harmonized against the National Center for Biotechnology Information (NCBI) taxonomic backbone.

host_sci_res Scientific name of the animal host (resolved to species or subspecies level), harmonized against the National Center for Biotechnology Information (NCBI) taxonomic backbone.

host_colloq The colloquial name of the host, i.e. the name commonly used to identify the animal in non-specialist language (e.g. “tiger” for “Sumatran tiger”).

host_sci_spec_res The scientific name of the host resolved to the species level.

family Animal family of the animal host.

epidemiological_unit The epidemiological unit considered to describe the event. Possible pre-defined string values are: animal = one individual; group = a group of animals housed/living together (excluding farm animals), e.g. zoo animals, pets; survey group = animals that have been sampled in different locations within the same surveillance programme or survey study; farm: a group of animals belonging to the same species and bred for commercial purposes.

number_cases Reported number of animal(s) tested positive for SARS-CoV-2 in the event.

number_susceptible Reported number of susceptible animal(s) of the same species in the event.

number_tested Reported number of animal(s) of the same species tested in the event.

number_deaths Reported number of direct and indirect death(s) related to the event. If death is not related to SARS-CoV-2 (see field outcome), number_deaths = 0.

age Age of the animal(s) when tested, in years.

sex Sex of the animal(s). Possible pre-defined values are: f = female; m = male.

country_iso3 Three-digit ISO country code for the country where the SARS-CoV-2 event was reported.

country_name Name of the country where the SARS-CoV-2 event was reported.

subnational_administration The subnational administrative region where the SARS-CoV-2 event was reported.

city The city where the SARS-CoV-2 event was reported.

location_detail Specification of the geographic location enabling to discriminate SARS-CoV-2 events occurring in the same species, at the same date and geolocation (subnational_administration, city), when the report(s) clearly stipulates that animal(s) were not geolocated at the same place (e.g. different farms or households).

date_confirmed When the SARS-CoV-2 infection or exposure was laboratory confirmed.

date_reported When the SARS-CoV-2 event was reported by the WAHIS.

date_published When the primary source published the SARS-CoV-2 event (date_published = date_reported when WAHIS is the primary source).

related_to_other_entries Relationship with another record (see field related_ID) in the dataset. Possible pre-defined string values are: new = the event is not related to any event previously entered in the dataset and no follow-up event exists but it can be related to an event that was reported on the same day or later in time with one of the following values: related_to_other_entries = living together or related_to_other_entries = connected or related_to_other_entries = same study; updated by = the event has a follow-up event in the dataset, which itself presents the value update of. Therefore, a new event gets the value updated by when a follow-up related event is entered; update of = the event is a follow-up of an event previously entered in the dataset; living together = the animal(s) described in the event share(s) the same geolocation (e.g. farm, household, pet store) as another (other) animal(s) that has/have been previously entered in the dataset; same study = the event reports infection in animal(s) belonging to a study that was previously entered in the dataset; connected = the event is epidemiologically related to a previously reported event in the dataset (e.g. SARS-CoV-2 events in pet hamsters in pet shops in Hong Kong, following a single importation of infected individuals from the Netherlands).

related_ID Unique identifier of the related entry in the dataset.

test First type of laboratory test performed to detect infection with (presence of the virus is evidenced) or exposure to (presence of antibodies is evidenced) SARS-CoV-2.

sampling_type Type of sample collected to perform the test (test).

test_2 Second type of laboratory test performed to detect infection with (presence of the virus is evidenced) or exposure to (presence of antibodies is evidenced) SARS-CoV-2.

sampling_type_2 Type of sample collected to perform the second test (test_2).

test_3 Third type of laboratory test performed to detect infection with (presence of the virus is evidenced) or exposure to (presence of antibodies is evidenced) SARS-CoV-2.

sampling_type_3 Type of sample collected to perform the third test (test_3).

negative_test First type of laboratory test mentioned in the report, which outcome was negative.

negative_sampling_type Type of sample collected to perform the first test (negative_test) that led to negative result.

negative_test_2 Second type of laboratory test mentioned in the report, which outcome was negative.

negative_sampling_type_2 Type of sample collected to perform the second test (negative_test_2) that led to negative result.

reason_for_testing Rationale for testing the animal(s).

symptoms Reported clinical signs allegedly associated to SARS-CoV-2.

outcome Issue of the SARS-CoV-2 infection (or exposure).

living_conditions How/where the animal(s) live(s).

source_of_infection Most probable source of SARS-CoV-2 infection.

variant SARS-CoV-2 genetic variant.

control_measures Main intervention(s) implemented to mitigate further spread of the virus.

original_source Information source cited by the primary source.

link_original_source Link to the online source cited by the primary source (when applicable).

Note

We have considered the two following values throughout the dataset:

  • NS Not specified: the information would be relevant for the event but the information was not mentioned in the primary or secondary source.
  • NA Not applicable: the field is not applicable in this case.

Contributors

License

This project is licensed under the CC BY-SA 4.0 License - see the CC BY-SA 4.0 file for details.

Contact

Amélie Desvars-Larrive
Email: [email protected]

Disclaimer

The World Organisation for Animal Health (WOAH) bears no responsibility for the integrity or accuracy of the data contained herein, in particular due, but not limited to, any deletion, manipulation, or reformatting of data that may have occurred beyond its control.

About

This is the raw data and code repository of the SARS-ANI project on SARS-CoV-2 events in animals.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages