June, 2022
Background
About this repository
Prerequisite
About the data
Contributors
License
Contact
Disclaimer
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) most
probably emerged from an animal source and subsequently spilled over to
the human population in 2019, in China. Despite its zoonotic origin, the
current COVID-19 pandemic is being sustained through human-to-human
transmission.
Animal infections with SARS-CoV-2 have been reported by several
countries. A wide range of animal species have proven to be susceptible
to SARS-CoV-2 either via natural and/or experimental infection.
Collecting and sharing data on reported SARS-CoV-2 natural infections in
animals is of critical importance to assess their epidemiological
significance for animal and human health, as well as their implications
for biodiversity and conservation.
This is the public repository of the SARS-ANI Dataset and related
documentation.
This repository contains:
sars_ani_data.csv
: This file contains the raw data
of the SARS-ANI Dataset, which presents structured information on
SARS-CoV-2 events in animals (.csv format, UTF-8 encoded).
sars_ani_validation.R
: This file contains the code
to validate and curate the dataset. This code enables the users to
explore the structure of the dataset, check the different entries for
each field, and search for the presence of duplicates.
sars_ani_visualization.Rmd
: This Markdown file contains
the code to explore, describe, and visualize the dataset. This code
is used for the visual validation of the data. To see all the results,
knit it to .pdf (default output).
Explanations for the code are displayed in the .R and .Rmd files.
sars_ani_examples.pdf
: This PDF file contains three
examples illustrating the structure and coding scheme of the SARS-ANI
Dataset.
Contributing.md
: This file provides guidelines for
contributing to the project: suggesting changes to the data or to the
code, submitting new data, and contributing to the code.
sars_ani_PDF_archives
: For each SARS-CoV-2 event
recorded in the dataset, a copy of the report used as primary and
secondary information source is saved in this folder. Each report was
downloaded as a .pdf file, on which a timestamp was inserted
(ProMED-mail reports) or the download date was added to the file name
(it was not possible to insert a timestamp on WAHIS reports).
sars_ani_excluded_rep.xlsx
: This file contains the list
of ProMED-mail and WAHIS reports that were not included in the dataset
and reasons for exclusion.
The SARS-ANI Dashboard provides
interactive visualizations of the dynamic version of the dataset.
The data on this repository is weekly updated.
If you would like to create PDF documents from R Markdown, you will need
to have a LaTeX distribution installed.
To install TinyTeX with the R package tinytex, run the following code in
the Console: tinytex::install_tinytex()
To uninstall TinyTeX, run: tinytex::uninstall_tinytex()
The dataset and code were created in an English-typing environment,
where the .
is used as the decimal symbol and dates are in
the format yyyy-mm-dd
.
To know more about local environment, see: https://cran.r-project.org/web/packages/readr/vignettes/locales.html.
Information has been collected from two major databases: i) the Program
for Monitoring Emerging Diseases ProMED-mail,
which is a program of the International Society for Infectious Diseases
(ISID); ii) the World Animal Health Information
System WAHIS of the World Organisation for
Animal Health (WOAH, formerly OIE).
ProMED-mail is the largest publicly available
system reporting of global infectious disease outbreaks. It provides
reports (called “posts”) on outbreaks and disease emergence. The
information flow leading to publication of ProMED-mail reports is as
follows: a disease event to be dispatched is selected from daily
notifications of outbreaks received via emails, searching through the
Internet and traditional media, and scanning of official and unofficial
websites. All incoming information is reviewed and filtered by an editor
or associate editor who, subsequently sends them to a multidisciplinary
global team of subject matter expert moderators who assess the
accountability and accuracy of the information, interpret it, provide
commentary, and give references to previous ProMED-mail reports and to
the scientific literature.
WAHIS is a Web-based computer system that
processes data on animal diseases in real-time. WAHIS data reflects the
information gathered by the Veterinary Services from WOAH Members and
non-Members Countries and Territories on WOAH-listed diseases in
domestic animals and wildlife, as well as on emerging and zoonotic
diseases. The detection of infection with SARS-CoV-2 in animals meets
the criteria for reporting to the World Animal Health Organisation
(WOAH) as an emerging infection in accordance with the WOAH Terrestrial
Animal Health Code. Only authorized users, i.e. the Delegates of WOAH
Member Countries and their authorised representatives, can enter data
into the WAHIS platform to notify the WOAH of relevant animal disease
information. All information are publicly accessible on the
WAHIS interface.
One ProMED-mail or WAHIS report, identified via a unique report
identifier, may depict one single or several health events (or
outbreaks). ProMED-mail and WAHIS also publish follow-up reports of
outbreaks (describing e.g. clinical follow-up, further spread of the
virus, treatment outcome, number of newly infected animals and new
deaths, newly implemented control measures) that have also been entered
in the dataset.
The number of reported SARS-CoV-2 events in animals in each country
depends on the reporting strategy of the country to the WOAH, the
intensity of the research and surveillance strategy in the different
animal species (e.g. whether pets from infected households are
systematically investigated or not), the media coverage on the diagnosed
cases, and the uptake of the reported event by the ProMED-mail team. If
an event/outbreak is not published in WAHIS and/or ProMED-mail then it
will not be included in the dataset.
Data on each SARS-CoV-2 event in animals was collected and entered
manually in a .csv file.
The dataset is structured such as:
-
Each row of the dataset represents a SARS-CoV-2 event in animal(s), identified by a unique identifier (field
ID
). We consider as an event when one single case or several epidemiologically related cases were identified by the presence of viral RNA (proof of infection) and/or antibodies (proof of exposure) in an animal. Epidemiologically related cases include e.g. animals belonging to the same farm, captive animals housed together, pets belonging to the same household, or animals sampled within the same (generally transversal) study, featuring similar event and patient attributes, i.e. they underwent the same laboratory test(s) and showed the same results (including variant), exhibited the same symptoms and disease outcome, and were confirmed, reported (when applicable), and published on the same date (e.g. when pets of the same species sharing the same household showed different symptoms, they are reported as two distinct events). Events include follow-up history reports of outbreaks (e.g. follow-up on the clinical status of the animal, variant identification after case confirmation). -
Each SARS-CoV-2 event is characterized by 50 quantitative and qualitative event and patient attributes (columns) that structure the dataset.
ID
Unique identifier for each unique event of SARS-CoV-2
infection/exposure in animal(s).
primary_source
Primary source of information to document
the event. Possible pre-defined string values are: ProMED; WAHIS.
archive_event_number
Unique identifier for the report, as
provided by the primary source. Also corresponds to the name of the PDF
file describing the event in the sars_ani_PDF_archives
folder.
link_web
Link to the online primary source to document the
event.
secondary_source
Secondary source of information to
document the event. Possible pre-defined string values are: ProMED;
WAHIS.
secondary_source_ID
Unique identifier for the report, as
provided by the secondary source. Also corresponds to the name of the
PDF file describing the event in the sars_ani_PDF_archives
folder.
secondary_source_web
Link to the online secondary source
for the event.
host_com_orig
Most specific designation of the animal host
provided by the source(s), in English.
host_sci_orig
Scientific name of the animal host as
mentioned in the source(s) (scientific names are harmonized so that only
the first letter of the genus is capitalized).
host_com_res
Common name of the animal host, harmonized
against the National Center for Biotechnology Information
(NCBI) taxonomic backbone.
host_sci_res
Scientific name of the animal host (resolved
to species or subspecies level), harmonized against the National Center
for Biotechnology Information (NCBI)
taxonomic backbone.
host_colloq
The colloquial name of the host, i.e. the name
commonly used to identify the animal in non-specialist language
(e.g. “tiger” for “Sumatran tiger”).
host_sci_spec_res
The scientific name of the host resolved
to the species level.
family
Animal family of the animal host.
epidemiological_unit
The epidemiological unit considered to
describe the event. Possible pre-defined string values are: animal =
one individual; group = a group of animals housed/living together
(excluding farm animals), e.g. zoo animals, pets; survey group =
animals that have been sampled in different locations within the same
surveillance programme or survey study; farm: a group of animals
belonging to the same species and bred for commercial purposes.
number_cases
Reported number of animal(s) tested positive
for SARS-CoV-2 in the event.
number_susceptible
Reported number of susceptible animal(s)
of the same species in the event.
number_tested
Reported number of animal(s) of the same
species tested in the event.
number_deaths
Reported number of direct and indirect
death(s) related to the event. If death is not related to SARS-CoV-2
(see field outcome
), number_deaths
= 0.
age
Age of the animal(s) when tested, in years.
sex
Sex of the animal(s). Possible pre-defined values are:
f = female; m = male.
country_iso3
Three-digit ISO country code for the country
where the SARS-CoV-2 event was reported.
country_name
Name of the country where the SARS-CoV-2 event
was reported.
subnational_administration
The subnational administrative
region where the SARS-CoV-2 event was reported.
city
The city where the SARS-CoV-2 event was reported.
location_detail
Specification of the geographic location
enabling to discriminate SARS-CoV-2 events occurring in the same
species, at the same date and geolocation
(subnational_administration
, city
), when the
report(s) clearly stipulates that animal(s) were not geolocated at the
same place (e.g. different farms or households).
date_confirmed
When the SARS-CoV-2 infection or exposure
was laboratory confirmed.
date_reported
When the SARS-CoV-2 event was reported by the
WAHIS.
date_published
When the primary source published the
SARS-CoV-2 event (date_published
=
date_reported
when WAHIS is the primary source).
related_to_other_entries
Relationship with another record
(see field related_ID
) in the dataset. Possible pre-defined
string values are: new = the event is not related to any event
previously entered in the dataset and no follow-up event exists but it
can be related to an event that was reported on the same day or later in
time with one of the following values:
related_to_other_entries
= living together or
related_to_other_entries
= connected or
related_to_other_entries
= same study; updated by = the
event has a follow-up event in the dataset, which itself presents the
value update of. Therefore, a new event gets the value updated by
when a follow-up related event is entered; update of = the event is a
follow-up of an event previously entered in the dataset; living
together = the animal(s) described in the event share(s) the same
geolocation (e.g. farm, household, pet store) as another (other)
animal(s) that has/have been previously entered in the dataset; same
study = the event reports infection in animal(s) belonging to a study
that was previously entered in the dataset; connected = the event is
epidemiologically related to a previously reported event in the dataset
(e.g. SARS-CoV-2 events in pet hamsters in pet shops in Hong Kong,
following a single importation of infected individuals from the
Netherlands).
related_ID
Unique identifier of the related entry in the
dataset.
test
First type of laboratory test performed to detect
infection with (presence of the virus is evidenced) or exposure to
(presence of antibodies is evidenced) SARS-CoV-2.
sampling_type
Type of sample collected to perform the test
(test
).
test_2
Second type of laboratory test performed to detect
infection with (presence of the virus is evidenced) or exposure to
(presence of antibodies is evidenced) SARS-CoV-2.
sampling_type_2
Type of sample collected to perform the
second test (test_2
).
test_3
Third type of laboratory test performed to detect
infection with (presence of the virus is evidenced) or exposure to
(presence of antibodies is evidenced) SARS-CoV-2.
sampling_type_3
Type of sample collected to perform the
third test (test_3
).
negative_test
First type of laboratory test mentioned in
the report, which outcome was negative.
negative_sampling_type
Type of sample collected to perform
the first test (negative_test
) that led to negative result.
negative_test_2
Second type of laboratory test mentioned in
the report, which outcome was negative.
negative_sampling_type_2
Type of sample collected to
perform the second test (negative_test_2
) that led to
negative result.
reason_for_testing
Rationale for testing the animal(s).
symptoms
Reported clinical signs allegedly associated to
SARS-CoV-2.
outcome
Issue of the SARS-CoV-2 infection (or exposure).
living_conditions
How/where the animal(s) live(s).
source_of_infection
Most probable source of SARS-CoV-2
infection.
variant
SARS-CoV-2 genetic variant.
control_measures
Main intervention(s) implemented to
mitigate further spread of the virus.
original_source
Information source cited by the primary
source.
link_original_source
Link to the online source cited by the
primary source (when applicable).
We have considered the two following values throughout the dataset:
NS
Not specified: the information would be relevant for the event but the information was not mentioned in the primary or secondary source.NA
Not applicable: the field is not applicable in this case.
- Afra Nerpel, University of Veterinary Medicine Vienna,
Austria
- Liuhuaying Yang, Complexity Science Hub Vienna,
Austria
- Johannes Sorger, Complexity Science Hub Vienna,
Austria
- Annemarie Käsbohrer, University of Veterinary Medicine Vienna,
Austria
- Chris Walzer, Wildlife Conservation Society, New York, United
States / University of Veterinary Medicine
Vienna, Austria
- Amélie Desvars-Larrive, University of Veterinary Medicine Vienna,
Austria / Complexity Science Hub
Vienna, Austria
This project is licensed under the CC BY-SA 4.0 License - see the CC BY-SA 4.0 file for details.
Amélie Desvars-Larrive
Email:
[email protected]
The World Organisation for Animal Health (WOAH) bears no responsibility for the integrity or accuracy of the data contained herein, in particular due, but not limited to, any deletion, manipulation, or reformatting of data that may have occurred beyond its control.