Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gather data from different sources for NPI model #518

Open
hazarane opened this issue Aug 12, 2020 · 1 comment
Open

Gather data from different sources for NPI model #518

hazarane opened this issue Aug 12, 2020 · 1 comment
Assignees

Comments

@hazarane
Copy link

Description

The data that we are using now for NPI model were created by forecasters before and now are outdated.

Acceptance Criteria

Use data from other sources to compile the dataset for NPI model.
Available datasets:
https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-government-response-tracker
https://www.nature.com/articles/s41562-020-0909-7
https://github.com/amel-github/covid19-interventionmeasures
https://masks4all.co/what-countries-require-masks-in-public/

Investigate the datasets and merge them as most fit for the model.

@JanataPavel
Copy link
Contributor

JanataPavel commented Aug 12, 2020

From the list of datasets
https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-government-response-tracker - OxCGRT - useful for most of the features
https://www.nature.com/articles/s41562-020-0909-7 - CoronaNet - the data is more ambiguous and less suited for our usecase, but some features could be extracted from it to fill the gaps of OxCGRT
https://github.com/amel-github/covid19-interventionmeasures - The data is no longer updated so it's no use to us
https://masks4all.co/what-countries-require-masks-in-public/ - sporadically updated, the data only contains the current status

OxCGRT

The definition of their features can be found in their codebook. Some of the features have also flags marking whether the policy is nationwide or only in some parts of the country. As we're interested only in the nationwide interventions we require the flags to be turned on

Some of the features used in some of our experiments already originated from the OxCGRT dataset

Most of the OxCGRT features are on an ordinal. 0 corresponding to no restrictions and then ranging from some restriction to tighter restriction. Usually, some of the lowest levels are recommendations for the public, which we ignore, as we are only concerned with policies which are enforced in some way.

Mapping:

Our feature OxCGRT feature
Name Description
Name Threshold, Description
Symptomatic Testing From OxCGRT H2_Testing policy >= 2
2 - testing of anyone showing Covid-19 symptoms
3 - open public testing (eg "drive through" testing available to asymptomatic people)
Blank - no data
Gatherings <1000 A country has set a size limit on gatherings. The limit is at most 1000
people (often less), and gatherings above the maximum size are disal-
lowed. For example, a ban on gatherings of 500 people or more would
be classified as “gatherings limited to 1000 or less”, but a ban on gath-
erings of 2000 people or more would not.
C4_Restrictions on gatherings >= 2
2 - restrictions on gatherings between 101-1000 people
3 - restrictions on gatherings between 11-100 people
4 - restrictions on gatherings of 10 people or less
Gatherings <100 A country has set a size limit on gatherings. The limit is at most 100
people (often less).
C4_Restrictions on gatherings >= 3
3 - restrictions on gatherings between 11-100 people
4 - restrictions on gatherings of 10 people or less
Gatherings <10 A country has set a size limit on gatherings. The limit is at most 10
people (often less).
C4_Restrictions on gatherings >= 4
4 - restrictions on gatherings of 10 people or less
School Closure A country has closed most or all schools. C1_School closing >= 3
3 - require closing all levels
Stay Home Order An order for the general public to stay at home has been issued. This is
mandatory, not just a recommendation. Exemptions are usually granted
for certain purposes (such as shopping, exercise, or going to work), or,
more rarely, for certain times of the day. In practice, a stay-at-home
order was often accompanied by other NPIs such as businesses closures.
However, a stay-at-home order does not in principle entail these other
NPIs, but only the (additional) order to generally stay at home except
for exemptions.
C6_Stay at home requirements >= 2
2 - require not leaving house with exceptions for daily exercise, grocery shopping, and 'essential' trips
3 - require not leaving house with minimal exceptions (eg allowed to
leave once a week, or only one person can leave at a time, etc)
Travel Screen/Quarantine From OxCGRT C8_International travel controls >= 1
1 - screening arrivals
2 - quarantine arrivals from some or all regions
3 - ban arrivals from some regions
4 - ban on all regions or total border closure
Travel Bans From OxCGRT C8_International travel controls >= 3
3 - ban arrivals from some regions
4 - ban on all regions or total border closure
Public Transport Limited From OxCGRT C5_Close public transport >= 1
1 - recommend closing (or significantly reduce volume/route/means of transport available)
2 - require closing (or prohibit most citizens from using it
Internal Movement Limited From OxCGRT C7_Restrictions on internal movement >= 1
1 - recommend not to travel between regions/cities
2 - internal movement restrictions in place
Public Information Campaigns From OxCGRT H1_Public information campaigns >= 1
1 - public officials urging caution about Covid-19
2- coordinated public information campaign (eg across traditional and social media)

non-mapped features

  • Some/Most Businesses Suspended - out feature is concerned mainly with customer-facing businesses while the only similar OxCGRT feature is C2_Workplace closing which has some overlap, but it has a wider focus. It could probably still be used if there is a lack of other data

  • Mask-wearing - OxCGRT has no mask-wearing features

  • Universities closed - the OxCGRT doesn't differentiate between the school levels and has only the C1_School closing feature

CoronaNet

The CoronaNet dataset is a list of entries each corresponding to some announced policy which went into an effect at some date.
The policies are grouped into categories (type) and subcategories (type_sub_cat) which are described in their codebook. However, the names of subcategories in codebook don't perfectly fit the names in the data and I could not find any coherent description of the subcategories.

From all the entries we have to filter the nationwide and mandatory policies (columns init_country_level and compliance)

The main goal with this dataset is to fill in the features not contained in the OxCGRT data (i.e. masks, businesses, and universities)

### Universities
Although there are defined sub_categories for universities, there are no entries about closed universities in the data

### Masks
All the mask-wearing entries should be contained in the Social Distancing category. Usually, it is in one of these subcategories Unspecified Mask Wearing Policy, Wearing masks, Other Mask Wearing Policy, but not always and it can be also in All public spaces / everywhere, Inside public or commercial building (e.g. supermarkets) as part of a wider policy. So the criteria for mask-wearing entry is to satisfy one of these conditions
* Has type_sub_cat one of the [Unspecified Mask Wearing Policy, Wearing masks, Other Mask Wearing Policy]
* Has type == Social Distancing and contains "mask" in the description of the entery

### Busnesses
All entries related to the closing of businesses have type=="Restriction and Regulation of Businesses"
* Some Businesses Suspended - has any subcategory except ["Construction","Telecommunications", "Information service activities", "Publishing activities", "Warehousing and support activities for transportation", "Mining and quarrying"] (we want only customer-facing)
* Some Businesses Suspended - has one of the subcategories ["All or unspecified non-essential businesses", "All or unspecified essential businesses", "Non-Essential Commercial Businesses", "Other Essential Businesses"]

Although the CoronaNet contains lot of data about many countries, the format of the data makes is basically impossible to automatically determine which countermeasures are on at a given time. From my exploration of the data, deducing any meaningful information from it would require reading the description of individual entries, because it often happens, that same an entry with some category and subcategory can have widely different meanings. Plus, the entries are not exactly consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants