Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strict flight period for common butterlfy species- rejected on the eBMS verification #724

Open
CrisSevilleja opened this issue Sep 18, 2024 · 19 comments
Assignees

Comments

@CrisSevilleja
Copy link
Collaborator

Hello,

The Czech coordinator wanted to make the validations on the verification system and realised the rulesets for flagging butterflies are quite strict for some common species. He pointed to Maniola jurtina and Coenonympha pamphilus are rejected in months when they usually fly. I checked the Rulesets for flagging butterflies in issue #511 and found M. jurtina flying between June and mid-August. I took it from this document flightperiod_95perc.csv

Species | BGR | Month | perido | nrec
Maniola jurtina | Continental | 7 | II | 35396
Maniola jurtina | Continental | 7 | I | 30685
Maniola jurtina | Continental | 6 | II | 25050
Maniola jurtina | Continental | 8 | I | 22809
Maniola jurtina | Continental | 1 | I | 11454
Maniola jurtina | Continental | 6 | I | 11251

The flying period of M. jurtina in Czechia is ca since May 20 till ca September 15, see https://portal.nature.cz/w/druh-31746#/ and for Copenonympha pamphilus, it ranges from mid April to mid October (https://portal.nature.cz/w/druh-31751#/).

I think we can correct the flight periods of those two species in Czechia. Still, it would be best to check common species, like Pyronia tithonus, Polyommatus thersites, Vanessa atalanta, Celastrina argiolus among others when they are rejected and correct them for all countries. Or another option to check which species are more rejected in the verification system and determine which one have a longer flight period.

@chrisvanswaay
Copy link
Collaborator

@CrisSevilleja Can't find the script anymore (but it will be somewhere), but I'm quite sure I used the 90% or 95% quantiles. As for such abundant species numbers in peak season are very high, the tails can still have substantial numbers (though small in percentage). But in CZ M jurtina before 15 May will be doubtful, the question is if you want to skip all M jurtina in May (so also those before 15 May) or the other way around. I fear there is no easy fix.

@chrisvanswaay
Copy link
Collaborator

PS all was restricted to months, so you have to choose to either include May or not.

@CrisSevilleja
Copy link
Collaborator Author

@chrisvanswaay I wouldn't accept records before 15May in Czechia of M.jurtina but for the tail at the end of the flight period is almost a month. The rulesets are set up for mid-August and it can be seen until mid-September.

I am posting this because other coordinators told me the rulesets did not include common species, like in Austria, and I noticed this in Spain as well. I am just wondering if a new check can be done to improve and include more of those species. We can involve more coordinators to check the flight periods of all their country species.

@CrisSevilleja
Copy link
Collaborator Author

ah I though it was restricted to periods and not months.

@chrisvanswaay
Copy link
Collaborator

I'll try to find the script (there are so many, that I sometimes forget where I put them).

@DavidRoy
Copy link
Collaborator

I noticed this issue with common species too. Hopefully Chris can update the rules as we don't want to be manually adjusting them?

Also, it is worth noting that these automated checks are only to adds flags to records - they do not lead to accepted/rejected status as that is only done by the human verifiers. We could use these rules to automate the verification but it's good to be confident that they work in all (most) situations

@chrisvanswaay
Copy link
Collaborator

@DavidRoy Can you find back when I sent you the file with the flightperiods? And what the name was? That would help me to trace the script.

@chrisvanswaay
Copy link
Collaborator

With the script it would be easy to change the quantiles and run it again.

@DavidRoy
Copy link
Collaborator

it was captured by this issue #511 which also links to an earlier issue. There is some discussion on the approach and the file you supplied to us

@chrisvanswaay
Copy link
Collaborator

Thanks, that helped, found the script. I used GBIF data for this, so if for some countries data is missing, then these species will be missing. Here is the table for M jurtina in Continental:

flight_period_M_jurtina_Continental.xlsx

The top rows:

<style> </style>
species code month period nrec cumsum tot cumperc
Maniola jurtina Continental 7 II 35396 35396 153270 23,09388661
Maniola jurtina Continental 7 I 30685 66081 153270 43,11411235
Maniola jurtina Continental 6 II 25050 91131 153270 59,45781953
Maniola jurtina Continental 8 I 22809 113940 153270 74,33940106
Maniola jurtina Continental 1 I 11454 125394 153270 81,81248777
Maniola jurtina Continental 6 I 11251 136645 153270 89,15312847
Maniola jurtina Continental 8 II 11010 147655 153270 96,33653031
Maniola jurtina Continental 5 II 3114 150769 153270 98,36823906
Maniola jurtina Continental 9 I 1368 152137 153270 99,26078163
Maniola jurtina Continental 5 I 616 152753 153270 99,66268676
Maniola jurtina Continental 9 II 316 153069 153270 99,86885888

The columns will be clear. I used the cumperc (cumulative percentage) of the records (not numbers). I put the border at 95%, but we can change that to any other percentage.

@chrisvanswaay
Copy link
Collaborator

And indeed (old man forgot) I did not do it by month, but by the first and second half of the month.

@larspett
Copy link
Collaborator

larspett commented Sep 18, 2024 via email

@chrisvanswaay
Copy link
Collaborator

PS just noticed month 1 I is also in, probably records in GBIF with data on the first of January. I should have skipped those.

@chrisvanswaay
Copy link
Collaborator

Let me know if I have to run again (and skip all records on 1 Jan in GBIF) and what cumperc I should choose. Like I said this will always be a rough estimate, the only alternative is people filling in their own borders somewhere.

@larspett
Copy link
Collaborator

larspett commented Sep 18, 2024 via email

@chrisvanswaay
Copy link
Collaborator

On the email...

@CrisSevilleja
Copy link
Collaborator Author

that new run of the script is more accurate I think. I saw the month of January before, indeed it has to be a mistake.
Good that you included by periods and not only months. thanks Chris

@zdfric
Copy link

zdfric commented Sep 18, 2024

Dear all,
I would not expect such a long discussion about this topic! What is wrong with Maniola jurtina from mid-May? Usually, it starts in Czechia in June, depending on altitude and area, however, especially this year we have plenty of records from May. Typical for our country is that it is on the edge between Continental and Atlantic climate and sometimes this can lead to unpredictable strange occurrence patterns. Like this year we had a lot of records of second generation of Coenonympha arcania, Limenitis camilla etc.

@CrisSevilleja
Copy link
Collaborator Author

Where are we with this issue @chrisvanswaay and @DavidRoy ? I have another coordinator (Switzerland) asking for the flight periods to be corrected before they start verifying the records (many red thumbs at the moment).

Perhaps one option would be to use the results produced by Chris to run the eBMS verification and re-run all pending records and put the final list of periods per country/region somewhere on the eBMS. Or ask the coordinators to check this list, although this will be less efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants