-
Notifications
You must be signed in to change notification settings - Fork 18
Study GKMConsistencyDesign
This page aim to define GeoKretyMap consistency check job: issue#328
A sync state object defines the current consistency check status.
It could be null (very first time), else it is stored in temporary cache file on disk.
SyncState
attributes:
-
rollId
: current roll identifier, from 10_000 to 19_999. A rollId is incremented by 1 when a roll is finishehd. If an incremented rollId is out of range, then min value must be used. -
timestamp
: define the current geokrety offset to use (geokrety max creation date). When a batch select gives 0 result with timestamp as max creation date, then the roll is finished, timestamp is set to null, and we need to restart from scratch (last created geokrety from now as first batch). -
geokrety_count
: number of analyzed geokrety -
unsync_geokrey_count
: number of unsync geokrety - (roll output directory is generated by configuration + rollId)
- array of finished_roll history
finished_roll
attributes:
-
rollId
: roll identifier geokrety_count
unsync_geokrey_count
-
finished_timestamp
: timestamp value of the last finished roll. - (roll output directory is generated by configuration + rollId)
Sync parameters is defined as the set of all config/parameters/attributes used as consistency check business logic input.
SyncParameters:
- current gk-geokrety table
- a job startup trigger (a cron entry config)
-
konfig
entries:
konfig:
-
gkm_consistency_max_duration_sec
: a job max duration in seconds -
gkm_consistency_max_batch_size
: a batch size is geokrety select limit -
gkm_consistency_roll_min_days
: min days limit between rolls -
gkm_consistency_roll_number_to_keep
: number of roll history to keep -
gkm_consistency_diretory
: an output log directory -
gkm_api_endpoint
: GeoKretyMap API endpoint
Started by cron configuration, the goal of a job is to start and try to finish a roll in the limit of max duration. But generally more than one job could be necessary to end a roll. This way we can spread the load over several days for a roll.
- If a roll couldn't be finished in the limit, a SyncState is produced and stored for the next job.
- If a job is triggered with an ended roll as SyncState, a new roll is started if and only if the roll min interval day is reached. rollId is incremented.
A roll is defined by the check of all geokrety table entries. This check is done by one or more batches (depend of X):
- We start by using current datetime and a selecting X geokrety order by creation date desc.
- X is max batch size,
- A batch will compare this X geokrety with remote GKM state using GKM API (cf dedicated section),
When a batch is finished, then timestamp
is set to geokrety min creationdate (to be used as max timestamp for the next batch iteration).
- a new batch could start if and only if the job max duration seconds is not reached.
We considers a end of a roll when a new batch gives no result. We need to store the roll history and null as timestamp
Admin page should inlude:
- array of last Y roll
- for each roll: state, number of handled geokrety, number of unsync geokrety with afriendly reminder to the output rollId directory log.
The following geokrety information will be used to compare gk-geokrety
entry with related GKM API call:
- id
- dateMoved
- ownerName
- ownerId
- distanceTraveledKm
- waypointCode
- state
- typeId
- positionLat
- positionLon
- imageSrc
- name
- lastMoveId