CDR Processing with Spark

1 - MapR Streams + Spark Streaming : Here you directly consume the messages into Spark, and check the "state of the towers" based on the CDR.

If the last 500 messages (or x minutes) we have specific % of failure we should send an alert/change the tower state (we can for example push a new message on a event topic to the UI, more or less what you have done in the racing car event), the idea:

0 < 10% failure on the sliding window: tower in GREEN
11% to 60% failure : tower in ORANGE
61 to 90% : tower in RED (+ special alert)
91% : tower in BLACK (+Special alert)

2 - Analytical Processing: here we will use Spark to create aggregated view (aggregated document) based on the CDR and Tower data, for example:

stats by caller id (for example 1 document in JSON DB for each caller id) with some aggregated data: number of calls, avg duration, min/max duration, and % of failure
stats by tower : number of calls, avg duration, min/max duration, and % of failure

We can aggregate also by day, hours.... using pre-aggregated document this job can runs every x minutes and do incremental update to keep stats of the whole dataset

3- Machine Learning: the idea here is to create a simple model and show how you can use this in applications, Something like:

if time since last CDR is > 99%-ile of time, mark as failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CDR Processing with Spark

Clone this wiki locally