The client is a joint-industry council founded by bodies that represent Indian broadcasters, advertisers, and advertising and media agencies. It is also the world’s largest television measurement science industry-body and most trusted source of TV viewership data provider in India.
The client collects data from households and analyzes them to rate various channels and shows. The council manages the most diverse TV measurement system which provides granular, minute by minute reports of TV viewing across 210 million TV homes in India.
Project Objective – Anomaly detection with machine learning techniques
The client receives viewership data from several television content providers at the end of each day that provides them key viewership details about the channels such as the viewers watched during the day, duration for which they watched it, time of the day and user demographics.
The main objective of the project was to verify the data for any sudden change in the viewership pattern, find the root cause analysis to understand the changing behavior and report observations of non-compliance if any, on a weekly basis. The Niveus team conducted root cause analysis using anomaly detection with machine learning techniques that attributes to internal/external events to verify the authenticity of the viewership data and report accordingly.
- The client receives viewership data from the content providers at the end of the day as batch files
- Data is pre-processed using an Apache Spark cluster for quality and transformations to provision it to the YUMI analytics system
- YUMI systems does various computations to aggregate the viewership data for the following key metrics across each of pre-defined segments – Reach, Average Time Spent (ATS) at the segment level, Average Impression of Audience (AMA) – The impressions are captured at each individual level for 1 minute of duration
- The output of the YUMI system is processed via a combination of the custom application python scripts, rules etc for anomaly detection in the viewership pattern
- One-time ingestion of historical data (push mechanism) into Google Cloud Storage
- AutoML Tables to be used for anomaly detection. AutoML capabilities provides automated model/algorithm selection, feature engineering, and hyperparameter tuning. It also uses the best algorithm against the dataset available using anomaly detection techniques
- Vertex AI to help build, deploy, and scale ML models faster, with pre-trained and custom tooling within a unified AI platform. Supports both custom models and AutoML Tables, and provides endpoint deployments of model for further consumption