Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.49.0
title: Model Point Clustering
emoji: ๐งฎ
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
license: mit
tags:
- actuarial
- clustering
- model-points
- insurance
- gradio
- data-science
- present-values
- policy-attributes
- cashflows
- machine-learning
short_description: Cluster insurance policies into representative model points.
๐งฎ Model Point Clustering Dashboard
An interactive dashboard for calibrating and evaluating model points using K-Means clustering. Designed for actuaries and data scientists working with large insurance portfolios.
๐ Overview
This application performs cluster-based model point selection by grouping similar policies to represent large portfolios more efficiently.
You can choose from three clustering calibration methods:
- Annual Cashflows
- Policy Attributes
- Present Values
It compares how well each clustering method replicates actual values across base, lapse, and mortality stress scenarios.
๐ Use Cases
- Model point reduction for valuation and projections
- Policy summarization for faster simulations
- Stress testing comparison across representative points
- Actuarial model validation and calibration studies
๐ Features
Calibration Methods
- Cashflows: Captures policy behavior over time.
- Attributes: Uses demographic/product characteristics.
- Present Values: Focuses on total liability or cashflow values.
Interactive Tabs
- Summary: Bar chart of absolute PV Net Cashflow errors.
- Cashflow Calibration: Visual and tabular comparisons based on cashflows.
- Policy Attribute Calibration: Analysis using static policy data.
- Present Value Calibration: PV-based clustering with stress testing.
Scenario Support
- Base Scenario
- Lapse Stress (+50%)
- Mortality Stress (+15%)
๐ Required Inputs
Upload 7 .xlsx
files, or use the example files by clicking Load Example Data.
File Type | Description |
---|---|
cashflows_seriatim_10K.xlsx |
Base cashflows per policy |
cashflows_seriatim_10K_lapse50.xlsx |
Cashflows under lapse stress |
cashflows_seriatim_10K_mort15.xlsx |
Cashflows under mortality stress |
model_point_table.xlsx |
Policy attributes (age, term, etc.) |
pv_seriatim_10K.xlsx |
Present values for base |
pv_seriatim_10K_lapse50.xlsx |
PVs under lapse stress |
pv_seriatim_10K_mort15.xlsx |
PVs under mortality stress |
Example directory structure:
โโโ app.py
โโโ eg_data/
โโโ cashflows_seriatim_10K.xlsx
โโโ cashflows_seriatim_10K_lapse50.xlsx
โโโ cashflows_seriatim_10K_mort15.xlsx
โโโ model_point_table.xlsx
โโโ pv_seriatim_10K.xlsx
โโโ pv_seriatim_10K_lapse50.xlsx
โโโ pv_seriatim_10K_mort15.xlsx
โ๏ธ How to Use
Launch the App
Click the "Open in Spaces" button or runapp.py
.Upload or Load Files
- Upload all 7 required
.xlsx
files. - Or click "Load Example Data".
- Upload all 7 required
Run Analysis
Click "Analyze Dataset" to generate cluster reps, plots, and comparisons.Explore Tabs
- ๐ Summary: Calibration errors across scenarios.
- ๐ธ Cashflow Calibration: Clustered vs actual based on cashflows.
- ๐ค Policy Attribute Calibration: Calibrated via policy data.
- ๐ฐ Present Value Calibration: Uses PVs directly.
๐ง Behind the Scenes
Core Engine: Clusters
Class
Encapsulates K-Means logic for:
- Clustering using selected variables
- Selecting representative policies
- Aggregating actual vs estimated outputs
- Plotting cashflows, PVs, and scatter comparisons
Key Libraries
gradio
โ UI and file interfacepandas
,numpy
โ Data manipulationscikit-learn
โ K-Means clusteringmatplotlib
,PIL
โ Visualization
๐ Output Summary
The application generates:
- ๐ Cluster vs Actual Comparisons
- ๐ผ๏ธ Cashflow Time Series Plots
- โ๏ธ Per-Cluster Scatter Plots
- ๐ Summary Tables
- ๐ Mean Absolute Error Bar Charts
All results are based on direct comparison of cluster-aggregated estimates vs original full dataset metrics.
๐ Attribution & References
Inspired by the Lifelib open-source project:
lifelib Developers. (2025). Model Point Clustering. In lifelib: Life actuarial models in Python.
https://github.com/lifelib-dev/lifelib
Notebook reference:
Cluster Model Points โ Lifelib Notebook
๐ ๏ธ Local Setup
To run locally:
# Clone the repo
git clone https://github.com/alidenewade/model-point-clustering.git
cd model-point-clustering
# Install dependencies
pip install -r requirements.txt
# Launch app
python app.py
๐ License
This project is open source under the MIT License.