Gil-Simas's picture
update readme to match ref-metrics

A newer version of the Gradio SDK is available: 5.20.1

colorFrom: yellow
colorTo: green
description: 'TODO: add a description here'
emoji: 🤑
pinned: false
  version: v3
sdk: gradio
sdk_version: 4.36.0
  - evaluate
  - metric
title: user-friendly-metrics

How to Use

import evaluate
from seametrics.payload.processor import PayloadProcessor

payload = PayloadProcessor(
    # tags=["GT_ID_FUSION"],

module = evaluate.load("SEA-AI/user-friendly-metrics")
res = module._compute(payload, max_iou=0.5, recognition_thresholds=[0.3, 0.5, 0.8])
    "ahoy_IR_b2_engine_3_6_0_49_gd81d3b63_oversea": {
        "overall": {
            "all": {
                "f1": 0.15967351103175614,
                "fn": 2923.0,
                "fp": 3666.0,
                "num_gt_ids": 10,
                "precision": 0.14585274930102515,
                "recall": 0.1763877148492533,
                "recognition_0.3": 0.1,
                "recognition_0.5": 0.1,
                "recognition_0.8": 0.1,
                "recognized_0.3": 1,
                "recognized_0.5": 1,
                "recognized_0.8": 1,
                "tp": 626.0
        "per_sequence": {
            "Sentry_2023_02_08_PROACT_CELADON_@6m_MOB_2023_02_08_12_51_49": {
                "all": {
                    "f1": 0.15967351103175614,
                    "fn": 2923.0,
                    "fp": 3666.0,
                    "num_gt_ids": 10,
                    "precision": 0.14585274930102515,
                    "recall": 0.1763877148492533,
                    "recognition_0.3": 0.1,
                    "recognition_0.5": 0.1,
                    "recognition_0.8": 0.1,
                    "recognized_0.3": 1,
                    "recognized_0.5": 1,
                    "recognized_0.8": 1,
                    "tp": 626.0

Metric Settings

The max_iou parameter is used to filter out the bounding boxes with IOU less than the threshold. The default value is 0.5. This means that if a ground truth and a predicted bounding boxes IoU value is less than 0.5, then the predicted bounding box is not considered for association. So, the higher the max_iou value, the more the predicted bounding boxes are considered for association.


The output is a dictionary containing the following metrics:

Name Description
recall Number of detections over number of objects.
precision Number of detected objects over sum of detected and false positives.
f1 F1 score
num_gt_ids Number of unique objects on the ground truth
fn Number of false negatives
fp Number of of false postives
tp number of true positives
recognized_th Total number of unique objects on the ground truth that were seen more then th% of the times
recognition_th Total number of unique objects on the ground truth that were seen more then th% of the times over the number of unique objects on the ground truth

How it Works

We levereage one of the internal variables of motmetrics MOTAccumulator class, events, which keeps track of the detections hits and misses. These values are then processed via the track_ratios function which counts the ratio of assigned to total appearance count per unique object id. We then define the recognition function that counts how many objects have been seen more times then the desired threshold.

W&B logging

When you use module.wandb(), it is possible to log the User Frindly metrics values in Weights and Bias (W&B). The W&B key is stored as a Secret in this repository.


  • wandb_project - Name of the W&B project (Default: 'user_freindly_metrics')
  • log_plots (bool, optional): Generates categorized bar charts for global metrics. Defaults to True
  • debug (bool, optional): Logs everything to the console and w&b Logs page. Defaults to False
import evaluate
import logging
from seametrics.payload.processor import PayloadProcessor


# Configure your dataset and model details
payload = PayloadProcessor(

# Evaluate using SEA-AI/user-friendly-metrics
module = evaluate.load("SEA-AI/user-friendly-metrics")
res = module._compute(payload, max_iou=0.5, recognition_thresholds=[0.3, 0.5, 0.8])

module.wandb(res,log_plots=True, debug=True)
  • If log_plots is True, the W&B logging function generates four bar plots:

    • User_Friendly Metrics (mostly_tracked_score_%) mainly for non dev users
    • User_Friendly Metrics (mostly_tracked_count_%) for dev
    • Evaluation Metrics (F1, precision, recall)
    • Prediction Summary (false negatives, false positives, true positives)
  • If debug is True, the function logs the global metrics plus the per-sequence evaluation metrics in descending order of F1 score under the Logs section of the run page.

  • If both log_plots and debug are False, the function logs the metrics to the Summary.



title = {A great new module},
authors={huggingface, Inc.},
title={MOT16: A benchmark for multi-object tracking},
author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
journal={arXiv preprint arXiv:1603.00831},

Further References