Heart Disease Prediction Model

Overview

We introduce a Multi-Layer Perceptron (MLP) model for predicting heart disease based on patient data. Developed by us, XeTute Technologies, using a custom C++ library named HANNA (Hamzah's Neural Network Architecture), this model demonstrates the potential of high-performance computing in C++ for AI inference and research. It has been trained on consumer-grade hardware to achieve efficient performance.


Model Details

  • Framework: HANNA (Custom C++ Library for Neural Networks)
  • Dataset Header: id,age,gender,height,weight,ap_hi,ap_lo,cholesterol,gluc,smoke,alco,active,cardio
  • Sample Data: 13,17668,1,158,71.0,110,70,1,1,0,0,1,0
  • Training Time: ~82s ± 500ms
  • Hardware: AMD Ryzen 6-core CPU @ 3.8 GHz

The model is trained on normalized patient data to predict the likelihood of heart disease (cardio field) using a sigmoid activation function.


Training Details

  • Activation Function: Sigmoid
  • Learning Rate: 0.05
  • Epochs: 200
  • Batch Size: Full-batch

Files

  • HDM.MLP.zip: Contains the trained MLP model.

Usage

  1. Load the Model:

    MLP::MLP hdm;
    hdm.load("HDM.MLP");
    
  2. Make Predictions:

    hdm.forward(input, sigmoid);
    float prediction = hdm.out()[0];
    std::cout << "Prediction: " << prediction << std::endl;
    

Acknowledgments

This project was built by XeTute Technologies, focusing on high-performance AI development and research in C++.
THIS MODEL IS NOT MEANT FOR PRODUCTION ENVIROMENTS. AVOID USING THIS MODEL FOR REAL-WORLD USAGE AT ALL COSTS.


License

This project is open-sourced under the Apache 2.0 License.


Code

The code used to train the model is provided below. The implementation includes data preprocessing, normalization, and MLP training:

#include <iostream>
#include <sstream>
#include <chrono>
#include <cmath>

#include "HANNA/HANNA.hpp"

// Activation and derivative
void sigmoid(float& x) { x = 1.f / (1.f + std::exp(-x)); }
void sigmoidDV(float& x) { x *= (1.f - x); }

struct normConf { float min, delta; }; // delta = max - min
void minmaxnorm(std::vector<float>& a, normConf conf)
{
    if (a.empty()) return;
    if (conf.min == *std::max_element(a.begin(), a.end())) std::fill(a.begin(), a.end(), 0.f);

    long long int size = a.size();
#pragma omp parallel for
    for (long long int i = 0; i < size; ++i)
        a[i] = (a[i] - conf.min) / conf.delta;
}

std::vector<std::vector<float>> readCSV(std::string path)
{
    std::ifstream r(path, std::ios::in);
    if (!r || !r.is_open() || !r.good()) return std::vector<std::vector<float>>(0);

    std::vector<std::vector<float>> data(0);
    std::size_t elems = 0;

    std::string buffer("");
    std::string elem("");

    std::getline(r, buffer); // First line is header
    {
        std::stringstream header(buffer);
        while (std::getline(header, elem, ','))
            ++elems;
    }

    while (std::getline(r, buffer))
    {
        std::stringstream row(buffer);
        std::vector<float> rowvec(elems);
        
        std::getline(row, elem, ','); // First elem is 'id'
        for (std::size_t i = 0; std::getline(row, elem, ','); ++i)
            rowvec[i] = std::stof(elem);
        data.push_back(rowvec);
    }
    return data;
}

int main()
{
    omp_set_num_threads(6);

    std::vector<std::vector<float>> data = readCSV("cardio-hdd.csv");

    std::size_t rows = data.size();
    std::size_t cols = data[0].size();
    std::size_t inputs = cols - 1;

    std::vector<std::vector<float>> input(rows, std::vector<float>(inputs));
    std::vector<std::vector<float>> output(rows, std::vector<float>(1));

    std::vector<normConf> conf(inputs, { 0.f, 0.f });

    for (std::size_t col = 0; col < inputs; ++col)
    {
        std::vector<float> coldata(rows);
        for (std::size_t row = 0; row < rows; ++row)
            coldata[row] = data[row][col];

        float min = *std::min_element(coldata.begin(), coldata.end());
        conf[col] = { min, *std::max_element(coldata.begin(), coldata.end()) - min };
        minmaxnorm(coldata, conf[col]);

        for (std::size_t row = 0; row < rows; ++row)
            data[row][col] = coldata[row];
    }
    std::cout << "Normalized data.\n";

    for (std::size_t row = 0; row < rows; ++row)
    {
        for (std::size_t col = 0; col < inputs; ++col)
            input[row][col] = data[row][col];
        output[row][0] = data[row][inputs];
    }

    data.clear();
    std::cout << "Formatted data.\n";

    MLP::MLP hdm; // heart disease
    if (!hdm.load("HDM.MLP"))
    {
        long long int _inp = long long int(inputs);
        hdm.birth({ _inp, _inp, _inp, _inp / 2, 1 });
    }

    float lr = 0.05f;
    std::size_t epochs = 200;
    std::chrono::high_resolution_clock::time_point tp[2];

    hdm.enableTraining();
    tp[0] = std::chrono::high_resolution_clock::now();
    hdm.train(input, output, sigmoid, sigmoidDV, lr, epochs);
    tp[1] = std::chrono::high_resolution_clock::now();
    hdm.disableTraining();

    std::cout << "Took " << std::chrono::duration_cast<std::chrono::seconds>(tp[1] - tp[0]).count() << "s.\n";

    hdm.save("HDM.MLP");
    hdm.forward(input[0], sigmoid);
    std::cout << "Output: " << hdm.out()[0] << '.' << std::endl;

    return 0;
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Dataset used to train XeTute/HANNA-HDM-MLP