Rice Classification Model
Overview
This repository contains an XGBoost-based model trained to classify rice grains using the mltrev23/Rice-classification
dataset. The model is designed to predict the type of rice grain based on various geometric and morphological features. XGBoost (eXtreme Gradient Boosting) is a powerful, efficient, and scalable machine learning algorithm that excels at handling structured data.
Model Details
Algorithm
- XGBoost: A gradient boosting framework that uses tree-based models. XGBoost is known for its performance and speed, making it a popular choice for structured/tabular data classification tasks.
Training Data
- Dataset: The model is trained on the
mltrev23/Rice-classification
dataset.- Features: The dataset includes the following features:
Area
,MajorAxisLength
,MinorAxisLength
,Eccentricity
,ConvexArea
,EquivDiameter
,Extent
,Perimeter
,Roundness
, andAspectRation
. - Target: The target variable is
Class
, a binary label indicating the type of rice grain.
- Features: The dataset includes the following features:
Model Performance
- Accuracy: [Insert accuracy metric]
- Precision: [Insert precision metric]
- Recall: [Insert recall metric]
- F1-Score: [Insert F1-score]
(Replace the placeholders with actual values after evaluating the model on your test data.)
Requirements
To run the model, you'll need the following Python libraries:
pip install xgboost
pip install pandas
pip install numpy
pip install scikit-learn
Usage
Loading the Model
You can load the trained model using the following code snippet:
import xgboost as xgb
# Load the trained model
model = xgb.Booster()
model.load_model('rice_classification_xgboost.model')
Making Predictions
To make predictions with the model, use the following code:
import pandas as pd
# Example input data (replace with your actual data)
data = pd.DataFrame({
'Area': [4537, 2872],
'MajorAxisLength': [92.23, 74.69],
'MinorAxisLength': [64.01, 51.40],
'Eccentricity': [0.72, 0.73],
'ConvexArea': [4677, 3015],
'EquivDiameter': [76.00, 60.47],
'Extent': [0.66, 0.71],
'Perimeter': [273.08, 208.32],
'Roundness': [0.76, 0.83],
'AspectRation': [1.44, 1.45]
})
# Convert DataFrame to DMatrix for XGBoost
dtest = xgb.DMatrix(data)
# Predict class
predictions = model.predict(dtest)
Evaluation
You can evaluate the model's performance on a test dataset using standard metrics like accuracy, precision, recall, and F1-score:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
# Assuming you have ground truth labels and predictions
y_true = [1, 0] # Replace with your actual labels
y_pred = predictions.round() # XGBoost predictions may need to be rounded
print("Accuracy:", accuracy_score(y_true, y_pred))
print("Precision:", precision_score(y_true, y_pred))
print("Recall:", recall_score(y_true, y_pred))
print("F1 Score:", f1_score(y_true, y_pred))
Model Interpretability
For understanding feature importance in the XGBoost model:
import matplotlib.pyplot as plt
# Plot feature importance
xgb.plot_importance(model)
plt.show()
References
If you use this model in your research, please cite the dataset and the following reference for XGBoost:
- Dataset:
mltrev23/Rice-classification
- XGBoost: Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).