File size: 4,692 Bytes
5d52132 1f3ef8d 95e70f2 5d52132 cc70bc8 5d52132 1f3ef8d c8fe3c6 1f3ef8d 0b41a14 cc70bc8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
---
title: Wine Quality Prediction
emoji: π·
colorFrom: pink
colorTo: gray
sdk: streamlit
sdk_version: 1.42.0
app_file: app.py
pinned: false
---
<h1>π€ Welcome to the Wine Quality Prediction App π€</h1><br>
This is my very first app here on Hugging Face's Spaces and it is my first app deploy for a machine learning model.<br>
This app was created using a dataset generated by a deep learning model trained on the <a href = "https://www.kaggle.com/datasets/yasserh/wine-quality-dataset">Wine Quality Dataset</a>. <br>
The dataset was generated for Episode 5 of the third season of Kaggle's Playgroud Series. It originally consists of the following attributes: <br><br>
<table>
<tr>
<th>Feature</th>
<th>Description</th>
</tr>
<tr>
<td><b>Fixed Acidity</b></td>
<td>Describes the amount of fixed acids within the wine, such as tartaric and malic acid.</td>
</tr>
<tr>
<td><b>Volatile Acidity</b></td>
<td>Describes the amount of volatile acids in the wine, such as acetic acid.</td>
</tr>
<tr>
<td><b>Residual Sugar</b></td>
<td>An organic acid found in citrus fruits, which can add a tangy flavor to the wine.</td>
</tr>
<tr>
<td><b>Residual Sugar</b></td>
<td>Describes the amount of unfermented sugar in the wine, which impacts the taste and sweetness of the wine.</td>
</tr>
<tr>
<td><b>Chlorides</b></td>
<td>Describes the amount of salt present in the wine.</td>
</tr>
<tr>
<td><b>Free Sulfur Dioxide</b></td>
<td>Describes the sulfur dioxide that hasn't reacted to other components in the wine.</td>
</tr>
<tr>
<td><b>Total Sulfur Dioxide</b></td>
<td>Describes the total amount of sulfur dioxide, including the free and bound forms.</td>
</tr>
<tr>
<td><b>Density</b></td>
<td>Describes a correlation between the wine's alcoholic content and the types of grapes used to make it.</td>
</tr>
<tr>
<td><b>pH</b></td>
<td>A measure of the acidity or basicity of the wine.</td>
</tr>
<tr>
<td><b>Sulfates</b></td>
<td>A type of salt used for preservation in wine, which can also affect its taste.</td>
</tr>
<tr>
<td><b>Alcohol</b></td>
<td>Describes the percentage of alcohol in the wine, which impacts its flavor and body.</td>
</tr>
<tr>
<td><b>Quality</b></td>
<td>Target variable.</td>
</tr>
</table>
<br><br>
The evaluation metric for this competition was the Quadratic Weighted Kappa score, evaluates the agreement between two raters, the predicted and actual values. <br>
The formula for Quadratic Weighted Kappa is given as:<br><br>

<br><br>
where:
-  is the actual confusion matrix.<br>
-  is the expected confusion matrix under randomness.<br>
-  is the weighted matrix, which can be calculated as %5E2), where `i` and `j` are the ratings.<br><br>
A score equal to 1 suggests a perfect agreement between the raters, while a score equal to 0 indicates that the agreement is no better than what would be expected by random chance. <br>
For this competition, I've built a <b>Pipeline</b> in which the input data gets cleaned, new features are added and selected, the categorical features get encoded with both Sklearn's <code>OrdinalEncoder</code> and <code>OneHotEncoder</code>, the data gets clustered and, finally, trained with a classifier machine learning model. <br>
The final model in this pipeline consists of a <code>CatBoostClassifier</code> model, fine-tuned with the <code>Optuna</code> library. <br>
The public score of this model for the competition was <code>QWK = 0.53011</code>. <br>
The whole process, including the training and fine-tuning, as well as an extensive EDA with Plotly, can be seen on the following Kaggle notebook: <br>
<a href = "https://www.kaggle.com/code/lusfernandotorres/wine-quality-eda-prediction-and-deploy/notebook"><b>π· Wine Quality: EDA, Prediction and Deploy</b></a> <br>
I would love to hear from you! Your feedback is the key to my growth!<br>
*Thank you!*
π§π»βπ» **Luis Fernando Torres** π§π»βπ»
Let's connect!π<br>
<a href ="https://www.linkedin.com/in/luuisotorres/">LinkedIn</a> β’ <a href ="https://medium.com/@luuisotorres">Medium</a> β’ <a href ="https://www.kaggle.com/lusfernandotorres">Kaggle</a> |