Spaces:

kevinhug
/

clientX

Sleeping

App Files Files Community

kevinhug commited on Dec 10, 2023

Commit

ae5a310

1 Parent(s): 3aa2412

explain

Browse files

Files changed (1) hide show

app.py +74 -81

app.py CHANGED Viewed

@@ -224,93 +224,86 @@ With no need for jargon, SSDS delivers tangible value to our fintech operations.
     df=pd.read_csv("./xgb/re.csv")
     gr.Markdown("""
-    Explain by Dataset
-    =============
-    ![summary](file=./xgb/data.png)
-    sorted feature from top(most importance)
-    dist_subway when at low value(green) make big impact to price
-    dist_store doesnt make much impact to price
-    high age lower the price
-    low age raise the price
-    Explain by Feature
-    =============
-    ![partial_dependence](file=./xgb/feature.png)
-    dist lower than 900 spike the price f(x)
-    also highlighted the shap value for record[20] at around 6500
-    Explain by Record
-    =============
-    ![force](file=./xgb/record.png)
-    the largest contribution to positive price is dist_subway
-    second contribution is age
-    Explain by Instance
-    =============
-    ![dependence](file=./xgb/instance.png)
-    at around 500 dist_subway, it possible for positive impact and negative impact for price
-    over all trend is negative that mean, closer to subway is contribute to higher price
-    there is a point at 6500 far from subway and it has negative impact on price, despite is is close to store(dist_stores)
-    ![1st decision tree](file=./xgb/tree.svg)
-    some how the word doesnt show in web...but this is the first decision tree inside xgboost
-    Explain by Top 5 Error Example
-    =============
-    ![](file=./xgb/error_data.png)
-    top feature for top 5 error is age
-    young age has negative impact on price
-    ![](file=./xgb/error_record.png)
-    top 1 error, negative impact for young age in price
-    ![](file=./xgb/error_feature.png)
-    for top 5 error, it is possible that further from subway will have positive in price
-    ![](file=./xgb/error_instance.png)
-    for top 5 error, it is possible young age have negative impact and old age has positive impact in price
-    ML Observability
-    =============
-    Visualization with Context
-    https://public.tableau.com/app/profile/kevin1619/vizzes
-    Data Validation
-    -----------
-      I led data validation for new data source for legacy model using covariate shift, recall methodology
-      Ensure feature transformation are same in dev and prod environment
-    Unit Testing/Acceptance Testing
-    -----------
-      I led unit testing for model, and discover logical error, improve lift by 50% for small business campaign
-    A/B Testing for lift
-    -----------
-      A/B testing for small business model using statistical approach to ensure lift pass criteria
-    File/Log Mining
-    -----------
-      I led server observability to understand why server was brought down with event journey map.
-    Root Cause Analysis
-    -----------
-      With the right metric in place, I can trace back to the root cause with six sigma methodology
     """)

     df=pd.read_csv("./xgb/re.csv")
     gr.Markdown("""
+Explain by Dataset
+===============
+![Summary](file=./xgb/data.png)
+**Key insights:**
+- **dist_subway** has a significant impact on pricing when at low values (green).
+- **dist_store** demonstrates minimal impact on price.
+- Higher age correlates with lower prices while lower age raises prices.
+Explain by Feature
+===============
+![Partial Dependence](file=./xgb/feature.png)
+**Observations:**
+- Prices spike for **distances lower than 900** based on the function f(x).
+- Noteworthy **SHAP value at record[20] around 6500**.
+Explain by Record
+===============
+![Force](file=./xgb/record.png)
+**Contribution to Price:**
+- **dist_subway** holds the largest positive contribution to price.
+- **Age** follows as the second significant contributor.
+Explain by Instance
+===============
+![Dependence](file=./xgb/instance.png)
+**Insights:**
+- Around **500 dist_subway**, there's a potential for both positive and negative impacts on price.
+- Overall trend: closer proximity to the subway correlates with higher prices.
+- An outlier at **6500 distance** from subway negatively impacts price, despite proximity to stores (dist_stores).
+![1st Decision Tree](file=./xgb/tree.svg)
+*Note: Unfortunately, the web doesn't display text, but this refers to the first decision tree within XGBoost.*
+Explain by Top 5 Error Example
+===============
+![Top 5 Error Data](file=./xgb/error_data.png)
+**Top Features for Errors:**
+- **Age** stands out as the top feature impacting the top 5 errors negatively (for young ages).
+![Error Record](file=./xgb/error_record.png)
+**Top 1 Error:**
+- Notably, young age has a negative impact on pricing (top 1 error).
+![Error Feature](file=./xgb/error_feature.png)
+**Insight from Errors:**
+- Further distance from the subway might positively impact pricing for the top 5 errors.
+![Error Instance](file=./xgb/error_instance.png)
+**Error Instances:**
+- Younger age negatively impacts price, while older age positively impacts it for the top 5 errors.
+ML Observability
+===============
+**Visualization with Context:**
+[Tableau Visualization](https://public.tableau.com/app/profile/kevin1619/vizzes)
+**Data Validation:**
+- Led data validation for a new data source using covariate shift and recall methodology for legacy models.
+- Ensured consistency in feature transformation between dev and prod environments.
+**Unit Testing/Acceptance Testing:**
+- Led unit testing for models, identified logical errors, and improved campaign lift by 50% for small businesses.
+**A/B Testing for Lift:**
+- Utilized statistical approaches in A/B testing for small business models, ensuring lift met criteria.
+**File/Log Mining:**
+- Led server observability, leveraging event journey maps to understand server downtimes.
+**Root Cause Analysis:**
+- Proficient in employing Six Sigma methodology to trace root causes with established metrics.
     """)