MartialTerran
/

Contract-Enforced_Collaborative_Supply_Chain_Forecasting_Model.py

Model card Files Files and versions Community

MartialTerran commited on 8 days ago

Commit

8949a60

verified ·

1 Parent(s): c333f60

Create Model_Inputs+Outputs.md

Browse files

Files changed (1) hide show

Model_Inputs+Outputs.md +170 -0

Model_Inputs+Outputs.md ADDED Viewed

	@@ -0,0 +1,170 @@

+Here we define the inputs and outputs of the "black box" Transformer-based forecasting model (Enhanced_Business_Model_for_Collaborative_Predictive_Supply_Chain_model.py) within this collaborative supply chain context.
+We categorize them for clarity and provide details on their format and expected characteristics.
+This detailed breakdown of inputs and outputs provides a clear picture of the data requirements and the expected results of the forecasting model, serving as a solid foundation for its development and implementation within the collaborative supply chain framework. It also sets the stage for specifying data preprocessing steps, model architecture, and evaluation metrics.
+**I. Inputs**
+The inputs are all the data fed into the Transformer model to generate the forecasts. Since we're aiming for a comprehensive and dynamic system, the inputs are diverse and can be grouped into several categories:
+**A. Historical Sales Data:**
+*   **Description:** Time-series data of past sales, at the most granular level possible (ideally SKU-store-day).
+*   **Format:**
+    *   **Structure:** Typically a tabular format (e.g., CSV, Parquet, database table).  Could also be a tensor if pre-processed for the Transformer.
+    *   **Columns:**
+        *   `timestamp`: Date and time of the sale (e.g., `YYYY-MM-DD HH:MM:SS` or a Unix timestamp).
+        *   `sku`: Stock Keeping Unit (unique product identifier).
+        *   `store_id`: Identifier for the store location.
+        *   `quantity`: Number of units sold.
+        *   `price`: Unit price at the time of sale.
+        *   `discount`: Any discount applied (amount or percentage).
+*   **Characteristics:**
+    *   High frequency (daily or even hourly).
+    *   Potentially millions or billions of rows.
+    *   May exhibit seasonality, trends, and noise.
+**B. Promotional Data:**
+*   **Description:** Information about past, current, and *planned* promotional activities.
+*   **Format:**
+    *   **Structure:** Tabular format.
+    *   **Columns:**
+        *   `promotion_id`: Unique identifier for the promotion.
+        *   `sku`:  SKU(s) included in the promotion.
+        *   `store_id`: Store(s) where the promotion is active.
+        *   `start_date`: Start date of the promotion.
+        *   `end_date`: End date of the promotion.
+        *   `promotion_type`:  Type of promotion (e.g., "BOGO," "percentage discount," "fixed price discount," "coupon").
+        *   `discount_value`:  Value of the discount (e.g., 0.2 for a 20% discount, 5.00 for a $5 discount).
+        *   `marketing_spend`:  (Optional) Amount spent on advertising for the promotion.
+*   **Characteristics:**
+    *   Less frequent than sales data.
+    *   Should include *future* planned promotions, which are crucial for forecasting.
+**C. Inventory Data:**
+*   **Description:**  Information about current and historical inventory levels.
+*   **Format:**
+    *   **Structure:** Tabular format.
+    *   **Columns:**
+        *   `timestamp`: Date and time of the inventory snapshot.
+        *   `sku`: Stock Keeping Unit.
+        *   `store_id`: Store location (or warehouse ID for wholesalers).
+        *   `quantity_on_hand`: Number of units currently in stock.
+        *   `quantity_on_order`: Number of units ordered but not yet received.
+        *   `reorder_point`:  (Optional) The inventory level at which a new order should be placed.
+        *    `safety_stock` (Optional) Minimum stock.
+*   **Characteristics:**
+    *   Frequency can vary (daily, weekly).
+**D. External Factors:**
+*   **Description:** Data that is not directly related to sales or inventory but can influence demand.
+*   **Format:**
+    *   **Structure:** Can be tabular or time-series data from various sources.
+    *   **Examples:**
+        *   **Economic Indicators:**  GDP growth, unemployment rate, consumer confidence index, inflation rate. (Typically time-series data from government sources or financial data providers.)
+        *   **Weather Data:**  Temperature, precipitation, forecasts. (Time-series data from weather APIs.)
+        *   **Holiday/Event Indicators:**  Binary indicators (0 or 1) for holidays, major events, school breaks. (Typically a pre-defined calendar.)
+        *   **Social Media Sentiment:**  Aggregated sentiment scores related to the product or brand. (Requires text processing and sentiment analysis.)
+        *   **Web Traffic Data:**  Website visits, product page views, search queries. (Data from web analytics platforms.)
+        *   **Competitor Data:**  Pricing and promotional activity of competitors (if available, often through web scraping or third-party data providers).
+*   **Characteristics:**
+    *   Varying frequencies and formats depending on the source.
+**E. Product Metadata:**
+*   **Description:**  Static information about the products.
+*   **Format:**
+    *   **Structure:** Tabular format.
+    *   **Columns:**
+        *   `sku`: Stock Keeping Unit.
+        *   `product_category`:  Category the product belongs to.
+        *   `product_subcategory`:  Subcategory.
+        *   `brand`:  Brand name.
+        *   `product_description`:  Textual description (may be used for embeddings).
+        *   `price_tier`: (Optional) Categorization based on price (e.g., "economy," "mid-range," "premium").
+* **Characteristics:**
+    *   Relatively static; changes infrequently.
+**F. Store Metadata:**
+* **Description:** Static information of store.
+* **Format:**
+  * **Structure:** Tabular format.
+    * **Columns:**
+        *`store_id`: Unique store identifier.
+        *`location`: City and state.
+        *`store_type`: Physical, online, mixed.
+**II. Outputs**
+The outputs are the forecasts generated by the Transformer model.
+**A. Probabilistic Forecasts:**
+*   **Description:**  Instead of a single point forecast (e.g., "we will sell 100 units"), the model provides a *probability distribution* of future demand. This quantifies the uncertainty in the forecast.
+*   **Format:**
+    *   **Structure:**  Typically a set of quantiles (or percentiles) for each SKU-store-future time period.
+    *   **Example:**  For SKU 123, store A, on 2024-07-04, the model might output:
+        *   `p10`: 80 units (10th percentile - there's a 10% chance demand will be 80 units or less)
+        *   `p50`: 105 units (50th percentile - median forecast)
+        *   `p90`: 130 units (90th percentile - there's a 90% chance demand will be 130 units or less)
+        *   ...and other quantiles as needed (e.g., p25, p75, p95, p99).
+*   **Characteristics:**
+    *   Provides a range of possible outcomes, allowing for risk-aware decision-making.
+    *   Allows for calculation of confidence intervals.
+**B. Forecast Horizon:**
+*   **Description:** The length of time into the future for which the model generates forecasts.
+*   **Format:**
+    *   Defined by the model configuration and the needs of the business.  Could be days, weeks, or months.
+    *   Typically specified as a number of time steps (e.g., 28 days, 12 weeks).
+*   **Characteristics:**
+    *   Longer horizons generally have greater uncertainty.
+**C. Forecast Granularity:**
+*   **Description:**  The level of detail at which the forecasts are generated (SKU-store-day, SKU-region-week, etc.).
+*   **Format:**
+    *   Determined by the model and the available data.
+    *   Should align with the business needs (e.g., retailers need store-level forecasts, while wholesalers might need regional forecasts).
+**D. Forecast Timestamps:**
+*    **Description:**  The specific dates and times for which the forecasts are generated.
+*   **Format:**
+    *   A list or sequence of timestamps corresponding to the forecast horizon and granularity.
+    *   Example: `[2024-07-04, 2024-07-05, 2024-07-06, ...]`
+**E. (Optional) Explainability Outputs:**
+*   **Description:**  Outputs that help explain *why* the model made a particular forecast.  This is especially important for building trust and understanding.
+*   **Format:**
+    *   **Attention Weights:**  For Transformer models, the attention weights can be visualized to show which parts of the input sequence were most important for the prediction.
+    *   **Feature Importance Scores:**  Estimates of the relative importance of different input features.
+    *   **SHAP Values:**  A more sophisticated method for explaining individual predictions.
+*   **Characteristics:**
+    *   Can be complex to interpret, but provide valuable insights.
+**Summary Table:**
+| Category         | Description                                                                      | Format                                    | Characteristics                                                                 |
+| ---------------- | -------------------------------------------------------------------------------- | ------------------------------------------ | ------------------------------------------------------------------------------- |
+| **Inputs**       |                                                                                  |                                            |                                                                                 |
+| Historical Sales | Past sales data (SKU-store-day level)                                           | Tabular (timestamp, sku, store_id, quantity, price, discount) | High frequency, potentially large, may exhibit seasonality/trends/noise.       |
+| Promotional Data | Past, current, and *planned* promotions                                          | Tabular (promotion_id, sku, store_id, start/end dates, type, value, spend) | Less frequent than sales data, includes future promotions.                       |
+| Inventory Data   | Current and historical inventory levels                                         | Tabular (timestamp, sku, store_id/warehouse_id, quantity_on_hand, quantity_on_order, reorder point) | Frequency varies (daily, weekly).                                                 |
+| External Factors | Economic indicators, weather, holidays, social media, web traffic, competitors | Tabular or time-series (various)          | Varying frequencies and formats.                                                |
+| Product Metadata | Static information about products                                               | Tabular (sku, category, subcategory, brand, description, price_tier)      | Relatively static.                                                               |
+| Store Metadata      | Static information of store      | Tabular (store_id, location, store_type)        | Relatively static.
+| **Outputs**        | Description                                            | Format                                                                     | Characteristics                                                      |
+| ------------------ | ------------------------------------------------------ | -------------------------------------------------------------------------- | -------------------------------------------------------------------- |
+| Probabilistic Forecasts | Probability distribution of future demand         | Set of quantiles (p10, p50, p90, etc.) for each SKU-store-future time period | Provides a range of outcomes, quantifies uncertainty.              |
+| Forecast Horizon   | Length of time into the future                        | Number of time steps (days, weeks, months)                                  | Longer horizons have greater uncertainty.                             |
+| Forecast Granularity| Level of detail (SKU-store-day, SKU-region-week, etc.) | Determined by model and business needs                                      | Aligns with business requirements.                                   |
+| Forecast Timestamps | Dates/times for which forecasts are generated       | List/sequence of timestamps                                                   | Corresponds to horizon and granularity.                              |
+| Explainability (Optional) | Outputs that explain model predictions            | Attention weights, feature importance scores, SHAP values                   | Complex to interpret, but provide valuable insights.                 |