Spaces:
Runtime error
Runtime error
| title: Model Evaluator | |
| emoji: π | |
| colorFrom: red | |
| colorTo: red | |
| sdk: streamlit | |
| sdk_version: 1.10.0 | |
| app_file: app.py | |
| # Model Evaluator | |
| > Submit evaluation jobs to AutoTrain from the Hugging Face Hub | |
| ## Supported tasks | |
| The table below shows which tasks are currently supported for evaluation in the AutoTrain backend: | |
| | Task | Supported | | |
| |:-----------------------------------|:---------:| | |
| | `binary_classification` | β | | |
| | `multi_class_classification` | β | | |
| | `multi_label_classification` | β | | |
| | `entity_extraction` | β | | |
| | `extractive_question_answering` | β | | |
| | `translation` | β | | |
| | `summarization` | β | | |
| | `image_binary_classification` | β | | |
| | `image_multi_class_classification` | β | | |
| | `text_zero_shot_evaluation` | β | | |
| ## Installation | |
| To run the application locally, first clone this repository and install the dependencies as follows: | |
| ``` | |
| pip install -r requirements.txt | |
| ``` | |
| Next, copy the example file of environment variables: | |
| ``` | |
| cp .env.template .env | |
| ``` | |
| and set the `HF_TOKEN` variable with a valid API token from the [`autoevaluator`](https://huggingface.co/autoevaluator) bot user. Finally, spin up the application by running: | |
| ``` | |
| streamlit run app.py | |
| ``` | |
| ## Usage | |
| Evaluation on the Hub involves two main steps: | |
| 1. Submitting an evaluation job via the UI. This creates an AutoTrain project with `N` models for evaluation. At this stage, the dataset is also processed and prepared for evaluation. | |
| 2. Triggering the evaluation itself once the dataset is processed. | |
| From the user perspective, only step (1) is needed since step (2) is handled by a cron job on GitHub Actions that executes the `run_evaluation_jobs.py` script every 15 minutes. | |
| See below for details on manually triggering evaluation jobs. | |
| ### Triggering an evaluation | |
| To evaluate the models in an AutoTrain project, run: | |
| ``` | |
| python run_evaluation_jobs.py | |
| ``` | |
| This will download the [`autoevaluate/evaluation-job-logs`](https://huggingface.co/datasets/autoevaluate/evaluation-job-logs) dataset from the Hub and check which evaluation projects are ready for evaluation (i.e. those whose dataset has been processed). | |
| ## AutoTrain configuration details | |
| Models are evaluated by the [`autoevaluator`](https://huggingface.co/autoevaluator) bot user in AutoTrain, with the payload sent to the `AUTOTRAIN_BACKEND_API` environment variable. Evaluation projects are created and run on either the `prod` or `staging` environments. You can view the status of projects in the AutoTrain UI by navigating to one of the links below (ask internally for access to the staging UI): | |
| | AutoTrain environment | AutoTrain UI URL | `AUTOTRAIN_BACKEND_API` | | |
| |:---------------------:|:--------------------------------------------------------------------------------------------------------------:|:--------------------------------------------:| | |
| | `prod` | [`https://ui.autotrain.huggingface.co/projects`](https://ui.autotrain.huggingface.co/projects) | https://api.autotrain.huggingface.co | | |
| | `staging` | [`https://ui-staging.autotrain.huggingface.co/projects`](https://ui-staging.autotrain.huggingface.co/projects) | https://api-staging.autotrain.huggingface.co | | |
| The current configuration for evaluation jobs running on [Spaces](https://huggingface.co/spaces/autoevaluate/model-evaluator) is: | |
| ``` | |
| AUTOTRAIN_BACKEND_API=https://api.autotrain.huggingface.co | |
| ``` | |
| To evaluate models with a _local_ instance of AutoTrain, change the environment to: | |
| ``` | |
| AUTOTRAIN_BACKEND_API=http://localhost:8000 | |
| ``` | |
| ### Migrating from staging to production (and vice versa) | |
| In general, evaluation jobs should run in AutoTrain's `prod` environment, which is defined by the following environment variable: | |
| ``` | |
| AUTOTRAIN_BACKEND_API=https://api.autotrain.huggingface.co | |
| ``` | |
| However, there are times when it is necessary to run evaluation jobs in AutoTrain's `staging` environment (e.g. because a new evaluation pipeline is being deployed). In these cases the corresponding environement variable is: | |
| ``` | |
| AUTOTRAIN_BACKEND_API=https://api-staging.autotrain.huggingface.co | |
| ``` | |
| To migrate between these two environments, update the `AUTOTRAIN_BACKEND_API` in two places: | |
| * In the [repo secrets](https://huggingface.co/spaces/autoevaluate/model-evaluator/settings) associated with the `model-evaluator` Space. This will ensure evaluation projects are created in the desired environment. | |
| * In the [GitHub Actions secrets](https://github.com/huggingface/model-evaluator/settings/secrets/actions) associated with this repo. This will ensure that the correct evaluation jobs are approved and launched via the `run_evaluation_jobs.py` script. | |