Spaces:

data-for-agents
/

README

Running

btrabucco commited on 12 days ago

Commit

4b1aeed

verified ·

1 Parent(s): 703426b

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -36,3 +36,18 @@ The predominant approach for training web navigation agents gathers human demons
 ![Training Results](https://data-for-agents.github.io/static/images/training-results.png)
 **Training agents with internet-scale data.** In data-limited settings derived from Mind2Web and WebLINX (left plot), we improve Step Accuracy by up to +89.5% and +122.1% respectively for agents trained on mixtures of data from our pipeline, and human data. When training agents with all available human data from these benchmarks (right plot), agents trained on existing human data struggle to generalize to diverse real sites, and adding our data improves their generalization by +149.0% for WebLINX and +156.3% for Mind2Web.

 ![Training Results](https://data-for-agents.github.io/static/images/training-results.png)
 **Training agents with internet-scale data.** In data-limited settings derived from Mind2Web and WebLINX (left plot), we improve Step Accuracy by up to +89.5% and +122.1% respectively for agents trained on mixtures of data from our pipeline, and human data. When training agents with all available human data from these benchmarks (right plot), agents trained on existing human data struggle to generalize to diverse real sites, and adding our data improves their generalization by +149.0% for WebLINX and +156.3% for Mind2Web.
+## Citing Us
+Please cite our work using the following bibtex:
+```
+@misc{Trabucco2025InSTA,
+  title={InSTA: Towards Internet-Scale Training For Agents},
+  author={Brandon Trabucco and Gunnar Sigurdsson and Robinson Piramuthu and Ruslan Salakhutdinov},
+  year={2025},
+  eprint={2502.06776},
+  archivePrefix={arXiv},
+  primaryClass={cs.LG},
+}
+```