btrabucco commited on
Commit
4b1aeed
·
verified ·
1 Parent(s): 703426b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -36,3 +36,18 @@ The predominant approach for training web navigation agents gathers human demons
36
  ![Training Results](https://data-for-agents.github.io/static/images/training-results.png)
37
 
38
  **Training agents with internet-scale data.** In data-limited settings derived from Mind2Web and WebLINX (left plot), we improve Step Accuracy by up to +89.5% and +122.1% respectively for agents trained on mixtures of data from our pipeline, and human data. When training agents with all available human data from these benchmarks (right plot), agents trained on existing human data struggle to generalize to diverse real sites, and adding our data improves their generalization by +149.0% for WebLINX and +156.3% for Mind2Web.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ![Training Results](https://data-for-agents.github.io/static/images/training-results.png)
37
 
38
  **Training agents with internet-scale data.** In data-limited settings derived from Mind2Web and WebLINX (left plot), we improve Step Accuracy by up to +89.5% and +122.1% respectively for agents trained on mixtures of data from our pipeline, and human data. When training agents with all available human data from these benchmarks (right plot), agents trained on existing human data struggle to generalize to diverse real sites, and adding our data improves their generalization by +149.0% for WebLINX and +156.3% for Mind2Web.
39
+
40
+ ## Citing Us
41
+
42
+ Please cite our work using the following bibtex:
43
+
44
+ ```
45
+ @misc{Trabucco2025InSTA,
46
+ title={InSTA: Towards Internet-Scale Training For Agents},
47
+ author={Brandon Trabucco and Gunnar Sigurdsson and Robinson Piramuthu and Ruslan Salakhutdinov},
48
+ year={2025},
49
+ eprint={2502.06776},
50
+ archivePrefix={arXiv},
51
+ primaryClass={cs.LG},
52
+ }
53
+ ```