WebGen-LM-32B / README.md

luzimu

Improve model card: Add project page, tags, and sample usage (#2)

f28bd42 verified about 1 month ago

preview code

raw

history blame contribute delete

2.92 kB

metadata

base_model:
  - Qwen/Qwen2.5-Coder-7B-Instruct
datasets:
  - luzimu/WebGen-Bench
language:
  - en
library_name: transformers
license: mit
metrics:
  - accuracy
pipeline_tag: text-generation
tags:
  - code-generation

WebGen-LM

WebGen-LM is trained using the Bolt.diy trajectories generated from a subset of the training set of WebGen-Bench (🤗 luzimu/WebGen-Bench). It has been introduced in the paper WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch.

Project page: https://webgen-bench.github.io/ The training data and code can be found at WebGen-Bench (Github).

The WebGen-LM family of models are as follows:

Models	HF Links
WebGen-LM-7B	🤗 luzimu/WebGen-LM-7B
WebGen-LM-14B	🤗 luzimu/WebGen-LM-14B
WebGen-LM-32B	🤗 luzimu/WebGen-LM-32B

Sample Usage

You can use this model with the transformers library for text generation tasks, specifically for code generation based on instructions.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "luzimu/WebGen-LM-32B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "user", "content": "Write HTML, CSS, and JavaScript for a simple to-do list web application. The list should allow users to add and remove items."},
]

chat_input = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([chat_input], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=2048,
    do_sample=True,
    temperature=0.7,
    top_p=0.95
)

# Decode only the newly generated tokens
output_text = tokenizer.decode(generated_ids[0][model_inputs.input_ids.shape[1]:], skip_special_tokens=False)
print(output_text)

Performance on WebGen-Bench

Citation

If you find our project useful, please cite:

@misc{lu2025webgenbenchevaluatingllmsgenerating,
      title={WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch}, 
      author={Zimu Lu and Yunqiao Yang and Houxing Ren and Haotian Hou and Han Xiao and Ke Wang and Weikang Shi and Aojun Zhou and Mingjie Zhan and Hongsheng Li},
      year={2025},
      eprint={2505.03733},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.03733}, 
}