Improve model card: Add pipeline tag, library name, and usage example (#1)

Browse files

- Improve model card: Add pipeline tag, library name, and usage example (1786423886dde4177edb87acb662b66a1eefaf93)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +56 -2

README.md CHANGED Viewed

@@ -1,7 +1,9 @@
 ---
-license: mit
 datasets:
 - michaelyuanqwq/roboseg
 ---
 <h1> RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation </h1>
@@ -12,7 +14,59 @@ datasets:
 The BG-Diffusion checkpoints of "RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation".
-Please checkout https://github.com/michaelyuancb/roboengine for more details.
 ### BibTex
 ```

 ---
 datasets:
 - michaelyuanqwq/roboseg
+license: mit
+pipeline_tag: image-to-image
+library_name: diffusers
 ---
 <h1> RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation </h1>
 The BG-Diffusion checkpoints of "RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation".
+Please checkout https://github.com/michaelyuanqwq/roboengine for more details.
+## Usage
+This model is a ControlNet model compatible with the `diffusers` library, specifically designed for background generation in robot scenes. It works by taking an image (e.g., a robot on a black background, created using a segmentation mask) as conditioning and generating a new background based on a text prompt.
+```python
+from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline
+from PIL import Image
+import torch
+# Load the ControlNet model
+controlnet = ControlNetModel.from_pretrained("michaelyuanqwq/roboengine-bg-diffusion", torch_dtype=torch.float16)
+# Load a base Stable Diffusion XL pipeline (this ControlNet is designed for SDXL)
+pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
+    "stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, torch_dtype=torch.float16
+)
+pipe.to("cuda")
+# Prepare your input image and segmentation mask
+# In a real application, these would come from your dataset or a segmentation model.
+# The 'michaelyuanqwq/roboseg' dataset can provide examples.
+# For demonstration: create a dummy input image and a white mask for the robot
+input_image = Image.new("RGB", (768, 768), color = 'red') # Placeholder for your actual robot image
+mask = Image.new("L", (768, 768), color = 'black') # Placeholder for your actual robot mask (white for robot, black for background)
+from PIL import ImageDraw
+draw = ImageDraw.Draw(mask)
+draw.ellipse((100, 100, 668, 668), fill='white') # Draw a white circle as a dummy robot mask
+# Create the conditioning image for ControlNet: robot on a black background
+# This image tells ControlNet to preserve the white areas (robot) and generate new content for the black areas (background).
+control_image = Image.composite(Image.new("RGB", input_image.size, (0, 0, 0)), input_image, mask.convert("1"))
+# Define your text prompt for the new background
+prompt = "A robot arm working in a futuristic lab with neon lights, high detail, photorealistic"
+negative_prompt = "blurry, low quality, bad anatomy, deformed"
+# Generate the image
+generator = torch.Generator(device="cuda").manual_seed(42) # For reproducible results
+output_image = pipe(
+    prompt=prompt,
+    image=control_image, # The conditioning image derived from the mask
+    negative_prompt=negative_prompt,
+    num_inference_steps=25,
+    generator=generator,
+    guidance_scale=7.5,
+).images[0]
+# Save the generated image
+output_image.save("generated_robot_scene.png")
+print("Generated image saved as generated_robot_scene.png")
+```
 ### BibTex
 ```