Spaces:

prasannareddyp
/

SPM

Sleeping

App Files Files Community

prasannareddyp commited on Sep 3

Commit

c3d8cc1

verified ·

1 Parent(s): c0b47d8

Upload 4 files

Browse files

Files changed (3) hide show

README.md +11 -9
app.py +13 -4
spm.py +161 -37

README.md CHANGED Viewed

@@ -1,36 +1,38 @@
 ---
-title: SPM
-emoji: 🚀
-colorFrom: green
-colorTo: pink
 sdk: gradio
 sdk_version: 5.44.1
 app_file: app.py
 pinned: false
 license: mit
-short_description: Shuffle PatchMix Augmentation
 ---
 arxiv.org/abs/2505.24216
 # Shuffle PatchMix (SPM) — Hugging Face Space
-A minimal interactive demo for SPM-style augmentation. Upload an image (or a .zip of images), set **grid size (N×N)**, and download the augmented outputs.
 GitHub repo: https://github.com/PrasannaPulakurthi/SPM
 ## Parameters
 - **Grid (N×N):** Choose one of **2×2, 4×4, 8×8, 16×16**. The image is cropped (top-left) so its width and height are divisible by N.
 - **Mix probability:** Per-patch probability to mix original and a shuffled patch.
-- **Beta α, β:** Shape parameters for a single per-image alpha sampled from Beta(α,β).
 - **Seed:** Optional deterministic seed.
 ## Batch Mode
 Upload a `.zip` containing images (`.png`, `.jpg`, `.jpeg`). The app returns a `.zip` of augmented results with the same folder structure.
 ## Notes
-- This uses a **global** patch permutation and per-patch mixing with a **single** alpha per image (tweak in `spm.py` if you want per-patch alpha or different strategies).
-- If you want parity with a specific paper version, swap in your official implementation but keep `spm_augment(image, num_patches, mix_prob, beta_a, beta_b, seed)`.
 ## Local Development
 ```bash

 ---
+title: Shuffle PatchMix
+colorFrom: purple
+colorTo: red
 sdk: gradio
 sdk_version: 5.44.1
 app_file: app.py
 pinned: false
 license: mit
 ---
 arxiv.org/abs/2505.24216
 # Shuffle PatchMix (SPM) — Hugging Face Space
+A minimal interactive demo for SPM-style augmentation. Now supports **overlap with feathered blending**.
 GitHub repo: https://github.com/PrasannaPulakurthi/SPM
 ## Parameters
 - **Grid (N×N):** Choose one of **2×2, 4×4, 8×8, 16×16**. The image is cropped (top-left) so its width and height are divisible by N.
+- **Enable overlap (feather blend):** When enabled, each base cell expands by ±overlap pixels (2× at borders to keep areas comparable) and patches are blended with a feather mask.
+- **Overlap (px):** Pixel overlap per side. Automatically clipped to `< ½ * patch size`.
 - **Mix probability:** Per-patch probability to mix original and a shuffled patch.
+- **Beta α, β:** Shape parameters for the Beta distribution used for blending weights.
 - **Seed:** Optional deterministic seed.
 ## Batch Mode
 Upload a `.zip` containing images (`.png`, `.jpg`, `.jpeg`). The app returns a `.zip` of augmented results with the same folder structure.
 ## Notes
+- Non-overlap path uses **one alpha per image**; overlap path uses **alpha per patch** to mirror your reference snippet (edit in `spm.py` if you prefer one alpha per image).
+- Feather size equals the overlap (you can decouple by adjusting `create_feather_mask` calls).
+- If you want parity with a specific paper version, swap in your official implementation but keep the signature:
+  `spm_augment(image, num_patches, mix_prob, beta_a, beta_b, overlap_px, seed)`.
 ## Local Development
 ```bash

app.py CHANGED Viewed

@@ -7,6 +7,7 @@ from spm import spm_augment
 TITLE = "Shuffle PatchMix (SPM) Augmentation"
 DESC = """
 Upload an image, choose **number of patches (N×N)**, and generate SPM-augmented variants.
 For batch processing, upload a .zip of images (PNG/JPG/JPEG), and download a .zip of outputs.
 """
@@ -18,12 +19,13 @@ def _parse_grid(grid_choice: str) -> int:
     except Exception:
         return 4
-def run_single(image, grid_choice, mix_prob, beta_a, beta_b, num_augs, seed):
     if image is None:
         return []
     outs = []
     base_seed = int(seed) if seed is not None else None
     N = _parse_grid(grid_choice)
     for i in range(num_augs):
         s = (base_seed + i) if base_seed is not None else None
         out_img = spm_augment(
@@ -32,12 +34,13 @@ def run_single(image, grid_choice, mix_prob, beta_a, beta_b, num_augs, seed):
             mix_prob=float(mix_prob),
             beta_a=float(beta_a),
             beta_b=float(beta_b),
             seed=s
         )
         outs.append(out_img)
     return outs
-def run_batch(zip_file, grid_choice, mix_prob, beta_a, beta_b, seed):
     if zip_file is None:
         return None, "Please upload a .zip file with images."
     tempdir = tempfile.mkdtemp()
@@ -50,6 +53,7 @@ def run_batch(zip_file, grid_choice, mix_prob, beta_a, beta_b, seed):
     valid_exts = {".png", ".jpg", ".jpeg"}
     count_in, count_out = 0, 0
     N = _parse_grid(grid_choice)
     for root_dir, _, files in os.walk(tempdir):
         for f in files:
             if f.lower().endswith(tuple(valid_exts)):
@@ -65,6 +69,7 @@ def run_batch(zip_file, grid_choice, mix_prob, beta_a, beta_b, seed):
                     mix_prob=float(mix_prob),
                     beta_a=float(beta_a),
                     beta_b=float(beta_b),
                     seed=int(seed) if seed is not None else None
                 )
                 rel = os.path.relpath(in_path, tempdir)
@@ -92,6 +97,8 @@ with gr.Blocks() as demo:
                 with gr.Column(scale=1):
                     inp = gr.Image(label="Input image", type="pil")
                     grid_choice = gr.Radio(choices=["2x2","4x4","8x8","16x16"], value="4x4", label="Grid (N×N)")
                     mix_prob = gr.Slider(0, 1, value=0.5, step=0.05, label="Mix probability (per patch)")
                     with gr.Row():
                         beta_a = gr.Slider(0.1, 8, value=2.0, step=0.1, label="Beta α")
@@ -103,7 +110,7 @@ with gr.Blocks() as demo:
                     gallery = gr.Gallery(label="Augmented outputs", columns=2, height="auto")
             run_btn.click(
                 fn=run_single,
-                inputs=[inp, grid_choice, mix_prob, beta_a, beta_b, num_augs, seed],
                 outputs=[gallery]
             )
         with gr.TabItem("Batch (.zip)"):
@@ -111,6 +118,8 @@ with gr.Blocks() as demo:
                 with gr.Column(scale=1):
                     zip_in = gr.File(label="Upload a .zip of images", file_types=[".zip"])
                     grid_choice_b = gr.Radio(choices=["2x2","4x4","8x8","16x16"], value="4x4", label="Grid (N×N)")
                     mix_prob_b = gr.Slider(0, 1, value=0.5, step=0.05, label="Mix probability (per patch)")
                     with gr.Row():
                         beta_a_b = gr.Slider(0.1, 8, value=2.0, step=0.1, label="Beta α")
@@ -122,7 +131,7 @@ with gr.Blocks() as demo:
                     status = gr.Markdown()
             run_b.click(
                 fn=run_batch,
-                inputs=[zip_in, grid_choice_b, mix_prob_b, beta_a_b, beta_b_b, seed_b],
                 outputs=[zip_out, status]
             )

 TITLE = "Shuffle PatchMix (SPM) Augmentation"
 DESC = """
 Upload an image, choose **number of patches (N×N)**, and generate SPM-augmented variants.
+You can optionally **enable overlap** with feathered blending for smoother seams.
 For batch processing, upload a .zip of images (PNG/JPG/JPEG), and download a .zip of outputs.
 """
     except Exception:
         return 4
+def run_single(image, grid_choice, use_overlap, overlap_px, mix_prob, beta_a, beta_b, num_augs, seed):
     if image is None:
         return []
     outs = []
     base_seed = int(seed) if seed is not None else None
     N = _parse_grid(grid_choice)
+    ov = int(overlap_px) if use_overlap else 0
     for i in range(num_augs):
         s = (base_seed + i) if base_seed is not None else None
         out_img = spm_augment(
             mix_prob=float(mix_prob),
             beta_a=float(beta_a),
             beta_b=float(beta_b),
+            overlap_px=ov,
             seed=s
         )
         outs.append(out_img)
     return outs
+def run_batch(zip_file, grid_choice, use_overlap, overlap_px, mix_prob, beta_a, beta_b, seed):
     if zip_file is None:
         return None, "Please upload a .zip file with images."
     tempdir = tempfile.mkdtemp()
     valid_exts = {".png", ".jpg", ".jpeg"}
     count_in, count_out = 0, 0
     N = _parse_grid(grid_choice)
+    ov = int(overlap_px) if use_overlap else 0
     for root_dir, _, files in os.walk(tempdir):
         for f in files:
             if f.lower().endswith(tuple(valid_exts)):
                     mix_prob=float(mix_prob),
                     beta_a=float(beta_a),
                     beta_b=float(beta_b),
+                    overlap_px=ov,
                     seed=int(seed) if seed is not None else None
                 )
                 rel = os.path.relpath(in_path, tempdir)
                 with gr.Column(scale=1):
                     inp = gr.Image(label="Input image", type="pil")
                     grid_choice = gr.Radio(choices=["2x2","4x4","8x8","16x16"], value="4x4", label="Grid (N×N)")
+                    use_overlap = gr.Checkbox(value=False, label="Enable overlap (feather blend)")
+                    overlap_px = gr.Slider(1, 64, value=8, step=1, label="Overlap (px)")
                     mix_prob = gr.Slider(0, 1, value=0.5, step=0.05, label="Mix probability (per patch)")
                     with gr.Row():
                         beta_a = gr.Slider(0.1, 8, value=2.0, step=0.1, label="Beta α")
                     gallery = gr.Gallery(label="Augmented outputs", columns=2, height="auto")
             run_btn.click(
                 fn=run_single,
+                inputs=[inp, grid_choice, use_overlap, overlap_px, mix_prob, beta_a, beta_b, num_augs, seed],
                 outputs=[gallery]
             )
         with gr.TabItem("Batch (.zip)"):
                 with gr.Column(scale=1):
                     zip_in = gr.File(label="Upload a .zip of images", file_types=[".zip"])
                     grid_choice_b = gr.Radio(choices=["2x2","4x4","8x8","16x16"], value="4x4", label="Grid (N×N)")
+                    use_overlap_b = gr.Checkbox(value=False, label="Enable overlap (feather blend)")
+                    overlap_px_b = gr.Slider(1, 64, value=8, step=1, label="Overlap (px)")
                     mix_prob_b = gr.Slider(0, 1, value=0.5, step=0.05, label="Mix probability (per patch)")
                     with gr.Row():
                         beta_a_b = gr.Slider(0.1, 8, value=2.0, step=0.1, label="Beta α")
                     status = gr.Markdown()
             run_b.click(
                 fn=run_batch,
+                inputs=[zip_in, grid_choice_b, use_overlap_b, overlap_px_b, mix_prob_b, beta_a_b, beta_b_b, seed_b],
                 outputs=[zip_out, status]
             )

spm.py CHANGED Viewed

@@ -1,6 +1,23 @@
 from PIL import Image
 import numpy as np
 def _to_divisible_by(img, N):
     """Crop so width and height are divisible by N (top-left anchored)."""
     w, h = img.size
@@ -12,22 +29,64 @@ def _to_divisible_by(img, N):
         img = img.crop((0, 0, W, H))
     return img, W, H
 def spm_augment(
     image,
     num_patches=4,   # N for an N×N grid
     mix_prob=0.5,
     beta_a=2.0,
     beta_b=2.0,
     seed=None
 ):
     """
-    SPM-style augmentation using a global shuffle over an N×N patch grid.
-      1) Divide image into N×N patches (cropping to be divisible by N if needed).
-      2) Globally permute patch indices.
-      3) Per patch, with probability `mix_prob`, replace by a convex blend of
-         original and a shuffled patch using alpha~Beta(beta_a,beta_b) (one alpha per image).
     """
-    # Normalize input
     if isinstance(image, np.ndarray):
         img = Image.fromarray(image).convert("RGB")
     else:
@@ -36,47 +95,112 @@ def spm_augment(
     N = int(num_patches)
     rng = np.random.default_rng(seed)
-    # Ensure divisibility and compute patch size
     img, W, H = _to_divisible_by(img, N)
-    arr = np.array(img, dtype=np.uint8)
     ph = H // N
     pw = W // N
-    # Build patch list (row-major)
     patches = []
     for i in range(N):
         for j in range(N):
-            y0 = i * ph
-            x0 = j * pw
-            patches.append(arr[y0:y0+ph, x0:x0+pw])
-    total = N * N
     perm = rng.permutation(total)
-    # Sample one alpha for the whole image
-    if beta_a > 0 and beta_b > 0:
-        alpha = float(rng.beta(beta_a, beta_b))
-    else:
-        alpha = 1.0
-    # Patchwise mix
-    out = arr.copy()
-    mask = rng.random(total) < float(mix_prob)
-    idx = 0
-    for i in range(N):
-        for j in range(N):
-            y0 = i * ph
-            x0 = j * pw
-            if mask[idx]:
-                src = patches[idx].astype(np.float32)
-                shf = patches[perm[idx]].astype(np.float32)
-                if 0.0 < alpha < 1.0:
-                    mixed = alpha * shf + (1.0 - alpha) * src
-                    out[y0:y0+ph, x0:x0+pw] = np.clip(mixed, 0, 255).astype(np.uint8)
-                else:
-                    out[y0:y0+ph, x0:x0+pw] = patches[perm[idx]]
-            else:
-                out[y0:y0+ph, x0:x0+pw] = patches[idx]
-            idx += 1
     return Image.fromarray(out)

 from PIL import Image
 import numpy as np
+def create_feather_mask(height, width, feather_size=4):
+    """
+    2D mask HxW that smoothly transitions from 1.0 in the interior
+    to 0.0 at the edges over `feather_size` pixels.
+    """
+    mask = np.ones((height, width), dtype=np.float32)
+    if feather_size <= 0:
+        return mask
+    ramp = np.linspace(0.0, 1.0, feather_size, dtype=np.float32)
+    # Top / Bottom
+    mask[:feather_size, :]  *= ramp[:, None]
+    mask[-feather_size:, :] *= ramp[::-1, None]
+    # Left / Right
+    mask[:, :feather_size]  *= ramp[None, :]
+    mask[:, -feather_size:] *= ramp[None, ::-1]
+    return mask
 def _to_divisible_by(img, N):
     """Crop so width and height are divisible by N (top-left anchored)."""
     w, h = img.size
         img = img.crop((0, 0, W, H))
     return img, W, H
+def _edgelogic(i, j, ph, pw, N, overlap):
+    """
+    Base (no-overlap) patch is [i*ph:(i+1)*ph, j*pw:(j+1)*pw].
+    Extend with overlap, biasing inward.
+    Uses 2*overlap for edges to keep patch areas roughly comparable.
+    Returns (start_h, end_h, start_w, end_w) BEFORE clamping to image bounds.
+    """
+    start_h = i * ph
+    start_w = j * pw
+    end_h   = start_h + ph
+    end_w   = start_w + pw
+    if overlap <= 0:
+        return start_h, end_h, start_w, end_w
+    # Vertical
+    if i == 0:
+        end_h += 2 * overlap
+    elif i == N - 1:
+        start_h -= 2 * overlap
+    else:
+        start_h -= overlap
+        end_h   += overlap
+    # Horizontal
+    if j == 0:
+        end_w += 2 * overlap
+    elif j == N - 1:
+        start_w -= 2 * overlap
+    else:
+        start_w -= overlap
+        end_w   += overlap
+    return start_h, end_h, start_w, end_w
 def spm_augment(
     image,
     num_patches=4,   # N for an N×N grid
     mix_prob=0.5,
     beta_a=2.0,
     beta_b=2.0,
+    overlap_px=0,
     seed=None
 ):
     """
+    SPM-style augmentation with optional overlap + feathered blending.
+    When overlap_px <= 0:
+      - Standard global shuffle over N×N patches;
+      - Per-patch mixing with a single alpha ~ Beta(a,b) for the image.
+    When overlap_px > 0:
+      - Each base cell (N×N grid) expands by +/-overlap_px (2*overlap at borders),
+        clipped to the image. Patches are mixed per location and alpha sampled per-patch
+        for a bit more stochasticity (can be changed to per-image alpha by editing below).
+      - Patches are blended into the canvas with a feather mask of size `overlap_px`.
     """
+    # Normalize to PIL and ensure divisibility
     if isinstance(image, np.ndarray):
         img = Image.fromarray(image).convert("RGB")
     else:
     N = int(num_patches)
     rng = np.random.default_rng(seed)
     img, W, H = _to_divisible_by(img, N)
+    arr_u8 = np.array(img, dtype=np.uint8)
     ph = H // N
     pw = W // N
+    # Clamp overlap to < half patch size
+    if overlap_px is None:
+        overlap_px = 0
+    overlap_px = int(overlap_px)
+    max_ov = max(0, min(ph, pw) // 2 - 1)
+    ov = int(np.clip(overlap_px, 0, max_ov))
+    if ov <= 0:
+        # === Non-overlap path ===
+        arr = arr_u8
+        # Build patches (row-major)
+        patches = []
+        for i in range(N):
+            for j in range(N):
+                y0 = i * ph
+                x0 = j * pw
+                patches.append(arr[y0:y0+ph, x0:x0+pw])
+        total = N * N
+        perm = rng.permutation(total)
+        # One alpha per image
+        if beta_a > 0 and beta_b > 0:
+            alpha = float(rng.beta(beta_a, beta_b))
+        else:
+            alpha = 1.0
+        out = arr.copy()
+        mask = rng.random(total) < float(mix_prob)
+        idx = 0
+        for i in range(N):
+            for j in range(N):
+                y0 = i * ph
+                x0 = j * pw
+                if mask[idx]:
+                    src = patches[idx].astype(np.float32)
+                    shf = patches[perm[idx]].astype(np.float32)
+                    if 0.0 < alpha < 1.0:
+                        mixed = alpha * shf + (1.0 - alpha) * src
+                        out[y0:y0+ph, x0:x0+pw] = np.clip(mixed, 0, 255).astype(np.uint8)
+                    else:
+                        out[y0:y0+ph, x0:x0+pw] = patches[perm[idx]]
+                else:
+                    out[y0:y0+ph, x0:x0+pw] = patches[idx]
+                idx += 1
+        return Image.fromarray(out)
+    # === Overlap path with feather blending ===
+    arr = arr_u8.astype(np.float32)
+    # Precompute feather mask for max size patch
+    feather_full = create_feather_mask(ph + 2*ov, pw + 2*ov, feather_size=ov)
     patches = []
+    coords = []
     for i in range(N):
         for j in range(N):
+            sh, eh, sw, ew = _edgelogic(i, j, ph, pw, N, ov)
+            # Clamp to image bounds
+            sh = max(0, sh); sw = max(0, sw)
+            eh = min(H, eh); ew = min(W, ew)
+            patches.append(arr[sh:eh, sw:ew])
+            coords.append((sh, eh, sw, ew))
+    total = len(patches)
     perm = rng.permutation(total)
+    # We'll sample alpha per-patch to echo your overlap snippet
+    def sample_alpha():
+        if beta_a > 0 and beta_b > 0:
+            return float(rng.beta(beta_a, beta_b))
+        return 1.0
+    canvas = np.zeros_like(arr, dtype=np.float32)
+    weight = np.zeros((H, W), dtype=np.float32)
+    for k, (sh, eh, sw, ew) in enumerate(coords):
+        if rng.random() >= float(mix_prob):
+            # keep original content in that region
+            src = patches[k]
+            patch = src
+        else:
+            lam = sample_alpha()
+            src = patches[k].astype(np.float32)
+            shf = patches[int(perm[k])].astype(np.float32)
+            patch = lam * shf + (1.0 - lam) * src
+        ph_k, pw_k, _ = patch.shape
+        # Slice feather mask down if needed (near borders)
+        mask2d = feather_full[:ph_k, :pw_k]
+        if arr.shape[2] == 1:
+            mask3d = mask2d[..., None]
+        else:
+            mask3d = np.repeat(mask2d[..., None], arr.shape[2], axis=2)
+        # Accumulate
+        canvas[sh:eh, sw:ew] += patch * mask3d
+        weight[sh:eh, sw:ew] += mask2d
+    # Normalize
+    weight = np.clip(weight, 1e-8, None)
+    out = (canvas / weight[..., None])
+    out = np.clip(out, 0, 255).astype(np.uint8)
     return Image.fromarray(out)