3 2

Neil Stoker

nmstoker

nmstoker

AI & ML interests

None yet

Recent Activity

replied to Xenova's post 20 days ago

We did it. Kokoro TTS (v1.0) can now run 100% locally in your browser w/ WebGPU acceleration. Real-time text-to-speech without a server. ⚡️ Generate 10 seconds of speech in ~1 second for $0. What will you build? 🔥 https://huggingface.co/spaces/webml-community/kokoro-webgpu The most difficult part was getting the model running in the first place, but the next steps are simple: ✂️ Implement sentence splitting, allowing for streamed responses 🌍 Multilingual support (only phonemization left) Who wants to help?

reacted to Xenova's post with 👍 about 1 month ago

reacted to Xenova's post with 🔥 2 months ago

First project of 2025: Vision Transformer Explorer I built a web app to interactively explore the self-attention maps produced by ViTs. This explains what the model is focusing on when making predictions, and provides insights into its inner workings! 🤯 Try it out yourself! 👇 https://huggingface.co/spaces/webml-community/attention-visualization Source code: https://github.com/huggingface/transformers.js-examples/tree/main/attention-visualization

View all activity

Organizations

nmstoker's activity

replied to Xenova's post 20 days ago

This is a great start @benjamin-paine . However won't it trip up on relatively common cases such as:

Hello Dr. Jones.
The answer is 10.45s.
Well... that needs extra attention.
They wanted pets, small animals etc. to be brought in for checks.
The filename is Report2023.xls.
You should look at www.example.com.

I'm AFK right now so can't verify for sure but given this seems to use a basic sequence of regex's it seems likely it won't cater for these sorts of things (unless there's some subtle aspect I'm overlooking)

Something like SaT would be too heavy weight. Maybe a compromise closer to PySBD but in JS would work well here?

reacted to Xenova's post with 👍 about 1 month ago

Post

9477

We did it. Kokoro TTS (v1.0) can now run 100% locally in your browser w/ WebGPU acceleration. Real-time text-to-speech without a server. ⚡️

Generate 10 seconds of speech in ~1 second for $0.

What will you build? 🔥
webml-community/kokoro-webgpu

The most difficult part was getting the model running in the first place, but the next steps are simple:
✂️ Implement sentence splitting, allowing for streamed responses
🌍 Multilingual support (only phonemization left)

Who wants to help?

9 replies

reacted to Xenova's post with 🔥 2 months ago

Post

8360

First project of 2025: Vision Transformer Explorer

I built a web app to interactively explore the self-attention maps produced by ViTs. This explains what the model is focusing on when making predictions, and provides insights into its inner workings! 🤯

Try it out yourself! 👇
webml-community/attention-visualization

Source code: https://github.com/huggingface/transformers.js-examples/tree/main/attention-visualization

replied to Xenova's post 2 months ago

Quick update: I was able to confirm that I can get some other WebGPU demos working in Chrome on this particular laptop.

Eg managed to get this: https://toji.github.io/webgpu-clustered-shading/ running.

I made sure Vulkan was installed and gives sensible output in the terminal from: sudo vulkaninfo and then I made sure it was enabled in Chrome (the flag options for Vulkan won't appear in Chrome flags without also setting "Temporarily unexpire M130 flags") along with the flag for enable-unsafe-webgpu.

Unfortunately no change in the error messages from the Moonshine demo link.

replied to Xenova's post 2 months ago

@Xenova - Happy New Year! 🎉

With the demo link above I'm seeing issues when running on Linux with Chrome on a laptop which make me wonder if the WASM fallback is working in my case.

I get this message:

An error occurred
Requested device not found

The message about the requested device is quickly replaced with:

Uncaught Error: no available backend found. ERR: [webgpu] Error: Failed to get GPU adapter. You may need to enable flag "--enable-unsafe-webgpu" if you are using Chrome.

I've tried setting the Chrome flags, but that doesn't seem to make any difference.

Is there anything you'd recommend I check to try to diagnose the issue? Or a way to force the WASM fallback?

I'd previously had great results on my main PC (but this has a 3090 GPU) and on mobile (Android Pixel 8 Pro which also has GPU support) and it was only when I went to test on some slightly older devices I came upon this issue.

Problem laptop has Chrome Version 131.0.6778.204 (Official Build) (64-bit), it's running Arch Linux. CPU is an i7-7500U from 2017 which has Intel® HD Graphics 620.

New activity in microsoft/OmniParser 5 months ago

No examples in readme

#3 opened 5 months ago by

jadbox

liked a dataset 5 months ago

iamollas/ethos

Updated Jan 18, 2024 • 932 • 16

replied to Xenova's post 7 months ago

Wonderful news!

Any indication on how easily v2 code etc will work under v3 and what the main changes are? (am guessing there'll be a blog post in due course 🙂)

reacted to Xenova's post with 🔥 7 months ago

Post

15082

I'm excited to announce that Transformers.js V3 is finally available on NPM! 🔥 State-of-the-art Machine Learning for the web, now with WebGPU support! 🤯⚡️

Install it from NPM with:
𝚗𝚙𝚖 𝚒 @𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎/𝚝𝚛𝚊𝚗𝚜𝚏𝚘𝚛𝚖𝚎𝚛𝚜

or via CDN, for example: https://v2.scrimba.com/s0lmm0qh1q

Segment Anything demo: webml-community/segment-anything-webgpu

5 replies

replied to Xenova's post 9 months ago

Any chance of some pointers for how to use the model in plain JavaScript? I had a look at the react/ts and wasn't sure where to begin (I'm not the strongest js coder so I may be missing something obvious!)

replied to Xenova's post 9 months ago

Amazing progress, thank you!

For anyone trying this on Chrome on Linux, you may need to set some flags: a comment appears in the Console when the model loading gets stuck suggesting a startup flag, but you can just switch those with chrome://flags directly and relaunch - for me, I needed to enable both Vulkan and Unsafe WebGPU, and then it works (seriously fast I should note!!)

New activity in microsoft/Phi-3-vision-128k-instruct 10 months ago

How to achieve faster inference speed？

#18 opened 10 months ago by

zkDU

liked a model about 1 year ago

152334H/miqu-1-70b-sf

Text Generation • Updated Jul 31, 2024 • 329 • 221

New activity in rhasspy/piper-voices over 1 year ago

How to download single voices?

#4 opened over 1 year ago by

werner1