Britny Farahdel's picture
2 21

Britny Farahdel

britny
ยท

AI & ML interests

None yet

Recent Activity

updated a collection about 5 hours ago
Image Editing
updated a collection 5 days ago
Image Editing
View all activity

Organizations

Hugging Face Discord Community's profile picture AI Starter Pack's profile picture

britny's activity

upvoted an article 1 day ago
view article
Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

โ€ข 31
reacted to merve's post with ๐Ÿš€ about 2 months ago
view post
Post
2281
smolagents can see ๐Ÿ”ฅ
we just shipped vision support to smolagents ๐Ÿค— agentic computers FTW

you can now:
๐Ÿ’ป let the agent get images dynamically (e.g. agentic web browser)
๐Ÿ“‘ pass images at the init of the agent (e.g. chatting with documents, filling forms automatically etc)
with few LoC change! ๐Ÿคฏ
you can use transformers models locally (like Qwen2VL) OR plug-in your favorite multimodal inference provider (gpt-4o, antrophic & co) ๐Ÿค 

read our blog http://hf.co/blog/smolagents-can-see
reacted to merve's post with โค๏ธ 2 months ago
view post
Post
3659
What a beginning to this year in open ML ๐Ÿค 
Let's unwrap! merve/jan-10-releases-677fe34177759de0edfc9714

Multimodal ๐Ÿ–ผ๏ธ
> ByteDance released SA2VA: a family of vision LMs that can take image, video, text and visual prompts
> moondream2 is out with new capabilities like outputting structured data and gaze detection!
> Dataset: Alibaba DAMO lab released multimodal textbook โ€” 22k hours worth of samples from instruction videos ๐Ÿคฏ
> Dataset: SciCap captioning on scientific documents benchmark dataset is released along with the challenge!

LLMs ๐Ÿ’ฌ
> Microsoft released Phi-4, sota open-source 14B language model ๐Ÿ”ฅ
> Dolphin is back with Dolphin 3.0 Llama 3.1 8B ๐Ÿฌ๐Ÿฌ
> Prime-RL released Eurus-2-7B-PRIME a new language model trained using PRIME alignment
> SmallThinker-3B is a new small reasoning LM based on Owen2.5-3B-Instruct ๐Ÿ’ญ
> Dataset: QWQ-LONGCOT-500K is the dataset used to train SmallThinker, generated using QwQ-32B-preview ๐Ÿ“•
> Dataset: @cfahlgren1 released React Code Instructions: a dataset of code instruction-code pairs ๐Ÿ“•
> Dataset: Qwen team is on the roll, they just released CodeElo, a dataset of code preferences ๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ป

Embeddings ๐Ÿ”–
> @MoritzLaurer released zero-shot version of ModernBERT large ๐Ÿ‘
> KaLM is a new family of performant multilingual embedding models with MIT license built using Qwen2-0.5B

Image/Video Generation โฏ๏ธ
> NVIDIA released Cosmos, a new family of diffusion/autoregressive World Foundation Models generating worlds from images, videos and texts ๐Ÿ”ฅ
> Adobe released TransPixar: a new text-to-video model that can generate assets with transparent backgrounds (a first!)
> Dataset: fal released cosmos-openvid-1m Cosmos-tokenized OpenVid-1M with samples from OpenVid-1M

Others
> Prior Labs released TabPFNv2, the best tabular transformer is out for classification and regression
> Metagene-1 is a new RNA language model that can be used for pathogen detection, zero-shot embedding and genome understanding
reacted to merve's post with ๐Ÿš€ 2 months ago
view post
Post
4890
supercharge your LLM apps with smolagents ๐Ÿ”ฅ

however cool your LLM is, without being agentic it can only go so far

enter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!

Here's our blog for you to get started https://huggingface.co/blog/smolagents
reacted to ginipick's post with ๐Ÿ”ฅ 3 months ago
view post
Post
5271
๐ŸŽฌ Revolutionize Your Video Creation
Dokdo Multimodal AI Transform a single image into a stunning video with perfect audio harmony! ๐Ÿš€

Superior Technology ๐Ÿ’ซ
Advanced Flow Matching: Smoother video transitions surpassing Kling and Sora
Intelligent Sound System: Automatically generates perfect audio by analyzing video mood
Multimodal Framework: Advanced AI integrating image, text, and audio analysis
Outstanding Performance ๐ŸŽฏ
Ultra-High Resolution: 4K video quality with bfloat16 acceleration
Real-Time Optimization: 3x faster processing with PyTorch GPU acceleration
Smart Sound Matching: Real-time audio effects based on scene transitions and motion
Exceptional Features โœจ
Custom Audio Creation: Natural soundtrack matching video tempo and rhythm
Intelligent Watermarking: Adaptive watermark adjusting to video characteristics
Multilingual Support: Precise translation engine powered by Helsinki-NLP
Versatile Applications ๐ŸŒŸ
Social Media Marketing: Create engaging shorts for Instagram and YouTube
Product Promotion: Dynamic promotional videos highlighting product features
Educational Content: Interactive learning materials with enhanced engagement
Portfolio Enhancement: Professional-grade videos showcasing your work
Experience the video revolution with Dokdo Multimodal, where anyone can create professional-quality content from a single image. Elevate your content with perfectly synchronized video and audio that captivates your audience! ๐ŸŽจ

Start creating stunning videos that stand out from the crowd - whether you're a marketer, educator, content creator, or business owner. Join the future of AI-powered video creation today!

ginipick/Dokdo-multimodal

#VideoInnovation #AITechnology #PremiumContent #MarketingSolution

๐Ÿ”Š Please turn on your sound for the best viewing experience!
  • 1 reply
ยท