AI & ML interests

Assets for Amazon SageMaker

Recent Activity

amazon-sagemaker's activity

pagezyhfย 
posted an update about 9 hours ago
view post
Post
476
We published https://huggingface.co/blog/deepseek-r1-aws!

If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.

We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.

We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.

Cheers
florentgbelidjiย 
posted an update 13 days ago
view post
Post
1404
๐—ฃ๐—น๐—ฎ๐—ป๐—ป๐—ถ๐—ป๐—ด ๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—ก๐—ฒ๐˜…๐˜ ๐—ฆ๐—ธ๐—ถ ๐—”๐—ฑ๐˜ƒ๐—ฒ๐—ป๐˜๐˜‚๐—ฟ๐—ฒ ๐—๐˜‚๐˜€๐˜ ๐—š๐—ผ๐˜ ๐—ฆ๐—บ๐—ฎ๐—ฟ๐˜๐—ฒ๐—ฟ: ๐—œ๐—ป๐˜๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐—ถ๐—ป๐—ด ๐—”๐—น๐—ฝ๐—ถ๐—ป๐—ฒ ๐—”๐—ด๐—ฒ๐—ป๐˜!๐Ÿ”๏ธโ›ท๏ธ

With the big hype around AI agents these days, I couldnโ€™t stop thinking about how AI agents could truly enhance real-world activities.
What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boringโ€ฆ

Passionate about outdoors, Iโ€™ve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. Thatโ€™s why I built ๐—”๐—น๐—ฝ๐—ถ๐—ป๐—ฒ ๐—”๐—ด๐—ฒ๐—ป๐˜, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.

Built using Hugging Face's ๐˜€๐—บ๐—ผ๐—น๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€ library, Alpine Agent combines the power of AI with trusted resources like ๐˜š๐˜ฌ๐˜ช๐˜ต๐˜ฐ๐˜ถ๐˜ณ.๐˜ง๐˜ณ (https://skitour.fr/) and METEO FRANCE. Whether itโ€™s suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.

In my latest blog post, I share how I developed this projectโ€”from defining tools and integrating APIs to selecting the best LLMs like ๐˜˜๐˜ธ๐˜ฆ๐˜ฏ2.5-๐˜Š๐˜ฐ๐˜ฅ๐˜ฆ๐˜ณ-32๐˜‰-๐˜๐˜ฏ๐˜ด๐˜ต๐˜ณ๐˜ถ๐˜ค๐˜ต, ๐˜“๐˜ญ๐˜ข๐˜ฎ๐˜ข-3.3-70๐˜‰-๐˜๐˜ฏ๐˜ด๐˜ต๐˜ณ๐˜ถ๐˜ค๐˜ต, or ๐˜Ž๐˜—๐˜›-4.

โ›ท๏ธ Curious how AI can enhance adventure planning?โ€จTry the app and share your thoughts: florentgbelidji/alpine-agent

๐Ÿ‘‰ Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co/blog/florentgbelidji/alpine-agent

Many thanks to @m-ric for helping on building this tool with smolagents!
  • 1 reply
ยท
pagezyhfย 
posted an update 17 days ago
jeffboudierย 
posted an update 24 days ago
view post
Post
571
NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos
  • 1 reply
ยท
pagezyhfย 
posted an update about 2 months ago
pagezyhfย 
posted an update about 2 months ago
view post
Post
974
Itโ€™s 2nd of December , hereโ€™s your Cyber Monday present ๐ŸŽ !

Weโ€™re cutting our price down on Hugging Face Inference Endpoints and Spaces!

Our folks at Google Cloud are treating us with a 40% price cut on GCP Nvidia A100 GPUs for the next 3๏ธโƒฃ months. We have other reductions on all instances ranging from 20 to 50%.

Sounds like the time to give Inference Endpoints a try? Get started today and find in our documentation the full pricing details.
https://ui.endpoints.huggingface.co/
https://huggingface.co/pricing
pagezyhfย 
posted an update 2 months ago
view post
Post
304
Hello Hugging Face Community,

if you use Google Kubernetes Engine to host you ML workloads, I think this series of videos is a great way to kickstart your journey of deploying LLMs, in less than 10 minutes! Thank you @wietse-venema-demo !

To watch in this order:
1. Learn what are Hugging Face Deep Learning Containers
https://youtu.be/aWMp_hUUa0c?si=t-LPRkRNfD3DDNfr

2. Learn how to deploy a LLM with our Deep Learning Container using Text Generation Inference
https://youtu.be/Q3oyTOU1TMc?si=V6Dv-U1jt1SR97fj

3. Learn how to scale your inference endpoint based on traffic
https://youtu.be/QjLZ5eteDds?si=nDIAirh1r6h2dQMD

If you want more of these small tutorials and have any theme in mind, let me know!
jeffboudierย 
posted an update 2 months ago
pagezyhfย 
posted an update 3 months ago
view post
Post
1363
Hello Hugging Face Community,

I'd like to share here a bit more about our Deep Learning Containers (DLCs) we built with Google Cloud, to transform the way you build AI with open models on this platform!

With pre-configured, optimized environments for PyTorch Training (GPU) and Inference (CPU/GPU), Text Generation Inference (GPU), and Text Embeddings Inference (CPU/GPU), the Hugging Face DLCs offer:

โšก Optimized performance on Google Cloud's infrastructure, with TGI, TEI, and PyTorch acceleration.
๐Ÿ› ๏ธ Hassle-free environment setup, no more dependency issues.
๐Ÿ”„ Seamless updates to the latest stable versions.
๐Ÿ’ผ Streamlined workflow, reducing dev and maintenance overheads.
๐Ÿ”’ Robust security features of Google Cloud.
โ˜๏ธ Fine-tuned for optimal performance, integrated with GKE and Vertex AI.
๐Ÿ“ฆ Community examples for easy experimentation and implementation.
๐Ÿ”œ TPU support for PyTorch Training/Inference and Text Generation Inference is coming soon!

Find the documentation at https://huggingface.co/docs/google-cloud/en/index
If you need support, open a conversation on the forum: https://discuss.huggingface.co/c/google-cloud/69
jeffboudierย 
posted an update 4 months ago
jeffboudierย 
posted an update 4 months ago
view post
Post
457
Inference Endpoints got a bunch of cool updates yesterday, this is my top 3
jeffboudierย 
posted an update 5 months ago
view post
Post
4042
Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up
ยท
jeffboudierย 
posted an update 9 months ago
jeffboudierย 
posted an update 10 months ago
philschmidย 
posted an update 10 months ago
view post
Post
7296
New state-of-the-art open LLM! ๐Ÿš€ Databricks just released DBRX, a 132B MoE trained on 12T tokens. Claiming to surpass OpenAI GPT-3.5 and is competitive with Google Gemini 1.0 Pro. ๐Ÿคฏ

TL;DR
๐Ÿงฎ 132B MoE with 16 experts with 4 active in generation
๐ŸชŸ 32 000 context window
๐Ÿ“ˆ Outperforms open LLMs on common benchmarks, including MMLU
๐Ÿš€ Up to 2x faster inference than Llama 2 70B
๐Ÿ’ป Trained on 12T tokens
๐Ÿ”ก Uses the GPT-4 tokenizer
๐Ÿ“œ Custom License, commercially useable

Collection: databricks/dbrx-6601c0852a0cdd3c59f71962
Demo: https://huggingface.co/spaces/databricks/dbrx-instruct

Kudos to the Team at Databricks and MosaicML for this strong release in the open community! ๐Ÿค—
ยท
philschmidย 
posted an update about 1 year ago
view post
Post
What's the best way to fine-tune open LLMs in 2024? Look no further! ๐Ÿ‘€ย I am excited to share โ€œHow to Fine-Tune LLMs in 2024 with Hugging Faceโ€ using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. ๐Ÿš€

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
๐Ÿ’กDefine and understand use cases for fine-tuning
๐Ÿง‘๐Ÿปโ€๐Ÿ’ปย Setup of the development environment
๐Ÿงฎย Create and prepare dataset (OpenAI format)
๐Ÿ‹๏ธโ€โ™€๏ธย Fine-tune LLM using TRL and the SFTTrainer
๐Ÿฅ‡ย Test and evaluate the LLM
๐Ÿš€ย Deploy for production with TGI

๐Ÿ‘‰ย  https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. ๐Ÿ”œ
ยท