If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.
We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.
We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.
With the big hype around AI agents these days, I couldnโt stop thinking about how AI agents could truly enhance real-world activities. What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boringโฆ
Passionate about outdoors, Iโve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. Thatโs why I built ๐๐น๐ฝ๐ถ๐ป๐ฒ ๐๐ด๐ฒ๐ป๐, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.
Built using Hugging Face's ๐๐บ๐ผ๐น๐ฎ๐ด๐ฒ๐ป๐๐ library, Alpine Agent combines the power of AI with trusted resources like ๐๐ฌ๐ช๐ต๐ฐ๐ถ๐ณ.๐ง๐ณ (https://skitour.fr/) and METEO FRANCE. Whether itโs suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.
In my latest blog post, I share how I developed this projectโfrom defining tools and integrating APIs to selecting the best LLMs like ๐๐ธ๐ฆ๐ฏ2.5-๐๐ฐ๐ฅ๐ฆ๐ณ-32๐-๐๐ฏ๐ด๐ต๐ณ๐ถ๐ค๐ต, ๐๐ญ๐ข๐ฎ๐ข-3.3-70๐-๐๐ฏ๐ด๐ต๐ณ๐ถ๐ค๐ต, or ๐๐๐-4.
โท๏ธ Curious how AI can enhance adventure planning?โจTry the app and share your thoughts: florentgbelidji/alpine-agent ๐ Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co/blog/florentgbelidji/alpine-agent
Many thanks to @m-ric for helping on building this tool with smolagents!
Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development. The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6
Itโs 2nd of December , hereโs your Cyber Monday present ๐ !
Weโre cutting our price down on Hugging Face Inference Endpoints and Spaces!
Our folks at Google Cloud are treating us with a 40% price cut on GCP Nvidia A100 GPUs for the next 3๏ธโฃ months. We have other reductions on all instances ranging from 20 to 50%.
if you use Google Kubernetes Engine to host you ML workloads, I think this series of videos is a great way to kickstart your journey of deploying LLMs, in less than 10 minutes! Thank you @wietse-venema-demo !
I'd like to share here a bit more about our Deep Learning Containers (DLCs) we built with Google Cloud, to transform the way you build AI with open models on this platform!
With pre-configured, optimized environments for PyTorch Training (GPU) and Inference (CPU/GPU), Text Generation Inference (GPU), and Text Embeddings Inference (CPU/GPU), the Hugging Face DLCs offer:
โก Optimized performance on Google Cloud's infrastructure, with TGI, TEI, and PyTorch acceleration. ๐ ๏ธ Hassle-free environment setup, no more dependency issues. ๐ Seamless updates to the latest stable versions. ๐ผ Streamlined workflow, reducing dev and maintenance overheads. ๐ Robust security features of Google Cloud. โ๏ธ Fine-tuned for optimal performance, integrated with GKE and Vertex AI. ๐ฆ Community examples for easy experimentation and implementation. ๐ TPU support for PyTorch Training/Inference and Text Generation Inference is coming soon!
Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!
These 15 open models are available for serverless inference on Cloudflare Workers AI, powered by GPUs distributed in 150 datacenters globally - ๐ @rita3ko@mchenco@jtkipp@nkothariCF@philschmid
New state-of-the-art open LLM! ๐ Databricks just released DBRX, a 132B MoE trained on 12T tokens. Claiming to surpass OpenAI GPT-3.5 and is competitive with Google Gemini 1.0 Pro. ๐คฏ
TL;DR ๐งฎ 132B MoE with 16 experts with 4 active in generation ๐ช 32 000 context window ๐ Outperforms open LLMs on common benchmarks, including MMLU ๐ Up to 2x faster inference than Llama 2 70B ๐ป Trained on 12T tokens ๐ก Uses the GPT-4 tokenizer ๐ Custom License, commercially useable
What's the best way to fine-tune open LLMs in 2024? Look no further! ๐ย I am excited to share โHow to Fine-Tune LLMs in 2024 with Hugging Faceโ using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. ๐
It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with: ๐กDefine and understand use cases for fine-tuning ๐ง๐ปโ๐ปย Setup of the development environment ๐งฎย Create and prepare dataset (OpenAI format) ๐๏ธโโ๏ธย Fine-tune LLM using TRL and the SFTTrainer ๐ฅย Test and evaluate the LLM ๐ย Deploy for production with TGI