Post
1063
Haystack can now see π
The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support!
π Notebooks below
This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases.
What's new?
π§ Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming π)
ποΈ Prompt template language to handle structured inputs, including images
π PDF and image converters
π Image embedders using CLIP-like models
π§Ύ LLM-based extractor to pull text from images
π§© Components to build multimodal RAG pipelines and Agents
I had the chance of leading this effort with @sjrhuschlee (great collab).
π Below you can find two notebooks to explore the new features:
σ ―β’σ σ Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro
σ ―β’σ σ Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag
(πΌοΈ image by @bilgeyucel )
The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support!
π Notebooks below
This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases.
What's new?
π§ Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming π)
ποΈ Prompt template language to handle structured inputs, including images
π PDF and image converters
π Image embedders using CLIP-like models
π§Ύ LLM-based extractor to pull text from images
π§© Components to build multimodal RAG pipelines and Agents
I had the chance of leading this effort with @sjrhuschlee (great collab).
π Below you can find two notebooks to explore the new features:
σ ―β’σ σ Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro
σ ―β’σ σ Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag
(πΌοΈ image by @bilgeyucel )