How to Use olmOCR GGUF Model with Ollama?

#2
by koala8104 - opened

Hi,

I've downloaded the olmOCR GGUF model and added it to Ollama (running on localhost:11434), but I'm struggling to get it working properly.

Could someone share:

  1. The correct prompt format for olmOCR with Ollama
  2. How to convert PDFs to images and send them to the model
  3. A simple code example showing how to use it

I've read the GitHub repo but still haven't managed to make it work.

Thanks!

I have the same question,is there someone can share how to use olmOCR on ollama?

I also could not make it to OCR images. but for some reasons it did not work for me even as LLM. (it returned random text what indicates a wrong prompt structure)
to fix the chat behavior I did ollama create olmocr -f Modelfile
the file I used

FROM olmOCR-7B-0225-preview-Q5_K_M.gguf
TEMPLATE """{{- if .Messages }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
<|im_start|>{{ .Role }}
{{ .Content }}
{{- if $last }}
{{- if (ne .Role "assistant") }}<|im_end|>
<|im_start|>assistant
{{ end }}
{{- else }}<|im_end|>
{{ end }}
{{- end }}
{{- else }}
{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
{{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}"""

SYSTEM You are a helpful assistant.
PARAMETER temperature 0.1

but it is still saying that it does not see images.
I figure our vision models from ollama repos uses a projector model as second GGUF.
I tried used projector GGUF from qwen vl 7b but ollama cli said "Error: invalid file magic".

could you release the projector separately, or deploy your model into ollama, or provide a better solution?
thanks for the great ocr model.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment