File size: 1,781 Bytes
e363b49
 
 
 
 
 
 
 
 
 
d84587b
 
 
 
e363b49
 
9b148d4
e363b49
9b148d4
 
 
 
 
 
 
e9f30c8
9b148d4
 
 
 
 
 
e363b49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d84587b
e363b49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
language:
- de
tags:
- ColBERT
- PyLate
- sentence-transformers
- sentence-similarity
pipeline_tag: sentence-similarity
library_name: PyLate
datasets:
- samheym/ger-dpr-collection
base_model:
- deepset/gbert-base
---

# Model Overview

GerColBERT is a ColBERT-based retrieval model trained on German text. It is designed for efficient late interaction-based retrieval while maintaining high-quality ranking performance.
Training Configuration

- Base Model: [deepset/gbert-base](https://huggingface.co/deepset/gbert-base)
- Training Dataset: samheym/ger-dpr-collection
- Dataset: 10% of randomly selected triples from the final dataset
- Vector Length: 128
- Maximum Document Length: 256 Tokens 
- Batch Size: 50
- Training Steps: 80,000
- Gradient Accumulation: 1 step
- Learning Rate: 5 × 10⁻⁶
- Optimizer: AdamW
- In-Batch Negatives: Included





## Usage
First install the PyLate library:

```bash
pip install -U pylate
```

### Retrieval 

PyLate provides a streamlined interface to index and retrieve documents using ColBERT models. The index leverages the Voyager HNSW index to efficiently handle document embeddings and enable fast retrieval.

```python
from pylate import indexes, models, retrieve

# Step 1: Load the ColBERT model
model = models.ColBERT(
    model_name_or_path=samheym/GerColBERT,
)
```





<!--
## Citation

### BibTeX

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->