PyLate model based on prajjwal1/bert-mini

This is a PyLate model finetuned from prajjwal1/bert-mini on the msmarco-train dataset. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.

Model Details

Model Description

  • Model Type: PyLate model
  • Base model: prajjwal1/bert-mini
  • Document Length: 256 tokens
  • Query Length: 32 tokens
  • Output Dimensionality: 128 tokens
  • Similarity Function: MaxSim
  • Training Dataset:

Model Sources

Full Model Architecture

ColBERT(
  (0): Transformer({'max_seq_length': 255, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Dense({'in_features': 256, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Usage

First install the PyLate library:

pip install -U pylate

Retrieval

PyLate provides a streamlined interface to index and retrieve documents using ColBERT models. The index leverages the Voyager HNSW index to efficiently handle document embeddings and enable fast retrieval.

Indexing documents

First, load the ColBERT model and initialize the Voyager index, then encode and index your documents:

from pylate import indexes, models, retrieve

# Step 1: Load the ColBERT model
model = models.ColBERT(
    model_name_or_path="yosefw/colbert-bert-mini",
)

# Step 2: Initialize the Voyager index
index = indexes.Voyager(
    index_folder="pylate-index",
    index_name="index",
    override=True,  # This overwrites the existing index if any
)

# Step 3: Encode the documents
documents_ids = ["1", "2", "3"]
documents = ["document 1 text", "document 2 text", "document 3 text"]

documents_embeddings = model.encode(
    documents,
    batch_size=32,
    is_query=False,  # Ensure that it is set to False to indicate that these are documents, not queries
    show_progress_bar=True,
)

# Step 4: Add document embeddings to the index by providing embeddings and corresponding ids
index.add_documents(
    documents_ids=documents_ids,
    documents_embeddings=documents_embeddings,
)

Note that you do not have to recreate the index and encode the documents every time. Once you have created an index and added the documents, you can re-use the index later by loading it:

# To load an index, simply instantiate it with the correct folder/name and without overriding it
index = indexes.Voyager(
    index_folder="pylate-index",
    index_name="index",
)

Retrieving top-k documents for queries

Once the documents are indexed, you can retrieve the top-k most relevant documents for a given set of queries. To do so, initialize the ColBERT retriever with the index you want to search in, encode the queries and then retrieve the top-k documents to get the top matches ids and relevance scores:

# Step 1: Initialize the ColBERT retriever
retriever = retrieve.ColBERT(index=index)

# Step 2: Encode the queries
queries_embeddings = model.encode(
    ["query for document 3", "query for document 1"],
    batch_size=32,
    is_query=True,  #  # Ensure that it is set to False to indicate that these are queries
    show_progress_bar=True,
)

# Step 3: Retrieve top-k documents
scores = retriever.retrieve(
    queries_embeddings=queries_embeddings,
    k=10,  # Retrieve the top 10 matches for each query
)

Reranking

If you only want to use the ColBERT model to perform reranking on top of your first-stage retrieval pipeline without building an index, you can simply use rank function and pass the queries and documents to rerank:

from pylate import rank, models

queries = [
    "query A",
    "query B",
]

documents = [
    ["document A", "document B"],
    ["document 1", "document C", "document B"],
]

documents_ids = [
    [1, 2],
    [1, 3, 2],
]

model = models.ColBERT(
    model_name_or_path="yosefw/colbert-bert-mini",
)

queries_embeddings = model.encode(
    queries,
    is_query=True,
)

documents_embeddings = model.encode(
    documents,
    is_query=False,
)

reranked_documents = rank.rerank(
    documents_ids=documents_ids,
    queries_embeddings=queries_embeddings,
    documents_embeddings=documents_embeddings,
)

Evaluation

Metrics

Col BERTTriplet

  • Evaluated with pylate.evaluation.colbert_triplet.ColBERTTripletEvaluator
Metric Value
accuracy 0.8179

Training Details

Training Dataset

msmarco-train

  • Dataset: msmarco-train at 6853021
  • Size: 983,844 training samples
  • Columns: query_id, query, positive, negative_1, negative_2, negative_3, and negative_4
  • Approximate statistics based on the first 1000 samples:
    query_id query positive negative_1 negative_2 negative_3 negative_4
    type int string string string string string string
    details
    • 24716: ~0.20%
    • 24720: ~0.20%
    • 24724: ~0.20%
    • 24725: ~0.20%
    • 24729: ~0.20%
    • 24731: ~0.20%
    • 24732: ~0.20%
    • 24734: ~0.20%
    • 24735: ~0.20%
    • 24736: ~0.20%
    • 24737: ~0.20%
    • 24740: ~0.20%
    • 24741: ~0.20%
    • 24743: ~0.20%
    • 24746: ~0.20%
    • 24749: ~0.20%
    • 24751: ~0.20%
    • 24753: ~0.20%
    • 24756: ~0.20%
    • 24757: ~0.20%
    • 24763: ~0.20%
    • 24764: ~0.20%
    • 24765: ~0.20%
    • 24766: ~0.20%
    • 24769: ~0.20%
    • 24772: ~0.20%
    • 24775: ~0.20%
    • 24778: ~0.20%
    • 24782: ~0.20%
    • 24783: ~0.20%
    • 24784: ~0.20%
    • 24785: ~0.20%
    • 24786: ~0.20%
    • 24787: ~0.20%
    • 24789: ~0.20%
    • 24790: ~0.20%
    • 24791: ~0.20%
    • 24792: ~0.20%
    • 24794: ~0.20%
    • 24798: ~0.20%
    • 24799: ~0.20%
    • 24800: ~0.20%
    • 24801: ~0.20%
    • 24812: ~0.20%
    • 24813: ~0.20%
    • 24815: ~0.20%
    • 24816: ~0.20%
    • 24819: ~0.20%
    • 24820: ~0.20%
    • 24823: ~0.20%
    • 24824: ~0.20%
    • 24830: ~0.20%
    • 24833: ~0.20%
    • 24834: ~0.20%
    • 24839: ~0.20%
    • 24840: ~0.20%
    • 24842: ~0.20%
    • 24843: ~0.20%
    • 24844: ~0.20%
    • 24846: ~0.20%
    • 24847: ~0.20%
    • 24849: ~0.20%
    • 24851: ~0.20%
    • 24853: ~0.20%
    • 24859: ~0.20%
    • 24860: ~0.20%
    • 24861: ~0.20%
    • 24864: ~0.20%
    • 24865: ~0.20%
    • 24871: ~0.20%
    • 24874: ~0.20%
    • 24875: ~0.20%
    • 24876: ~0.20%
    • 24880: ~0.20%
    • 24885: ~0.20%
    • 24894: ~0.20%
    • 24896: ~0.20%
    • 24900: ~0.20%
    • 24902: ~0.20%
    • 24910: ~0.20%
    • 24914: ~0.20%
    • 24916: ~0.20%
    • 24921: ~0.20%
    • 24928: ~0.20%
    • 24931: ~0.20%
    • 24934: ~0.20%
    • 24935: ~0.20%
    • 24937: ~0.20%
    • 24939: ~0.20%
    • 24942: ~0.20%
    • 24943: ~0.20%
    • 24944: ~0.20%
    • 24946: ~0.20%
    • 24947: ~0.20%
    • 24949: ~0.20%
    • 24950: ~0.20%
    • 24953: ~0.20%
    • 24955: ~0.20%
    • 24956: ~0.20%
    • 24958: ~0.20%
    • 24959: ~0.20%
    • 24961: ~0.20%
    • 24966: ~0.20%
    • 24967: ~0.20%
    • 24968: ~0.20%
    • 24969: ~0.20%
    • 24970: ~0.20%
    • 24974: ~0.20%
    • 24975: ~0.20%
    • 24976: ~0.20%
    • 24981: ~0.20%
    • 24982: ~0.20%
    • 24983: ~0.20%
    • 24985: ~0.20%
    • 24986: ~0.20%
    • 24987: ~0.20%
    • 24992: ~0.20%
    • 24995: ~0.20%
    • 24997: ~0.20%
    • 25000: ~0.20%
    • 25001: ~0.20%
    • 25014: ~0.20%
    • 25017: ~0.20%
    • 25019: ~0.20%
    • 25021: ~0.20%
    • 25024: ~0.20%
    • 25028: ~0.20%
    • 25029: ~0.20%
    • 25031: ~0.20%
    • 25032: ~0.20%
    • 25033: ~0.20%
    • 25043: ~0.20%
    • 25048: ~0.20%
    • 25050: ~0.20%
    • 25051: ~0.20%
    • 25054: ~0.20%
    • 25060: ~0.20%
    • 25062: ~0.20%
    • 25063: ~0.20%
    • 25064: ~0.20%
    • 25066: ~0.20%
    • 25069: ~0.20%
    • 25070: ~0.20%
    • 25073: ~0.20%
    • 25075: ~0.20%
    • 25078: ~0.20%
    • 25079: ~0.20%
    • 25081: ~0.20%
    • 25084: ~0.20%
    • 25085: ~0.20%
    • 25086: ~0.20%
    • 25089: ~0.20%
    • 25090: ~0.20%
    • 25091: ~0.20%
    • 25092: ~0.20%
    • 25093: ~0.20%
    • 25094: ~0.20%
    • 25096: ~0.20%
    • 25098: ~0.20%
    • 25101: ~0.20%
    • 25102: ~0.20%
    • 25104: ~0.20%
    • 25106: ~0.20%
    • 25109: ~0.20%
    • 25114: ~0.20%
    • 25117: ~0.20%
    • 25118: ~0.20%
    • 25119: ~0.20%
    • 25120: ~0.20%
    • 25122: ~0.20%
    • 25125: ~0.20%
    • 25126: ~0.20%
    • 25130: ~0.20%
    • 25133: ~0.20%
    • 25136: ~0.20%
    • 25137: ~0.20%
    • 25138: ~0.20%
    • 25140: ~0.20%
    • 25143: ~0.20%
    • 25152: ~0.20%
    • 25155: ~0.20%
    • 25157: ~0.20%
    • 25158: ~0.20%
    • 25159: ~0.20%
    • 25160: ~0.20%
    • 25161: ~0.20%
    • 25165: ~0.20%
    • 25168: ~0.20%
    • 25169: ~0.20%
    • 25170: ~0.20%
    • 25172: ~0.20%
    • 25173: ~0.20%
    • 25174: ~0.20%
    • 25176: ~0.20%
    • 25177: ~0.20%
    • 25180: ~0.20%
    • 25187: ~0.20%
    • 25188: ~0.20%
    • 25189: ~0.20%
    • 25190: ~0.20%
    • 25191: ~0.20%
    • 25192: ~0.20%
    • 25193: ~0.20%
    • 25194: ~0.20%
    • 25195: ~0.20%
    • 25197: ~0.20%
    • 25199: ~0.20%
    • 25202: ~0.20%
    • 25204: ~0.20%
    • 25206: ~0.20%
    • 25207: ~0.20%
    • 25208: ~0.20%
    • 25210: ~0.20%
    • 25211: ~0.20%
    • 25213: ~0.20%
    • 25219: ~0.20%
    • 25226: ~0.20%
    • 25229: ~0.20%
    • 25231: ~0.20%
    • 25232: ~0.20%
    • 25233: ~0.20%
    • 25236: ~0.20%
    • 25245: ~0.20%
    • 25246: ~0.20%
    • 25249: ~0.20%
    • 25250: ~0.20%
    • 25253: ~0.20%
    • 25255: ~0.20%
    • 25256: ~0.20%
    • 25257: ~0.20%
    • 25258: ~0.20%
    • 25259: ~0.20%
    • 25262: ~0.20%
    • 25264: ~0.20%
    • 25265: ~0.20%
    • 25268: ~0.20%
    • 25269: ~0.20%
    • 25270: ~0.20%
    • 25271: ~0.20%
    • 25274: ~0.20%
    • 25277: ~0.20%
    • 25278: ~0.20%
    • 25281: ~0.20%
    • 25284: ~0.20%
    • 25285: ~0.20%
    • 25289: ~0.20%
    • 25292: ~0.20%
    • 25293: ~0.20%
    • 25296: ~0.20%
    • 25297: ~0.20%
    • 25298: ~0.20%
    • 25300: ~0.20%
    • 25305: ~0.20%
    • 25308: ~0.20%
    • 25310: ~0.20%
    • 25311: ~0.20%
    • 25312: ~0.20%
    • 25314: ~0.20%
    • 25315: ~0.20%
    • 25316: ~0.20%
    • 25317: ~0.20%
    • 25321: ~0.20%
    • 25323: ~0.20%
    • 25324: ~0.20%
    • 25327: ~0.20%
    • 25330: ~0.20%
    • 25331: ~0.20%
    • 25334: ~0.20%
    • 25336: ~0.20%
    • 25338: ~0.20%
    • 25339: ~0.20%
    • 25342: ~0.20%
    • 25343: ~0.20%
    • 25346: ~0.20%
    • 25348: ~0.20%
    • 25350: ~0.20%
    • 25352: ~0.20%
    • 25355: ~0.20%
    • 25356: ~0.20%
    • 25357: ~0.20%
    • 25358: ~0.20%
    • 25359: ~0.20%
    • 25360: ~0.20%
    • 25361: ~0.20%
    • 25366: ~0.20%
    • 25367: ~0.20%
    • 25370: ~0.20%
    • 25371: ~0.20%
    • 25374: ~0.20%
    • 25379: ~0.20%
    • 25380: ~0.20%
    • 25381: ~0.20%
    • 25383: ~0.20%
    • 25384: ~0.20%
    • 25388: ~0.20%
    • 25389: ~0.20%
    • 25390: ~0.20%
    • 25391: ~0.20%
    • 25392: ~0.20%
    • 25394: ~0.20%
    • 25397: ~0.20%
    • 25401: ~0.20%
    • 25402: ~0.20%
    • 25403: ~0.20%
    • 25405: ~0.20%
    • 25408: ~0.20%
    • 25410: ~0.20%
    • 25411: ~0.20%
    • 25413: ~0.20%
    • 25417: ~0.20%
    • 25420: ~0.20%
    • 25421: ~0.20%
    • 25423: ~0.20%
    • 25428: ~0.20%
    • 25430: ~0.20%
    • 25432: ~0.20%
    • 25433: ~0.20%
    • 25436: ~0.20%
    • 25439: ~0.20%
    • 25445: ~0.20%
    • 25446: ~0.20%
    • 25448: ~0.20%
    • 25450: ~0.20%
    • 25453: ~0.20%
    • 25454: ~0.20%
    • 25456: ~0.20%
    • 25457: ~0.20%
    • 25458: ~0.20%
    • 25466: ~0.20%
    • 25467: ~0.20%
    • 25468: ~0.20%
    • 25469: ~0.20%
    • 25470: ~0.20%
    • 25471: ~0.20%
    • 25473: ~0.20%
    • 25475: ~0.20%
    • 25476: ~0.20%
    • 25477: ~0.20%
    • 25478: ~0.20%
    • 25479: ~0.20%
    • 25480: ~0.20%
    • 25482: ~0.20%
    • 25486: ~0.20%
    • 25487: ~0.20%
    • 25488: ~0.20%
    • 25490: ~0.20%
    • 25494: ~0.20%
    • 25497: ~0.20%
    • 25499: ~0.20%
    • 25500: ~0.20%
    • 25504: ~0.20%
    • 25508: ~0.20%
    • 25511: ~0.20%
    • 25512: ~0.20%
    • 25513: ~0.20%
    • 25514: ~0.20%
    • 25516: ~0.20%
    • 25518: ~0.20%
    • 25519: ~0.20%
    • 25525: ~0.20%
    • 25526: ~0.20%
    • 25528: ~0.20%
    • 25530: ~0.20%
    • 25532: ~0.20%
    • 25533: ~0.20%
    • 25535: ~0.20%
    • 25536: ~0.20%
    • 25539: ~0.20%
    • 25540: ~0.20%
    • 25541: ~0.20%
    • 25544: ~0.20%
    • 25549: ~0.20%
    • 25552: ~0.20%
    • 25553: ~0.20%
    • 25555: ~0.20%
    • 25558: ~0.20%
    • 25560: ~0.20%
    • 25562: ~0.20%
    • 25565: ~0.20%
    • 25566: ~0.20%
    • 25567: ~0.20%
    • 25569: ~0.20%
    • 25571: ~0.20%
    • 25572: ~0.20%
    • 25574: ~0.20%
    • 25575: ~0.20%
    • 25583: ~0.20%
    • 25584: ~0.20%
    • 25587: ~0.20%
    • 25589: ~0.20%
    • 25591: ~0.20%
    • 25593: ~0.20%
    • 25594: ~0.20%
    • 25595: ~0.20%
    • 25598: ~0.20%
    • 25601: ~0.20%
    • 25604: ~0.20%
    • 25605: ~0.20%
    • 25607: ~0.20%
    • 25609: ~0.20%
    • 25612: ~0.20%
    • 25613: ~0.20%
    • 25614: ~0.20%
    • 25615: ~0.20%
    • 25617: ~0.20%
    • 25618: ~0.20%
    • 25619: ~0.20%
    • 25620: ~0.20%
    • 25624: ~0.20%
    • 25628: ~0.20%
    • 25629: ~0.20%
    • 25631: ~0.20%
    • 25633: ~0.20%
    • 25637: ~0.20%
    • 25638: ~0.20%
    • 25639: ~0.20%
    • 25640: ~0.20%
    • 25641: ~0.20%
    • 25642: ~0.20%
    • 25644: ~0.20%
    • 25648: ~0.20%
    • 25649: ~0.20%
    • 25650: ~0.20%
    • 25652: ~0.20%
    • 25653: ~0.20%
    • 25655: ~0.20%
    • 25656: ~0.20%
    • 25657: ~0.20%
    • 25659: ~0.20%
    • 25660: ~0.20%
    • 25661: ~0.20%
    • 25667: ~0.20%
    • 25668: ~0.20%
    • 25669: ~0.20%
    • 25670: ~0.20%
    • 25672: ~0.20%
    • 25673: ~0.20%
    • 25675: ~0.20%
    • 25676: ~0.20%
    • 25678: ~0.20%
    • 25679: ~0.20%
    • 25683: ~0.20%
    • 25684: ~0.20%
    • 25685: ~0.20%
    • 25687: ~0.20%
    • 25691: ~0.20%
    • 25695: ~0.20%
    • 25697: ~0.20%
    • 25699: ~0.20%
    • 25702: ~0.20%
    • 25704: ~0.20%
    • 25705: ~0.20%
    • 25706: ~0.20%
    • 25709: ~0.20%
    • 25710: ~0.20%
    • 25712: ~0.20%
    • 25714: ~0.20%
    • 25715: ~0.20%
    • 25716: ~0.20%
    • 25722: ~0.20%
    • 25725: ~0.20%
    • 25727: ~0.20%
    • 25728: ~0.20%
    • 25730: ~0.20%
    • 25731: ~0.20%
    • 25733: ~0.20%
    • 25735: ~0.20%
    • 25736: ~0.20%
    • 25737: ~0.20%
    • 25738: ~0.20%
    • 25740: ~0.20%
    • 25741: ~0.20%
    • 25742: ~0.20%
    • 25744: ~0.20%
    • 25746: ~0.20%
    • 25748: ~0.20%
    • 25749: ~0.20%
    • 25751: ~0.20%
    • 25752: ~0.20%
    • 25754: ~0.20%
    • 25756: ~0.20%
    • 25760: ~0.20%
    • 25761: ~0.20%
    • 25762: ~0.20%
    • 25766: ~0.20%
    • 25768: ~0.20%
    • 25771: ~0.20%
    • 25772: ~0.20%
    • 25773: ~0.20%
    • 25777: ~0.20%
    • 25778: ~0.20%
    • 25780: ~0.20%
    • 25782: ~0.20%
    • 25783: ~0.20%
    • 25784: ~0.20%
    • 25787: ~0.20%
    • 25788: ~0.20%
    • 25789: ~0.20%
    • 25790: ~0.20%
    • 25791: ~0.20%
    • min: 6 tokens
    • mean: 10.55 tokens
    • max: 27 tokens
    • min: 23 tokens
    • mean: 31.97 tokens
    • max: 32 tokens
    • min: 18 tokens
    • mean: 31.95 tokens
    • max: 32 tokens
    • min: 19 tokens
    • mean: 31.97 tokens
    • max: 32 tokens
    • min: 20 tokens
    • mean: 31.97 tokens
    • max: 32 tokens
    • min: 21 tokens
    • mean: 31.97 tokens
    • max: 32 tokens
  • Samples:
    query_id query positive negative_1 negative_2 negative_3 negative_4
    24716 are rock band guitars compatible with guitar hero Instruments compatible with The Beatles: Rock Band, Green Day: Rock Band, Lego Rock Band, Guitar Hero 4, Guitar Hero 5, Guitar Hero: Van Halen, Guitar Hero Smash Hits, Guitar Hero: Metallica, and Band Hero. However, guitar compatibility has become a major question now that Rock Band has joined Guitar Hero in the genre. You can't mix and match controllers with games anymore now that we have two competing game franchises with different guitar designs. Games compatible with the Guitar Hero 4, Band Hero, Rock Band 1, Rock Band 2, and The Beatles: Rock Band mics. Our position is really all about respect for our consumers and for the money that they have spent to get into the game space in the first place. They spend a fortune on games: on consoles, on hardware, and we're sensitive to that, he said. For more on the music party game resurgence, check out IGN's discussion about both Guitar Hero and Rock Band. Cassidee is a freelance writer for various outlets around the web. You can chat with her about all things geeky on Twitter.
    24716 are rock band guitars compatible with guitar hero Instruments compatible with The Beatles: Rock Band, Green Day: Rock Band, Lego Rock Band, Guitar Hero 4, Guitar Hero 5, Guitar Hero: Van Halen, Guitar Hero Smash Hits, Guitar Hero: Metallica, and Band Hero. While the Xbox 360 RedOctane controllers play nicely with Rock Band, we can't say that the Rock Band guitar returns the favor. The Rock Band Fender Stratocaster will not work with Guitar Hero II or Guitar Hero III. The new peripherals for the recently-announced Guitar Hero Live will not work with Rock Band 4, although Sussman hasn't completely ruled it out, saying he won't know for sure until [they] get one in the office.. Note: Rock Band instruments ARE compatible to Guitar Hero 5 and Band Hero for the Wii Edit. Do NOT change those things. Rock Band series Guitars, Drumkits, and Microphones ARE, and I mean ARE compatible fore Guitar Hero 5 and Band Hero for the Wii. THEY FUCKING ARE! Speaking with Business Insider, Rock Band 4 project director Daniel Sussman revealed the upcoming music game will be compatible with last-gen Guitar Hero and Rock Band peripherals. Sussman explains he and his team wanted to see this integration through as a way to cut costs for consumers.
    24720 are rocks found in temperate deciduous forests Since the smaller biomes make up the Deciduous Forests it shows that because it is a forest then naturally trees, animals and rocks are found inside of it. (Riley M) Temperate forests are in cool rainy areas that have trees that lose their leaves in the Fall, and then grow back in the Spring. Plants Animals Climate Northeast Asian Deciduous Forest Deciduous forests can be found in the eastern half of North America, and the middle of Europe. There are many deciduous forests in Asia. Some of the major areas that they are in are southwest Russia, Japan, and eastern China. Northeast Asian Deciduous Forest. Deciduous forests can be found in the eastern half of North America, and the middle of Europe. There are many deciduous forests in Asia. Some of the major areas that they are in are southwest Russia, Japan, and eastern China. The trees found in temperate forest are called hardwoods. Hardwoods are trees that loose their leaves in the winter, also their trunks are also made of a bark that is very hard. (Dimitri T.). Most of the trees in temperate forest are maple, birch, beech, oak, hickory, and sweet gum. Temperate (Deciduous Forest) occupies the eastern part of the United States {Coraima T}. Temperate deciduous forests are found in continents such as North America, South America, Europe, Asia, Austrailia, and Africa. The soil in these temperate forests can be very rocky and sandy.
  • Loss: pylate.losses.contrastive.Contrastive

Evaluation Dataset

msmarco-train

  • Dataset: msmarco-train at 6853021
  • Size: 20,000 evaluation samples
  • Columns: query_id, query, positive, negative_1, negative_2, negative_3, and negative_4
  • Approximate statistics based on the first 1000 samples:
    query_id query positive negative_1 negative_2 negative_3 negative_4
    type int string string string string string string
    details
    • 3: ~0.20%
    • 4: ~0.20%
    • 5: ~0.20%
    • 6: ~0.20%
    • 8: ~0.20%
    • 12: ~0.20%
    • 14: ~0.20%
    • 15: ~0.20%
    • 16: ~0.20%
    • 18: ~0.20%
    • 19: ~0.20%
    • 20: ~0.20%
    • 24: ~0.20%
    • 26: ~0.20%
    • 31: ~0.20%
    • 39: ~0.20%
    • 40: ~0.20%
    • 42: ~0.20%
    • 43: ~0.20%
    • 48: ~0.20%
    • 51: ~0.20%
    • 54: ~0.20%
    • 55: ~0.20%
    • 59: ~0.20%
    • 60: ~0.20%
    • 62: ~0.20%
    • 63: ~0.20%
    • 64: ~0.20%
    • 66: ~0.20%
    • 67: ~0.20%
    • 68: ~0.20%
    • 69: ~0.20%
    • 70: ~0.20%
    • 76: ~0.20%
    • 77: ~0.20%
    • 79: ~0.20%
    • 80: ~0.20%
    • 91: ~0.20%
    • 100: ~0.20%
    • 105: ~0.20%
    • 107: ~0.20%
    • 108: ~0.20%
    • 111: ~0.20%
    • 112: ~0.20%
    • 114: ~0.20%
    • 118: ~0.20%
    • 121: ~0.20%
    • 125: ~0.20%
    • 138: ~0.20%
    • 141: ~0.20%
    • 142: ~0.20%
    • 144: ~0.20%
    • 145: ~0.20%
    • 147: ~0.20%
    • 148: ~0.20%
    • 150: ~0.20%
    • 152: ~0.20%
    • 155: ~0.20%
    • 160: ~0.20%
    • 161: ~0.20%
    • 163: ~0.20%
    • 166: ~0.20%
    • 168: ~0.20%
    • 173: ~0.20%
    • 174: ~0.20%
    • 176: ~0.20%
    • 177: ~0.20%
    • 181: ~0.20%
    • 188: ~0.20%
    • 190: ~0.20%
    • 193: ~0.20%
    • 194: ~0.20%
    • 202: ~0.20%
    • 205: ~0.20%
    • 212: ~0.20%
    • 213: ~0.20%
    • 216: ~0.20%
    • 221: ~0.20%
    • 226: ~0.20%
    • 228: ~0.20%
    • 229: ~0.20%
    • 237: ~0.20%
    • 238: ~0.20%
    • 243: ~0.20%
    • 245: ~0.20%
    • 246: ~0.20%
    • 249: ~0.20%
    • 252: ~0.20%
    • 253: ~0.20%
    • 256: ~0.20%
    • 257: ~0.20%
    • 265: ~0.20%
    • 266: ~0.20%
    • 271: ~0.20%
    • 275: ~0.20%
    • 279: ~0.20%
    • 286: ~0.20%
    • 294: ~0.20%
    • 295: ~0.20%
    • 301: ~0.20%
    • 302: ~0.20%
    • 303: ~0.20%
    • 306: ~0.20%
    • 307: ~0.20%
    • 311: ~0.20%
    • 313: ~0.20%
    • 315: ~0.20%
    • 320: ~0.20%
    • 322: ~0.20%
    • 323: ~0.20%
    • 328: ~0.20%
    • 329: ~0.20%
    • 333: ~0.20%
    • 338: ~0.20%
    • 345: ~0.20%
    • 347: ~0.20%
    • 354: ~0.20%
    • 355: ~0.20%
    • 360: ~0.20%
    • 365: ~0.20%
    • 367: ~0.20%
    • 369: ~0.20%
    • 371: ~0.20%
    • 372: ~0.20%
    • 374: ~0.20%
    • 375: ~0.20%
    • 376: ~0.20%
    • 377: ~0.20%
    • 379: ~0.20%
    • 381: ~0.20%
    • 382: ~0.20%
    • 385: ~0.20%
    • 387: ~0.20%
    • 390: ~0.20%
    • 393: ~0.20%
    • 394: ~0.20%
    • 399: ~0.20%
    • 402: ~0.20%
    • 405: ~0.20%
    • 406: ~0.20%
    • 408: ~0.20%
    • 411: ~0.20%
    • 414: ~0.20%
    • 415: ~0.20%
    • 424: ~0.20%
    • 425: ~0.20%
    • 426: ~0.20%
    • 438: ~0.20%
    • 444: ~0.20%
    • 445: ~0.20%
    • 447: ~0.20%
    • 448: ~0.20%
    • 451: ~0.20%
    • 452: ~0.20%
    • 454: ~0.20%
    • 455: ~0.20%
    • 456: ~0.20%
    • 457: ~0.20%
    • 458: ~0.20%
    • 459: ~0.20%
    • 460: ~0.20%
    • 466: ~0.20%
    • 469: ~0.20%
    • 473: ~0.20%
    • 475: ~0.20%
    • 477: ~0.20%
    • 478: ~0.20%
    • 480: ~0.20%
    • 481: ~0.20%
    • 487: ~0.20%
    • 489: ~0.20%
    • 494: ~0.20%
    • 497: ~0.20%
    • 503: ~0.20%
    • 504: ~0.20%
    • 507: ~0.20%
    • 508: ~0.20%
    • 515: ~0.20%
    • 516: ~0.20%
    • 518: ~0.20%
    • 519: ~0.20%
    • 523: ~0.20%
    • 524: ~0.20%
    • 525: ~0.20%
    • 528: ~0.20%
    • 531: ~0.20%
    • 534: ~0.20%
    • 536: ~0.20%
    • 539: ~0.20%
    • 540: ~0.20%
    • 542: ~0.20%
    • 543: ~0.20%
    • 545: ~0.20%
    • 547: ~0.20%
    • 576: ~0.20%
    • 585: ~0.20%
    • 596: ~0.20%
    • 598: ~0.20%
    • 601: ~0.20%
    • 603: ~0.20%
    • 612: ~0.20%
    • 618: ~0.20%
    • 623: ~0.20%
    • 625: ~0.20%
    • 630: ~0.20%
    • 632: ~0.20%
    • 641: ~0.20%
    • 646: ~0.20%
    • 651: ~0.20%
    • 657: ~0.20%
    • 662: ~0.20%
    • 666: ~0.20%
    • 673: ~0.20%
    • 676: ~0.20%
    • 677: ~0.20%
    • 680: ~0.20%
    • 683: ~0.20%
    • 684: ~0.20%
    • 685: ~0.20%
    • 687: ~0.20%
    • 689: ~0.20%
    • 690: ~0.20%
    • 691: ~0.20%
    • 694: ~0.20%
    • 696: ~0.20%
    • 702: ~0.20%
    • 704: ~0.20%
    • 710: ~0.20%
    • 711: ~0.20%
    • 713: ~0.20%
    • 731: ~0.20%
    • 733: ~0.20%
    • 743: ~0.20%
    • 747: ~0.20%
    • 748: ~0.20%
    • 752: ~0.20%
    • 754: ~0.20%
    • 761: ~0.20%
    • 766: ~0.20%
    • 771: ~0.20%
    • 776: ~0.20%
    • 778: ~0.20%
    • 790: ~0.20%
    • 795: ~0.20%
    • 796: ~0.20%
    • 797: ~0.20%
    • 798: ~0.20%
    • 804: ~0.20%
    • 819: ~0.20%
    • 820: ~0.20%
    • 821: ~0.20%
    • 822: ~0.20%
    • 823: ~0.20%
    • 824: ~0.20%
    • 825: ~0.20%
    • 828: ~0.20%
    • 837: ~0.20%
    • 839: ~0.20%
    • 842: ~0.20%
    • 847: ~0.20%
    • 848: ~0.20%
    • 851: ~0.20%
    • 855: ~0.20%
    • 857: ~0.20%
    • 866: ~0.20%
    • 867: ~0.20%
    • 872: ~0.20%
    • 878: ~0.20%
    • 883: ~0.20%
    • 886: ~0.20%
    • 898: ~0.20%
    • 924: ~0.20%
    • 925: ~0.20%
    • 932: ~0.20%
    • 942: ~0.20%
    • 945: ~0.20%
    • 950: ~0.20%
    • 952: ~0.20%
    • 958: ~0.20%
    • 964: ~0.20%
    • 968: ~0.20%
    • 970: ~0.20%
    • 976: ~0.20%
    • 982: ~0.20%
    • 987: ~0.20%
    • 990: ~0.20%
    • 997: ~0.20%
    • 1004: ~0.20%
    • 1006: ~0.20%
    • 1009: ~0.20%
    • 1012: ~0.20%
    • 1024: ~0.20%
    • 1028: ~0.20%
    • 1032: ~0.20%
    • 1033: ~0.20%
    • 1035: ~0.20%
    • 1040: ~0.20%
    • 1041: ~0.20%
    • 1049: ~0.20%
    • 1070: ~0.20%
    • 1078: ~0.20%
    • 1085: ~0.20%
    • 1096: ~0.20%
    • 1102: ~0.20%
    • 1114: ~0.20%
    • 1115: ~0.20%
    • 1117: ~0.20%
    • 1118: ~0.20%
    • 1119: ~0.20%
    • 1120: ~0.20%
    • 1127: ~0.20%
    • 1130: ~0.20%
    • 1138: ~0.20%
    • 1139: ~0.20%
    • 1150: ~0.20%
    • 1161: ~0.20%
    • 1163: ~0.20%
    • 1168: ~0.20%
    • 1180: ~0.20%
    • 1183: ~0.20%
    • 1189: ~0.20%
    • 1192: ~0.20%
    • 1199: ~0.20%
    • 1200: ~0.20%
    • 1201: ~0.20%
    • 1204: ~0.20%
    • 1211: ~0.20%
    • 1216: ~0.20%
    • 1217: ~0.20%
    • 1220: ~0.20%
    • 1228: ~0.20%
    • 1230: ~0.20%
    • 1234: ~0.20%
    • 1237: ~0.20%
    • 1241: ~0.20%
    • 1245: ~0.20%
    • 1247: ~0.20%
    • 1251: ~0.20%
    • 1256: ~0.20%
    • 1257: ~0.20%
    • 1261: ~0.20%
    • 1264: ~0.20%
    • 1266: ~0.20%
    • 1269: ~0.20%
    • 1270: ~0.20%
    • 1271: ~0.20%
    • 1272: ~0.20%
    • 1275: ~0.20%
    • 1281: ~0.20%
    • 1285: ~0.20%
    • 1292: ~0.20%
    • 1293: ~0.20%
    • 1299: ~0.20%
    • 1300: ~0.20%
    • 1305: ~0.20%
    • 1308: ~0.20%
    • 1323: ~0.20%
    • 1327: ~0.20%
    • 1333: ~0.20%
    • 1335: ~0.20%
    • 1336: ~0.20%
    • 1365: ~0.20%
    • 1367: ~0.20%
    • 1369: ~0.20%
    • 1370: ~0.20%
    • 1394: ~0.20%
    • 1398: ~0.20%
    • 1401: ~0.20%
    • 1404: ~0.20%
    • 1405: ~0.20%
    • 1407: ~0.20%
    • 1423: ~0.20%
    • 1426: ~0.20%
    • 1430: ~0.20%
    • 1432: ~0.20%
    • 1441: ~0.20%
    • 1448: ~0.20%
    • 1453: ~0.20%
    • 1463: ~0.20%
    • 1465: ~0.20%
    • 1466: ~0.20%
    • 1471: ~0.20%
    • 1474: ~0.20%
    • 1480: ~0.20%
    • 1481: ~0.20%
    • 1485: ~0.20%
    • 1487: ~0.20%
    • 1493: ~0.20%
    • 1494: ~0.20%
    • 1498: ~0.20%
    • 1500: ~0.20%
    • 1502: ~0.20%
    • 1515: ~0.20%
    • 1519: ~0.20%
    • 1520: ~0.20%
    • 1521: ~0.20%
    • 1523: ~0.20%
    • 1525: ~0.20%
    • 1527: ~0.20%
    • 1528: ~0.20%
    • 1530: ~0.20%
    • 1551: ~0.20%
    • 1554: ~0.20%
    • 1556: ~0.20%
    • 1562: ~0.20%
    • 1563: ~0.20%
    • 1564: ~0.20%
    • 1567: ~0.20%
    • 1568: ~0.20%
    • 1569: ~0.20%
    • 1578: ~0.20%
    • 1588: ~0.20%
    • 1599: ~0.20%
    • 1604: ~0.20%
    • 1610: ~0.20%
    • 1620: ~0.20%
    • 1621: ~0.20%
    • 1626: ~0.20%
    • 1628: ~0.20%
    • 1631: ~0.20%
    • 1632: ~0.20%
    • 1638: ~0.20%
    • 1642: ~0.20%
    • 1643: ~0.20%
    • 1646: ~0.20%
    • 1650: ~0.20%
    • 1662: ~0.20%
    • 1663: ~0.20%
    • 1665: ~0.20%
    • 1668: ~0.20%
    • 1673: ~0.20%
    • 1675: ~0.20%
    • 1679: ~0.20%
    • 1689: ~0.20%
    • 1694: ~0.20%
    • 1702: ~0.20%
    • 1705: ~0.20%
    • 1710: ~0.20%
    • 1719: ~0.20%
    • 1720: ~0.20%
    • 1722: ~0.20%
    • 1723: ~0.20%
    • 1727: ~0.20%
    • 1729: ~0.20%
    • 1735: ~0.20%
    • 1738: ~0.20%
    • 1739: ~0.20%
    • 1743: ~0.20%
    • 1745: ~0.20%
    • 1746: ~0.20%
    • 1747: ~0.20%
    • 1749: ~0.20%
    • 1752: ~0.20%
    • 1754: ~0.20%
    • 1756: ~0.20%
    • 1757: ~0.20%
    • 1759: ~0.20%
    • 1760: ~0.20%
    • 1763: ~0.20%
    • 1765: ~0.20%
    • 1767: ~0.20%
    • 1768: ~0.20%
    • 1769: ~0.20%
    • 1770: ~0.20%
    • 1772: ~0.20%
    • 1775: ~0.20%
    • 1779: ~0.20%
    • 1781: ~0.20%
    • 1782: ~0.20%
    • 1784: ~0.20%
    • 1787: ~0.20%
    • 1790: ~0.20%
    • 1791: ~0.20%
    • 1792: ~0.20%
    • 1793: ~0.20%
    • 1795: ~0.20%
    • 1798: ~0.20%
    • 1799: ~0.20%
    • 1800: ~0.20%
    • 1801: ~0.20%
    • 1802: ~0.20%
    • 1804: ~0.20%
    • 1806: ~0.20%
    • 1807: ~0.20%
    • 1808: ~0.20%
    • 1809: ~0.20%
    • 1810: ~0.20%
    • 1813: ~0.20%
    • 1814: ~0.20%
    • 1815: ~0.20%
    • 1816: ~0.20%
    • 1817: ~0.20%
    • 1818: ~0.20%
    • 1822: ~0.20%
    • 1825: ~0.20%
    • 1828: ~0.20%
    • 1829: ~0.20%
    • 1830: ~0.20%
    • 1832: ~0.20%
    • 1833: ~0.20%
    • min: 5 tokens
    • mean: 11.05 tokens
    • max: 32 tokens
    • min: 17 tokens
    • mean: 31.92 tokens
    • max: 32 tokens
    • min: 25 tokens
    • mean: 31.96 tokens
    • max: 32 tokens
    • min: 17 tokens
    • mean: 31.93 tokens
    • max: 32 tokens
    • min: 15 tokens
    • mean: 31.9 tokens
    • max: 32 tokens
    • min: 18 tokens
    • mean: 31.92 tokens
    • max: 32 tokens
  • Samples:
    query_id query positive negative_1 negative_2 negative_3 negative_4
    3 Another name for the primary visual cortex is The primary (parts of the cortex that receive sensory inputs from the thalamus) visual cortex is also known as V1, V isual area one, and the striate cortex.The extrastriate areas consist of visual areas two (V2), three (V3), four (V4), and five (V5).he primary visual cortex is the best-studied visual area in the brain. In all mammals studied, it is located in the posterior pole of the occipital cortex (the occipital cortex is responsible for processing visual stimuli). The visual cortex is made up of Brodmann area 17 (the primary visual cortex), and Brodmann areas 18 and 19, the extrastriate cortical areas.he primary visual cortex is the best-studied visual area in the brain. In all mammals studied, it is located in the posterior pole of the occipital cortex (the occipital cortex is responsible for processing visual stimuli). The visual cortex of the brain is the part of the cerebral cortex responsible for processing visual information. This article addresses the ventral/dorsal model of the visual cortex. Another model for the perceptual/conceptual neuropsychological model of the visual cortex was studied by Raftopolous.he primary visual cortex is the best-studied visual area in the brain. In all mammals studied, it is located in the posterior pole of the occipital cortex (the occipital cortex is responsible for processing visual stimuli). The primary visual cortex is the best-studied visual area in the brain. In all mammals studied, it is located in the posterior pole of the occipital cortex (the occipital cortex is responsible for processing visual stimuli).he primary visual cortex is the best-studied visual area in the brain. In all mammals studied, it is located in the posterior pole of the occipital cortex (the occipital cortex is responsible for processing visual stimuli). Damage to the primary visual cortex, which is located on the surface of the posterior occipital lobe, can cause blindness due to the holes in the visual map on the surface of the visual cortex that resulted from the lesions. significant functional aspect of the occipital lobe is that it contains the primary visual cortex. Retinal sensors convey stimuli through the optic tracts to the lateral geniculate bodies, where optic radiations continue to the visual cortex.
    3 Another name for the primary visual cortex is The primary (parts of the cortex that receive sensory inputs from the thalamus) visual cortex is also known as V1, V isual area one, and the striate cortex.The extrastriate areas consist of visual areas two (V2), three (V3), four (V4), and five (V5).he primary visual cortex is the best-studied visual area in the brain. In all mammals studied, it is located in the posterior pole of the occipital cortex (the occipital cortex is responsible for processing visual stimuli). Called also first visual area. visual cortex the area of the occipital lobe of the cerebral cortex concerned with vision; the striate cortex is also called the first visual area, and the adjacent second and third visual areas serve as its association areas.rea 17 (which is also called the striate cortex or area because the line of Gennari is grossly visible on its surface) is the primary visual cortex, receiving the visual radiation from the lateral geniculate body of the thalamus. The primary visual cortex, V1, is the koniocortex (sensory type) located in and around the calcarine fissure in the occipital lobe. It is the one that receives information directly from the lateral geniculate nucleus .To this have been added later as many as thirty interconnected (secondary or tertiary) visual areas.esearch on the primary visual cortex can involve recording action potentials from electrodes within the brain of cats, ferrets, mice, or monkeys, or through recording intrinsic optical signals from animals or fMRI signals from human and monkey V1. However, only in the primary visual cortex (V1) can this band be seen with the naked eye as the line of Gennari (#6282, #8987). In fact, the term striate (striped) cortex is another name for primary visual cortex.Primary visual cortex is also referred to as either calcarine cortex or Brodmann’s area 17.djacent to primary visual cortex (V1) are the visual association cortical areas, including V2 and V3 (= area 18, or 18 & 19, depending on the author) (#4350). Lesions of V1, V2 and V3 produce identical visual field defects. Beyond V3, visual information is processed along two functionally different pathways. Primary visual cortex (V1) Edit. The primary visual cortex is the best studied visual area in the brain. Like that of all mammals studied, it is located in the posterior pole of the occipital cortex (the occipital cortex is responsible for processing visual stimuli).It is the simplest, earliest cortical visual area.esearch on the primary visual cortex can involve recording action potentials from electrodes within the brain of cats, ferrets, mice, or monkeys, or through recording intrinsic optical signals from animals or fMRI signals from human and monkey V1.
    4 Defining alcoholism as a disease is associated with Jellinek The formation of AA – Alcoholics Anonymous – in the 1930s and the publication of noted psychiatrist and Director of the Center of Alcohol Studies at Yale Medical School E. M. Jellinek’s famous book defining the concept of alcoholism as a medical disease facilitated moving alcoholism into a different light.s alcoholism is an addiction, it is considered a disease of the brain. The brain has been physically altered by extended exposure to alcohol, causing it to function differently and therefore creating addictive behavior. Nonetheless, it was Jellinek's Stages of the Alcoholism that led to diagnosing alcoholism as a disease and eventually to the medical acceptance of alcoholism as a disease. Astoundingly, the inception of the disease theory and treatment for substance abuse is based on fraud.n a recent Gallup poll, 90 percent of people surveyed believe that alcoholism is a disease. Most argue that because the American Medical Association (AMA) has proclaimed alcoholism a disease, the idea is without reproach. But, the fact is that the AMA made this determination in the absence of empirical evidence. 1 JELLINEK PHASES: THE PROGRESSIVE SYMPTOMS OF ALCOHOLISM The behavioral characteristics of the alcoholic are progressive as is the person's tolerance to alcohol and as is the course of the disease itself.ypes of alcoholism: Jellinek's species The pattern described above refers to the stages of alcohol addiction. Jellinek continued his study of alcoholism, focusing on alcohol problems in other countries. 1 Jellinek, E. M., Phases in the Drinking History of Alcoholics: Analysis of a Survey Conducted by the Official Organ of Alcoholics Anonymous, Quarterly Journal of Studies on Alcohol, Vol.7, (1946), pp. 2 1–88. 3 Jellinek, E. M., The Disease Concept of Alcoholism, Hillhouse, (New Haven), 1960.uring the 1920s, he conducted research in Sierra Leone and at Tela, Honduras. In the 1930s he returned to the U.S.A. and worked at the Worcester State Hospital, Worcester, Massachusetts, from whence he was commissioned to conduct a study for the Research Council on Problems of Alcohol. ALCOHOLISM: A DISEASE. In 1956 the American Medical Association decided that alcoholism is a disease, however more than 30 years later this is still debated in certain circles.Besides the medical opinion, there are many others (e.g., legal, sociological, religious) which derive from any number of social pressures.LCOHOLISM: A DISEASE. In 1956 the American Medical Association decided that alcoholism is a disease, however more than 30 years later this is still debated in certain circles.
  • Loss: pylate.losses.contrastive.Contrastive

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 1e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • fp16: True
  • push_to_hub: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss accuracy
1.0 30746 1.1105 - -
0 0 - - 0.8083
1.0 30746 - 1.1328 -
2.0 61492 0.9818 - -
0 0 - - 0.8146
2.0 61492 - 1.1017 -
3.0 92238 0.939 - -
0 0 - - 0.8172
3.0 92238 - 1.0939 -
4.0 122984 0.9216 - -
0 0 - - 0.8179
4.0 122984 - 1.0912 -
0 0 - - 0.8179

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.0.2
  • PyLate: 1.2.0
  • Transformers: 4.48.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.8.1
  • Datasets: 3.6.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084"
}

PyLate

@misc{PyLate,
title={PyLate: Flexible Training and Retrieval for Late Interaction Models},
author={Chaffin, Antoine and Sourty, Raphaël},
url={https://github.com/lightonai/pylate},
year={2024}
}
Downloads last month
10
Safetensors
Model size
11.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yosefw/colbert-bert-mini

Finetuned
(10)
this model

Dataset used to train yosefw/colbert-bert-mini

Evaluation results