metadata
language: en
license: apache-2.0
tags:
- semantic-search
- research-papers
- arxiv
- sbert
model_name: Fine-Tuned Semantic Search Model (Arxiv Papers)
base_model: sentence-transformers/all-MiniLM-L6-v2
datasets:
- arxiv_community/arxiv_dataset
arxiv-search
This model is a fine-tuned version of all-MiniLM-L6-v2
, trained on Arxiv research papers to perform semantic similarity search.
Model Details
- Base Model:
sentence-transformers/all-MiniLM-L6-v2
- Training Data: Arxiv Research Papers (
title + abstract
) - Fine-Tuned Task: Semantic Search
- Use Case: Find similar research papers based on a query
- License: Apache 2.0
How to Use
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Talina06/arxiv-search")
query = "Neural networks in medicine"
query_embedding = model.encode(query)
# Use FAISS or cosine similarity to retrieve similar papers
Training Details
- Training Data: 100k+ Arxiv research papers
- Training Framework: Sentence Transformers
- Hyperparameters:
- Learning Rate:
2e-5
- Batch Size:
100
- Epochs:
10
- Learning Rate:
- Hardware Used: TPU & GPU
Example Search Results
Query | Top Matching Paper Title | Similarity Score |
---|---|---|
"Neural networks in healthcare" | "Deep Learning for Medical Diagnosis" | 0.89 |
"Quantum cryptography" | "A Survey on Quantum-Safe Encryption" | 0.87 |