Yosef Worku Alemneh

rasyosef

AI & ML interests

Pretraining, Supervised Fine Tuning, Direct Preference Optimization, Retrieval Augmented Generation (RAG), Function Calling

Recent Activity

updated a dataset 4 days ago

rasyosef/amharic-passage-retrieval-dataset

published a dataset 4 days ago

rasyosef/amharic-passage-retrieval-dataset

updated a collection 9 days ago

Amharic Text Embedding Models

View all activity

Organizations

rasyosef's activity

New activity in tomaarsen/natural-questions-hard-negatives 12 days ago

Using hard negatives VS query, pos pair to train embedding models

#2 opened 14 days ago by

rasyosef

New activity in rasyosef/phi-2-instruct-apo 3 months ago

Adding Evaluation Results

#1 opened 6 months ago by

leaderboard-pr-bot

New activity in rasyosef/Mistral-NeMo-Minitron-8B-Chat 3 months ago

Adding Evaluation Results

#3 opened 6 months ago by

leaderboard-pr-bot

New activity in ContextualAI/ultrafeedback_clair_32k 3 months ago

Phi-2-Instruct-APO: aligned with Anchored Preference Optimization

#3 opened 6 months ago by

rasyosef

New activity in meta-llama/Llama-3.2-1B 4 months ago

[Query-ISSUE] tokenizer.vocab_size is 128000, however len(tokenizer) is 128256, which prevents me from using those other tokens.

#34 opened 4 months ago by

HV-Khurdula

What are the start and stop tokens of this model?

#40 opened 4 months ago by

aryaash

Is the BOS token id of 128000 hardcoded into the llama 3.2 tokenizer?

#17 opened 5 months ago by

rasyosef

New activity in nvidia/Mistral-NeMo-Minitron-8B-Base 6 months ago

Mistral-NeMo-Minitron-8B-Chat

#5 opened 7 months ago by

rasyosef

New activity in rasyosef/Phi-1_5-Instruct-v0.1 6 months ago

what is the context window size of this model , i means what is the input token and output tokens of this model

#1 opened 6 months ago by

naveen237

New activity in ContextualAI/ultrafeedback_clair_32k 6 months ago

APO Trainer in TRL?

#2 opened 6 months ago by

rasyosef

New activity in rasyosef/Mistral-NeMo-Minitron-8B-Chat 7 months ago

ChatML template does not work properly

#2 opened 7 months ago by

WasamiKirua

New activity in rasyosef/bert-medium-amharic 7 months ago

Collaboration

#1 opened 7 months ago by deleted

New activity in rasyosef/Llama-3.1-Minitron-4B-Chat 7 months ago

Error when trying to run

#1 opened 7 months ago by

ctranslate2-4you

New activity in microsoft/Phi-3.5-mini-instruct 7 months ago

What changed for people using this model in english?

#3 opened 7 months ago by

migueltalka

New activity in microsoft/phi-2 7 months ago

Phi 2 Instruct: an instruction following Phi 2 SLM that has undergone SFT and DPO

#132 opened 7 months ago by

rasyosef

New activity in open-llm-leaderboard/open_llm_leaderboard 7 months ago

What should a finetuned model's license be if the model is MIT but the datasets are Apache 2.0 and cc-by-4.0

#866 opened 7 months ago by

rasyosef

New activity in microsoft/phi-1_5 7 months ago

Phi 1.5 Instruct: an instruction following Phi 1.5 model that has undergone SFT and DPO

#89 opened 7 months ago by

rasyosef

New activity in rasyosef/amharic-sentences-corpus 8 months ago

Update README.md

#2 opened 8 months ago by

seyyaw

New activity in rasyosef/amharic-news-category-classification 10 months ago

Duplicate?

#2 opened 10 months ago by

israel

New activity in mistral-community/Mixtral-8x22B-Instruct-v0.1-4bit 11 months ago

Model card is about Mixtral-8x7B instead of Mixtral-8x22B

#3 opened 11 months ago by

rasyosef