Nevidu commited on
Commit
52a6dfd
·
verified ·
1 Parent(s): 5169989

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -18
README.md CHANGED
@@ -18,24 +18,7 @@ Students & Academic Teams: Showcase research on model adaptation, data augmentat
18
  Industry & Startups: Demonstrate practical performance in real-world pipelines; optimise inference speed, resource usage.
19
 
20
  2. Allowed Base Models
21
- Participants must choose one of the following (or any other fully open-source LLM ≤ 8 B params):
22
- Model Name
23
- Parameters
24
- Notes
25
- Llama 3
26
- 1B, 3B, 7B
27
- Meta's Llama series, particularly the smaller versions, is designed for efficiency and multilingual text generation. While the larger Llama models are more widely known, the 1B and 3B models offer a compact solution. Meta has also shown interest in addressing the linguistic diversity gap, which includes support for languages like Sinhala and Tamil.
28
- Gemma
29
- 2B, 4B
30
- Developed by Google DeepMind, Gemma models are known for being lightweight yet powerful, with strong multilingual capabilities. Google has a strong focus on linguistic diversity, and Gemma's architecture makes it a good candidate for adapting to less-resourced languages.
31
- Qwen-2
32
- 0.5B, 1.5B, 7B
33
- This family of models from Alibaba is designed for efficiency and versatility. Their strong multilingual pretraining makes them good candidates for adaptation to Sinhala and Tamil through fine-tuning.
34
- Microsoft Phi-3-Mini
35
- 3.8B
36
- This model from Microsoft is highlighted for its strong reasoning and code generation capabilities within a compact size. While its primary focus isn't explicitly on a wide range of South Asian languages, its efficient design and good general language understanding could make it a suitable base for fine-tuning with Sinhala and Tamil data.
37
- Or … any other open-source checkpoint ≤ 8 B params
38
-
39
  Note: Proprietary or closed-license models (e.g., GPT-3 series, Claude) are not allowed.
40
 
41
  3. Data Resources and Evaluation
 
18
  Industry & Startups: Demonstrate practical performance in real-world pipelines; optimise inference speed, resource usage.
19
 
20
  2. Allowed Base Models
21
+ Participants must choose one of the following (or any other fully open-source LLM ≤ 8 B params)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  Note: Proprietary or closed-license models (e.g., GPT-3 series, Claude) are not allowed.
23
 
24
  3. Data Resources and Evaluation