Update README.md
Browse files
README.md
CHANGED
@@ -18,24 +18,7 @@ Students & Academic Teams: Showcase research on model adaptation, data augmentat
|
|
18 |
Industry & Startups: Demonstrate practical performance in real-world pipelines; optimise inference speed, resource usage.
|
19 |
|
20 |
2. Allowed Base Models
|
21 |
-
Participants must choose one of the following (or any other fully open-source LLM ≤ 8 B params)
|
22 |
-
Model Name
|
23 |
-
Parameters
|
24 |
-
Notes
|
25 |
-
Llama 3
|
26 |
-
1B, 3B, 7B
|
27 |
-
Meta's Llama series, particularly the smaller versions, is designed for efficiency and multilingual text generation. While the larger Llama models are more widely known, the 1B and 3B models offer a compact solution. Meta has also shown interest in addressing the linguistic diversity gap, which includes support for languages like Sinhala and Tamil.
|
28 |
-
Gemma
|
29 |
-
2B, 4B
|
30 |
-
Developed by Google DeepMind, Gemma models are known for being lightweight yet powerful, with strong multilingual capabilities. Google has a strong focus on linguistic diversity, and Gemma's architecture makes it a good candidate for adapting to less-resourced languages.
|
31 |
-
Qwen-2
|
32 |
-
0.5B, 1.5B, 7B
|
33 |
-
This family of models from Alibaba is designed for efficiency and versatility. Their strong multilingual pretraining makes them good candidates for adaptation to Sinhala and Tamil through fine-tuning.
|
34 |
-
Microsoft Phi-3-Mini
|
35 |
-
3.8B
|
36 |
-
This model from Microsoft is highlighted for its strong reasoning and code generation capabilities within a compact size. While its primary focus isn't explicitly on a wide range of South Asian languages, its efficient design and good general language understanding could make it a suitable base for fine-tuning with Sinhala and Tamil data.
|
37 |
-
Or … any other open-source checkpoint ≤ 8 B params
|
38 |
-
|
39 |
Note: Proprietary or closed-license models (e.g., GPT-3 series, Claude) are not allowed.
|
40 |
|
41 |
3. Data Resources and Evaluation
|
|
|
18 |
Industry & Startups: Demonstrate practical performance in real-world pipelines; optimise inference speed, resource usage.
|
19 |
|
20 |
2. Allowed Base Models
|
21 |
+
Participants must choose one of the following (or any other fully open-source LLM ≤ 8 B params)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
Note: Proprietary or closed-license models (e.g., GPT-3 series, Claude) are not allowed.
|
23 |
|
24 |
3. Data Resources and Evaluation
|