intelpen commited on
Commit
e262bca
·
verified ·
1 Parent(s): f86c090

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -111
README.md CHANGED
@@ -179,127 +179,34 @@ print(tokenizer.decode(outputs[0]))
179
  </tbody>
180
  </table>
181
 
182
- ## Training Details
183
 
184
- ### Training Data
185
-
186
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
187
-
188
- [More Information Needed]
189
-
190
- ### Training Procedure
191
-
192
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
193
-
194
- #### Preprocessing [optional]
195
-
196
- [More Information Needed]
197
-
198
-
199
- #### Training Hyperparameters
200
-
201
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
202
-
203
- #### Speeds, Sizes, Times [optional]
204
-
205
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
206
-
207
- [More Information Needed]
208
-
209
- ## Evaluation
210
-
211
- <!-- This section describes the evaluation protocols and provides the results. -->
212
-
213
- ### Testing Data, Factors & Metrics
214
-
215
- #### Testing Data
216
-
217
- <!-- This should link to a Dataset Card if possible. -->
218
-
219
- [More Information Needed]
220
-
221
- #### Factors
222
-
223
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
224
-
225
- [More Information Needed]
226
-
227
- #### Metrics
228
-
229
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
230
-
231
- [More Information Needed]
232
-
233
- ### Results
234
-
235
- [More Information Needed]
236
-
237
- #### Summary
238
-
239
-
240
-
241
- ## Model Examination [optional]
242
-
243
- <!-- Relevant interpretability work for the model goes here -->
244
-
245
- [More Information Needed]
246
-
247
- ## Environmental Impact
248
-
249
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
250
-
251
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
252
-
253
- - **Hardware Type:** [More Information Needed]
254
- - **Hours used:** [More Information Needed]
255
- - **Cloud Provider:** [More Information Needed]
256
- - **Compute Region:** [More Information Needed]
257
- - **Carbon Emitted:** [More Information Needed]
258
-
259
- ## Technical Specifications [optional]
260
-
261
- ### Model Architecture and Objective
262
-
263
- [More Information Needed]
264
-
265
- ### Compute Infrastructure
266
-
267
- [More Information Needed]
268
 
269
  #### Hardware
270
 
271
- [More Information Needed]
272
 
273
  #### Software
274
 
275
  [More Information Needed]
276
 
277
- ## Citation [optional]
278
-
279
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
280
-
281
- **BibTeX:**
282
-
283
- [More Information Needed]
284
-
285
- **APA:**
286
-
287
- [More Information Needed]
288
-
289
- ## Glossary [optional]
290
-
291
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
292
-
293
- [More Information Needed]
294
-
295
- ## More Information [optional]
296
-
297
- [More Information Needed]
298
 
299
- ## Model Card Authors [optional]
 
 
 
 
300
 
301
- [More Information Needed]
302
 
303
- ## Model Card Contact
304
 
305
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
179
  </tbody>
180
  </table>
181
 
 
182
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
183
 
184
  #### Hardware
185
 
186
+ Nvidia RTX 4090 16GB, Laptop Version
187
 
188
  #### Software
189
 
190
  [More Information Needed]
191
 
192
+ ## RoLlama3 Model Family
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
193
 
194
+ | Model | Link |
195
+ |--------------------|:--------:|
196
+ |RoLlama3-8b-Instruct-2024-06-28| [link](https://huggingface.co/OpenLLM-Ro/RoLlama3-8b-Instruct-2024-06-28) |
197
+ |*RoLlama3-8b-Instruct-2024-10-09*| [link](https://huggingface.co/OpenLLM-Ro/RoLlama3-8b-Instruct-2024-10-09) |
198
+ |RoLlama3-8b-Instruct-DPO-2024-10-09| [link](https://huggingface.co/OpenLLM-Ro/RoLlama3-8b-Instruct-DPO-2024-10-09) |
199
 
 
200
 
201
+ ## Citation
202
 
203
+ ```
204
+ @misc{masala2024vorbecstiromanecsterecipetrain,
205
+ title={"Vorbe\c{s}ti Rom\^ane\c{s}te?" A Recipe to Train Powerful Romanian LLMs with English Instructions},
206
+ author={Mihai Masala and Denis C. Ilie-Ablachim and Alexandru Dima and Dragos Corlatescu and Miruna Zavelca and Ovio Olaru and Simina Terian-Dan and Andrei Terian-Dan and Marius Leordeanu and Horia Velicu and Marius Popescu and Mihai Dascalu and Traian Rebedea},
207
+ year={2024},
208
+ eprint={2406.18266},
209
+ archivePrefix={arXiv},
210
+ primaryClass={cs.CL},
211
+ url={https://arxiv.org/abs/2406.18266},
212
+ }