flax-community
/

code-mt5-base

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

code-mt5-base / README.md

jackal1586's picture

Add README

f066ef4 over 3 years ago

|

1.22 kB

	[tokenizer](#tokenizer) \| [model](#model) \| [datasets](#datasets) \| [plots](#plots) \| [fine tuning](#fine-tuning)

	# Tokenizer {#tokenizer}

	We trained our tokenizer using [sentencepiece](https://github.com/google/sentencepiece)'s unigram tokenizer. Then loaded the tokenizer as MT5TokenizerFast.

	## Model {#model}

	We used [MT5-base](https://huggingface.co/google/mt5-base) model.

	## Datasets {#datasets}

	We used [Code Search Net](https://huggingface.co/datasets/code_search_net)'s dataset and some scrapped data from internet to train the model. We maintained a list of datasets where each dataset had codes of same language.

	## Plots {#plots}

	[train loss](#train_loss) \| [evaluation loss](#eval_loss) \| [evaluation accuracy](#eval_acc) \| [learning rate](#lrs)

	### Train loss {#train_loss}

	![train loss](train_loss.png)

	### Evaluation loss {#eval_loss}

	![eval loss](eval_loss.png)

	### Evaluation accuracy {#eval_acc}

	![eval accuracy](eval_accuracy.png)

	### Learning rate {#lrs}

	![learning rate](learning_rate.png)

	## Fine tuning {#fine-tuning}

	We fine tuned the model with [CodeXGLUE code-to-code-trans dataset](https://huggingface.co/datasets/code_x_glue_cc_code_to_code_trans), and scrapper data.