added RUN:2::gpt-2-medium

Browse files

Files changed (9) hide show

README.md +68 -0
added_tokens.json +1 -0
config.json +41 -0
merges.txt +0 -0
pytorch_model.bin +3 -0
special_tokens_map.json +1 -0
tokenizer.json +0 -0
tokenizer_config.json +1 -0
vocab.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,68 @@

+# PyCoder 🐍
+<img alt="Made With Python" src="http://ForTheBadge.com/images/badges/made-with-python.svg">
+<!-- <img alt="Medium" src="https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white" height=35/> -->
+<!-- [![PyPI version fury.io](https://badge.fury.io/py/torchlit.svg)](https://pypi.org/project/torchlit/)  -->
+`PyCoder` is a tool to generate python code out of a few given topics and a description. It uses GPT-2 language model as its engine. Pycoder poses writing Python code as a conditional-Causal Language Modelling(c-CLM). It has been trained on millions of lines of Python code written by all of us.  At the current stage and state of training, it produces sensible code with few lines of description, but the scope of improvement for the model is limitless.
+Pycoder has been developed as a Command-Line tool (CLI), an API endpoint, as well as a python package (yet to be deployed to PyPI). This repository acts as a framework for anyone who either wants to try to build Pycoder from scratch or turn Pycoder into maybe a `CPPCoder` or `JSCoder` 😃.  A blog post about the development of the project will be released soon.
+To use `Pycoder` as a CLI utility, clone the repository as normal, and install the package with:
+```console
+foo@bar:❯ python setup.py install
+```
+After this the package could be verified and accessed as either a native CLI tool or a python package with:
+```console
+foo@bar:❯ python -m pycoder --version
+```
+Or directly as:
+```console
+foo@bar:❯ pycoder --version
+```
+The API endpoint is deployed using FastAPI. Once all the requirements have been installed for the project, the API can be accessed with:
+```console
+foo@bar:❯ pycoder --endpoint PORT_NUMBER
+```
+Or
+```console
+foo@bar:❯ pycoder -e PORT_NUMBER
+```
+## Tech Stack
+<p align="center">
+<img alt="Python" src="https://img.shields.io/badge/python-%2314354C.svg?style=for-the-badge&logo=python&logoColor=white" style="display:inline;" />
+<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=for-the-badge&logo=PyTorch&logoColor=white" style="display:inline;" />
+<img alt="Docker" src="https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white" style="display:inline;" />
+<img src="https://fastapi.tiangolo.com/img/logo-margin/logo-teal.png" alt="FastAPI" style="display:inline; background-color:black; height:28px;" />
+<img src="https://typer.tiangolo.com/img/logo-margin/logo-margin-vector.svg" style="display:inline; background-color:teal; height:28px;" />
+</p>
+## Tested Platforms
+<p align="center">
+<img alt="Linux" src="https://img.shields.io/badge/Linux-FCC624?style=for-the-badge&logo=linux&logoColor=black" style="display:inline;" />
+<img alt="Windows 10" src="https://img.shields.io/badge/Windows-0078D6?style=for-the-badge&logo=windows&logoColor=white" style="display:inline;" />
+</p>
+## BibTeX
+If you want to cite the framework feel free to use this:
+```bibtex
+@article{dutta2021pycoder,
+  title={Pycoder},
+  author={Dutta, H},
+  journal={GitHub. Note: https://github.com/himanshu-dutta/pycoder},
+  year={2021}
+}
+```
+<hr />
+<p align="center">
+<img alt="MIT License" src="https://img.shields.io/github/license/himanshu-dutta/pycoder?style=for-the-badge&logo=appveyor" style="display:inline;" />
+<img src="https://img.shields.io/badge/Copyright-Himanshu_Dutta-2ea44f?style=for-the-badge&logo=appveyor" style="display:inline;" />
+</p>

added_tokens.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"<\|PAD\|>": 50260, "<\|EOS\|>": 50258, "<\|BOS\|>": 50257, "<\|SEP\|>": 50261, "<\|UNK\|>": 50259}

config.json ADDED Viewed

	@@ -0,0 +1,41 @@

+{
+  "_name_or_path": "gpt2-medium",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50257,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50258,
+  "gradient_checkpointing": false,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 1024,
+  "n_head": 16,
+  "n_inner": null,
+  "n_layer": 24,
+  "n_positions": 1024,
+  "n_special": 0,
+  "pad_token_id": 50260,
+  "predict_special_tokens": true,
+  "resid_pdrop": 0.1,
+  "scale_attn_weights": true,
+  "sep_token_id": 50261,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "transformers_version": "4.6.0",
+  "use_cache": true,
+  "vocab_size": 50262
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e3d167e489a60a092812c52073c743717a988953abc6a55826e6730acd296af2
+size 1444609955

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"bos_token": "<\|BOS\|>", "eos_token": "<\|EOS\|>", "unk_token": "<\|UNK\|>", "sep_token": "<\|SEP\|>", "pad_token": "<\|PAD\|>"}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"unk_token": "<\|endoftext\|>", "bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "add_prefix_space": false, "model_max_length": 1024, "special_tokens_map_file": null, "name_or_path": "gpt2-medium"}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff