Commit
·
a59c08c
1
Parent(s):
8467541
added RUN:2::gpt-2-medium
Browse files- README.md +68 -0
- added_tokens.json +1 -0
- config.json +41 -0
- merges.txt +0 -0
- pytorch_model.bin +3 -0
- special_tokens_map.json +1 -0
- tokenizer.json +0 -0
- tokenizer_config.json +1 -0
- vocab.json +0 -0
README.md
ADDED
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# PyCoder 🐍
|
2 |
+
|
3 |
+
<img alt="Made With Python" src="http://ForTheBadge.com/images/badges/made-with-python.svg">
|
4 |
+
|
5 |
+
<!-- <img alt="Medium" src="https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white" height=35/> -->
|
6 |
+
|
7 |
+
<!-- [](https://pypi.org/project/torchlit/) -->
|
8 |
+
|
9 |
+
`PyCoder` is a tool to generate python code out of a few given topics and a description. It uses GPT-2 language model as its engine. Pycoder poses writing Python code as a conditional-Causal Language Modelling(c-CLM). It has been trained on millions of lines of Python code written by all of us. At the current stage and state of training, it produces sensible code with few lines of description, but the scope of improvement for the model is limitless.
|
10 |
+
|
11 |
+
Pycoder has been developed as a Command-Line tool (CLI), an API endpoint, as well as a python package (yet to be deployed to PyPI). This repository acts as a framework for anyone who either wants to try to build Pycoder from scratch or turn Pycoder into maybe a `CPPCoder` or `JSCoder` 😃. A blog post about the development of the project will be released soon.
|
12 |
+
|
13 |
+
To use `Pycoder` as a CLI utility, clone the repository as normal, and install the package with:
|
14 |
+
```console
|
15 |
+
foo@bar:❯ python setup.py install
|
16 |
+
```
|
17 |
+
After this the package could be verified and accessed as either a native CLI tool or a python package with:
|
18 |
+
```console
|
19 |
+
foo@bar:❯ python -m pycoder --version
|
20 |
+
```
|
21 |
+
Or directly as:
|
22 |
+
```console
|
23 |
+
foo@bar:❯ pycoder --version
|
24 |
+
```
|
25 |
+
|
26 |
+
The API endpoint is deployed using FastAPI. Once all the requirements have been installed for the project, the API can be accessed with:
|
27 |
+
```console
|
28 |
+
foo@bar:❯ pycoder --endpoint PORT_NUMBER
|
29 |
+
```
|
30 |
+
Or
|
31 |
+
```console
|
32 |
+
foo@bar:❯ pycoder -e PORT_NUMBER
|
33 |
+
```
|
34 |
+
|
35 |
+
|
36 |
+
## Tech Stack
|
37 |
+
<p align="center">
|
38 |
+
<img alt="Python" src="https://img.shields.io/badge/python-%2314354C.svg?style=for-the-badge&logo=python&logoColor=white" style="display:inline;" />
|
39 |
+
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=for-the-badge&logo=PyTorch&logoColor=white" style="display:inline;" />
|
40 |
+
<img alt="Docker" src="https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white" style="display:inline;" />
|
41 |
+
<img src="https://fastapi.tiangolo.com/img/logo-margin/logo-teal.png" alt="FastAPI" style="display:inline; background-color:black; height:28px;" />
|
42 |
+
<img src="https://typer.tiangolo.com/img/logo-margin/logo-margin-vector.svg" style="display:inline; background-color:teal; height:28px;" />
|
43 |
+
</p>
|
44 |
+
|
45 |
+
## Tested Platforms
|
46 |
+
<p align="center">
|
47 |
+
<img alt="Linux" src="https://img.shields.io/badge/Linux-FCC624?style=for-the-badge&logo=linux&logoColor=black" style="display:inline;" />
|
48 |
+
<img alt="Windows 10" src="https://img.shields.io/badge/Windows-0078D6?style=for-the-badge&logo=windows&logoColor=white" style="display:inline;" />
|
49 |
+
</p>
|
50 |
+
|
51 |
+
|
52 |
+
## BibTeX
|
53 |
+
If you want to cite the framework feel free to use this:
|
54 |
+
|
55 |
+
```bibtex
|
56 |
+
@article{dutta2021pycoder,
|
57 |
+
title={Pycoder},
|
58 |
+
author={Dutta, H},
|
59 |
+
journal={GitHub. Note: https://github.com/himanshu-dutta/pycoder},
|
60 |
+
year={2021}
|
61 |
+
}
|
62 |
+
```
|
63 |
+
<hr />
|
64 |
+
|
65 |
+
<p align="center">
|
66 |
+
<img alt="MIT License" src="https://img.shields.io/github/license/himanshu-dutta/pycoder?style=for-the-badge&logo=appveyor" style="display:inline;" />
|
67 |
+
<img src="https://img.shields.io/badge/Copyright-Himanshu_Dutta-2ea44f?style=for-the-badge&logo=appveyor" style="display:inline;" />
|
68 |
+
</p>
|
added_tokens.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"<|PAD|>": 50260, "<|EOS|>": 50258, "<|BOS|>": 50257, "<|SEP|>": 50261, "<|UNK|>": 50259}
|
config.json
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "gpt2-medium",
|
3 |
+
"activation_function": "gelu_new",
|
4 |
+
"architectures": [
|
5 |
+
"GPT2LMHeadModel"
|
6 |
+
],
|
7 |
+
"attn_pdrop": 0.1,
|
8 |
+
"bos_token_id": 50257,
|
9 |
+
"embd_pdrop": 0.1,
|
10 |
+
"eos_token_id": 50258,
|
11 |
+
"gradient_checkpointing": false,
|
12 |
+
"initializer_range": 0.02,
|
13 |
+
"layer_norm_epsilon": 1e-05,
|
14 |
+
"model_type": "gpt2",
|
15 |
+
"n_ctx": 1024,
|
16 |
+
"n_embd": 1024,
|
17 |
+
"n_head": 16,
|
18 |
+
"n_inner": null,
|
19 |
+
"n_layer": 24,
|
20 |
+
"n_positions": 1024,
|
21 |
+
"n_special": 0,
|
22 |
+
"pad_token_id": 50260,
|
23 |
+
"predict_special_tokens": true,
|
24 |
+
"resid_pdrop": 0.1,
|
25 |
+
"scale_attn_weights": true,
|
26 |
+
"sep_token_id": 50261,
|
27 |
+
"summary_activation": null,
|
28 |
+
"summary_first_dropout": 0.1,
|
29 |
+
"summary_proj_to_labels": true,
|
30 |
+
"summary_type": "cls_index",
|
31 |
+
"summary_use_proj": true,
|
32 |
+
"task_specific_params": {
|
33 |
+
"text-generation": {
|
34 |
+
"do_sample": true,
|
35 |
+
"max_length": 50
|
36 |
+
}
|
37 |
+
},
|
38 |
+
"transformers_version": "4.6.0",
|
39 |
+
"use_cache": true,
|
40 |
+
"vocab_size": 50262
|
41 |
+
}
|
merges.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e3d167e489a60a092812c52073c743717a988953abc6a55826e6730acd296af2
|
3 |
+
size 1444609955
|
special_tokens_map.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"bos_token": "<|BOS|>", "eos_token": "<|EOS|>", "unk_token": "<|UNK|>", "sep_token": "<|SEP|>", "pad_token": "<|PAD|>"}
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer_config.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"unk_token": "<|endoftext|>", "bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "add_prefix_space": false, "model_max_length": 1024, "special_tokens_map_file": null, "name_or_path": "gpt2-medium"}
|
vocab.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|