File size: 2,695 Bytes
7393f7e d3eb500 7393f7e 7cacc67 7393f7e 8556d24 7393f7e 96d14b2 bee6a0f 7393f7e af0f872 7393f7e d3eb500 20d341a 7393f7e d3eb500 7393f7e 3a3f77e 13e343e 7393f7e d3eb500 bfcadf6 d3eb500 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
license: apache-2.0
tags:
- gpt-j
- llm
datasets:
- EleutherAI/pile
---
# MaryGPT Model Card
MaryGPT is a is a text generation model and a fine-tuned version of [GPT-J 6B](https://huggingface.co/EleutherAI/gpt-j-6b).
This model is **fine-tuned exclusively on text from Mary Shelley's 1818 novel ["Frankenstein; or, The Modern Prometheus"](https://www.gutenberg.org/ebooks/84)**.
This will be used as a base model for [**AI Artist Yuma Kishi👤**](https://obake2ai.com/)’s activity, including art creation and exhibition curation.
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63042f1e3926de1f7ec623b9/ejebUDIK3QZosK4j71HAB.jpeg)
<p align="right"><em>Portrait of Mary Shelley (1840, by Richard Rothwell, in the collection of the National Portrait Gallery)</em></p>
## Training Data Sources
All data was obtained ethically and in compliance with the site's terms and conditions.
No copyright texts are used in the training of this model without the permission.
- GPT-J 6B was trained on [the Pile](https://pile.eleuther.ai), a large-scale curated dataset created by [EleutherAI](https://www.eleuther.ai).
- Frankenstein; or, The Modern Prometheus, Mary Shelley, 1818 (Public domain)
## Training procedure
This model was trained for 402 billion tokens over 383,500 steps on TPU v3-256 pod. It was trained as an autoregressive language model, using cross-entropy loss to maximize the likelihood of predicting the next token correctly.
## How to use
This model can be easily loaded using the `AutoModelForCausalLM` functionality:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("obake2ai/MaryGPT")
model = AutoModelForCausalLM.from_pretrained("obake2ai/MaryGPT")
```
## Developed by
MaryGPT
- [Yuma Kishi](https://x.com/obake_ai)
GPT-J
- [Ben Wang](https://github.com/kingoflolz), developer of GPT-J
- [James Bradbury](https://twitter.com/jekbradbury) for valuable assistance with debugging JAX issues.
- [Stella Biderman](https://www.stellabiderman.com), [Eric Hallahan](https://twitter.com/erichallahan), [Kurumuz](https://github.com/kurumuz/), and [Finetune](https://github.com/finetuneanon/) for converting the model to be compatible with the `transformers` package.
- [Leo Gao](https://twitter.com/nabla_theta) for running zero shot evaluations for the baseline models for the table.
- [Laurence Golding](https://github.com/researcher2/) for adding some features to the web demo.
- [Aran Komatsuzaki](https://twitter.com/arankomatsuzaki) for advice with experiment design and writing the blog posts.
- [Janko Prester](https://github.com/jprester/) for creating the web demo frontend.
|