LuciferMorningStar
commited on
Commit
Β·
0e26089
1
Parent(s):
8c4f1f4
Commits
Browse files- .DS_Store +0 -0
- Gradio-Space/README.md +13 -0
- app.py β Gradio-Space/app.py +0 -0
- requirements.txt β Gradio-Space/requirements.txt +0 -0
- Nemotron.md +85 -0
- README.md +201 -9
- pipeline.py +53 -0
.DS_Store
ADDED
Binary file (6.15 kB). View file
|
|
Gradio-Space/README.md
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: Morningstar Omega
|
3 |
+
emoji: π¬
|
4 |
+
colorFrom: yellow
|
5 |
+
colorTo: purple
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 5.0.1
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
license: mit
|
11 |
+
---
|
12 |
+
|
13 |
+
An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
|
app.py β Gradio-Space/app.py
RENAMED
File without changes
|
requirements.txt β Gradio-Space/requirements.txt
RENAMED
File without changes
|
Nemotron.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Model Link
|
2 |
+
|
3 |
+
<https://huggingface.co/chat/models/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF>
|
4 |
+
|
5 |
+
## Model Card: Llama-3.1-Nemotron-70B-Instruct-HF
|
6 |
+
|
7 |
+
The Llama-3.1-Nemotron-70B-Instruct-HF model is a fine-tuned variant of the Llama-3.1 model, specifically designed for instruction-following tasks. This model card provides an overview of the model's capabilities, limitations, and intended use cases.
|
8 |
+
|
9 |
+
## Model Description
|
10 |
+
|
11 |
+
The Llama-3.1-Nemotron-70B-Instruct-HF model is a transformer-based language model that leverages the power of large-scale pre-training to generate coherent and contextually relevant text. It is trained on a diverse range of tasks, including but not limited to:
|
12 |
+
|
13 |
+
* Text generation
|
14 |
+
* Language translation
|
15 |
+
* Question answering
|
16 |
+
* Text classification
|
17 |
+
|
18 |
+
The model's architecture is based on the transformer model introduced in the paper "Attention Is All You Need" by Vaswani et al. in 2017. It consists of an encoder and a decoder, where the encoder processes the input sequence and the decoder generates the output sequence.
|
19 |
+
|
20 |
+
### Training Details
|
21 |
+
|
22 |
+
The Llama-3.1-Nemotron-70B-Instruct-HF model was trained on a large corpus of text data, including but not limited to:
|
23 |
+
|
24 |
+
* Web pages
|
25 |
+
* Books
|
26 |
+
* Articles
|
27 |
+
* Research papers
|
28 |
+
|
29 |
+
The training process involved a combination of masked language modeling, next sentence prediction, and other tasks to improve the model's language understanding and generation capabilities.
|
30 |
+
|
31 |
+
### Capabilities
|
32 |
+
|
33 |
+
The Llama-3.1-Nemotron-70B-Instruct-HF model is capable of:
|
34 |
+
|
35 |
+
* Generating coherent and contextually relevant text based on a given prompt or input
|
36 |
+
* Following instructions and generating text that adheres to specific guidelines or formats
|
37 |
+
* Answering questions based on the content of a given text or context
|
38 |
+
* Translating text from one language to another
|
39 |
+
* Classifying text into predefined categories
|
40 |
+
|
41 |
+
### Limitations
|
42 |
+
|
43 |
+
While the Llama-3.1-Nemotron-70B-Instruct-HF model is a powerful tool for natural language processing tasks, it is not without its limitations. Some of the known limitations include:
|
44 |
+
|
45 |
+
* The model may generate text that is not entirely accurate or relevant to the context, especially in cases where the input prompt is ambiguous or open-ended
|
46 |
+
* The model may struggle with tasks that require a deep understanding of specific domains or technical knowledge
|
47 |
+
* The model may not always follow instructions precisely, especially if the instructions are complex or open to interpretation
|
48 |
+
|
49 |
+
### Intended Use Cases
|
50 |
+
|
51 |
+
The Llama-3.1-Nemotron-70B-Instruct-HF model is intended for use in a variety of applications, including but not limited to:
|
52 |
+
|
53 |
+
* Chatbots and virtual assistants
|
54 |
+
* Content generation and writing assistance
|
55 |
+
* Language translation and localization
|
56 |
+
* Question answering and information retrieval
|
57 |
+
* Text classification and sentiment analysis
|
58 |
+
|
59 |
+
### Ethical Considerations
|
60 |
+
|
61 |
+
As with any AI model, there are ethical considerations to be taken into account when using the Llama-3.1-Nemotron-70B-Instruct-HF model. Some of the key considerations include:
|
62 |
+
|
63 |
+
* Ensuring that the model is used in a way that is fair and unbiased
|
64 |
+
* Avoiding the use of the model to generate misleading or harmful content
|
65 |
+
* Ensuring that the model is transparent and explainable in its decision-making processes
|
66 |
+
* Addressing any potential biases or inaccuracies in the model's output
|
67 |
+
|
68 |
+
By understanding the capabilities and limitations of the Llama-3.1-Nemotron-70B-Instruct-HF model, developers and users can harness its power to create innovative applications that benefit society as a whole.
|
69 |
+
|
70 |
+
### Usage Example
|
71 |
+
|
72 |
+
To use the Llama-3.1-Nemotron-70B-Instruct-HF model, you can initialize it as follows:
|
73 |
+
|
74 |
+
```python
|
75 |
+
from transformers import pipeline
|
76 |
+
|
77 |
+
model = pipeline("text-generation", model="nvidia/Llama-3.1-Nemotron-70B-Instruct-HF")
|
78 |
+
output = model("Your input prompt here")
|
79 |
+
print(output)
|
80 |
+
```
|
81 |
+
|
82 |
+
### Explanation of Updates
|
83 |
+
|
84 |
+
* **Usage Example**: Added a practical example in `Nemotron.md` to help users understand how to implement the model.
|
85 |
+
* **Model Performance Metrics**: Introduced a new section in `README.md` to provide users with insights into how the model's performance is measured.
|
README.md
CHANGED
@@ -1,13 +1,205 @@
|
|
1 |
---
|
2 |
-
title: Morningstar Omega
|
3 |
-
emoji: π¬
|
4 |
-
colorFrom: yellow
|
5 |
-
colorTo: purple
|
6 |
-
sdk: gradio
|
7 |
-
sdk_version: 5.0.1
|
8 |
-
app_file: app.py
|
9 |
-
pinned: false
|
10 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: mit
|
3 |
+
base_model:
|
4 |
+
- nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
|
5 |
+
- nvidia/Llama-3.1-Nemotron-70B-Instruct
|
6 |
+
datasets:
|
7 |
+
- neuralwork/arxiver
|
8 |
+
pipeline_tag: text2text-generation
|
9 |
+
tags:
|
10 |
+
- Neuroscience
|
11 |
+
- chemistry
|
12 |
+
- code
|
13 |
---
|
14 |
|
15 |
+
# Morningstar-Omega Model README
|
16 |
+
|
17 |
+
## Project: Morningstar-Omega
|
18 |
+
|
19 |
+
Welcome to Morningstar-Omega, a text generation model designed to provide state-of-the-art performance in neuroscience and chemistry text generation tasks. This repository contains the model, its documentation, usage guidelines, and licensing information.
|
20 |
+
|
21 |
+
Repository: Lucius-Morningstar/Morningstar-Omega
|
22 |
+
Model Type: Text2Text Generation
|
23 |
+
Related Fields: Neuroscience, Chemistry
|
24 |
+
Model ID DOI: doi:10.57967/hf/3369
|
25 |
+
arXiv Paper: 1910.09700
|
26 |
+
License: MIT License
|
27 |
+
|
28 |
+
## Model Overview
|
29 |
+
|
30 |
+
The Morningstar-Omega model leverages advancements in neural networks to generate high-quality, contextually accurate text in response to a given input, focusing particularly on applications in neuroscience and chemistry.
|
31 |
+
|
32 |
+
### Model Details
|
33 |
+
|
34 |
+
β’ Developed by: [Lucius-Morningstar]
|
35 |
+
β’ Funded by: [optional: Specify Funding Agency]
|
36 |
+
β’ Model Type: Text2Text Generation
|
37 |
+
β’ Languages: English (NLP), with potential for multilingual support
|
38 |
+
β’ License: MIT License
|
39 |
+
β’ Finetuned from: [Original Base Model, if applicable]
|
40 |
+
|
41 |
+
### Model Sources
|
42 |
+
|
43 |
+
β’ Repository: Lucius-Morningstar/Morningstar-Omega
|
44 |
+
β’ Paper: arXiv:1910.09700
|
45 |
+
β’ Demo: [Add Link to Demo, if available]
|
46 |
+
|
47 |
+
#### Usage
|
48 |
+
|
49 |
+
### Direct Use
|
50 |
+
|
51 |
+
This model can be used for generating scientific text in neuroscience and chemistry, specifically aimed at applications requiring complex, contextually aware language generation. Ideal for academic, research, and professional environments needing coherent, topic-specific text output.
|
52 |
+
|
53 |
+
### Downstream Use
|
54 |
+
|
55 |
+
Potential downstream applications include:
|
56 |
+
|
57 |
+
β’ Automated scientific paper generation
|
58 |
+
β’ Text generation for hypothesis testing in neuroscience and chemistry
|
59 |
+
β’ Educational tools and scientific summarization tasks
|
60 |
+
|
61 |
+
## Out-of-Scope Use
|
62 |
+
|
63 |
+
The model is not recommended for:
|
64 |
+
|
65 |
+
β’ Tasks outside scientific and technical domains, as it may lack contextual accuracy in broader fields.
|
66 |
+
β’ Generating personal or sensitive information where text accuracy and ethical considerations are paramount.
|
67 |
+
|
68 |
+
### Model Bias, Risks, and Limitations
|
69 |
+
|
70 |
+
The Morningstar-Omega model, like many large language models, is subject to biases present in its training data. Users should be aware of potential limitations, including:
|
71 |
+
|
72 |
+
β’ Bias in Scientific Domains: Training data may reflect predominant theories, leading to a reinforcement of certain scientific biases.
|
73 |
+
β’ Data Gaps: Specific areas in neuroscience or chemistry may be underrepresented.
|
74 |
+
β’ Ethical Considerations: Content generation should comply with ethical standards, especially in academic and professional contexts.
|
75 |
+
|
76 |
+
## Recommendations
|
77 |
+
|
78 |
+
Users should validate the modelβs output in scientific contexts and critically assess any generated content for accuracy, especially for high-stakes applications.
|
79 |
+
|
80 |
+
Getting Started
|
81 |
+
|
82 |
+
To begin using the model, you can follow these steps:
|
83 |
+
|
84 |
+
Installation
|
85 |
+
|
86 |
+
# Clone the repository
|
87 |
+
|
88 |
+
git clone <https://github.com/Lucius-Morningstar/Morningstar-Omega.git>
|
89 |
+
cd Morningstar-Omega
|
90 |
+
|
91 |
+
# Install dependencies
|
92 |
+
|
93 |
+
pip install -r requirements.txt
|
94 |
+
|
95 |
+
Usage Example
|
96 |
+
|
97 |
+
from morningstar_omega import Model
|
98 |
+
|
99 |
+
# Initialize model
|
100 |
+
|
101 |
+
model = Model.load('path/to/pretrained_model')
|
102 |
+
|
103 |
+
## Text Generation
|
104 |
+
|
105 |
+
output = model.generate("Describe the process of synaptic transmission in the brain.")
|
106 |
+
print(output)
|
107 |
+
|
108 |
+
Training Details
|
109 |
+
|
110 |
+
Training Data
|
111 |
+
|
112 |
+
The model was trained on a curated dataset combining publicly available resources in neuroscience and chemistry research articles, augmented with domain-specific text to enhance language capabilities.
|
113 |
+
|
114 |
+
Training Procedure
|
115 |
+
|
116 |
+
Preprocessing
|
117 |
+
|
118 |
+
Data was tokenized and cleaned to ensure scientific accuracy and context. Irrelevant or low-quality samples were removed.
|
119 |
+
|
120 |
+
Training Hyperparameters
|
121 |
+
|
122 |
+
β’ Training Regime: Fine-tuning based on neural network hyperparameter optimization.
|
123 |
+
β’ Epochs: [Specify]
|
124 |
+
β’ Batch Size: [Specify]
|
125 |
+
β’ Learning Rate: [Specify]
|
126 |
+
|
127 |
+
Speeds, Sizes, Times
|
128 |
+
|
129 |
+
β’ Model Size: [Model size, e.g., 1.2B parameters]
|
130 |
+
β’ Training Time: [Specify]
|
131 |
+
|
132 |
+
Evaluation
|
133 |
+
|
134 |
+
Testing Data, Factors & Metrics
|
135 |
+
|
136 |
+
Testing Data
|
137 |
+
|
138 |
+
The model was evaluated using a set of scientific articles and technical documents in neuroscience and chemistry.
|
139 |
+
|
140 |
+
Factors
|
141 |
+
|
142 |
+
Evaluation focused on metrics like coherence, relevance to input prompts, factual accuracy, and linguistic diversity.
|
143 |
+
|
144 |
+
Metrics
|
145 |
+
|
146 |
+
β’ Perplexity: [Specify]
|
147 |
+
β’ BLEU Score: [Specify]
|
148 |
+
β’ Accuracy in Factual Generation: [Specify]
|
149 |
+
|
150 |
+
Results
|
151 |
+
|
152 |
+
The model achieved [Specify Results] on standard evaluation benchmarks, indicating high performance in scientific text generation.
|
153 |
+
|
154 |
+
Summary
|
155 |
+
|
156 |
+
The Morningstar-Omega model is a specialized text generation tool for neuroscience and chemistry applications, delivering precise and relevant language generation capabilities for academic and research use. Its design allows for detailed exploration of scientific topics, enhancing productivity in technical fields.
|
157 |
+
|
158 |
+
Environmental Impact
|
159 |
+
|
160 |
+
To assess the environmental footprint of training this model, use the Machine Learning Impact calculator as suggested by Lacoste et al. (2019).
|
161 |
+
|
162 |
+
β’ Hardware Type: [e.g., GPU, TPU]
|
163 |
+
β’ Hours Used: [Specify]
|
164 |
+
β’ Cloud Provider: [Specify, if applicable]
|
165 |
+
β’ Compute Region: [Specify, if applicable]
|
166 |
+
β’ Carbon Emitted: [Estimate, if available]
|
167 |
+
|
168 |
+
Technical Specifications
|
169 |
+
|
170 |
+
Model Architecture and Objective
|
171 |
+
|
172 |
+
The model architecture is based on [Specify neural network architecture, e.g., Transformer-based architecture optimized for text-to-text generation].
|
173 |
+
|
174 |
+
Compute Infrastructure
|
175 |
+
|
176 |
+
β’ Hardware: [Specify hardware used during training, e.g., NVIDIA Tesla GPUs]
|
177 |
+
β’ Software Dependencies: Listed in requirements.txt
|
178 |
+
|
179 |
+
Citation
|
180 |
+
|
181 |
+
If you use this model in your work, please cite it as follows:
|
182 |
+
|
183 |
+
BibTeX:
|
184 |
+
|
185 |
+
@article{lucius2024morningstar,
|
186 |
+
title={Morningstar-Omega: Advanced Text Generation for Neuroscience and Chemistry},
|
187 |
+
author={Lucius-Morningstar},
|
188 |
+
journal={Neuralwork/arxiver},
|
189 |
+
doi={10.57967/hf/3369},
|
190 |
+
year={2024}
|
191 |
+
}
|
192 |
+
|
193 |
+
APA:
|
194 |
+
Lucius-Morningstar. (2024). Morningstar-Omega: Advanced Text Generation for Neuroscience and Chemistry. Neuralwork/arxiver. doi:10.57967/hf/3369.
|
195 |
+
|
196 |
+
Glossary
|
197 |
+
|
198 |
+
β’ Synaptic Transmission: [Define term]
|
199 |
+
β’ Neuroplasticity: [Define term]
|
200 |
+
β’ Molecular Modeling: [Define term]
|
201 |
+
|
202 |
+
Contact
|
203 |
+
|
204 |
+
For any questions or issues, please reach out to [Contact Information].
|
205 |
+
|
pipeline.py
ADDED
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from transformers import pipeline
|
2 |
+
from datasets import load_dataset
|
3 |
+
|
4 |
+
# Initialize the pipeline with the Llama 3.2 model
|
5 |
+
model = pipeline("text-generation", model="meta-llama/Llama-3.2-1B")
|
6 |
+
|
7 |
+
def load_data():
|
8 |
+
"""
|
9 |
+
Load the dataset from the specified source.
|
10 |
+
|
11 |
+
Returns:
|
12 |
+
- Dataset object containing the loaded data.
|
13 |
+
"""
|
14 |
+
try:
|
15 |
+
ds = load_dataset("neuralwork/arxiver")
|
16 |
+
return ds
|
17 |
+
except Exception as e:
|
18 |
+
print(f"An error occurred while loading the dataset: {e}")
|
19 |
+
return None
|
20 |
+
|
21 |
+
def generate_text(prompt, max_length=50, num_return_sequences=1, temperature=1.0):
|
22 |
+
"""
|
23 |
+
Generate text using the Llama 3.2 model.
|
24 |
+
|
25 |
+
Parameters:
|
26 |
+
- prompt (str): The input prompt for text generation.
|
27 |
+
- max_length (int): The maximum length of the generated text.
|
28 |
+
- num_return_sequences (int): The number of sequences to return.
|
29 |
+
- temperature (float): Controls the randomness of predictions. Lower values make the output more deterministic.
|
30 |
+
|
31 |
+
Returns:
|
32 |
+
- List of generated text sequences.
|
33 |
+
"""
|
34 |
+
try:
|
35 |
+
output = model(prompt, max_length=max_length, num_return_sequences=num_return_sequences, temperature=temperature)
|
36 |
+
return [o['generated_text'] for o in output]
|
37 |
+
except Exception as e:
|
38 |
+
print(f"An error occurred: {e}")
|
39 |
+
return []
|
40 |
+
|
41 |
+
# Example usage
|
42 |
+
if __name__ == "__main__":
|
43 |
+
# Load the dataset
|
44 |
+
dataset = load_data()
|
45 |
+
if dataset:
|
46 |
+
print("Dataset loaded successfully.")
|
47 |
+
# You can access specific splits of the dataset, e.g., dataset['train']
|
48 |
+
print(dataset['train'][0]) # Print the first example from the training set
|
49 |
+
|
50 |
+
prompt = "Describe the process of synaptic transmission in the brain."
|
51 |
+
generated_texts = generate_text(prompt, max_length=100, num_return_sequences=3, temperature=0.7)
|
52 |
+
for i, text in enumerate(generated_texts):
|
53 |
+
print(f"Generated Text {i+1}:\n{text}\n")
|