Bschleter commited on
Commit
29c8585
·
1 Parent(s): 17495fc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +170 -0
README.md ADDED
@@ -0,0 +1,170 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: zero-shot-classification
5
+ tags:
6
+ - finance
7
+ - compliance
8
+ ---
9
+
10
+ # Model Card for Model ID
11
+
12
+ <!--
13
+ -->
14
+
15
+ ## Model Details
16
+ Based of the full weight llama 2-hermes from Nous Research.
17
+
18
+ ### Model Description
19
+
20
+ <!-- This model was fine tuned off the full weight llama-2-hermes-7B from Nous Research. This model is a preemptive V1, and a hastily put together model to assist
21
+ in finance and compliance tasks, mostly tuned to the new SEC Marketing and Compliance rules established in 2021. Later iterations will have more guidelines and rulings
22
+ unrelated to the SEC Marketing rule.
23
+ https://www.sec.gov/files/rules/final/2020/ia-5653.pdf
24
+ This is to help companies and individuals within compliance and marketing departments to determine and find issues within their marketing or public facing documents.
25
+ Since the new marketing rule is principles based it requires logic, experience, and reasoning to determine if a statement or advertisement would be compliant within
26
+ the SEC's new guidelines. This can lead to multiple viewpoints of compliant or not depending on the viewer. Thus this is a small/high quality dataset version
27
+ to aid or provide an second viewpoint of a public facing statement to help determine if something is compliant per the SEC's guidelines. The dataset was crafted by
28
+ reviewing the SEC Marketing rule, other scenarios, and providing reasoning within the ###n\ Response n\### to help guide the model in reasoning tasks.
29
+ Further versions will be reviewed more for accuracy, bias, and more data. -->
30
+
31
+
32
+
33
+ - **Developed by:** [More Information Needed]
34
+ - **Shared by [optional]:** [More Information Needed]
35
+ - **Model type:** [More Information Needed]
36
+ - **Language(s) (NLP):** [Enlgish]
37
+ - **License:** [More Information Needed]
38
+ - **Finetuned from model [optional]:** [llama 2-hermes-7b]
39
+
40
+ ### Model Sources [optional]
41
+
42
+ <!-- Provide the basic links for the model. -->
43
+
44
+ - **Repository:** [More Information Needed]
45
+ - **Paper [optional]:** [More Information Needed]
46
+ - **Demo [optional]:** [More Information Needed]
47
+
48
+ ## Uses
49
+
50
+ <!-- For use by marketing and compliance finance teams to assist in determination and interpretation of SEC Marketing rule and other SEC interpretations. No outputs should be guaranteed as fact,
51
+ and review of data is encouraged. This is to simply assist, and aid those in remembering certain aspects and interpretation of aspects of the long SEC Marketing guidelines
52
+ amongst other SEC rulings. -->
53
+
54
+ ### Direct Use
55
+
56
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
57
+
58
+ [More Information Needed]
59
+
60
+ ### Downstream Use [optional]
61
+
62
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
63
+
64
+ [More Information Needed]
65
+
66
+ ### Out-of-Scope Use
67
+
68
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
69
+
70
+ [More Information Needed]
71
+
72
+ ## Bias, Risks, and Limitations
73
+
74
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
75
+
76
+ [More Information Needed]
77
+
78
+ ### Recommendations
79
+
80
+ <!-- This is the first model iteration, and has not be fully reviewed by multiple professional peers for its accuracy, bias, and output variations.
81
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. -->
82
+
83
+ ## How to Get Started with the Model
84
+
85
+ Use the code below to get started with the model.
86
+
87
+ [More Information Needed]
88
+
89
+ ## Training Details
90
+
91
+ ### Training Data
92
+
93
+ <!-- -->
94
+
95
+
96
+ ### Training Procedure
97
+
98
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
99
+
100
+
101
+ #### Training Hyperparameters
102
+
103
+ - <!--# Compute dtype for 4-bit base models
104
+ bnb_4bit_compute_dtype = "float16"
105
+
106
+ bnb_4bit_quant_type = "nf4"
107
+ use_nested_quant = False
108
+ fp16 = False
109
+ bf16 = False - this will be True for next training run.
110
+ per_device_train_batch_size = 4
111
+ per_device_eval_batch_size = 4
112
+ gradient_accumulation_steps = 1
113
+ gradient_checkpointing = True
114
+ max_grad_norm = 0.3
115
+ learning_rate = 2e-5 -1 e-4 for a 13B will be applied.
116
+ weight_decay = 0.001
117
+ optim = "paged_adamw_32bit"
118
+ lr_scheduler_type = "constant"
119
+ max_steps = 13000
120
+ warmup_ratio = 0.03
121
+ group_by_length = True
122
+ -->
123
+
124
+
125
+
126
+ ## Evaluation
127
+
128
+ <!-- This section describes the evaluation protocols and provides the results. -->
129
+
130
+ ### Testing Data, Factors & Metrics
131
+
132
+ #### Testing Data
133
+
134
+ <!-- This should link to a Data Card if possible. -->
135
+
136
+
137
+
138
+ [More Information Needed]
139
+
140
+ #### Metrics
141
+
142
+ <!-- -->
143
+
144
+ [More Information Needed]
145
+
146
+ ### Results
147
+
148
+ [More Information Needed]
149
+
150
+ #### Summary
151
+
152
+
153
+
154
+ ## Model Examination [optional]
155
+
156
+ <!-- Relevant interpretability work for the model goes here -->
157
+
158
+ [More Information Needed]
159
+
160
+ ### Model Architecture and Objective
161
+
162
+ [More Information Needed]
163
+
164
+ ### Compute Infrastructure
165
+
166
+ [Google Colab]
167
+
168
+ #### Hardware
169
+
170
+ [1xA100]