prital27 commited on
Commit
71d0e61
·
verified ·
1 Parent(s): bd575bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -20
README.md CHANGED
@@ -39,49 +39,56 @@ tags:
39
 
40
  <!-- Provide the basic links for the model. -->
41
 
42
- - **Repository:** [More Information Needed]
43
- - **Paper [optional]:** [More Information Needed]
44
  - **Demo [optional]:** [More Information Needed]
45
 
46
  ## Uses
47
 
48
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
49
 
50
  ### Direct Use
51
 
52
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
53
-
54
- [More Information Needed]
55
 
56
  ### Downstream Use [optional]
57
 
58
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
59
 
60
- [More Information Needed]
61
 
62
  ### Out-of-Scope Use
63
 
64
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
65
-
66
- [More Information Needed]
67
 
68
  ## Bias, Risks, and Limitations
69
 
70
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
71
-
72
- [More Information Needed]
73
 
74
  ### Recommendations
75
 
76
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 
77
 
78
  Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
79
 
80
  ## How to Get Started with the Model
81
 
82
- Use the code below to get started with the model.
 
 
 
 
 
 
 
 
 
 
 
83
 
84
- [More Information Needed]
85
 
86
  ## Training Details
87
 
@@ -102,8 +109,13 @@ Use the code below to get started with the model.
102
 
103
  #### Training Hyperparameters
104
 
105
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
106
 
 
107
  #### Speeds, Sizes, Times [optional]
108
 
109
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
@@ -136,11 +148,15 @@ Use the code below to get started with the model.
136
 
137
  ### Results
138
 
139
- [More Information Needed]
140
 
141
- #### Summary
142
 
 
 
 
143
 
 
144
 
145
  ## Model Examination [optional]
146
 
 
39
 
40
  <!-- Provide the basic links for the model. -->
41
 
42
+ - **Repository:** https://huggingface.co/prital27/tinyllama-lora-cli-utils
43
+ - **Paper [optional]:** N/A
44
  - **Demo [optional]:** [More Information Needed]
45
 
46
  ## Uses
47
 
 
48
 
49
  ### Direct Use
50
 
51
+ This model is fine-tuned for answering CLI-related questions. It is best suited for generating shell command suggestions for tasks involving tools like `git`,`tar`, `ssh`, general Unix commands and basic 'sed' and 'grep' commands. Ideal for use in AI assistants, terminal copilots, or educational tools.
 
 
52
 
53
  ### Downstream Use [optional]
54
 
55
+ This adapter can be integrated into a CLI assistant application or chatbot for developers and system administrators.
56
 
 
57
 
58
  ### Out-of-Scope Use
59
 
60
+ - Not suitable for general conversation or non-technical queries.
61
+ - Not intended for security-sensitive operations (e.g., altering SSH settings on production systems).
62
+ - May produce incorrect or unsafe commands if misused.
63
 
64
  ## Bias, Risks, and Limitations
65
 
66
+ - Does not generalize well to non-trained or very obscure command-line tools.
67
+ - May hallucinate incorrect or risky commands if given vague instructions.
68
+ - No safety layer is applied to verify command validity.
69
 
70
  ### Recommendations
71
 
72
+ - Use with human supervision.
73
+ - Always validate generated commands before execution.
74
 
75
  Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
76
 
77
  ## How to Get Started with the Model
78
 
79
+ ```python
80
+ from transformers import AutoTokenizer, AutoModelForCausalLM
81
+ from peft import PeftModel
82
+
83
+ tokenizer = AutoTokenizer.from_pretrained("prital27/tinyllama-lora-cli-utils")
84
+ base = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
85
+ model = PeftModel.from_pretrained(base, "prital27/tinyllama-lora-cli-utils")
86
+
87
+ prompt = "### Question:\nHow do I search for TODOs recursively?\n\n### Answer:\n"
88
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
89
+ outputs = model.generate(**inputs, max_new_tokens=50)
90
+ print(tokenizer.decode(outputs[0]))
91
 
 
92
 
93
  ## Training Details
94
 
 
109
 
110
  #### Training Hyperparameters
111
 
112
+ Precision: fp16 mixed precision
113
+
114
+ Epochs: 3
115
+
116
+ Batch Size: 2 (gradient accumulation = 2)
117
 
118
+ Learning Rate: 2e-4
119
  #### Speeds, Sizes, Times [optional]
120
 
121
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
148
 
149
  ### Results
150
 
151
+ Accuracy on direct prompts: ~85%
152
 
153
+ Basic shell command correctness: high
154
 
155
+ Limitations on multi-line/bash scripting: present
156
+
157
+ #### Summary
158
 
159
+ The model reliably suggests shell commands for common CLI tasks. Performance degrades on ambiguous prompts or complex multi-line scripts.
160
 
161
  ## Model Examination [optional]
162