Model Card for Your Model
Model Details
Model Description
This is a transformers-based model fine-tuned for generative AI tasks, particularly in data engineering and AI service applications. It has been optimized for structured text generation, analytics, and AI-assisted workflows. The model supports multi-turn interactions and is designed for business intelligence, data insights, and technical documentation generation.
Developed by: [Harshraj Bhoite]
Funded by: Self-funded
Shared by: [Harshraj]
Model type: Transformer-based ( GPT-2)
Language(s) (NLP): English
License: Apache 2.0 / MIT / Custom
Finetuned from model: [GPT-2] (e.g., GPT-2, BERT, T5)
Model Sources
Repository: [https://huggingface.co/Harshraj8721/agri_finetuned_model]
Uses
Direct Use
AI-assisted data engineering documentation generation
Business intelligence reports and data insights automation
Technical content creation for AI and analytics
Downstream Use
Fine-tuning for Agriculture-specific AI
Conversational AI in data analytics applications
AI-driven customer support for analytics tools
Out-of-Scope Use
Not intended for real-time conversational AI without further optimization
May not perform well in non-English languages
Bias, Risks, and Limitations
Bias: Model performance may be influenced by the dataset used.
Limitations: It may generate inaccurate or misleading responses in highly technical scenarios.
Mitigation: Users should validate outputs for critical decision-making.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Harshraj8721/agri_finetuned_model" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
input_text = "Explain Delta Lake architecture" inputs = tokenizer(input_text, return_tensors="pt") output = model.generate(**inputs) print(tokenizer.decode(output[0]))
Training Details
Training Data
Dataset: Proprietary dataset of technical blogs, data engineering articles, and structured datasets.
Preprocessing: Tokenization with Byte Pair Encoding (BPE) or WordPiece.
Training Procedure
Hyperparameters
Batch size: 16
Learning rate: 3e-5
Precision: fp16 mixed precision
Optimizer: AdamW
Compute Infrastructure
Hardware: NVIDIA A100 GPUs (x4)
Cloud Provider: AWS / Azure / GCP
Training Duration: ~36 hours
Evaluation
Testing Data, Factors & Metrics
Testing Data
Synthetic datasets from AI-powered analytics use cases
Real-world structured datasets from data engineering pipelines
Metrics
Perplexity (PPL): Measures how well the model predicts text
BLEU Score: Evaluates generated text quality
F1 Score: Measures precision and recall
Results
Perplexity: 9.7 (lower is better)
BLEU Score: 34.2 (higher is better)
F1 Score: 85.5%
Environmental Impact
Hardware Type: NVIDIA A100 GPUs
Hours used: 36 hours
Carbon Emitted: ~50 kg CO2eq (estimated using ML CO2 Impact Calculator)
Citation
If you use this model, please cite it as follows:
@misc{Harshraj8721/agri_finetuned_model/2025, title={agri_finetuned_model}, author={Harshraj Bhoite}, year={2025}, url={https://huggingface.co/Harshraj8721/agri_finetuned_model} }
Contact
For queries, reach out to:
Email: [email protected]
LinkedIn: Linkedin/in/harshrajb/
- Downloads last month
- 46