import streamlit as st st.markdown( """ """, unsafe_allow_html=True, ) st.html( """ Debopam Chowdhury (Param)
Debopam Chowdhury (Param)

Debopam Chowdhury (Param)

Creator of AiGymBuddy.in | Machine Learning | Deep Learning | Flutter | Math | MLops | TensorFlow | FastAPI

(He/Him)

About Me

Passionate about Machine Learning, Computer Science and Mathematics. Has a strong grip over Machine & Deep Learning Fundamentals, Computer Networks, OS and DSA. Solved 100+ problems on leetcode. Have a keen interest in learning Mathematics, Deep Learning and Statistics. I like to understand things in a deeper way.

Skills

Technical

  • Programming: Python, Java, Dart, VS Code, Git, GitHub, Jupyter Notebooks, CI/CD
  • Machine Learning & Deep Learning: TensorFlow, PyTorch, Scikit-learn, Keras, Supervised Learning, Unsupervised Learning, Neural Networks, Sequence Modeling, Convolution, Attention Mechanisms, Transformer, GPT, BERT, Hyperparameter Optimization
  • Data Handling: Pandas, NumPy, Data Manipulation, Data Preparation, SQL, Pyspark, NoSQL
  • Cloud & DevOps: Docker, AWS, Cloud-AI, FastAPI
  • Mathematics: Linear Algebra, Probability, Statistics, Boosting Methods
  • Frameworks & Tools: Flutter, Firebase, Deep Learning Frameworks, Model Training & Optimization, Version Control, LLM FineTuning
  • Generative AI: Vector Embeddings, Indexing, Chunking, RAG pipelines, LlamaIndex, LangChain, Colpali, Byaldi, Chroma DB

Experience

GENERATIVE AI ENGINEER (Contract - Remote)

Private Client | Sydney, Australia

October - November 2024

Projects

LLM-finetuning and SQL Agent with Auto_Execution with DuckDB, Schema Retriever from CSVs, manual SQL executer.

Current WebAPP which you are using

Deployed

Deep Learning Based Recommendation System

Initially, I aimed to build a recommendation system from scratch using TensorFlow Recommenders (TFRS) on the MovieLens 1M dataset. This involved creating user and movie embeddings with a candidate generation and ranking model.

However, this TFRS approach proved too resource-intensive and time-consuming for effective training and testing. Crucially, the initial results weren't satisfactory for deployment.

Therefore, for the deployed web application, I switched to pre-trained models (BGE embeddings and re-ranking). This offered:

  • Better Performance: More relevant recommendations.
  • Reduced Resources/Time: Faster training and deployment.

While the TFRS code is also included in the web app, the pre-trained model approach was chosen for its superior results and efficiency in a deployment setting. A future improvement could be fine-tuning the pre-trained models for even better performance.

Technologies used: TensorFlow Recommenders, Scann, Vector DB, Distributed GPU Training, Langchain, Streamlit, BAAI BGE Models

Deployed

IBM EMPLOYEE ATTRITION PREDICTOR (End to End with Deployment to AWS, FastAPI with Proxy Server)

Objective: Predicted employee attrition with 85% AUC to improve employee retention and business performance.

Model Development: Hyperparameter optimized Multi-Layer Perceptron, XGBoost, Logistic Regression with Inference as well as Training Pipeline

Backend: Developed a FastAPI backend for real-time predictions, using Pydantic for schema validation for incoming and outgoing requests

Deployment: Containerized with Docker, deployed on AWS EC2, managed via AWS ECR.

CI/CD: Set up an automated CI/CD pipeline using GitHub Actions for seamless updates.

Web Application: Built a user-friendly interface using Flutter Web for real-time interaction.

Security: Handled HTTPS requests using Caddy as a reverse proxy server

Technologies used: TensorFlow, AWS, Docker, FastAPI, CI/CD Pipeline, Multi-Layer Perceptron, Neural Network, XGBoost, Logistic Regression, Hyperparameter Tuned Models, GitHub Actions, Pydantic, Flutter Web, Reverse-Proxy-Server: Caddy

Deployed

AI GYM BUDDY (Langchain | Flutter | Riverpod | Gemini)

Personalized AI-Driven Workouts with Smart Equipment Detection and Progress Tracking

Features: Al Instrument Detection (Camera or Gallery), Exercises based on Available Equipments, Time, Preffered Muscle Groups & Custom requests, Dynamic Video Tutorial Finder for each exercise, Super personalized Al generated routine, Workout History Tracker, Easy SignUp/Login with Google Oauth

Technologies used: Dart, flutter, firebase, gemini 1.5 flash, riverpod, langchain, fastapi, google oauth

Licenses: This code of this app/website is written from scratch and I hold all the rights over distribution

Deployed

Non-Sequential Breast Cancer Classification System

Multi-Modal Cancer Detection: Developed a novel multi-output deep learning model for breast cancer detection, predicting cancer presence, invasiveness, and difficult-negative case status. The model incorporates both mammogram images and tabular clinical data, leveraging a non-sequential architecture to process distinct data modalities.

Fine-Tuned Image Feature Extraction: Utilized a pre-trained EfficientNetV2B3 model for image feature extraction, fine-tuning layers from block 6 onwards to enhance its applicability to the specific task, thus improving the quality of learned representations and potentially making the model more robust and accurate.

Distributed Training: Accelerated model training through distributed training using TensorFlow's MirroredStrategy on 2xT4 GPUs for 9 hours on Kaggle, demonstrating proficiency in optimizing model training with limited computational resources.

Technologies used: TensorFlow, Transfer Learning, EfficientNetV2, Fused MB-CNN

Deployed

Image Entity Extraction with Qwen2 VL: Large-Scale Inference

Problem Statement: E-commerce and healthcare industries struggle to efficiently extract product details (weight, volume, dimensions) from images at scale.

Action: Developed a large-scale image-to-text inference pipeline using Qwen2 VL: 2B, incorporating image preprocessing, Regex, and parallel processing. Processed 84,000 of 131,000 test images.

Result: Successfully extracted product values from a significant portion of the dataset. Our team of four ranked 172nd out of ~75,000 in the Amazon ML Challenge with Fl-Score=0.47, demonstrating the solution's potential for automated product information extraction.

Technologies used: Qwen2 VL, Python, Regex, Parallel Processing

LLM based ATS System using VertexAI Embedding

Technologies used: Langchain, VertexAI Embedding, StreamLit, PostGresVector

Ongoing

Volunteering

Git/GitHub Instructor (Volunteer)

Carried out sessions to teach juniors the fundamentals of Git and GitHub, covering version control, collaboration, and best practices.

Event Coordinator

Acharya Technical Club - Steigen

Achievements

Certifications

Education

BE in Information Science

Acharya Institute of Technology, Bangalore

2021-2025

CGPA-8.12

Higher Secondary Education

Kalyani Public School, Barasat, Kolkata

2021

77% (Auto Pass Covid Batch)

Secondary Education

Sacred Heart Day High School, Kolkata

2019

90%

""" )