Hi, I'm Tarun Kumar Rathore

Data Science Expert

I specialise in building end-to-end machine learning and generative-AI systems that convert raw data into business insights and automation.

About Me

I’m a Data Scientist and Generative AI Engineer passionate about creating data-driven, autonomous AI solutions that address complex business challenges. My expertise spans finance, e-commerce, and customer analytics, where I design intelligent systems that transform data into meaningful and actionable insights.

I've built and deployed predictive and generative AI models that boost performance, improve decision-making, and enable intelligent automation. With a strong foundation in Machine Learning, Deep Learning, and Agentic AI, I focus on delivering solutions that create measurable business impact.

By applying advanced predictive and generative architectures, my goal is to enhance efficiency, speed up decisions, and drive digital transformation. I'm driven to build AI systems that solve today's problems and evolve for tomorrow's innovations.

20+

Projects Completed

10+

Certifications

10+

Technologies Mastered

Featured Projects

RAG-Based Document QA System

RAG-Powered PDF Q&A System using LangChain, Groq Llama 3.3, HuggingFace embeddings, and ChromaDB. Enables natural language querying of technical documents through semantic chunking and vector search, delivering accurate, source-backed responses with efficient retrieval.

LangChain HF ChromaDB

Multi-Agent Workflow with LangGraph

Developed a transparent, stateful AI chatbot using LangGraph with multi-stage workflow: preprocessing, sentiment analysis, Groq-powered response generation, and LangSmith logging. Enables traceable, debuggable conversational AI with context-aware responses and complete end-to-end observability.

LangGraph LangSmith Groq

Azure OCR Document Text Extraction

Built an Azure Computer Vision OCR solution using Azure Cognitive Services to extract text from identity and financial documents. Implemented the Read API with asynchronous processing, status polling, and image-based text detection to analyze unstructured document images.

Azure Cognitive Services Python

DeepSeek-OCR Text Extraction System

Built an intelligent OCR system using DeepSeek-OCR model with Gradio interface to extract text from images and documents. Enables accurate text recognition from complex layouts, solving document digitization challenges with state-of-the-art deep learning architecture.

DeepSeek-OCR Gradio Python

Real-Time Weather Power BI Dashboard

Developed a real-time Power BI weather dashboard using live API data and advanced DAX formulas, enabling automatic refresh and instant insights.

Power BI API DAX

Sentiment Classification of Restaurant Reviews

Compared 9 machine learning models for sentiment classification; optimized SVM achieved 97.5% accuracy and 0.9978 AUC, automating review analysis and improving efficiency by 40%.

Python NLP Scikit-learn

3D Bounding Box Detection & Tracking in Videos

Real-time 3D detection of shoe, cup, chair, and camera objects in video using MediaPipe Objectron, generating accurate 3D bounding boxes for object size, orientation, and position tracking.

CV2 MediaPipe Streamlit

Gemma-2B Quote Generator: Fine-Tuning

I fine-tuned Google’s Gemma-2B into a specialized Quote Generator using LoRA and FP16 precision. After hitting resource limits on Colab, I migrated the workflow to Kaggle’s dual Tesla T4 GPUs, successfully resolving library conflicts and implementing secure API management via Kaggle Secrets.

LoRA Gemma-2B FP16 Hugging Face Transformers

Student Performance Tracker Dashboard

A secure, interactive Power BI dashboard for parents to track student performance using RLS and MySQL integration.

Python MySQL Power BI

Data-Driven Insights for the Publishing Industry

Interactive Streamlit app performing EDA, visualizing book sales, author performance, and publishing insights.

Python Streamlit EDA

Multi Language Text Translator

A lightweight AI app that combines NLP and speech synthesis to translate and speak text in 60+ languages, making global communication effortless.

Python mtranslate gTTs

Facial & Hand Landmark Tracking System

Developed a high-performance, real-time application using MediaPipe's Vision Tasks to enable touchless user interaction by simultaneously tracking facial features and hand gestures from a live video stream.

CV2 MediaPipe Streamlit

Face Transformation

Developed a real-time facial analysis system using MediaPipe Face Landmarker, capturing 478 3D landmarks and blendshape expressions to enable precise, low-latency facial feature tracking.

CV2 MediaPipe Streamlit

Skills & Technologies

A comprehensive toolkit for building intelligent solutions

Machine Learning

5
Data Analytics
Machine Learning
Natural Language Processing
Deep Learning
Scikit-learn

Generative AI

8
LangChain
LangGraph
Groq
HuggingFace
ChromaDB
LoRA
Transformers
RAG

Programming & Markup

4
Python
HTML
CSS
MySQL

Tools & Platforms

6
Microsoft Power BI
TensorFlow
Keras
Google Cloud Platform (GCP)
MLOps
CI/CD

Certifications

10+ industry-recognized certifications across Data Science, ML, DL, and Cloud.

Code with Harry Data Science
Deloitte Virtual Program
EY + MICROSOFT
Google Python Coursera
IBM ML0101EN Certificate Cognitive Class
Oracle Data Science Certificate
Power BI Techtip
Power BI Course PE
Slash Mark Tarun Kumar Rathore
Udemy Data Analyst

Resume

Download My Resume

Get a detailed overview of my experience, skills, and achievements in Data Science and Generative AI.

Comprehensive skill breakdown
Project portfolio highlights
Professional experience
Certifications & achievements
Download Resume (PDF)

* Resume will be downloaded as a PDF file

Contact Me

Let's collaborate or talk about Data Science and AI projects!

Let's Connect

I'm always interested in discussing new opportunities, collaborations, or just having a chat about data science and AI.

Loading document...