Looking for opportunities in Data Science and AI/ML

Chirag
Sharma

Physics postgraduate research -> Data Science & AI/ML

I like building intelligent systems. PG Diploma in Artificial Intelligence from CDAC ACTS Pune and a BS-MS in Physics from IISER Bhopal. Learning to use AI for real-world applications and solutions with a focus on data-driven decision making. Passionate about turning data and research into measurable product improvements.

ML · DL Data Analysis LLM Agents RAG Systems Time-Series NLP · CV
Profile Picture
+91 9079882590
Pune, Maharashtra
About Me

From Physics to Data Science: Solving Real-World Problems with Data

Hello! I'm Chirag Sharma — an early career Data Scientist and AI/ML Engineer. I obtained a BS-MS in Physics at IISER Bhopal, where I gained an analytic and research mindset to work out complex problems. I worked in the field of cosmology for my master's thesis and then, in Astronomy as a Project Associate at Ahmedabad University in collaboration with SAC, ISRO.

Repeated exposure to computational projects and the desire to work on more applied problems led me to pursue a PG Diploma in AI at CDAC ACTS Pune. Since then, I have been working on AI/ML projects in varied domains, including time-series anomaly detection for maritime sensor data and LLM-powered agentic RAG application for financial queries.

I'm most interested in the real-world applications of AI/ML along with data analysis for data-driven decision making.

Timeline
Mar 2026
Finsense — AI Agent for Financial Queries & Tax Calculation
Personal Project · Deployed on Streamlit
Built a conversational AI agent that answers personal finance queries and performs Indian tax calculations from natural language input, deployed live on Streamlit.
Dec 2025 – Jan 2026
Anomaly Detection Pipeline
Seaker Systems Pvt. Ltd. · PG-DAI Project
Developed an end-to-end anomaly detection pipeline over ~23 million maritime AIS signals, replacing a static rule-based system with a dynamic, AI-driven workflow.
Aug 2025 – Feb 2026
PG Diploma in Artificial Intelligence
CDAC ACTS Pune · 93% · AIR 1
Intensive course with theory and lab sessions on Data Analysis, Machine Learning, Deep Learning, NLP, Computer Vision, and AI Platforms \& Trends.
Oct 2024 – Jun 2025
Project Associate
Ahmedabad University & SAC, ISRO
Comparative analysis of astronomical filament detection algorithms. Presented at SAMHITA-2025 at SAC, ISRO.
Aug 2019 – May 2024
BS-MS in Physics
IISER Bhopal · CGPA 7.95
Five-year integrated program. Thesis: Understanding Physics of the Early Universe using Boltzmann Equations.
Skills & Tools

Technologies I work with

💻
Programming
PythonJavaC
🧠
AI / ML
scikit-learnTensorFlowPyTorchHugging FaceFine-TuningNLTKOpenCV
🤖
LLM & Agentic
LangChainLangGraphRAGFine-TuningAgentic Systems
⚙️
Dev Tools & Infra
GitDockerFastAPIFlaskStreamlitPlaywright
️📈
ML Domains
NLPComputer VisionAnomaly DetectionTime-SeriesLLM Apps
️📈
Data Analysis
SQLPandasNumpyMatplotlibSeabornPlotly
🔬
Science & Math
Numerical MethodsODE/PDE SolvingStatistical ModelingPerturbation Theory
Work

Projects

01 Finsense — AI Agent for Financial Queries & Tax Calculation
LLM AgentRAGDeployed
Overview

Built a conversational AI agent that answers personal finance queries and performs Indian tax calculations from natural language input, deployed live on Streamlit.

Details
  • RAG-based document retrieval for financial policy Q&A with source grounding by using Indian Budget 2026 highlights and Post Office Savings scheme data as the knowledge base
  • Browser-based tax computation via Playwright for real-time official calculation
  • Used MiniLM-L6 (HuggingFace) for semantic embeddings and LLaMA-3.1-8B (GroqAPI) as the backbone LLM.
  • Exposed functionality via a REST-style API with persistent conversational memory; orchestrated with LangChain and instrumented with Python logging for backend observability.
2
AI capabilities unified
in one agent
Deployed publicly
Streamlit Cloud**
RAG
Document retrieval
with source grounding
REST
FastAPI + conversational
memory backend
Tech Stack
PythonLangChainRAGPlaywrightFastAPIStreamlitChromaDBConv. MemoryDockerGitHub
Finsense UI

**Note: Due to free tier limitations, the app might be inactive. You can always run it locally using the Dockerfile provided in the GitHub repository. Or you can wake it up by clicking on the streamlit link and huggingface link and wait for 30 seconds.

02 Context-Based Auto Anomaly Detection
Unsupervised MLTime-Series
Overview

Developed an end-to-end anomaly detection pipeline over ~23 million maritime AIS signals, replacing a static rule-based system with a dynamic, AI-driven workflow.

Details
  • End-to-end pipeline over 6 months of per minute resolution (~ 250,000 rows × 90 features) maritime sensor data
  • Implemented and benchmarked 3 Machine Learning and Deep Learning models — Isolation Forest, LSTM Autoencoder, and Transformer Autoencoder — using reconstruction-error-based anomaly scoring; enabling data-driven threshold tuning.
  • Engineered root-cause analysis module to surface the top contributing features for each detected anomaly, giving operations teams quantitative evidence for prioritizing maintenance.
  • Reduced data load latency by 98% (from ~2 min to under 2 sec) via Parquet-based optimization, enabling rapid exploratory analysis and real-time reporting.
98%
Load time reduction
2 min → <2 sec
250K
Rows × 90 features
processed
3
Models benchmarked
IF · LSTM · Transformer
Seaker
Company guided project
Nov 2025 – Jan 2026
Tech Stack
PythonPyTorchscikit-learnIsolation ForestLSTM AutoencoderTransformer AutoencoderPyarrowPandasMatplotlibSeabornGitHub
03 Astronomical Filament Detection Analysis
Data AnalysisAstrophysicsISRO
Overview

Project Associate at Ahmedabad University with ISRO's Space Applications Centre. Comparative study of filament detection algorithms on astronomical datasets to identify which best detects interstellar filamentary structures from observational data. Our findings suggested that DisPerSE and FilFinder tend to be more generous in detecting filaments, while getSF tends to be more conservative and detects fewer, dominant filaments. On the quantitative side, we found that MSSIM (Mean Structural Similarity) index was not a good metric for comparing the performance of the algorithms as it did not behave as expected (Green et al. 2017). We concluded that new metrics are needed to compare the performance of the algorithms in an unbiased, objective manner.

Details
  • Researched and benchmarked multiple interstellar filament detection algorithms (FilFinder, getSF, DisPerSE) across diverse astronomical datasets, contributing to pipeline selection for ISRO's imaging analysis workflows.
  • Quantitatively evaluated algorithm performance using the MSSIM structural similarity index, enabling objective comparison across varied observational conditions.
  • Presented research findings at the SAMHITA-2025 national conference hosted at SAC, ISRO, Ahmedabad
3
Algorithms benchmarked
FilFinder · getSF · DisPerSE
MSSIM
Structural similarity
performance metric
ISRO
Presented at SAMHITA-2025
SAC, Ahmedabad
9 mo
Oct 2024 – Jun 2025
Tech Stack
PythonFilFindergetSFDisPerSEAstropyMSSIMMatplotlibFITS Data
04 Numerical Modeling of Cosmological Perturbations
Scientific ComputingCosmology
Overview

Master's thesis at IISER Bhopal. Developed custom numerical solvers for the coupled PDEs arising in linear cosmological perturbation theory — the mathematical framework for understanding large-scale structure formation in the universe. A good reference is Callin 2006.

Details
  • Solved full coupled PDE system from linear cosmological perturbation theory
  • Implemented 4th-order Runge–Kutta (RK4) for time evolution (unstable!)
  • Implemented adaptive RK5 with Cash–Karp parameters for improved stability
  • Results interpreted in context of CMB anisotropies and structure formation
RK4
Classical 4th-order
integrator (custom)
RK5
Adaptive Cash–Karp
integrator
IISER
Master's thesis
Bhopal 2023–24
PDE
Coupled differential
equation systems
Tech Stack
CPythonNumPy / SciPyRK4 (Custom)Adaptive RK5 Cash–KarpMatplotlibGnuplotLaTeX
Evolution of gravitational potential as a function of scale factor for different k-modes. Linear matter power spectrum as a function of scales.

For a detailed explanation of the above figures, please refer to my thesis.