Posts by Collection

portfolio

AWS-Powered MLOps Workflow

This project encapsulates the creation of an MLOps pipeline leveraging AWS services, streamlining the path from model development to production with automated CI/CD, testing, and scalable deployment.

Amazon Product Co-Purchasing Network Analysis

This project leverages network analysis and machine learning to analyze over half a million Amazon product metadata, focusing on understanding product relationships, predicting sales ranks, and provide product recommendations.

Natural Language Processing with Disaster Tweets

This project utilizes Microsoft DeBERTa and data preprocessing techniques to classify tweets as related to real disasters or not, exploring the impact of model choice and data cleaning on prediction accuracy.

Analysis of Flight Systems

This project explores the enhancement of flight fare and delay predictions by merging conventional datasets with Twitter data, employing data processing and sentiment analysis to assess social media’s influence on aviation trends.

Customer Segmentation Analysis

This project applies K-means clustering to segment customers based on various features related to their purchasing behavior, and the analysis was focused on understanding the distinct customer groups to tailor marketing and sales strategies effectively.

portfolio1

portfolio2

portfolio3

N-gram Language Models

Understanding N-Gram Models: Perplexity and Smoothing Techniques in Natural Language Processing

portfolio4

LLM Serving API

A production style REST API for serving large language models with FastAPI, token streaming via Server Sent Events, asyncio request batching, and sliding window rate limiting backed by Ollama.

Financial Reasoning with SFT + GRPO

Fine-tuned Gemma-3-270M for structured financial sentiment reasoning using a two-phase pipeline: SFT to teach output format, followed by GRPO with a multi-component reward function including a FinBERT teacher model for sentiment alignment.

VLM Fine-Tuning: SmolVLM-256M on ChartQA

Fine-tuned SmolVLM-256M on the ChartQA chart question-answering dataset using streaming lazy loading and LoRA/DoRA adapters, achieving full training in under 25 minutes on a 16GB GPU with less than 2GB peak VRAM usage.

GRPO Fine-Tuning with LoRA

Fine-tunes large language models using GRPO (Group Relative Policy Optimization) with LoRA adapters and 4-bit quantization, supporting any HuggingFace model and dataset with automatic field detection and a multi-component reward function.

LLaMA 2: Inference Architecture from Scratch

Implements the LLaMA 2 inference pipeline from scratch in PyTorch, covering rotary positional embeddings, RMSNorm, SwiGLU activations, grouped-query attention, and KV caching, the production techniques that distinguish modern LLMs from research transformers.

GPT-2: Reproducing OpenAI’s Architecture from Scratch

Reproduces the GPT-2 architecture from scratch in PyTorch with BPE tokenization, GELU activations, and flash attention, including a weight-loading pipeline to verify the implementation against OpenAI’s pretrained GPT-2 checkpoints.

GPT from Scratch

Character-level GPT transformer entirely from scratch in PyTorch, trained on the Tiny Shakespeare dataset, implementing multi-head self-attention, transformer blocks, and autoregressive generation with every component written by hand.

Makemore - Character-level Language Model

Language modeling using Multi Layer Perceptron and activations, gradients, and BatchNorm, manual backpropagation, and WaveNet-style dilated convolutions experiments to predict the next letter in a sequence of letters.

portfolio5

Compliance Report Generator

AI-powered compliance document analysis system with a Streamlit interface — combines vector search (Qdrant), knowledge graphs (Neo4j), and LLM agents to answer regulatory questions from PDFs, with PII filtering, prompt guardrails, and human-in-the-loop approval.

publications

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.