Skills

Astroinformatics and Data Science

PythonPyTorchStatisticsSQL, astroqueryJupyter LabRStudio/RMDMongoDBDockerKafka and SparkKeras

Machine Learning

XGBoost/Decision TreesMLP and MC-DropoutCNNLLM

Projects

Distributed ETL Streaming Pipeline for Astroinformatics

Distributed ETL Streaming Pipeline for Astroinformatics image 1

A production-style streaming ETL pipeline built for astroinformatics. The system ingests survey CSVs via Kafka, processes them with Spark streaming, writes normalized tables to PostgreSQL (with pgAdmin), and exposes interactive analysis via Jupyter — all reproducible with Docker Swarm and a Makefile for quick deployment.

AWS Cloud-Based Stroke Risk ETL & Analytics Pipeline

A cloud-native end-to-end data engineering project designed to collect, transform, store, and analyze healthcare data related to stroke risk factors using AWS and Python. This project demonstrates how a scalable ETL pipeline can power public-health analytics — identifying trends, patterns, and risk correlations in real-world healthcare data.

CNN to Classify Galaxies

Convolutional Neural Network (CNN) for classifying galaxy morphologies using the Galaxy10 dataset. The project utilizes TensorFlow for building and training the model and follows standard preprocessing steps to ensure efficient data handling and model performance.

Gametrax

Gametrax image 1
Gametrax image 2
Gametrax image 3

Full Stack development of an Android app (built with Flutter/Dart + Firebase + Figma) that helps gamers search, track and organise games, view latest gaming news, and check basic store info — all in one place.

EDHREC to Archidekt Python Script

Converts any EDHREC deck JSON into a clean card list (.txt) for importing into Archidekt (or really any other deck-building platform).