Shaheryar
Data Engineer crafting scalable data pipelines at the intersection of machine learning, cloud infrastructure, and analytics.
Projects & Experience
Biodata Enricher
A flexible, pip-installable geospatial enrichment pipeline that takes user-provided point data and automatically samples environmental predictors from local rasters or GEE, producing clean statistical features, QA outputs, and full provenance for downstream machine-learning workflows.
Master Thesis (Ongoing)
Is the Best, the Best? Analysing Best-Paper Awardees' Career Trajectories. Investigating how early-career recognition through best paper awards relates to subsequent career trajectories of junior computer scientists using large-scale bibliometric data from OpenAlex.
Send News
A modern news aggregation platform delivering real-time news updates with a clean, responsive interface. Built with Next.js and deployed on Vercel for optimal performance.
Quote Website
An elegant web application showcasing inspirational religious quotes with a minimalist design. Features dynamic content rendering and smooth user experience.
Pipeline Visualizer
Interactive visualization tool for data engineering pipelines. Helps understand complex data flows and pipeline architectures with intuitive visual representations.
GitHub Stars Prediction Pipeline
Designed and implemented a complete ML pipeline using GitHub API to predict repository stars. Containerized with Docker, automated CI/CD with Ansible, and achieved near-linear scaling speedups on UPPMAX cloud.
Walmart E-commerce Analytics Pipeline
Built an end-to-end data analytics pipeline using dbt and PostgreSQL. Implemented star-schema modeling with fact and dimension tables, enforced data quality tests, and structured models into staging and core layers.
Education & Skills
Let's Connect
Always interested in new opportunities, collaborations in data engineering, machine learning, and cloud infrastructure.