PORTFOLIO

Shaheryar

Data Engineer crafting scalable data pipelines at the intersection of machine learning, cloud infrastructure, and analytics.

Available for work
Uppsala, Sweden
CURRENTLY
Developer
@ ScopeChain
Sustainability Business
M.Sc. Data Science & Engineering
@ Uppsala University
Expected June 2026
FOCUS
PythonApache SparkDockerPostgreSQLdbtMachine Learning

Projects & Experience

2023 — PRESENT
2025

Biodata Enricher

EBC (Evolutionary Biology Campus)

A flexible, pip-installable geospatial enrichment pipeline that takes user-provided point data and automatically samples environmental predictors from local rasters or GEE, producing clean statistical features, QA outputs, and full provenance for downstream machine-learning workflows.

PythonGEEGeospatial AnalysisMachine LearningData Pipeline
2026

Master Thesis (Ongoing)

InfoLabs Uppsala University

Is the Best, the Best? Analysing Best-Paper Awardees' Career Trajectories. Investigating how early-career recognition through best paper awards relates to subsequent career trajectories of junior computer scientists using large-scale bibliometric data from OpenAlex.

PythonData ScienceNetwork AnalysisBibliometricsResearch
2025

Send News

Full-Stack News Web App

A modern news aggregation platform delivering real-time news updates with a clean, responsive interface. Built with Next.js and deployed on Vercel for optimal performance.

Next.jsReactVercelAPI Integration
2025

Quote Website

Religious Quotes Platform

An elegant web application showcasing inspirational religious quotes with a minimalist design. Features dynamic content rendering and smooth user experience.

ReactNext.jsTailwind CSSVercel
2025

Pipeline Visualizer

Data Pipeline Visualization Tool

Interactive visualization tool for data engineering pipelines. Helps understand complex data flows and pipeline architectures with intuitive visual representations.

ReactD3.jsNext.jsData Visualization
2025

GitHub Stars Prediction Pipeline

Uppsala University

Designed and implemented a complete ML pipeline using GitHub API to predict repository stars. Containerized with Docker, automated CI/CD with Ansible, and achieved near-linear scaling speedups on UPPMAX cloud.

PythonScikit-learnDockerAnsibleFlaskRay Tune
2025

Walmart E-commerce Analytics Pipeline

Uppsala University

Built an end-to-end data analytics pipeline using dbt and PostgreSQL. Implemented star-schema modeling with fact and dimension tables, enforced data quality tests, and structured models into staging and core layers.

dbtPostgreSQLPower BIPython

Education & Skills

EDUCATION
M.Sc. Data Science & Engineering
Uppsala University
Expected June 2026 • Uppsala, Sweden
B.S. Computer Science
COMSATS University Islamabad
July 2023 • CGPA: 3.73/4.0
Campus Gold Medal
TECHNICAL SKILLS
Data Engineering
PythonSQLApache SparkApache KafkaAirflowdbtDockerBashHDFSETL/ELT
Databases & Cloud
MongoDBPostgreSQLBigQueryHadoopParquetAvroGCPAWSKubernetes
Development Tools
GitCI/CDVS CodeJupyterReactJS
Soft Skills
CommunicationProblem-solvingTeam collaborationSelf-driven
HONORS & AWARDS
CUI Campus Gold Medal
Top of 1200 students at CUI Sahiwal Campus (2023)
IELTS
Overall Band Score: 8.5/9
INTERESTS
CookingGardeningTravelingSelf-improvementGaming

Let's Connect

Always interested in new opportunities, collaborations in data engineering, machine learning, and cloud infrastructure.