Afnan Rahman | Data Science Portfolio

About Me

I'm Afnan Rahman, a senior at Michigan State University pursuing dual degrees in Data Science and Criminal Justice, with a concentration in Cyber Investigations. Through my internships at Rocket and Loc Performance, I've gained experience in analytics, cybersecurity, cloud security, and compliance auditing, working with frameworks like SOC, SOX, and NIST while supporting audit preparation, risk assessments, cloud security reviews, and data-driven security initiatives across AWS and GCP environments. I've used Python, SQL, C++, and machine learning to build analytics pipelines, forecasting models, automation tools, and dashboards that turn complex data into practical insights. What matters most to me is producing work that stakeholders can trust by validating results carefully, challenging weak assumptions, and making sure the final product is both accurate and useful.

Skills

Languages

Python SQL C++ Java R JavaScript PowerShell Bash

Data / ML

Pandas NumPy scikit-learn PyTorch Prophet Matplotlib Seaborn Tableau Alteryx AWS SageMaker

Cloud / DevOps

AWS GCP Docker Kubernetes Terraform Jenkins CI/CD AWS Lambda DynamoDB CircleCI Azure DevOps

Security / Compliance

SOC 2 SOX NIST 800-53 Qualys Tenable Intune CrowdStrike Splunk ServiceNow Wireshark Fortinet

Projects

NYC 311 on AWS (Serverless Data Pipeline)

Built a cloud-based workflow to ingest, process, and analyze NYC 311 service-request data using AWS. The project demonstrates an end-to-end pipeline mindset: reproducible infrastructure setup, automated data handling, and clear documentation for how the system works and how to run it.

AWS Python SQL Serverless Data pipelines

More details

Problem

Public datasets like NYC 311 are large and continuously updated. The goal was to build a repeatable cloud workflow that can ingest data reliably, support querying/analysis, and be documented well enough that another person can run it without guesswork.

Approach

Implemented a serverless-first pipeline design and documented the full workflow end-to-end (setup, ingestion, processing, and analysis). Focused on clean interfaces between steps, automation where possible, and keeping the repository easy to navigate for reviewers.

Results & Impact

Produced a working cloud workflow that’s easy to demo and extend. The repository is designed to showcase practical cloud data engineering fundamentals (automation, reliability, and clear documentation) in an employer-readable way.

View Repo

MovieLens Recommender System

Built a movie recommender using the MovieLens dataset, focusing on clean data preparation, baseline modeling, and evaluation. The project is organized to show both the implementation and the reasoning behind model choices.

Python pandas NumPy scikit-learn Jupyter Notebooks

More details

Problem

Given historical user ratings, how can we recommend movies a user is likely to enjoy? The goal was to implement a recommender workflow that is easy to understand, evaluate, and iterate on.

Approach

Prepared MovieLens ratings data for modeling, implemented recommendation baselines, and evaluated performance with appropriate metrics. Structured the code/notebooks so that data cleaning, modeling, and evaluation are clearly separated and reproducible.

Results & Impact

Delivered a working recommender pipeline that can generate user-level recommendations and provides a strong template for comparing approaches (e.g., baseline vs. more advanced models) as the project evolves.

View Repo

SQL Mini-Project: Sakila Database Analysis

Worked with the Sakila sample database to answer business-style questions using SQL. This project highlights clear, readable queries (joins, aggregation, subqueries) and an analysis mindset: translating a question into a query and interpreting the result.

SQL Relational databases Data analysis

More details

Problem

The Sakila database represents a simplified video-rental business. The goal was to use SQL to extract actionable insights—such as customer behavior, inventory performance, and revenue patterns—from a relational schema.

Approach

Wrote and iterated on SQL queries to answer a set of structured questions, emphasizing correctness, readability, and efficient use of joins and aggregation. Validated results with sanity checks and clear interpretation.

Results & Impact

Produced a set of queries and outputs that demonstrate practical SQL fluency and the ability to communicate insights from relational data—skills that transfer directly to analytics and data engineering workflows.

View Repo

Contact

Email: rahman44@msu.edu
LinkedIn: https://www.linkedin.com/in/rahmanafnan/
GitHub: https://github.com/afnan-rah