I am a data analyst with a passion for working with large data sets to derive actionable insights that generate effective solutions and greater business value. Here you can view some of my projects, and read about my experience and education. I look forward to connecting with you!
I'm a data analyst who recently earned a certificate in Data Analytics from Columbia University, with a background in Mathematics and Economics. Past work includes data analytics, project management, and building stakeholder relationships. Experienced in handling large datasets consisting of structured, semi-structured, and raw data, and collaborating across diverse groups to support creative, detailed and efficient analysis. Strengths include analytical problem-solving, empathic teamwork, and effective communication. I am passionate about both the technological and social aspects of a problem at hand - data analytics and technological solutions tend to hold greater efficacy when combined with people analytics and human psychology!
Python
SQL
R Studio
JavaScript HTML CSS
Tableau
Machine Learning
MongoDB
Columbia Engineering Data Analytics Certificate
Data Analytics is a high-growth career track, and Columbia University Data Analytics Boot Camp teaches you the specialized skills for it in a very comprehensive manner. During the 24 weeks, I learned in-demand front-end and back-end technologies while working on projects with real-world applications. My experience includes data analysis, data visualization, web development, working with databases and big data, and machine learning.
Master of Arts in Anthropology
The Masters in Anthropology at the University of Virginia (UVA) equipped me with in-depth knowledge about the history of the social and cultural anthropological theory, langugae and identity, systems of care, and medical anthropology. It expanded on understanding why humans act the way they do, how culture affects the choices they make, and what drives their interactions with one another and the artefacts around them.
Bachelors in Economics and Politics (Minor in Mathematics)
The Bachelors in Economics and Politics at the Lahore University of Management Sciences (LUMS) provided me with a strong foundation in Economics and Politics, with courses ranging from Advanced Macro- and Microeconomics to Game Theory and Political Theory. The minor in Mathematics equipped me with invaluable mathematical knowledge through the following courses: Advanced Statistical Analysis, Calculus I & II, Probability, Linear Algebra with Differential Equations, Ordinary Differential Equations, Formal Mathematics, Introduction to Real Analysis I, and Operations Research I.
Create an interactive world map for earthquakes, with multiple layers added for different features and view modes, using JavaScript (D3 library), Leaflet, and Mapbox.
View CodeAnalysis to determine if there is a positivity bias in product reviews written by Amazon Vine members. The ETL process and analysis uses PySpark, Python (Pandas), SQL, Amazon Web Services (AWS), and pgAdmin.
View CodeSupervised machine learning models built and evaluated to predict credit loan risk. Resampling and ensemble techniques applied to the logistic regression classifier models using Scikit-learn, Imbalanced-learn, Pandas, and NumPy libraries in Python.
View CodeCreates a travel itinerary map based on the customer's weather preferences. Uses Python (Pandas, Matplotlib, and SciPy libraries) and APIs.
View CodeThe analysis uses R language to run a multiple linear regression, t-tests, and generate summary statistics, in order to aid an automotive company in identifying the production troubles that are hindering the manufacture of a prototype car of theirs.
View CodeUnsupervised machine learning algorithms (PCA dimensionality reduction and K-means clustering algorithm) report on tradable cryptocurrencies and create a classification system for them, using Scikit-learn, Pandas, Plotly, and hvPlot in Python.
View CodeCreates a web app that scrapes and displays the most recently published data on Mars and the Mars mission. Uses Python (html5lib and lxml libraries), MongoDB, Flask-PyMongo, Splinter, BeautifulSoup, and Web-Driver Manager.
View CodeDeep-learning neural network (binary classifier) to determine which organizations are worth donating to and which ones are high-risk. Uses Python (TensorFlow, Pandas, and Scikit-Learn libraries).
View CodeAnalysis and visualization of ride-sharing data to determine how total weekly fares differ by city type. Uses Python (Pandas and Matplotlib libraries).
View CodeData visualization of the NYC restaurant data, and data analysis to gauge if a restaurant located in a high-income area receives a higher health inspection grade. Uses Python (Pandas, Scikit-learn, Imbalanced-learn), PostgreSQL, SQLAlchemy, Tableau, JavaScript (Plotly.js library), HTML, CSS, and Bootstrap.
Live Demo View CodeDetermines how many employees at the company will soon be retiring, and how many among those are eligible for mentoring the new hires. Uses SQL, postgreSQL, and pgAdmin.
View CodeCreates an automated ETL (Extract, Transform, Load) pipeline that extracts (from three data files), transforms, and loads data into a movies database. Uses Python (Pandas), PostgreSQL, and SQL.
View CodeProduces district-level and school-level summary on the math, reading, and overall passing percentages in the schools, and repeats the analysis after ninth grade scores for math and reading have to be replaced. Uses Python.
View CodeAnalyzes climate data to determine if opening up a surf shop in the location will make for a viable investment or not. Uses Python (Pandas), SQLAlchemy, SQLite, and Flask.
View CodeAnalyzes the performance of 12 different stocks through the years 2017 and 2018 in order to enable the client to make informed decisions when it comes to investing in stocks. Uses VBA (Visual Basic for Applications).
View CodeAnalyzes a dataset consisting of 4,000 crowdfunding projects to discover hidden trends (campaign performance based on launch dates and funding goals). Uses Excel.
View CodeConducts an election audit of a local congressional election, and computes the county and candidate with the highest number of votes. Uses Python.
View Code