8 Kaggle datasets for your data analytics portfolio. I have included project ideas for each one so you know exactly what to build.
πππππ‘π‘ππ₯
Coffee Sales (Vending Machine) β real transaction data from a coffee vending machine. Daily sales, product types, payment methods, and time of day. Build: a sales trend analysis by time of day using SQL, or a product performance dashboard in Power BI. kaggle.com/datasets/iheβ¦
Chocolate Sales 2023-2024 β retail chocolate sales with salesperson, product, country, and revenue data. Build: a salesperson performance analysis using SQL, or a regional sales dashboard in Power BI. kaggle.com/datasets/sssβ¦
World Happiness Report 2024 β country-level happiness scores, GDP, social support, life expectancy, and freedom index. Build: a correlation analysis between GDP and happiness in Python, or a country ranking dashboard. kaggle.com/datasets/jaiβ¦
ππ‘π§ππ₯π πππππ§π
Zomato Delivery Operations β delivery time, weather, traffic, and customer ratings across thousands of orders. Build: a Python EDA on what drives low ratings, or an operational dashboard tracking average delivery time by city. kaggle.com/datasets/sauβ¦
Global Cost of Living β cost of living data across 4,500+ world cities including rent, groceries, transport, and utilities.
Build: a city comparison analysis using SQL, or a Python EDA on which cities offer the best value relative to average salaries.
kaggle.com/datasets/mviβ¦
Telecom Churn Dataset β customer-level data including contract type, usage, complaints, and whether the customer churned.
Build: a churn rate analysis by contract type using SQL, or a Python EDA on the key drivers of customer churn β directly relevant to analyst roles in telecoms and banking.
kaggle.com/datasets/mnaβ¦
πππ©ππ‘πππ
Online Retail & E-Commerce β customer behaviour, product performance, and sales trends across thousands of transactions. Build: an RFM customer segmentation in Python, or a cohort analysis of repeat purchasers using SQL. kaggle.com/datasets/ertβ¦
Supply Chain Analysis β product-level supply chain data covering sales, stock levels, shipping times, defect rates, and supplier performance across multiple product categories.
Build: an end-to-end supplier performance dashboard in Power BI; Python EDA on which product categories have the highest defect rates; SQL analysis of shipping time vs order volume.
kaggle.com/datasets/harβ¦