Free Data Cleaning Projects (The Messy Datasets)
ππ’π₯ ππ’π π£πππ§π πππππ‘π‘ππ₯π¦
β’ iweld / data_cleaning β practice data cleansing using SQL or Excel with messy datasets github.com/iweld/data_cβ¦
β’ sunnyr3 / Python-Data-Cleaning β beginner-friendly Python data cleaning walkthrough github.com/sunnyr3/Pythβ¦
π£π¬π§ππ’π‘ πππ§π πππππ‘ππ‘π
β’ realpython / python-data-cleaning β cleaning messy data with pandas and Jupyter notebooks github.com/realpython/pβ¦
β’ Jcharis / Data-Cleaning-Practical-Examples β practical cleaning examples for missing values, duplicates, and text cleanup github.com/Jcharis/Dataβ¦
β’ PacktPublishing / Python-Data-Cleaning-Cookbook β advanced recipes for handling messy real-world datasets github.com/PacktPublishβ¦
π π¨ππ§π-π§π’π’π π£π₯π’ππππ§π¦ (π£πππ΅πΌπ», π¦π€π, ππ
π°π²πΉ)
β’ eyowhite / Messy-dataset β dirty datasets for cleaning with Python, Excel, SQL, and Power BI github.com/eyowhite/Mesβ¦
β’ ragijaireddy27 / Data_cleaning_for_analysis β cross-platform cleaning with SQL, Python, Excel, and Power BI github.com/ragijaireddyβ¦
β’ RIDGE777 / Data-Cleaning-in-Excel-and-SQL β movie dataset cleaned using SQL and Excel workflows github.com/RIDGE777/Datβ¦
πππ₯π§π¬ πππ§π πππππππ‘πππ¦
β’ ojalp26 / Cleaning-dirty-data-samples β messy datasets with duplicates, missing values, and inconsistent formatting github.com/ojalp26/Cleaβ¦
β’ joshtemple / pandas-cleanup β sales data cleaning exercise based on a real presentation github.com/joshtemple/pβ¦
πππ©ππ‘πππ πππππ‘ππ‘π (text, dates, and more)
β’ mramshaw / Data-Cleaning β Python cleaning examples focused on dates, text, and complex formats github.com/mramshaw/Datβ¦
β’ underthecurve / pandas-data-cleaning-tricks β practical cleaning tricks for real-world messy datasets github.com/underthecurvβ¦