HOUSEHOLD ELECTRIC POWER CONSUMPTION: ANALYSIS, CLUSTERING, AND PREDICTION WITH PYTHON

HOUSEHOLD ELECTRIC POWER CONSUMPTION: ANALYSIS, CLUSTERING, AND PREDICTION WITH PYTHON

ByVivian SiahaanRismon Hasiholan Sianipar

This ebook may not meet accessibility standards and may not be fully compatible with assistive technologies.
In this project, you will perform analysis, clustering, and prediction on household electric power consumption with python. The dataset used in this project contains 2075259 measurements gathered between December 2006 and November 2010 (47 months). Following are the attributes in the dataset: date: Date in format dd/mm/yyyy; time: time in format hh:mm:ss; globalactivepower: household global minute-averaged active power (in kilowatt); globalreactivepower: household global minute-averaged reactive power (in kilowatt); voltage: minute-averaged voltage (in volt); global_intensity: household global minute-averaged current intensity (in ampere); submetering1: energy sub-metering No. 1 (in watt-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered); submetering2: energy sub-metering No. 2 (in watt-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light; and submetering3: energy sub-metering No. 3 (in watt-hour of active energy). It corresponds to an electric water-heater and an air-conditioner. In this project, you will perform clustering using KMeans to get 5 clusters. The machine learning models used in this project to perform regression on total number of purchase and to predict clusters as target variable are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM, Gradient Boosting, XGB, and MLP. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.

Details

Publication Date
Mar 30, 2023
Language
English
Category
Computers & Technology
Copyright
All Rights Reserved - Standard Copyright License
Contributors
By (author): Vivian Siahaan, By (author): Rismon Hasiholan Sianipar

Specifications

Format
PDF

Ratings & Reviews