TechieYan Technologies

Predicting the trend of the house prices with United States housing data and Regression models


Prediction is one of the important features of Machine learning projects and in this project we will be predicting the house prices using XG Boost machine learning model. In this project we will be having tabular data with various features useful for doing house price prediction like city, zip code, flat area, location etc. we will initially train the model on existing dataset and then validating it by giving the model, unknown data as input to do predictions.

Predicting the trend of the house prices graph

Code Description & Execution

Algorithm Description

XG Boost

XG Boost or extreme gradient boosting is a machine learning model which is an advanced implementation of gradient boosting algorithm, used for regression and predictive modeling and it is a supervised learning algorithm.

XG boost is like combination of gradient boost and decision tree where trees are built to and each tree is aimed to reduce the error and the learning’s of the tree are transferred to the other and the prediction is done based on the learning.

How to Execute?

So, before execution we have some pre-requisites that we need to download or install i.e., anaconda environment, python and a code editor. Anaconda: Anaconda is like a package of libraries and offers a great deal of information which allows a data engineer to create multiple environments and install required libraries easy and neat.

Download link:

Python: Python is a most popular interpreter programming language, which is used in almost every field. Its syntax is very similar to English language and even children and learning it nowadays, due to its readability and easy syntax and large community of users to help you whenever you face any issues.

Download link:

Code editor: Code editor is like a notepad for a programming language which allows user to write, run and execute program which we have written. Along with these some code editors also allows us to debug, which usually allows users to execute the code line by line and allows them to see where and how to solve the errors. But I personally feel visual code is very good to work with any programming language and makes a great deal of attachment with user.

Download links:

Steps to Execute

Note: Make sure you have added path while installing the software’s.

Install the prerequisites mentioned above.


Open anaconda prompt and create a new environment. To create an environment use the commands given below. Replace env_name by the name of environment you want to give.

  • conda create -n “env_name”
  • conda activate “env_name”


Set up jupyter notebook for your environment

  • conda install –c conda-forge jupyterlab
  • conda install –c anaconda ipython


Install necessary libraries from requirements.txt file provided.

Predicting the trend of the house prices files

Go to the directory where your requirement.txt file is present.

  • cd D:\Chronic-Kidney-Desease-Prediction-main
Predicting the trend of the house prices
  • Run command  pip install -r requirements.txt or conda install requirements.txt

Requirements.txt is a text file consisting of all the necessary libraries required for executing this python file. If it gives any error while installing libraries, you might need to install them individually. All the required files will be downloaded after you run it. I got requirement already satisfied as I already have them installed.

Predicting the trend of the house prices command


To run the code, start jupyter notebook by typing “jupytrr notebook” in command prompt, this will navigate directly to jupyter notebook in your default web browser

Predicting the trend of the house prices command
Predicting the trend of the house prices

Open the folder containing the code, here it is chronic kidney disease prediction. When you run the Kidney_Disease_Prediction.ipynb file, you get the appropriate results.

Data Description

In the given dataset is divided into training and testing dataset which has 81 columns of which 80 are attributes or dependent variables and the last column is the target or independent variable the size of training dataset is (1460×81) and the size of testing dataset is(1460×80) and the target in the test data is predicted using the model

each attribute is a measure of some parameter affecting the price of the house for example:

SalePrice – the property’s sale price in dollars. This is the target variable that you’re trying to predict.

MSSubClass: The building class

MSZoning: The general zoning classification

LotFrontage: Linear feet of street connected to property

LotArea: Lot size in square feet

Street: Type of road access

Alley: Type of alley access

Predicting the trend of the house prices data set


Predicting the trend of the house prices result

Confusion matrix for KNN classifyer

Issues Faced

  1. While evaluating the house prices model give proper location of the data by checking the size of test dataset as size of the data is small

Click Here To Download This Code And Associated File.

+91 7075575787