
1. Project Overview:
In this project, we will try to predict the possibility of a booking for a hotel based on different factors and also try to predict if they need special requests based on different features. The data set contains booking information for a city hotel and a resort hotel, and includes information such as when the booking was made, the number of adults, children, and/or babies, and the number of available parking spaces, among other things. From it, we can understand the customer’s’ behavior and it might help us make better decisions.
The process of our analysis will be by the following step: Define our Business question, understanding the Datasets, Data preparation and wrangling, analyze the data, model the data and conclusion.

2. Business Understanding:
My goal for this project is predicting which kind of customers need special request and predicting the possibility of a booking for a hotel by knowing different features. This will help the hotel booking company to make better decisions.


3. Data Understanding:
R library used: fun Modeling, tidyverse, Hmisc, DataExplorer, dplyr, caret, lattice, magrittr, ggplot2, scales, gridExtra, psych, plotly and many more.
The data set contains 119390 rows and 32 columns.

Top 4 Most Popular Ai Articles:
4. Machine Learning using Logistic Regression in Python with Code

4.Data preparation / Wrangling:
We are replacing missing values in Children column from the corresponding Babies column. We are also replacing undefined as SC. Both means no meal package. Replacing Undefined with mode in the market segment column. Replacing Undefined with mode in the distribution channel column.
5. Analyzing the data:
•Categorical Data and Continuous Data analyzed. Uni variant, Bi variant and multi variant analysis performed.
- Analyzed to check the seasonal trend in the data set







MAJOR OBSERVATIONS FROM EDA
1.Number of bookings made were highest in the month of July and August and lowest in January.
2.Bookings were more for the City hotel than the Resort hotel.
3.41.7% of the total bookings were cancelled for City hotel and 21.7% for the Resort hotel.
4.Number of days that elapsed between the entering date of the booking and the arrival date is less for the people who cancelled.
5.As the hotels are in Portugal Europe, the bookings are mostly with European countries, Highest is Portugal with 48.59k bookings.
6.77% of the bookings are made with bed and breakfast.
7.Only 3% are repeated guests.







6.MODEL BUILDING –FOR BOOKING CANCELLATIONS :
1)USING MULTIPLE LOGISTIC REGRESSION
•Build the model using some variables from the dataset as independent variables to predict booking cancellations.
•80.4% accuracy in predicting the model with testing dataset.
LIMITATIONS
•Execution time is high with large datasets.
- Difficult to satisfy assumptions.

2)USING RIDGE AND LASSO REGULARIZATION:
RIDGE REGRESSION
•Model was build using all the independent variables except for the reservation status and date column to predict the booking cancellations.
•Performed k-fold cross validation.
•80.9% accuracy in predicting the model with testing dataset.
- Prepared confusion matrix to evaluate the performance of the model.

LASSO REGRESSION
80.57% accuracy in predicting the model with testing dataset.
Prepared the confusion matrix.

3)USING RANDOM FOREST :
•Model was build using all the independent variables except for the reservation status and date column to predict the booking cancellations.
•80% of the data as training data and 20% as testing.
•93.9% accuracy in predicting the model with testing dataset.

7.MODEL BUILDING- FOR SPECIAL REQUEST :
•Model was build using all the independent variables to predict the booking cancellations.
•Number of decision trees are 500 and the variable at each split is 5.
•83% accuracy in predicting the model with testing data set.
•Tuning the model by increasing mtry.

8. Conclusion:
•Booking cancellation model will help to Identify the likelihood of bookings being cancelled and makes it possible for hotel managers to take measures to avoid these potential cancellations, such as offering services, discounts, or other perks.
•The prediction model enable hotel managers to mitigate revenue loss derived from booking cancellations and to mitigate the risks associated with overbooking (reallocation costs, cash or service compensations).
- Special request model will contribute to reduce uncertainty in the inventory allocation and pricing decision process by predicting the likelihood of getting a request from customers.
Don’t forget to give us your ? !



Hotel Booking Demand Project Using Different Models was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.
