In [ ]: %matplotlib inline Load the dataset In [ ]: flight_df = pd.read_excel('Airfares.xlsx', sheet_name='data') Prepro
Posted: Sun Jul 03, 2022 10:00 am
In [ ]:
%matplotlib inline
Load the dataset
In [ ]:
flight_df = pd.read_excel('Airfares.xlsx',sheet_name='data')
Preprocessing¶
1.Conduct preprocessing below using FARE as dependent variableand the following variables as independent variables: COUPON, NEW,VACATION, SW, HI, S_INCOME, E_INCOME, S_POP, E_POP, SLOT, GATE,DISTANCE, PAX. At the end of preprocessing, all independentvariables should be in a dataframe named X and the dependentvariable should be in a Series object(i.e., single-dimensiondataframe) named y.
In [ ]:
2. Check the missing values. Drop them if needed.
In [ ]:
Partition data
In [ ]:
train_X, valid_X, train_y, valid_y = train_test_split(X, y,test_size=0.4, random_state=1)
Train the model
In [ ]:
car_lm = LinearRegression()
Print coefficients
In [ ]:
print('intercept ', car_lm.intercept_)
Test the model using the training set
In [ ]:
Test the model using validation set
In [ ]:
Print performance measures
Compute common accuracy measures
In [ ]:
Determine the residuals and create a histogram
In [ ]:
Exhaustive search
Run an exhaustive search.
In [ ]:
Backward elimination
In [ ]:
In [ ]:
Forward selection
In [ ]:
In [ ]:
%matplotlib inline
Load the dataset
In [ ]:
flight_df = pd.read_excel('Airfares.xlsx',sheet_name='data')
Preprocessing¶
1.Conduct preprocessing below using FARE as dependent variableand the following variables as independent variables: COUPON, NEW,VACATION, SW, HI, S_INCOME, E_INCOME, S_POP, E_POP, SLOT, GATE,DISTANCE, PAX. At the end of preprocessing, all independentvariables should be in a dataframe named X and the dependentvariable should be in a Series object(i.e., single-dimensiondataframe) named y.
In [ ]:
2. Check the missing values. Drop them if needed.
In [ ]:
Partition data
In [ ]:
train_X, valid_X, train_y, valid_y = train_test_split(X, y,test_size=0.4, random_state=1)
Train the model
In [ ]:
car_lm = LinearRegression()
Print coefficients
In [ ]:
print('intercept ', car_lm.intercept_)
Test the model using the training set
In [ ]:
Test the model using validation set
In [ ]:
Print performance measures
Compute common accuracy measures
In [ ]:
Determine the residuals and create a histogram
In [ ]:
Exhaustive search
Run an exhaustive search.
In [ ]:
Backward elimination
In [ ]:
In [ ]:
Forward selection
In [ ]:
In [ ]: