Python Problem 1: estimate regression parameters using grid search We're going to start by looking at the relationship b
Posted: Sun May 15, 2022 11:49 am
Python
Problem 1: estimate regression parameters using grid search
We're going to start by looking at the relationship
between income and house value in our
dataset. How well does income predict housing value in
each census block?
First, run the code below, which fetches the California Housing
Dataset from sklearn and structures it as a pandas dataframe
called cal_df. The column encoding income
is MedInc and the column encoding house value
is MedHouseVal.
Then, in your solution code block, find the best
fitting regression intercept and slope by iterating through
possible values to identify the ones with the lowest sum of squared
error. For this problem, the best-fitting intercept and
slope will be between 0 and 1, so you only need to search in this
range for both parameter values. Estimate your regression
parameters in this range with a precision of 0.01.
Here's how we recommend doing this:
Code needed:
from sklearn.datasets import fetch_california_housing
# Read in the California Housing dataset
cal = fetch_california_housing()
# Convert the dataset to a pandas dataframe
cal_df = pd.DataFrame(data = cal.data, columns =
cal.feature_names)
cal_df['MedHouseVal'] = cal.target
Problem 1: estimate regression parameters using grid search
We're going to start by looking at the relationship
between income and house value in our
dataset. How well does income predict housing value in
each census block?
First, run the code below, which fetches the California Housing
Dataset from sklearn and structures it as a pandas dataframe
called cal_df. The column encoding income
is MedInc and the column encoding house value
is MedHouseVal.
Then, in your solution code block, find the best
fitting regression intercept and slope by iterating through
possible values to identify the ones with the lowest sum of squared
error. For this problem, the best-fitting intercept and
slope will be between 0 and 1, so you only need to search in this
range for both parameter values. Estimate your regression
parameters in this range with a precision of 0.01.
Here's how we recommend doing this:
Code needed:
from sklearn.datasets import fetch_california_housing
# Read in the California Housing dataset
cal = fetch_california_housing()
# Convert the dataset to a pandas dataframe
cal_df = pd.DataFrame(data = cal.data, columns =
cal.feature_names)
cal_df['MedHouseVal'] = cal.target