## Synopsis

The objective of this work is to predict the chance of admission of a student based on the student GRE and TOEFL scores. We will start by first trying to predict admission chance based on the student GRE score only and then proceed by considering both scores all together.

We will be using the “Admission_Predict_Ver1.1” data set that was gotten from Kaggle. We use MATLAB to build our machine learning algorithm (linear regression). The work focuses on:

- Plot a scatter plot of the data set
- compute and display initial cost
- testing of the cost function
- run gradient descent
- Plot the linear fit
- Predict admission chance for a GRE score of of say 316 and 340
- visualize the cost function

**Load the Data**

dat = load(‘grad1.txt’);

X = dat(:, 1); y = dat(:, 2);

m = length(y);

**Plot the Data**

plot(X, y);

plot(X, y, ‘rx’, ‘MarkerSize’, 10);

ylabel(‘Admission chance in %’);

xlabel(‘GRE score in 1000s’);

fprintf(‘Program paused. Press enter to continue.\n’);

pause;

## Compute and display initial cost

iterations = 25000;

alpha = 1.0;

J = computeCost1(X, y, theta);

fprintf(‘With theta = [0 ; 0]\nCost computed = %f\n’, J);

fprintf(‘Expected chance of admission(approx) 0.26\n’);

J = computeCost1(X, y, [-1 ; 2]);

fprintf(‘\nWith theta = [-1 ; 2]\nCost computed = %f\n’, J);

fprintf(‘Expected chance of admission(approx) 0.59\n’);

fprintf(‘Program paused. Press enter to continue.\n’);

pause;

**Output:**

With theta = [0 ; 0]

Cost computed = 0.269432

Expected chance of admission(approx) is 0.26

With theta = [-1 ; 2]

Cost computed = 0.598547

Expected chance of admission(approx) is 0.59

## Run gradient descent

theta = gradientDescent1(X, y, theta, alpha, iterations);

fprintf(‘Theta found by gradient descent:\n’);

fprintf(‘%f\n’, theta);

fprintf(‘Expected theta values (approx)\n’);

fprintf(‘ -2.3874\n 9.8184\n\n’);

**Output:**

Theta found by gradient descent:

-2.387424

9.818487

Expected theta values (approx)

-2.3874

9.8184

## Plot the linear fit

hold on;

plot(X(:,2), X*theta, ‘-‘)

legend(‘Training data’, ‘Linear regression’)

hold off

## Predict admission chance

predict1 = [1, 0.316] *theta;

fprintf(‘For gre = 316, we predict an admission chance of %f\n’,…

predict1);

predict2 = [1, 0.340] * theta;

fprintf(‘For gre = 340, we predict an admission chance of %f\n’,…

predict2);

**Output:**

For gre = 316, we predict an admission chance of 0.715218

For gre = 340, we predict an admission chance of 0.950861

So a student with a gre score of 316 will have a 71.5% chance of being admitted and a student with a gre score of 340 will have a 95.08% chance of being admitted .

## Visualizing the cost function

theta0_vals = linspace(-10, 10, 100);

theta1_vals = linspace(-1, 4, 100);

J_vals = zeros(length(theta0_vals), length(theta1_vals));

for i = 1:length(theta0_vals)

for j = 1:length(theta1_vals)

t = [theta0_vals(i); theta1_vals(j)];

J_vals(i,j) = computeCost1(X, y, t);

end

end

**Output:**

## Considering both GRE and TOEFL For the Prediction.

**Load the Data**

data = load(‘gag.txt’);

X = data(:, 1:2);

y = data(:, 3);

m = length(y);

**P**rint out some data points

fprintf(‘First 10 examples from the dataset: \n’);

fprintf(‘ x = [%.0f %.0f], y = %.0f \n’, [X(1:10,:) y(1:10,:)]’);

## Scale features

fprintf(‘Normalizing Features …\n’);

[X mu sigma] = featureNormalize1(X);

X = [ones(m, 1) X];

## Gradient Descent

fprintf(‘Running gradient descent …\n’);

% Choose some alpha value

alpha1 = 0.0001;

alpha2 = 0.001;

alpha3 = 0.015;

alpha4 = 0.01;

num_iters = 1000;

% Init Theta and Run Gradient Descent

theta = zeros(3, 1);

[theta, J1]= gradientDescentMulti1(X, y, theta, alpha1, num_iters);

[theta, J2]= gradientDescentMulti1(X, y, theta, alpha2, num_iters);

[theta, J3] = gradientDescentMulti1(X, y, theta, alpha3, num_iters);

[theta, J4]= gradientDescentMulti1(X, y, theta, alpha4, num_iters);

% Plot the convergence graph

figure;

subplot(2,2,1);

plot(J1, ‘b’)

xlabel(‘Number of iterations’);

ylabel(‘Cost J’);

title(‘convergence graph for alpha = 0.0001’)

subplot(2,2,2);

plot(J2, ‘r’)

xlabel(‘Number of iterations’);

ylabel(‘Cost J’);

title(‘convergence graph for alpha = 0.001’)

subplot(2,2,3);

plot(J3, ‘k’)

xlabel(‘Number of iterations’);

ylabel(‘Cost J’);

title(‘convergence graph for alpha = 0.015’)

subplot(2,2,4);

plot(J4, ‘g’)

xlabel(‘Number of iterations’);

ylabel(‘Cost J’);

title(‘convergence graph for alpha = 0.01’)

%Display gradient descent’s result

fprintf(‘Theta computed from gradient descent: \n’);

fprintf(‘ %f \n’, theta);

fprintf(‘\n’);

**Output:**

## Estimate the chance of admission of a student with a gre = 300 and toefl = 99 Using Gradient Descent

admitchance = sum(X*theta)/100000; % You should change this

fprintf([‘chance of admission of a student with a gre score= 300 and toefl score = 99’ …

‘(using gradient descent):\n $%f\n’], admitchance);

fprintf(‘Program paused. Press enter to continue.\n’);

pause;

**Output:**

Theta computed from gradient descent:

72.174000

6.921442

5.457525

chance of admission of a student with a gre score= 300 and toefl score = 99(using gradient descent):

0.360870

## Estimate the chance of admission of a student with a gre = 300 and toefl = 99 Using Normal Equations

data = csvread(‘gag.txt’);

X = data(:, 1:2);

y = data(:, 3);

m = length(y);

X = [ones(m, 1) X];

theta = normalEqn1(X, y);

fprintf(‘Theta computed from the normal equations: \n’);

fprintf(‘ %f \n’, theta);

fprintf(‘\n’);

% Estimate the chance of admission of a student with a gre score = 300 and toefl score = 99

admitchance = sum(X*theta)/100000; % You should change this

fprintf([‘chance of admission of a student with a gre = 300 and toefl = 99’ …

‘(using normal equations):\n $%f\n’], admitchance);

**Output:**

Theta computed from the normal equations:

-218.026755

0.613504

0.896000

chance of admission of a student with a gre = 300 and toefl = 99(using normal equations):

0.360870

So a student with a gre score of 300 and toefl score of 99 will have a 36.08% chance of being admitted.