Uncategorized – StevyeTinchie's Blog

Synopsis

The objective of this work is to predict the chance of admission of a student based on the student GRE and TOEFL scores. We will start by first trying to predict admission chance based on the student GRE score only and then proceed by considering both scores all together.

We will be using the “Admission_Predict_Ver1.1” data set that was gotten from Kaggle. We use MATLAB to build our machine learning algorithm (linear regression). The work focuses on:

Plot a scatter plot of the data set
compute and display initial cost
testing of the cost function
run gradient descent
Plot the linear fit
Predict admission chance for a GRE score of of say 316 and 340
visualize the cost function

Load the Data

dat = load(‘grad1.txt’);
X = dat(:, 1); y = dat(:, 2);
m = length(y);

Plot the Data

plot(X, y);
plot(X, y, ‘rx’, ‘MarkerSize’, 10);
ylabel(‘Admission chance in %’);
xlabel(‘GRE score in 1000s’);
fprintf(‘Program paused. Press enter to continue.\n’);
pause;

Scatter plot of the data

Compute and display initial cost

iterations = 25000;
alpha = 1.0;

J = computeCost1(X, y, theta);
fprintf(‘With theta = [0 ; 0]\nCost computed = %f\n’, J);
fprintf(‘Expected chance of admission(approx) 0.26\n’);
J = computeCost1(X, y, [-1 ; 2]);
fprintf(‘\nWith theta = [-1 ; 2]\nCost computed = %f\n’, J);
fprintf(‘Expected chance of admission(approx) 0.59\n’);

fprintf(‘Program paused. Press enter to continue.\n’);
pause;

Output:

With theta = [0 ; 0]
Cost computed = 0.269432
Expected chance of admission(approx) is 0.26

With theta = [-1 ; 2]
Cost computed = 0.598547
Expected chance of admission(approx) is 0.59

Run gradient descent

theta = gradientDescent1(X, y, theta, alpha, iterations);
fprintf(‘Theta found by gradient descent:\n’);
fprintf(‘%f\n’, theta);
fprintf(‘Expected theta values (approx)\n’);
fprintf(‘ -2.3874\n 9.8184\n\n’);

Output:

Theta found by gradient descent:
-2.387424
9.818487
Expected theta values (approx)
-2.3874
9.8184

Plot the linear fit

hold on;
plot(X(:,2), X*theta, ‘-‘)
legend(‘Training data’, ‘Linear regression’)
hold off

Predict admission chance

predict1 = [1, 0.316] *theta;
fprintf(‘For gre = 316, we predict an admission chance of %f\n’,…
predict1);
predict2 = [1, 0.340] * theta;
fprintf(‘For gre = 340, we predict an admission chance of %f\n’,…
predict2);

Output:

For gre = 316, we predict an admission chance of 0.715218
For gre = 340, we predict an admission chance of 0.950861

So a student with a gre score of 316 will have a 71.5% chance of being admitted and a student with a gre score of 340 will have a 95.08% chance of being admitted .

Visualizing the cost function

theta0_vals = linspace(-10, 10, 100);
theta1_vals = linspace(-1, 4, 100);
J_vals = zeros(length(theta0_vals), length(theta1_vals));
for i = 1:length(theta0_vals)
for j = 1:length(theta1_vals)
t = [theta0_vals(i); theta1_vals(j)];
J_vals(i,j) = computeCost1(X, y, t);
end
end

Output:

Considering both GRE and TOEFL For the Prediction.

Load the Data

data = load(‘gag.txt’);
X = data(:, 1:2);
y = data(:, 3);
m = length(y);

Print out some data points

fprintf(‘First 10 examples from the dataset: \n’);
fprintf(‘ x = [%.0f %.0f], y = %.0f \n’, [X(1:10,:) y(1:10,:)]’);

Scale features

fprintf(‘Normalizing Features …\n’);

[X mu sigma] = featureNormalize1(X);
X = [ones(m, 1) X];

Gradient Descent

fprintf(‘Running gradient descent …\n’);

% Choose some alpha value
alpha1 = 0.0001;
alpha2 = 0.001;
alpha3 = 0.015;
alpha4 = 0.01;
num_iters = 1000;

% Init Theta and Run Gradient Descent
theta = zeros(3, 1);

[theta, J1]= gradientDescentMulti1(X, y, theta, alpha1, num_iters);

[theta, J2]= gradientDescentMulti1(X, y, theta, alpha2, num_iters);

[theta, J3] = gradientDescentMulti1(X, y, theta, alpha3, num_iters);

[theta, J4]= gradientDescentMulti1(X, y, theta, alpha4, num_iters);

% Plot the convergence graph
figure;
subplot(2,2,1);
plot(J1, ‘b’)
xlabel(‘Number of iterations’);
ylabel(‘Cost J’);
title(‘convergence graph for alpha = 0.0001’)
subplot(2,2,2);
plot(J2, ‘r’)
xlabel(‘Number of iterations’);
ylabel(‘Cost J’);
title(‘convergence graph for alpha = 0.001’)
subplot(2,2,3);
plot(J3, ‘k’)
xlabel(‘Number of iterations’);
ylabel(‘Cost J’);
title(‘convergence graph for alpha = 0.015’)
subplot(2,2,4);
plot(J4, ‘g’)

xlabel(‘Number of iterations’);
ylabel(‘Cost J’);
title(‘convergence graph for alpha = 0.01’)

%Display gradient descent’s result
fprintf(‘Theta computed from gradient descent: \n’);
fprintf(‘ %f \n’, theta);
fprintf(‘\n’);

Output:

Estimate the chance of admission of a student with a gre = 300 and toefl = 99 Using Gradient Descent

admitchance = sum(X*theta)/100000; % You should change this

fprintf([‘chance of admission of a student with a gre score= 300 and toefl score = 99’ …
‘(using gradient descent):\n $%f\n’], admitchance);

fprintf(‘Program paused. Press enter to continue.\n’);
pause;

Output:

Theta computed from gradient descent:
72.174000
6.921442
5.457525

chance of admission of a student with a gre score= 300 and toefl score = 99(using gradient descent):
0.360870

Estimate the chance of admission of a student with a gre = 300 and toefl = 99 Using Normal Equations

data = csvread(‘gag.txt’);
X = data(:, 1:2);
y = data(:, 3);
m = length(y);
X = [ones(m, 1) X];
theta = normalEqn1(X, y);
fprintf(‘Theta computed from the normal equations: \n’);
fprintf(‘ %f \n’, theta);
fprintf(‘\n’);

% Estimate the chance of admission of a student with a gre score = 300 and toefl score = 99

admitchance = sum(X*theta)/100000; % You should change this

fprintf([‘chance of admission of a student with a gre = 300 and toefl = 99’ …
‘(using normal equations):\n $%f\n’], admitchance);

Output:

Theta computed from the normal equations:
-218.026755
0.613504
0.896000

chance of admission of a student with a gre = 300 and toefl = 99(using normal equations):
0.360870

So a student with a gre score of 300 and toefl score of 99 will have a 36.08% chance of being admitted.

Category: Uncategorized

Predicting the Chance of Admission of a Student Based on His/her GRE and TOEFL Scores.

Synopsis

Load the Data

Plot the Data

Compute and display initial cost

Run gradient descent

Plot the linear fit

Predict admission chance

Visualizing the cost function

Considering both GRE and TOEFL For the Prediction.

Load the Data

Print out some data points

Scale features

Gradient Descent

Estimate the chance of admission of a student with a gre = 300 and toefl = 99 Using Gradient Descent

Estimate the chance of admission of a student with a gre = 300 and toefl = 99 Using Normal Equations

The Journey Begins