OPE Prop for the MSE error function and any activation function

On this page we will discuss the general method of deriving a computing formula for the new weights and biases of a unit layer. This page targets the MSE error function and any activation function to provide a starting template for various activation functions that aren't already on the website.

If we assume that:

i - is the matrix of all input entries, where the first dimension is the data row, and the second is the individual input

y - is a vector of answers

n - is the length of the dataset

w - is a vector of weights

n_w - is the number of weights

b - is the bias

a - is the specified activation function

Then we can solve for the required parameter (either w_k or b) using this general formula:

nΣr=1(a(n_wΣg=1i_rgw_g + b) - y_r) × a'(n_wΣg=1i_rgw_g + b) × (n_wΣg=1i_rgw_g + b)' = 0

It is possible to simplify this, by substituting the prediction into a function like this:

p(x) = n_wΣg=1x_gw_g + b

Then the equation becomes simpler to comprehend like this:

nΣr=1(a( p(i_r) ) - y_r) × a'( p(i_r) ) × p'(i_r) = 0

After plugging in a specified activation function, solve for the required parameter and you'll have a function that calculates the best-fitted weight at the moment, with the MSE error function and your specified activation function.

Here is an example of deriving the needed formula with the MSE error function and an activation function which is a linear function like this:

a(x) = x

Then substituting it into the formula provided above:

nΣr=1(p(i_r) - y_r) × p'(i_r) = 0

And after solving for w_k and b, you get the formulas available on the output layer page.

OPE Prop formulas on this website are licensed under the CC BY-SA 4.0 License. More details here