OPE Prop for the MSE error function and any activation function
On this page we will discuss the general method of deriving a computing formula for the new weights and biases of a unit layer. This page targets the MSE error function and any activation function to provide a starting template for various activation functions that aren't already on the website.
If we assume that:
i - is the matrix of all input entries, where the first dimension is the data row, and the second is the individual input
y - is a vector of answers
n - is the length of the dataset
w - is a vector of weights
nw - is the number of weights
b - is the bias
a - is the specified activation function
Then we can solve for the required parameter (either wk or b) using this general formula:
nΣr=1(a(nwΣg=1irgwg + b) - yr) × a'(nwΣg=1irgwg + b) × (nwΣg=1irgwg + b)' = 0
It is possible to simplify this, by substituting the prediction into a function like this:
p(x) = nwΣg=1xgwg + b
Then the equation becomes simpler to comprehend like this:
nΣr=1(a( p(ir) ) - yr) × a'( p(ir) ) × p'(ir) = 0
After plugging in a specified activation function, solve for the required parameter and you'll have a function that calculates the best-fitted weight at the moment, with the MSE error function and your specified activation function.
Example with a linear unit function
Here is an example of deriving the needed formula with the MSE error function and an activation function which is a linear function like this:
a(x) = x
Then substituting it into the formula provided above:
nΣr=1(p(ir) - yr) × p'(ir) = 0
And after solving for wk and b, you get the formulas available on the output layer page.
OPE Prop formulas on this website are licensed under the CC BY-SA 4.0 License. More details here