OPE Prop for one weight and bias
From the history of OPE Prop, the first thing I developed was an algorithm for calculating best-suited weights in a model with one weight and one bias. This was the beginning of OPE Prop, that already resulted in a performance increase over gradient descent.
The Main Formula
If we consider that:
Then we've got the following formula for our error function e(w,b):
e(w,b) = n-1 ((a1 - wi1 - b)2 + (a2 - wi2 - b)2 + ...)
After that, the partial derivative of the error function to the bias de/db is:
de/db = -2 n-1 (-w (i1 + i2 + ...) - nb + a1 + a2 + ...)
And the partial derivative of the error function to the weight de/dw is:
de/dw = -2 n-1 (-w(i12 + i22 + ...) - b (i1 + i2 + ...) + a1i1 + a2i2 + ...)
Now, after finding the partial derivatives of the error function, we can find where are the error function's extrema's, or just find where the derivatives are equal to zero. de/db = 0 here:
w (i1 + i2 + ...) + nb = a1 + a2 + ...
And de/dw = 0 here:
w (i12 + i22 + ...) + b (i1 + i2 + ...) = a1i1 + a2i2 + ...
Then, to find the new best-fitting weight and bias we use these formulas:
b = n-1 ((a1 + a2 + ...) - w (i1 + i2 + ...))
w = ( (a1i1 + a2i2 + ...) - b (i1 + i2 + ...) ) / (i12 + i22 + ...)
The important part is during computing new weights and biases you will need to use old ones.
Implementation in Code
Here is an implementation of these formulas in Ruby code:
inputs = (0...1000).map { |n| n * 0.1 - 50 }
answers = inputs.map { |i| i * 3 + 4 }
epochs = 5
n = inputs.size
w = 1
b = 1
epochs.times do |epoch|
new_b = (answers.sum - w * inputs.sum) / n
new_w = (answers.zip(inputs).map{ |a, i| a * i }.sum - b * inputs.sum) / (inputs.map{ |i| i**2 }.sum)
w = new_w
b = new_b
end
After understanding the one weight algorithm (or not understanding it), you can see the multi-parameter algorithm.
OPE Prop formulas on this website are licensed under the CC BY-SA 4.0 License. More details here