You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your code. it is very useful for me.
i read your code and want to ask a question.
Line68 in layers.py: self.dW = np.dot(self.last_input.T, output_grad)/n - self.weight_decay*self.W
In L2 regularization, i think this program need modify into self.dW = np.dot(self.last_input.T, output_grad)/n + self.weight_decay*self.W
Would you tell me what you think to use "- self.weight_decay*self.W"?
B.R
heibanke
The text was updated successfully, but these errors were encountered:
One of the reasons why tanh and sigmoid are used as an activation function is that the derivative can be computed from the forward propagation pass without evaluating an expensive function again, but that is not done here.
hello, andersbll:
Thanks for your code. it is very useful for me.
i read your code and want to ask a question.
Line68 in layers.py:
self.dW = np.dot(self.last_input.T, output_grad)/n - self.weight_decay*self.W
In L2 regularization, i think this program need modify into
self.dW = np.dot(self.last_input.T, output_grad)/n + self.weight_decay*self.W
Would you tell me what you think to use "- self.weight_decay*self.W"?
B.R
heibanke
The text was updated successfully, but these errors were encountered: