chainer.functions.softmax_cross_entropy¶
-
chainer.functions.
softmax_cross_entropy
(x, t, normalize=True, cache_score=True, class_weight=None, ignore_label=-1, reduce='mean', enable_double_backprop=False)[source]¶ Computes cross entropy loss for pre-softmax activations.
Parameters: - x (
Variable
ornumpy.ndarray
orcupy.ndarray
) – Variable holding a multidimensional array whose element indicates unnormalized log probability: the first axis of the variable represents the number of samples, and the second axis represents the number of classes. While this function computes a usual softmax cross entropy if the number of dimensions is equal to 2, it computes a cross entropy of the replicated softmax if the number of dimensions is greater than 2. - t (
Variable
ornumpy.ndarray
orcupy.ndarray
) – Variable holding a signed integer vector of ground truth labels. Ift[i] == ignore_label
, correspondingx[i]
is ignored. - normalize (bool) – If
True
, this function normalizes the cross entropy loss across all instances. IfFalse
, it only normalizes along a batch size. - cache_score (bool) – When it is
True
, the function stores result of forward computation to use it on backward computation. It reduces computational cost though consumes more memory. Ifenable_double_backprop
option isTrue
, this option is forcibly turned off and the function does not cache the intermediate value. - class_weight (
Variable
ornumpy.ndarray
orcupy.ndarray
) – An array that contains constant weights that will be multiplied with the loss values along with the second dimension. The shape of this array should be(x.shape[1],)
. If this is notNone
, each class weightclass_weight[i]
is actually multiplied toy[:, i]
that is the corresponding log-softmax output ofx
and has the same shape asx
before calculating the actual loss value. - ignore_label (int) – Label value you want to ignore. Its default value
is
-1
. See description of the argument t. - reduce (str) – A string that determines whether to reduce the loss
values. If it is
'mean'
, it computes the sum of the individual cross entropy and normalize it according tonormalize
option. If it is'no'
, this function computes cross entropy for each instance and does not normalize it (normalize
option is ignored). In this case, the loss value of the ignored instance, which hasignore_label
as its target value, is set to0
. - enable_double_backprop (bool) – If
True
, this function uses implementation that supports higher order differentiation. IfFalse
, it uses single-backprop implementation. This function use the single-backprop version because we expect it is faster. So, if you need second or higher derivatives, you need to turn it on explicitly.
Returns: A variable holding a scalar array of the cross entropy loss. If
reduce
is'mean'
, it is a scalar array. Ifreduce
is'no'
, the shape is same as that ofx
.Return type: Note
This function is differentiable only by
x
.Example
>>> x = np.array([[-1, 0, 1, 2], [2, 0, 1, -1]]).astype('f') >>> x array([[-1., 0., 1., 2.], [ 2., 0., 1., -1.]], dtype=float32) >>> t = np.array([3, 0]).astype('i') >>> t array([3, 0], dtype=int32) >>> y = F.softmax_cross_entropy(x, t) >>> y variable(0.44018972) >>> log_softmax = -F.log_softmax(x) >>> expected_loss = np.mean([log_softmax[row, column].data for row, column in enumerate(t)]) >>> y.array == expected_loss True
- x (