chainer.functions.spatial_pyramid_pooling_2d¶
-
chainer.functions.
spatial_pyramid_pooling_2d
(x, pyramid_height, pooling=None)[source]¶ Spatial pyramid pooling function.
It outputs a fixed-length vector regardless of input feature map size.
It performs pooling operation to the input 4D-array
x
with different kernel sizes and padding sizes, and then flattens all dimensions except first dimension of all pooling results, and finally concatenates them along second dimension.At \(i\)-th pyramid level, the kernel size \((k_h^{(i)}, k_w^{(i)})\) and padding size \((p_h^{(i)}, p_w^{(i)})\) of pooling operation are calculated as below:
\[\begin{split}k_h^{(i)} &= \lceil b_h / 2^i \rceil, \\ k_w^{(i)} &= \lceil b_w / 2^i \rceil, \\ p_h^{(i)} &= (2^i k_h^{(i)} - b_h) / 2, \\ p_w^{(i)} &= (2^i k_w^{(i)} - b_w) / 2,\end{split}\]where \(\lceil \cdot \rceil\) denotes the ceiling function, and \(b_h, b_w\) are height and width of input variable
x
, respectively. Note that index of pyramid level \(i\) is zero-based.See detail in paper: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.
- Parameters
- Returns
Output variable. The shape of the output variable will be \((batchsize, c \sum_{h=0}^{H-1} 2^{2h}, 1, 1)\), where \(c\) is the number of channels of input variable
x
and \(H\) is the number of pyramid levels.- Return type