This post explains a neural network operation called depthwise separable convolution, which trades off between latency and accuracy.
Depthwise separable convolution factorizes a standard convolution into a depthwise convolution and a pointwise convolution. Depthwise convolution captures spatial information of each feature map channel and pointwise convolution combines these information across all channels.
This post introduces the paper ‘Gradient Harmonized Single-stage Detector’, which tries to tackle hard example mining problem. In the end, we compare between GHM and Focal Loss.
This paper tries to handle the long-existing and well-known problems of one-stage detector: the imbalance between the number of positive and negative examples as well as that between easy and hard examples.
This post introduces conditional log-likelihood and mean squared error and shows the relationship between these two performance measures.
For many machine learning problems, especially for supervised learning, the goal is to build a system that can take a vector \bold{x}\in{\mathbb{R}^m} as input and predict the value of a scalar y\in{\mathbb{R}} as its output. To solve such problems, we can follow the steps:
choose a model to predict a \hat{y} when given an input \bold{x}
estimate the weights \theta = g(\bold{x}) that \hat{y} = f(\theta, \bold{x})
define the performance measure (also known as loss function or cost function)
design an algorithm that will improve the weights \theta in a way that reduces loss function