2 months ago

Here, this paper propose a supervisor signal, addivitive angular margin (ArcFace), which has a better geometrical interpretation than suoervision singals proposed before this paper.
Moreover, this paper introduced many recent face recognition modes and loss functions they use, given us a clear overview of this area.

- Three primary attributes makes enbedding differ:

  1. Training data
  2. Network architecture
  3. Loss functions

- Lose functions -- From Softma to ArcFAce
Softmax (Only want to do classification well):

Could not explicitly optimise the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap.

Weights Normalisation:

L2 weight normalisation only improves little on performance.

Multiplicative Angular Margin:

The additional dynamic hyper-parameter λ will make the training of SphereFace relatively tricky.

Feature Normalisation:

L2 normalisation on features and weights is an important step for hypersphere metric learning. The intuitive insight behind feature and weight normalisation is to remove the radial variation and push every feature to distribute on a hypersphere manifold.

Additive Cosine Margin:

(1) Extremely easy to implement without tricky hyper-parameters
(2) More clear and able to converge without the Softmax supervision
(3) Obvious performance improvement

Additive Angular Margin:

← [Paper Reading] Phoneme Recognition [Paper Reading] Neural Architecture Search with Reinforcement Learning →
comments powered by Disqus