Abstract
While most deep learning architectures are built on convolution, alternative foundations like morphology are being explored for purposes like interpretability and its connection to the analysis and processing of geometric structures. The morphological hit-or-miss operation has the advantage that it takes into account both foreground and background information when evaluating target shape in an image. Herein, we identify limitations in existing hit-or-miss neural definitions and we formulate an optimization problem to learn the transform relative to deeper architectures. To this end, we model the semantically important condition that the intersection of the hit and miss structuring elements (SEs) should be empty and we present a way to express Don't Care (DNC), which is important for denoting regions of an SE that are not relevant to detecting a target pattern. Our analysis shows that convolution, in fact, acts like a hit-miss transform through semantic interpretation of its filter differences. On these premises, we introduce an extension that outperforms conventional convolution on benchmark data. Quantitative experiments are provided on synthetic and benchmark data, showing that the direct encoding hit-or-miss transform provides better interpretability on learned shapes consistent with objects whereas our morphologically inspired generalized convolution yields higher classification accuracy. Last, qualitative hit and miss filter visualizations are provided relative to single morphological layer.