By passing millions of ImageNet images through InceptionV1 (state-of-the-art deep convolutional neural network) we can extract the image patches that make specific neurons from various convolutional layers to activate mostly.

By projecting the image patches to 2D using UMAP we can see what the neural network “sees” at the various layers.

This is a great way for explaining how a computer vision model makes its classification decision.

However, the following part of the article was the reason for my post:

“….There is another phenomenon worth noting: not only are concepts being refined as you move from layer to layer, but new concepts seem to be appearing out of combinations of old ones….”

This is how a world of complexity works.

We know that deep neural networks perform hierarchical feature learning and combine simpler features to learn more complex ones. This is one of the reasons why we use deep learning for audio, visual and textual data.

Deep learning can decompose the complexity of data!