Tag Archive: computer vision


By passing millions of ImageNet images through InceptionV1 (state-of-the-art deep convolutional neural network) we can extract the image patches that make specific neurons from various convolutional layers to activate mostly.

By projecting the image patches to 2D using UMAP we can see what the neural network “sees” at the various layers.

This is a great way for explaining how a computer vision model makes its classification decision.

However, the following part of the article was the reason for my post:

“….There is another phenomenon worth noting: not only are concepts being refined as you move from layer to layer, but new concepts seem to be appearing out of combinations of old ones….”

This is how a world of complexity works.

We know that deep neural networks perform hierarchical feature learning and combine simpler features to learn more complex ones. This is one of the reasons why we use deep learning for audio, visual and textual data.

Deep learning can decompose the complexity of data!

Conditional Language Models are not used only in Text Summarization and Machine Translation. They can be used also for Image Captioning!

Here is a great example from Machine Learning Mastery of how we can connect the Feature Extraction component of a SOTA Computer Vision model (e.g., VGG, ResNet, Inception, Xception, etc) with the input of a Language Model in order to generate the caption of an image.

The whole deep learning architecture can be trained end-to-end. It is a simple encoder-decoder architecture but it can be extended and improved using an attention interface between encoder and decoder, or even using Transformer layers!

Adding attention not only enables the model to attend differently various parts of the input image but also explain its decisions. For each generated word in output caption we can visualize the attended visual part of input image.

PAPER

E. Chatzikyriakidis, C. Papaioannidis and I. Pitas, “Adversarial Face De-Identification,” 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 2019, pp. 684-688

PRESENTER

Anastasios Tefas

PDF

https://bit.ly/2WtJAmx

Presentation topic: “Content-based Image Retrieval”

Presenter: Efstathios Chatzikyriakidis

PDF presentation: https://bit.ly/39FVpNg

Source code: https://bit.ly/38Q5XKv

Presentation topic: “Adversarial Face De-identification”

Presenter: Efstathios Chatzikyriakidis

PDF presentation: https://bit.ly/3ik1nHC

Experiments (exported files): https://bit.ly/3qrzdx7

Presentation topic: “Adversarial Examples and Generative Adversarial Networks”

Presenter: Efstathios Chatzikyriakidis

Contributor: Christos Papaioannidis

PDF presentation: https://bit.ly/2Qw7QBB