0

I'm trying to understand a basic conceptual difference between CNN architecture and an RCNN architecture when talking in reference to Image/ computer vision.

Correct me if I'm wrong but to my understand CNN provides spatial invariance(location) and RNN provides temporal invariance(time).

Shiva Reddy
  • 39
  • 1
  • 5
  • invariance? what is this terms referring to? – Gordon Jan 18 '18 at 10:02
  • 1
    @Gordon You can intuitively view "spatial invariance" as something like "not caring about where in space it is, just that it's somewhere". Different filters in a CNN can recognize different kinds of shapes (e.g. dogs) in images regardless of where those shapes/dogs are in the image. This is because the same filter (with the same set of learned weights) "slides" over the entire image, looking for the same features in all kinds of different small sections of the image. – Dennis Soemers Jan 18 '18 at 11:21
  • Possible duplicate of [What's the difference between convolutional and recurrent neural networks?](https://stackoverflow.com/questions/20923574/whats-the-difference-between-convolutional-and-recurrent-neural-networks) – Maxim Jan 18 '18 at 12:37

1 Answers1

8

You are mixing up different concepts. A RNN is not the same as a R-CNN.

A RNN is a Recurrent Neural Network, which is a class of artificial neural network where connections between units form a directed cycle. This allows it to exhibit dynamic temporal behavior. The following image shows a simple representation of a RNN Cell.

This is a RNN Cell

A R-CNN is a Region-based Convolutional Neural Network. It is a visual object detection system that combines bottom-up region proposals with rich features computed by a convolutional neural network. Casually said the R-CNN proposes a bunch of boxes in the image and see if any of them actually correspond to an object. It computes these proposal regions with a selective search algorithm. The following image shows the architecture of a R-CNN:

enter image description here

So, to answer your question: A R-CNN is simply an extension of a CNN with a focus on object detection, while "normal" CNNs are usually used for image classification.

ITiger
  • 1,056
  • 3
  • 11
  • 24