Images and Videos

When talking about images and videos as data types one usually refers to a digital depiction of the visual perception. An image has two spatial dimensions and one or multiple spectral bands e.g. an image taken by a digital camera has usually three color channels: red green and blue. Disciplines like computer vision or photogrammetry aim at extracting information out of images, resulting in several image related machine learning tasks. Image classification means to assign the correct class to an image e.g. does the image show a cat or a dog? Then there is the task of pixel wise classification, which is also referred to as scene parsing or scene labeling. More sophisticated tasks are e.g. to detect multiple possibly overlapping objects, called object detection, or object aware segmentation, which is called instance segmentation. Currently, the image analysis based on machine learning achieves great attention. This is due to several points. One the one hand, the amount of data is tremendous and continuously growing, on the other hand images can be a powerful interface between computers, humans and the real world. Videos are basically just a time series of images. The same tasks can be considered with a additional aspect of temporal consistency. Additionally new, video related tasks like motion or gesture detection are gaining researchers attention.



Online Resources