The present invention is a method and system for selectively executing content on a display based on the automatic recognition of predefined characteristics, including visually perceptible attributes, such as the demographic profile of people identified automatically using a sequence of image frames from a video stream. The present invention detects the images of the individual or the people from captured images. The present invention automatically extracts visually perceptible attributes, including demographic information, local behavior analysis, and emotional status, of the individual or the people from the images in real time. The visually perceptible attributes further comprise height, skin color, hair color, the number of people in the scene, time spent by the people, and whether a person looked at the display. A targeted media is selected from a set of media pools, according to the automatically-extracted, visually perceptible attributes and the feedback from the people.