Sidewalk Video

Original Video	“Difference Image” Video - shows motion	Result Video
sidewalk.avi	sidewalk_absdiff.avi	sidewalk_output.avi

(yes, the “Difference Image” video may run fast (29.97FPS), but this has been corrected via dynamic FPS matching of source to destination video. I leave this video here only for illustration, as it is, of course, half the size of the other video).

Theory

Mask out the sidewalk region of the image.
For each frame, take a difference image against the starting “background” scene.
When a person enters the scene from the left or right hand side of the screen, it will break the background. When the background becomes again visible “behind” the person (e.g. the person is bounded and is at least 1 pixel “on-screen”), then the person is completely represented.
For the extra credit (differentiating between people and the motorcycle rider): people are limited to two main speeds: “walk” and “run”. Anything faster than this difference in position between frames must be a “ride” action.

Messing Around (Feel free to grab the modified code)

I decided to see what would happen using the HMI example as an overlay. Now, although I suppose I could just grab this wholesale, I’ll stick with what I’ve got and go from there (ROI, HMI detection, figure detection).

(Although I’ll tell you what I will use: the realization that a center-of-mass measurement can be taken over time via HMI. And by its motion, I ought to be able to tell if it’s moving at a “riding”, “running”, or “walking” speed. Also, I can maybe get an area (e.g. BWAREA) and make blobs over a certain size “cars” and those under the threshold “people”. And “people” moving at a “riding” speed will yield “motorcycle”.)

So goes the theory, anyway.

OpenCV Code that Counts People Walking on the Sidewalk

Note: I tried to use contours and then do SeqPops off the sequence stack, but I just couldn’t get contours to work. So here we have a wildly inaccurate algorithm which finds when there is motion within the scene and then plots a “tracer” accordingly.

/*
 Author:  Chris Pilson, cpilson@iastate.edu
 Program Name: hw3_sidewalk
 Description: This program will read a visual scene (devoid of motion at first)
     and then use a region within this sene to detect people walking or 
     running through the scene.  The people will be counted, and this 
     count presented on-screen.
 High-Level Analysis:
     Tasks:
      DONE - (1) Use background scene as a "baseline"; display rest of video
      against the difference image created by ABSDIFFing(currentFrame,baseFrame)
      to get movement through the video.
      (2) Define a region of interest that covers the sidewalk; just look
      within this ROI poly to find motion.
      (3) For each blob in the difference image (e.g. each thing "different"
      from the baseline frame), see if it fits with the size and speed data
      that would be a:
       [optional] Vehicle (car/bus)
       Motorcyce/Bike (human, but moving quickly)
       Humanoid (running)
       Humanoid (walking)
      (4) Keep a running count of each category
*/
// MSVC++ .NET include.
#include "stdafx.h" 
 
// C++ includes.
#include 
#include 
#include 
#include 
 
// OpenCV includes.
#include 
#include 
 
#define DEBUG 1
 
// Set up the AVI Writer object
typedef struct CvAVIWriter CvAVIWriter;
 
int main(int argc, char* argv[])
{
  // STEP 1:
 // Bring the video file (AVI) in.
 CvCapture* VideoFile = cvCaptureFromFile("sidewalk.avi");
 if (VideoFile == NULL)
 {
  std::cout << "Uh-oh.  Either the input file doesn't exist, or OpenCV cannot read it." << std::endl;
  return 1;
 }
 
 // Now let's set up the frame size so that we can vomit out a video...
 CvSize frame_size;
 frame_size.height = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FRAME_HEIGHT);
 frame_size.width = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FRAME_WIDTH);
 // We'll go ahead and say that the AVI file is loaded now:
 if(DEBUG)
 {std::cout << "Brought in AVI file." << std::endl;}
 
 // Figure out what our incoming movie file looks like
 double FPS = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FPS);
 double FOURCC = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FOURCC);
 if(DEBUG)
 {
  std::cout << "FPS:  " << FPS << std::endl;
  std::cout << "FOURCC:  " << FOURCC << std::endl;
 }
 
 // Create a CvVideoWriter.  The arguments are the name of the output file (must be .avi), 
 // a macro for a four-character video codec installed on your system, the desired frame 
 // rate of the video, and the video dimensions.
 CvVideoWriter* videoWriter = cvCreateVideoWriter("sidewalk_output.avi",CV_FOURCC('D', 'I', 'V', 'X'), FPS, cvSize(frame_size.width, frame_size.height));
 // Now we can say that the VideoWriter is created:
 if(DEBUG)
 {std::cout << "videoWriter is made." << std::endl;}
 
 // Make display windows
 cvNamedWindow("background Frame", CV_WINDOW_AUTOSIZE);
 cvNamedWindow("current Frame", CV_WINDOW_AUTOSIZE);
 cvNamedWindow("diff Frame", CV_WINDOW_AUTOSIZE);
 cvNamedWindow("output Frame", CV_WINDOW_AUTOSIZE);
 cvNamedWindow("ROI Frame", CV_WINDOW_AUTOSIZE);
 cvNamedWindow("ROI Frame (Color)", CV_WINDOW_AUTOSIZE);
 
 // Keep track of frames
 static int imageCount = 0;
 
 // Set up images.
 IplImage* diffFrame = cvCreateImage(cvSize(frame_size.width, frame_size.height), IPL_DEPTH_8U, 1);
 IplImage* backgroundFrame, *eig_image, *temp_image;
 IplImage* currentFrame = cvCreateImage(cvSize(frame_size.width, frame_size.height), IPL_DEPTH_8U, 1);
 IplImage* outFrame = cvCreateImage(cvSize(frame_size.width, frame_size.height), IPL_DEPTH_8U, 3);
 IplImage* tempFrameBGR = cvCreateImage(cvSize(frame_size.width, frame_size.height), IPL_DEPTH_8U, 3);
 IplImage* ROIFrame = cvCreateImage(cvSize((265-72), (214-148)), IPL_DEPTH_8U, 1);
 IplImage* ROIFrame2 = cvCreateImage(cvSize((265-72), (214-148)), IPL_DEPTH_8U, 1);
 IplImage* ROIFrameBGR = cvCreateImage(cvSize((265-72), (214-148)), IPL_DEPTH_8U, 3);
 IplImage* ROIFrameBGRPrior = cvCreateImage(cvSize((265-72), (214-148)), IPL_DEPTH_8U, 3);
 
 // And now set up the data for MinMaxLoc (for ROI image)
 double minVal, maxVal;
 CvPoint minLoc, maxLoc, outPoint;
 
 // Initialize our contour information...
 int contours=0;
 CvMemStorage* storage = cvCreateMemStorage(0);
 CvSeq** firstContour;
 int headerSize;
 CvSeq* contour = 0;
 int color = (0, 0, 0);
 CvContourScanner ContourScanner=0;
 
 // Zero out the people-counting image...
 cvZero(ROIFrameBGR);
 
 int people=0;
 int MOVEMENT=0;
 
 // There's gotta be a better way to do this... like with threading?
 while(1)
 {
  // Let's try to threshold this at 15FPS - the input rate.
  // 66 is used as it's 1/15 * 1000...  
  // Wait a second!  I have the FPS here.  *sigh*  Let's do this dynamically:
  //Sleep((1000/FPS)-10);
  // Awesome.  The video runs in actual time now, after subtracting out the 10ms from WaitKey().
 
  IplImage* tempFrame = cvQueryFrame(VideoFile);
  // If the video HAS a current frame...
  if (tempFrame != NULL)
  {
   // The video is BGR-space.  I wish there were a cvGetColorSpace command or something...
   cvCvtColor(tempFrame, currentFrame, CV_BGR2GRAY);
   // Grrr ... flipped.
   cvFlip(currentFrame);
   // Get initial "background" image...
   if (imageCount==0)
   {
    //IplImage* backgroundFrame = cvCloneImage(currentFrame);
    backgroundFrame = cvCloneImage(currentFrame);
   }
   cvShowImage("background Frame", backgroundFrame);
   cvShowImage("current Frame", currentFrame);
   cvAbsDiff(currentFrame,backgroundFrame,diffFrame);
   if(DEBUG)
   {std::cout << "Pulled in video grab of frame " << imageCount << "." << std::endl;}
 
   // Back to color ...
   cvCvtColor(diffFrame, outFrame, CV_GRAY2BGR);
   // Now let's go ahead and put up a box (rect, actually) for our ROI.
   // (72, 148)+-----------------------+(265, 148)
   //   |      |
   // (72, 214)+-----------------------+(265, 214)
   //MotionRegion cvRect(72, 148, (265-72), (214-148));
   //cvRectangle(outFrame, cvPoint(72, 148), cvPoint(265, 214), CV_RGB(255, 0, 255), 1);
   cvRectangle(diffFrame, cvPoint(72, 148), cvPoint(265, 214), CV_RGB(255, 0, 255), 1);
   cvShowImage("diff Frame", diffFrame);
   cvCvtColor(backgroundFrame, tempFrameBGR, CV_GRAY2BGR);
   cvFlip(tempFrame);
   // ROIFrame is BW.
   ROIFrame = cvCloneImage(outFrame);
   cvSetImageROI(ROIFrame, cvRect(72, 148, (265-72), (214-148)));
   //cvOr(outFrame, tempFrame, outFrame);
   cvShowImage("ROI Frame", ROIFrame);
   // Great.  The ROI Frame works, almost as an "inset".
   // Now let's find when motion exists within the ROI.
   // First:  the cumbersome way...
   
   cvSetImageCOI(ROIFrame, 1);
   cvMinMaxLoc(ROIFrame, &minVal, &maxVal, &minLoc, &maxLoc, NULL);
   if (maxVal < 100)
   {
    // Zero out the LAST people-counting image...
    cvZero(ROIFrameBGRPrior);
    MOVEMENT=0;
   }
   if(maxVal > 100)
   {
    cvSetImageCOI(ROIFrameBGRPrior, 1);
    // We are starting a motion sequence...
    if( (MOVEMENT==0) && (cvCountNonZero(ROIFrameBGRPrior)==0) )
    {
     // Zero out the people-counting image...
     cvZero(ROIFrameBGR);
     MOVEMENT=1;
     people++;
     if(DEBUG)
     {std::cout << "ROI has counted " << people << " people." << std::endl;}
    }
 
    if(DEBUG)
    {std::cout << "We have motion in the ROI!  maxVal: " << maxVal << " minVal: " << minVal << std::endl;}
    // Phew.  Okay, we can figure out when there's motion within the ROI.  Good.
    // Now let's see what we can do with contours.
    //contours = cvFindContours(ROIFrame, storage, firstContour, headerSize=sizeof(CvContour), CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
    //contours = cvFindContours(ROIFrame, storage, &contour, sizeof(CvContour), CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE);
    // Bah.  Couldn't do anything with them.  :(
    // Let's instead try to put a dot on people that are moving...
    //cvCircle( CvArr* img, CvPoint center, int radius, double color, int thickness=1 )
    cvCircle(ROIFrameBGR, maxLoc, 1, CV_RGB(255, 0, 0), 1);
    ROIFrameBGRPrior = cvCloneImage(ROIFrameBGR);
   }
   cvShowImage("ROI Frame (Color)", ROIFrameBGR);
   /*
   // Now:  a better way - we'll know there's motion if contours>0.
   ROIFrameBGR = cvCloneImage(ROIFrame);
   cvCvtColor(ROIFrame, ROIFrameBGR, CV_GRAY2BGR);
   cvSetImageCOI(ROIFrame, 1);
 
   contours = cvFindContours(ROIFrame, storage, &contour, sizeof(CvContour), CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE);
   if (contours > 0)
   {
    if(DEBUG)
    {std::cout << "We have motion in the ROI!" << std::endl;}
   }
   
   // Draw out the contours
   for( ; contour != 0; contour = contour->h_next )
   {
    // replace CV_FILLED with 1 to see the outlines
    cvDrawContours( ROIFrameBGR, contour, CV_RGB( rand(), rand(), rand() ), CV_RGB( rand(), rand(), rand() ), -1, CV_FILLED, 8 );
   }
   cvShowImage("ROI Frame (Color)", ROIFrameBGR);
   */
 
   // Write the current frame to an output movie.
   //cvWriteFrame(videoWriter, diffFrame);
   // Build up the output ...
   cvOr(outFrame, tempFrame, outFrame);
   // ... and draw the ROI rectangle.
   cvRectangle(outFrame, cvPoint(72, 148), cvPoint(265, 214), CV_RGB(255, 0, 255), 1);
   char peopleCount[32];
   if (people==1)
   {sprintf(peopleCount, "%d person", people);}
   else if ( (people < 1) || (people > 1) )
   {sprintf(peopleCount, "%d people", people);}
   CvFont font;
   cvInitFont(&font, CV_FONT_HERSHEY_SIMPLEX, 0.8, 0.8, 0, 2);
   cvPutText(outFrame, peopleCount, cvPoint(0, 25), &font, cvScalar(0, 0, 300));
   cvShowImage("output Frame", outFrame);
   cvWriteFrame(videoWriter, outFrame);
   if(DEBUG)
   {std::cout << "Wrote frame to output AVI file." << std::endl;}
   imageCount++;
   } // end if (image != NULL) loop
  
  // This will return the code of the pressed key or -1 if
  // nothing was pressed before 10 ms elapsed.
  int keyCode = cvWaitKey(10);
  if ( (keyCode == 's') || (keyCode == 'S') )
  {
   while(1)
   {
    keyCode = cvWaitKey(10);
    if ( (keyCode == 's') || (keyCode == 'S') )
    {
     keyCode = 999;
     break;
    }
   }
  }
 
  // But the video may have ended...
  if( (tempFrame == NULL) || (keyCode >= 0) && (keyCode != 999) )
  {
   // Either the video is over or a key was pressed.
   // Dump the video file.
   cvReleaseCapture(&VideoFile);
   // Release the videoWriter from memory.
   cvReleaseVideoWriter(&videoWriter);
   // Release images from memory...
   cvReleaseImage(&currentFrame);
   //cvReleaseImage(&diffFrame);
   // ... And destroy the windows.
   cvDestroyWindow("Video Frame");
   std::cout << "Released VideoFile and VideoWriter." << std::endl;
   return 0;
   exit(0);
  }
 }// end while loop
 return 0;
}

Hands Video

Original Video	Output Video
hands.avi	hands_output.avi

Theory

The 3 items are solidly colored and likely have “hard” edges.
Each item has a shadow below it.
Each item has a predominance of color associated with it - it’s not a completely single-colored object, but it’s “good enough”. I’ll likely want a Sobel edge detector on this.
One item is “counted” when it becomes occluded by a flesh-colored object (the hand) that rests on the object for a thresholded period of time (0.5 second?). What this means is that the hand enters a ROI (e.g. the “book” area) and then leaves. Upon entry, a “count” is registered.
We can tell which item was occluded when, in a frame, the object’s area is less than it’s baseline area.

How Many Times Was Each Book Touched? Code in OpenCV

Notes: I have an issue with pointers right now. Aside from this, I believe the code to be largely working. New Code; largely working. Concessions on the “multi-add problem” are made inline. e.g. I need a skin detector here.

/*
 Author:  Chris Pilson, cpilson@iastate.edu
 Program Name: hw3_hands
 Description: This program will read a visual scene and then 
     use a region within this sene to detect when an 
     onscreen object is being touched (occluded).  The 
     object will be counted, and this count presented 
     on-screen.
*/
// MSVC++ .NET include.
#include "stdafx.h" 
 
// C++ includes.
#include 
#include 
#include 
#include 
#include 
 
// OpenCV includes.
#include 
#include 
 
// Do we want EXTREMELY verbose CLI output?
#define DEBUG 0
 
// Set up the AVI Writer object
typedef struct CvAVIWriter CvAVIWriter;
 
// Set up our 3 rectangles
  // BLUE BOOK:
  // (65, 62) +-----------------------+(80, 62)
  //   |      |
  // (65, 67) +-----------------------+(80, 67)
  // RED BOOK:
  // (47, 135)+-----------------------+(55, 135)
  //   |      |
  // (47, 140)+-----------------------+(55, 140)
  // YELLOW BOOK:
  // (148, 185)+----------------------+(153, 185)
  //    |      |
  // (148, 190)+----------------------+(153, 190)
#define BLUE_RECTANGLE cvRectangle(outFrame, cvPoint(65,62), cvPoint(80,67), CV_RGB(0, 0, 300), 1);
#define BLUE_RECTANGLE_FILLED cvRectangle(outFrame, cvPoint(65,62), cvPoint(80,67), CV_RGB(0, 0, 300), CV_FILLED);
int BLUE_RECT_WIDTH = (80-65);
int BLUE_RECT_HEIGHT = (67-62);
#define RED_RECTANGLE cvRectangle(outFrame, cvPoint(47, 135), cvPoint(55, 140), CV_RGB(300, 0, 0), 1);
#define RED_RECTANGLE_FILLED cvRectangle(outFrame, cvPoint(47, 135), cvPoint(55, 140), CV_RGB(300, 0, 0), CV_FILLED);
int RED_RECT_WIDTH = (55-47);
int RED_RECT_HEIGHT = (140-135);
#define YELLOW_RECTANGLE cvRectangle(outFrame, cvPoint(148, 185), cvPoint(153, 190), CV_RGB(255, 255, 128), 1);
#define YELLOW_RECTANGLE_FILLED cvRectangle(outFrame, cvPoint(148, 185), cvPoint(153, 190), CV_RGB(255, 255, 128), CV_FILLED);
int YELLOW_RECT_WIDTH = (153-148);
int YELLOW_RECT_HEIGHT = (190-185);
 
// Flag and count trackers...
// D'oh.  They're already globals.  *sigh*
static bool BLUEFLAG=false;
static bool REDFLAG=false;
static bool YELLOWFLAG=false;
static int BlueCount=0;
static int RedCount=0;
static int YellowCount=0;
 
 
// Function to detect when a hand is in the Blue Rectangle.  
// Returns BlueCount value (int), BLUEFLAG (bool - cast to int).
//std::string DetectBlue(IplImage* tempFrame, IplImage* outFrame, bool BLUEFLAG, int BlueCount)
int DetectBlue(IplImage* tempFrame, IplImage* outFrame, int BlueCount)
{
 uchar* temp_ptr;
 for (int i = 65; i < 65+BLUE_RECT_WIDTH; i++)
 {
  for (int j = 62; j < 62+BLUE_RECT_HEIGHT; j++)
  {
   // If the value at any (i,j) within the rectangles is flesh-colored, then set a flag...
   // This'll do if the Red value exceeds 150 at any pixel.
   temp_ptr = &((uchar*)(tempFrame->imageData + tempFrame->widthStep*j))[i*3];
   if (DEBUG)
   {std::cout << "Color values [(B, G, R)]: (" << (int)temp_ptr[0] << ", " << (int)temp_ptr[1] << ", " << (int)temp_ptr[2] << ")." << std::endl;}
   if ( (int)temp_ptr[2] > 200)
   {
    if (DEBUG)
    {std::cout << "BBBBBBBBBBBBBB Blue Book is being touched BBBBBBBBBBBB." << std::endl;}
    BLUE_RECTANGLE_FILLED;
    RED_RECTANGLE;
    //*REDFLAG=false;
    YELLOW_RECTANGLE;
    //*YELLOWFLAG=false;
    if (BLUEFLAG==false)
    {
     // Shove 'true' into BLUEFLAG and 'false' into other flags.
     BLUEFLAG = true;
     REDFLAG = false;
     YELLOWFLAG = false;
     BlueCount++;
     return(BlueCount);
     if (DEBUG)
     {std::cout << "--------- Blue Count: " << BlueCount << "." << std::endl;}
    }
   }
  }
 }
 return(BlueCount);
}
 
// Function to detect when a hand is in the Red Rectangle.  
// Returns RedCount value (int), REDFLAG (bool).
int DetectRed(IplImage* tempFrame, IplImage* outFrame, int RedCount)
{
 uchar* temp_ptr;
 for (int i = 47; i < 47+RED_RECT_WIDTH; i++)
 {
  for (int j = 135; j < 135+RED_RECT_HEIGHT; j++)
  {
   // If the value at any (i,j) within the rectangles is flesh-colored, then set a flag...
   // This'll do if the Blue value exceeds 100 at any pixel.
   temp_ptr = &((uchar*)(tempFrame->imageData + tempFrame->widthStep*j))[i*3];
   if (DEBUG)
   {std::cout << "Color values [(B, G, R)]: (" << (int)temp_ptr[0] << ", " << (int)temp_ptr[1] << ", " << (int)temp_ptr[2] << ")." << std::endl;}
   if ( (int)temp_ptr[0] > 100)
   {
    if (DEBUG)
    {std::cout << "RRRRRRRRRRRRR Red Book is being touched RRRRRRRRRRR." << std::endl;}
    BLUE_RECTANGLE;
    //FlagPointer = &BLUEFLAG;
    //*FlagPointer = false;
    RED_RECTANGLE_FILLED;
    YELLOW_RECTANGLE;
    //FlagPointer = &YELLOWFLAG;
    //*FlagPointer = false;
    if (REDFLAG==false)
    {
     // Shove 'true' into REDFLAG
     REDFLAG = true;
     BLUEFLAG=false;
     YELLOWFLAG=false;
     RedCount++;
     return(RedCount);
     if (DEBUG)
     {std::cout << "--------- Red Count: " << RedCount << "." << std::endl;}
    }
   }
  }
 }
 return(RedCount);
}
 
// Function to detect when a hand is in the Yellow Rectangle.  
// Returns YellowCount value (int), YELLOWFLAG (bool).
int DetectYellow(IplImage* tempFrame, IplImage* outFrame, int YellowCount)
{
 if(DEBUG)
 {std::cout << "YELLOWFLAG: " << YELLOWFLAG << std::endl;}
 
 uchar* temp_ptr;
 for (int i = 148; i < 148+YELLOW_RECT_WIDTH; i++)
 {
  for (int j = 185; j < 185+YELLOW_RECT_HEIGHT; j++)
  {
   // If the value at any (i,j) within the rectangles is flesh-colored, then set a flag...
   // This'll do if the Blue value exceeds 100 at any pixel.
   temp_ptr = &((uchar*)(tempFrame->imageData + tempFrame->widthStep*j))[i*3];
   if (DEBUG)
   {std::cout << "Color values [(B, G, R)]: (" << (int)temp_ptr[0] << ", " << (int)temp_ptr[1] << ", " << (int)temp_ptr[2] << ")." << std::endl;}
   if ( (int)temp_ptr[0] > 100)
   {
    if (DEBUG)
    {std::cout << "YYYYYYYYY Yellow Book is being touched YYYYYYYYYY." << std::endl;}
    BLUE_RECTANGLE;
    //*BLUEFLAG=false;
    RED_RECTANGLE;
    //*REDFLAG=false;
    YELLOW_RECTANGLE_FILLED;
    // If the flag is false, meaning that this isn't a continuation of contact...
    //if (YELLOWFLAG==false)
    if (!YELLOWFLAG)
    {
     // Shove 'true' into YELLOWFLAG
     YELLOWFLAG = true;
     REDFLAG=false;
     BLUEFLAG=false;
     YellowCount++;
     return(YellowCount);
     if (DEBUG)
     {
      {std::cout << "YELLOWFLAG (YellowCount++): " << YELLOWFLAG << std::endl;}
      std::cout << "--------- Yellow Count: " << YellowCount << "." << std::endl;
     }
    }
   }
  }
 }
 return(YellowCount);
}
int main(int argc, char* argv[])
{
  // STEP 1:
 // Bring the video file (AVI) in.
 CvCapture* VideoFile = cvCaptureFromFile("hands.avi");
 if (VideoFile == NULL)
 {
  std::cout << "Uh-oh.  Either the input file doesn't exist, or OpenCV cannot read it." << std::endl;
  return 1;
 }
 
 // Now let's set up the frame size so that we can vomit out a video...
 CvSize frame_size;
 frame_size.height = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FRAME_HEIGHT);
 frame_size.width = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FRAME_WIDTH);
 // We'll go ahead and say that the AVI file is loaded now:
 if(DEBUG)
 {std::cout << "Brought in AVI file." << std::endl;}
 
 // Figure out what our incoming movie file looks like
 double FPS = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FPS);
 double FOURCC = cvGetCaptureProperty(VideoFile, CV_CAP_PROP_FOURCC);
 if(DEBUG)
 {
  std::cout << "FPS:  " << FPS << std::endl;
  std::cout << "FOURCC:  " << FOURCC << std::endl;
 }
 
 // Create a CvVideoWriter.  The arguments are the name of the output file (must be .avi), 
 // a macro for a four-character video codec installed on your system, the desired frame 
 // rate of the video, and the video dimensions.
 CvVideoWriter* videoWriter = cvCreateVideoWriter("hands_output.avi",CV_FOURCC('D', 'I', 'V', 'X'), FPS, cvSize(frame_size.width, frame_size.height));
 // Now we can say that the VideoWriter is created:
 if(DEBUG)
 {std::cout << "videoWriter is made." << std::endl;}
 
 // Make display windows
 cvNamedWindow("current Frame", CV_WINDOW_AUTOSIZE);
 //cvNamedWindow("canny Frame", CV_WINDOW_AUTOSIZE);
 cvNamedWindow("output Frame", CV_WINDOW_AUTOSIZE);
 
 // Keep track of frames
 static int imageCount = 0;
 
 // Set up images.
 //IplImage* cannyFrame = cvCreateImage(cvSize(frame_size.width, frame_size.height), IPL_DEPTH_8U, 1);
 IplImage* currentFrame = cvCreateImage(cvSize(frame_size.width, frame_size.height), IPL_DEPTH_8U, 1);
 IplImage* outFrame = cvCreateImage(cvSize(frame_size.width, frame_size.height), IPL_DEPTH_8U, 3);
 
 // There's gotta be a better way to do this... like with threading?
 while(1)
 {
  // Let's try to threshold this at 15FPS - the input rate.
  // 66 is used as it's 1/15 * 1000...  
  // Wait a second!  I have the FPS here.  *sigh*  Let's do this dynamically:
//  Sleep((1000/FPS)-10);
  // Awesome.  The video runs in actual time now, after subtracting out the 10ms from WaitKey().
 
  IplImage* tempFrame = cvQueryFrame(VideoFile);
  // If the video HAS a current frame...
  if (tempFrame != NULL)
  {
   // The video is BGR-space.  I wish there were a cvGetColorSpace command or something...
   cvCvtColor(tempFrame, currentFrame, CV_BGR2GRAY);
   // Grrr ... flipped.
   cvFlip(currentFrame);
   // Grr... if I could detect skin, then AND the regions within the rectangles with the skin-highlighted image, then I could tell with certainty if a hand was there, rather than getting false counts.
   // cvThreshold is almost good enough, though.
   // AHA!  But I'd not have to check for skin at every pixel, just within the rectangles within DetectXXX function calls.
   // ... and if the area within the box has white elements (e.g. skin), then the hand is on the book.
   // this would be MUCH more reliable than using RGB colorspace detection.
   cvThreshold(currentFrame, currentFrame, 200, 255, CV_THRESH_BINARY);
   cvShowImage("current Frame", currentFrame);
   if(DEBUG)
   {std::cout << "Pulled in video grab of frame " << imageCount << "." << std::endl;}
 
   // Set up the output composite image...
   outFrame = cvCloneImage(tempFrame);
   // Reference the TopLeft as (0,0).
   tempFrame->origin=IPL_ORIGIN_TL;
   cvFlip(tempFrame);
   outFrame->origin=IPL_ORIGIN_TL;
   cvFlip(outFrame);
 
   // If a hand isn't in any of the rectangles...
   // Draw some rectangle overlays on the image.
   BLUE_RECTANGLE;
   RED_RECTANGLE;
   YELLOW_RECTANGLE;
 
   // But if a hand is in a rectangle (e.g. counting a book)...
   // we should be able to pick this up at the pixel level...
   // This is downright odd ... the pixel values here (the blue rectangle) change whenever ANY book is touched.
   // Erm ... yes, it would do this at (i, j)=0.
   // (Calls moved to DetectBlue, DetectRed, DetectYellow)
   if (DEBUG)
   {
    std::cout << "+ BlueCount: " << BlueCount << std::endl;
    std::cout << "+ BLUEFLAG: " << BLUEFLAG << std::endl;
    std::cout << "+ RedCount: " << RedCount << std::endl;
    std::cout << "+ REDFLAG: " << REDFLAG << std::endl;
    std::cout << "+ YellowCount: " << YellowCount << std::endl;
    std::cout << "+ YELLOWFLAG: " << YELLOWFLAG << std::endl;
   }
 
   BlueCount = DetectBlue(tempFrame, outFrame, BlueCount);
   RedCount = DetectRed(tempFrame, outFrame, RedCount);
   YellowCount = DetectYellow(tempFrame, outFrame, YellowCount);
 
   // Let's get some output on the screen...
   CvFont font;
   cvInitFont(&font, CV_FONT_HERSHEY_SIMPLEX, 0.8, 0.8, 0, 2);
   cvPutText(outFrame, "Counting Book-Touches", cvPoint(0, 20), &font, cvScalar(255, 255, 255));
   // Blue Counts
   char CountB[3];
   sprintf(CountB, "%d", BlueCount);
   cvPutText(outFrame, CountB, cvPoint(270, 50), &font, cvScalar(300, 0, 0));
   // Red Counts
   char CountR[3];
   sprintf(CountR, "%d", RedCount);
   cvPutText(outFrame, CountR, cvPoint(270, 125), &font, cvScalar(0, 0, 300));
   // Yellow Counts
   char CountY[3];
   sprintf(CountY, "%d", YellowCount);
   cvPutText(outFrame, CountY, cvPoint(270, 175), &font, cvScalar(128, 255, 255));
   
   // Holy freakin' crap, as Peter Griffin might say.  I've never been happier to count book touches.
 
   cvShowImage("output Frame", outFrame);
   
   // Write the current frame to an output movie.
   cvWriteFrame(videoWriter, outFrame);
   if(DEBUG)
   {std::cout << "Wrote frame to output AVI file." << std::endl;}
   imageCount++;
   } // end if (image != NULL) loop
  
  // This will return the code of the pressed key or -1 if
  // nothing was pressed before 10 ms elapsed.
  int keyCode = cvWaitKey(10);
  // "S" or "s" will pause playback.  
  if ( (keyCode == 's') || (keyCode == 'S') )
  {
   while(1)
   {
    keyCode = cvWaitKey(10);
    if ( (keyCode == 's') || (keyCode == 'S') )
    {
     keyCode = 999;
     break;
    }
   }
  }
 
  // But the video may have ended...
  if( ((tempFrame == NULL) || (keyCode >= 0)) && (keyCode != 999) )
  {
   // Either the video is over or a key was pressed.
   // Dump the video file.
   cvReleaseCapture(&VideoFile);
   // Release the videoWriter from memory.
   cvReleaseVideoWriter(&videoWriter);
   // Release images from memory...
   cvReleaseImage(&currentFrame);
   cvReleaseImage(&tempFrame);
   cvReleaseImage(&outFrame);
   // ... And destroy the windows.
   cvDestroyWindow("current Frame");
   cvDestroyWindow("output Frame");
   std::cout << "Released VideoFile and VideoWriter." << std::endl;
   exit(0);
  }
 }// end while loop
 return 0;

Vision of the Machine

6/26/2013

Sidewalk Video

Theory

Messing Around (Feel free to grab the modified code)

OpenCV Code that Counts People Walking on the Sidewalk

Hands Video

Theory

How Many Times Was Each Book Touched? Code in OpenCV

No comments: