Java Real-Time Video Kit

Early Access Version 0.8

Simon M Lucas

 

Overview

The aim of this kit is to simplify writing real-time video applications in Java.  All you need to do is provide the implementation of a very simple interface, described below.  I've found this simplification to be very useful, and saves having to spend time learning the Java Media Framework's powerful but rather complex API.

Interface

The current interface is called ShortFrameProcessor, which reflects the fact that the early access version is based on each frame being delivered as a 1-d array of type short (a Java short is a 16-bit signed int).  The most significant bit in this representation is unused, the remaining fifteen bits are used in sequence for the red, green and blue channels respectively.  The FrameDifference implementation illustrates how to extract each channel (though some channels may be commented out).

There are just two methods.  Init is called when the application starts up, and specifies the size of the image (width and height).  Process is called when each frame is received from the camera driver.

public interface ShortFrameProcessor {

    /** init method should be called before any calls to process */
    public void init(int width, int height);

    public void process(short[] frame);

}
 

Frame Difference

The frame difference example illustrates how to extract the RGB channels and then compute the differences between the current frame and the previous one.  You can build simple object tracking algorithms on the basis of this.

We first declare some useful class variables - these specify the bit masks etc for extracting the color channels from the shorts. 

    final static int RED_MASK = 0x00007F00;
    final static int RED_SHIFT = 7;     // 7 right
    final static int GREEN_MASK = 0x000003E0;
    final static int GREEN_SHIFT = 2;   // 2 right
    final static int BLUE_MASK = 0x0000001F;
    final static int BLUE_SHIFT = 3;    // 3 left
    final static int ALPHA_MASK = 0xFF000000;
    final static int MID_GREY = 128;

Next we declare the instance variables:

    int width, height;
    Array2dComponent a;
    int[] pixels;
    int[] buf;
    TextField text;
    int count;
    ElapsedTimer t;
    StatisticalSummary ss;

The init method  initialises a few variables and opens a new CloseableFrame, which is used to display the frame difference images.  The Array2dComponent is used as a simple way of painting the difference pixels on the screen. ElapsedTimer offers a convenient way of measuring the elapsed time between chosen points in the code.  StatisticalSummary gives us an easy way to compute various statistics - though in this case we're only using it to calculate the mean.

    public void init(int width, int height) {
        this.width = width;
        this.height = height;
        pixels = new int[width * height];
        buf = new int[width * height];
        a = new Array2dComponent(pixels, width, height, 1);
        Panel p = new Panel();
        p.setLayout(new BorderLayout());
        text = new TextField();
        p.add(a, BorderLayout.CENTER);
        p.add(text, BorderLayout.SOUTH);
        new CloseableFrame(p, "Difference frame", true);
        t = new ElapsedTimer();
        ss = new StatisticalSummary();
    }

Finally, we have the implementation of the process method - the comments explain this:

    public void process(short[] frame) {
        // reset the timer
        t.reset();
        for (int i = 0; i < Math.min(frame.length, pixels.length); i++) {
            // red, green and blue
            int r = ( (int) frame[i]) & RED_MASK ;
            r >>= RED_SHIFT;

            int g = ( (int) frame[i]) & GREEN_MASK ;
            g = g >> GREEN_SHIFT;

            int b = ( (int) frame[i]) & BLUE_MASK ;
            b = b << BLUE_SHIFT;

            // grey level - average of r, g and b
            int x = (r + g + b) / 3;
            int diff = x - buf[i];
            int p = (diff / 2) + MID_GREY;
            // set image pixel to difference in grey levels
            pixels[i] = ALPHA_MASK + p + (p << 8) + (p << 16);
            // set buffer pixel to current grey level
            buf[i] = x;
        }
        // update the difference image and repaint it
        // costs about 30ms - comment this out if not needed
        a.setPixels(pixels);
        a.repaint();
        // compute some timing statistics
        ss.add( t.elapsed() );
        text.setText( "Set: " + count++ + " frames : " + (int) ss.mean() );
    }

Interfacing with the JMF (FrameAccess.java)

To interface with the JMF, I've taken example code from Sun called FrameAccess.java, and worked in calls to the implementation of ShortFrameProcessor.  At the beginning of this file I specify the class to be used - you'll see some of the other possibilities commented out.  Note that it would be very straightforward to pass the class name as a command line argument instead.

// set the kind of processing you want on each frame
// static ShortFrameProcessor proc = new ObjectTracker();

// some other alternatives...
static ShortFrameProcessor proc = new FrameDifference();
// static ShortFrameProcessor proc = new FrameSaver();
// static ShortFrameProcessor proc = new FrameThreshold();

The setup and calls to the ShortFrameProcessor are in PostAccessCodec.  Note that the resolution in this example is hard-wired into the code, which is very naughty!  As the comment says, this should be taken from the video format information.

public class PostAccessCodec extends PreAccessCodec {
  // We'll advertize as supporting all video formats.
  ShortFrameProcessor proc;

  public PostAccessCodec() {
    supportedIns = new Format[]{
      new RGBFormat()
    };
    // set up the ShortFrameProcessor
    proc = FrameAccess.proc;
    proc.init( 320 , 240 ); // should get this from the video format...

  }

  /**
    * Callback to access individual video frames.
    */
  void accessFrame(Buffer frame) {
    ....
    proc.process( (short[]) frame.getData() );
  } 
  ....

 

Running the Example

You will need:

Extract the RTVK zip file, then test it by running (on windows):

java -classpath .;\jdk\jmf\lib\jmf.jar problems.video.FrameAccess

I've tested the code on a Pentium 4 running windows XP.  It typically runs for days without crashing, but can sometimes crash after a few days.  I've found this with JMF applications in general.

Note that the JMF must be installed on your computer, it is not enough (unfortunantely) to just refer to the jmf.jar file.

Here's the frame difference example running.  This is the Frame Difference image:

The text at the bottom indicates the number of frames processed, and the average time in milliseconds to process each frame.  Note that in this example, most of the time is taken repainting the component, so for real vision applications you may want to disable the display to get better speed - but the display is of course very useful for debugging purposes.

This is the video capture window that FrameAccess creates:

Future Directions

I've found this to be very useful already, but the fact that it only handles one format is a real limitation.  It would be a useful exercise to re-factor this example and allow new format decoders to be plugged in - in a clean yet efficient manner.  This involves deciding a standard format that each decoder should output - perhaps an array of 32-bit int of the kind produced by the FrameDifference example would be a good choice.  However, if you then have to unpack for your own requirements this may be unacceptably wasteful!  Temporary remedy is given below in Alternative Formats.

IAPR TC5

If you found this useful you may want to join IAPR TC5 - the technical committee on benchmarking and software in pattern recognition.  The remit of TC5 is to promote the use of common software, standards and benchmarking practices in the area.  As a member you'll be kept updated with new releases of datasets and software.

Download

You can download the real-time vision kit from here.

If you find any source files missing, you can download these from this server, using the package as the path and then just giving the name of the java file.  For example, the main method of StatisticalSummary refers to a class called VisualSummary.  This is in package utilities, and is available from http://algoval.essex.ac.uk/utilities/VisualSummary.java

Alternatively, note that any missing source files are probably not needed, and references to the corresponding classes can be safely commented out.

Alternative Formats

The sample code above assumes the image is provided as an array of type short.  Many web-cam drivers provide the image as an array of byte, of the form byte[] image = {r0, b0, g0, r1, g1, b1, ...}.

If yours is one of these, then you can use this version of FrameAccess and of ByteFrameDifference for now - in a future release this will be tidied up, and if the computational overhead is not too high, we'll transform all incoming formats into int[] image={argb0, argb1, ...}.