Projects:2018s1-100 Automated Person Identification with Multiple Sensors

From Projects
Revision as of 15:36, 16 October 2018 by A1670112 (talk | contribs) (Background)
Jump to: navigation, search

Project Team

Students

Maxwell Standen

Archana Vadakattu

Michael Vincent

Supervisors

Dr Brian Ng

Dr David Booth (DST Group)

Sau-Yee Yiu (DST Group)

Abstract

This project seeks to develop the capability to identify individuals based on their whole body shape and 3D facial characteristics. This will involve:

  • 1. Determining the most suitable sensor type(s) for the task.
  • 2. Data collection, using the most appropriate sensor(s) selected in (1).
  • 3. Data alignment, and extraction of 3D face and body shape features.
  • 4. The development of a recognition and fusion capability using the extracted 3D face and body shape features.
  • 5. Performance assessment.

The project will involve elements of literature survey (both sensor hardware and algorithmic techniques), software development (ideally in Matlab) and performance assessment, possibly by means of a small trial. This project is sponsored by DST Group.

Introduction

Automated person identification is a vital capability for modern security systems. Normal security environments feature uncooperative subjects so robust methods are required. The current challenge with person identification is to identify uncooperative or semi-uncooperative subjects such as those who don`t interact with the identification system directly. Soft biometrics are identifying characteristics that can be gathered without the cooperation of a subject. These can be features such as the way you walk, the lengths of your limbs or the shape of your face. This is different to hard biometrics which features a cooperative subject and involves data such as fingerprints or DNA.

The overall project will develop the capability to identify individuals based on their whole body shape and 3D facial characteristics. Our aims are:

  • Create an automated person identification system using 3D data
  • Use realistic data that is extendable to real-world applications
  • Fuse different methods to improve reliability of identification

The main objective is to implement a system that can recognise a person using soft biometric data, given image sequences captured using a Microsoft Kinect, which includes both depth and RGB cameras. From this objective, the project is divided into three main stages: investigation of preprocessing methods for the Kinect data, implementing face and body recognition techniques and performing fusion of techniques to improve robustness of the system.

Background

This project is part of a larger collaboration between the University of Adelaide and DST Group on biometric data fusion.

Microsoft Kinect Sensor

One of the sensors is a Microsoft Kinect, which is used to acquire depth videos from a frontal perspective. The Kinect sensor is a motion sensing device produced by Microsoft Corporation in 2010. It was developed as an input device for the Xbox 360 gaming console and as such, has capabilities such as face and body detection and tracking [1]. The device also utilises cooperative person recognition for its gaming applications (e.g. user authentication and game controls).

Depth Image example

A Kinect device features a depth sensor and RGB camera which it uses to recognise the position in space of different body parts for up to 6 people in a single frame. The depth sensor works using a CMOS camera to capture an image of an IR pattern [2]. The depth at each pixel in the image is calculated by the image processor in the Kinect device using the relative positions of dots in the pattern. The individual video frames captured by the Kinect can be viewed as either a "depth image" or as a point cloud, which shows the Kinect depth data after conversion to real world (XYZ) coordinates.

Point Cloud example


Recognition is performed using a machine learning pipeline which involves:

  • Acquiring data from a subject
  • Feature extraction to find identifying characteristics
  • Classification to identify the subject

TBC Background about Fusion

Methodology

System Pipeline
  • 1. Acquire data
  • 2. Preprocess data to improve quality
  • 3. Extract features from preprocessed data
  • 4. Use extracted features to create a classifier

Data Acquisition

Data is acquired using the Microsoft Kinect. RGB, depth and skeletal data is captured.

Preprocessing

Preprocessing is used to enhance the quality of the captured data and the facilitate more accurate extraction of features present in the data. Depth videos are segmented into sequences of frames by extracting gait cycles, which allows for feature averaging. This provides frame sequences which are suitable for extracting the features used in the project. Faces are also cropped for use in facial recognition algorithms.

Facial Feature Extraction

Facial features are extracted from the depth data.

Local Feature Extraction

Local feature extraction analyses local patterns and textures to determine the similarity between images. Instead of a global approach, this method extracts textures and patterns from a subject’s face. This technique is very resilient in changes to rotation and resolution.

Eigenfaces

Eigenfaces is a facial recognition method which captures the variation in a collection of face images and uses the information to compare faces. This information is used to encode subject’s faces for comparison-based identification.

Body Feature Extraction

Body features are extracted from the skeletal data that is captured by the Kinect.

Anthropometrics

Anthropometrics is the analysis of body measurements and is used to identify subjects. This is done by calculating body part lengths based on the positions of skeletal joints relative to each other. Static features such as height, arm length, leg length, and shoulder width are calculated using joints extracted from the Kinect.

Gait Analysis

Gait analysis involves extracting features related to the movement of individuals. Skeletal joints are tracked as a person walks and are used to calculate things such as stride length and degree of arm swing. These tend to be quite individualistic and so are useful for classification. Joints are also used in this method, however the features are dependent on the subject’s motion. They include things such as stride length, speed, elevation angles of the arms and legs, and average angles of the elbows and knees.

Fusion

Combine the face and body recognition results in a structured manner to improve robustness of the person identification system. This stage merges the face and body features to create a unified person identity classifier. Fusion combines these outputs to arrive at a final decision.

Results

Classification Accuracy vs Number of Subjects per Test

The results show that the best method was gait analysis, which performed better than anthropometrics. Body features consistently performed better than methods which used facial data, of which local feature extraction performed better than Eigenfaces. Despite the large difference in identification accuracy between each method, fusion improved the overall performance of the system.

Body Features

Anthropometrics

Preliminary testing with a dataset consisting of three subjects showed a 98% classification rate. This was done by inputting body part lengths into a KNN classifier with k = 1.

Conclusion

The aims of the project were achieved, as shown by the following key findings:

  • 3D data can be used to create an automated person identification system
  • Realistic data can be used to accurately identify people
  • Fusion of different methods improves performance

References

  1. J. Han, L. Shao, D. Xu, J. Shotton, “Enhanced computer vision with microsoft kinectsensor: A review,”IEEE transactions on cybernetics, vol. 43, no. 5, pp. 1318-1334, Oct.2013
  2. M.R. Andersen, T. Jensen, P. Lisouski, A.K. Mortensen, T. Gregersen, P. Ahrendt,“Kinect depth sensor evaluation for computer vision applications,”Electrical and Com-puter Engineering Technical Report ECE-TR-6, 2012