Projects:2018s1-100 Automated Person Identification with Multiple Sensors
Contents
Project Team
Students
Maxwell Standen
Archana Vadakattu
Michael Vincent
Supervisors
Dr Brian Ng
Dr David Booth (DST Group)
Sau-Yee Yiu (DST Group)
Abstract
This project seeks to develop the capability to identify individuals based on their whole body shape and 3D facial characteristics. This will involve:
- 1. Determining the most suitable sensor type(s) for the task.
- 2. Data collection, using the most appropriate sensor(s) selected in (1).
- 3. Data alignment, and extraction of 3D face and body shape features.
- 4. The development of a recognition and fusion capability using the extracted 3D face and body shape features.
- 5. Performance assessment.
The project will involve elements of literature survey (both sensor hardware and algorithmic techniques), software development (ideally in Matlab) and performance assessment, possibly by means of a small trial. This project is sponsored by DST Group.
Introduction
Automated person identification is a vital capability for modern security systems. Normal security environments feature uncooperative subjects so robust methods are required. The current challenge with person identification is to identify uncooperative or semi-uncooperative subjects such as those who don`t interact with the identification system directly. Soft biometrics are identifying characteristics that can be gathered without the cooperation of a subject. These can be features such as the way you walk, the lengths of your limbs or the shape of your face. This is different to hard biometrics which features a cooperative subject and involves data such as fingerprints or DNA.
The overall project will develop the capability to identify individuals based on their whole body shape and 3D facial characteristics. Our aims are:
- Create an automated person identification system using 3D data
- Use realistic data that is extendable to real-world applications
- Fuse different methods to improve reliability of identification
The main objective is to implement a system that can recognise a person using soft biometric data, given image sequences captured using a Microsoft Kinect, which includes both depth and RGB cameras. From this objective, the project is divided into three main stages: investigation of preprocessing methods for the Kinect data, implementing face and body recognition techniques and performing fusion of techniques to improve robustness of the system.
Background
This project is part of a larger collaboration between the University of Adelaide and DST Group on biometric data fusion.
One of the sensors is a Microsoft Kinect, which is used to acquire depth videos from a frontal perspective. The Kinect sensor is a motion sensing device produced by Microsoft Corporation in 2010. It was developed as an input device for the Xbox 360 gaming console and as such, has capabilities such as face and body detection and tracking [1]. The device also utilises cooperative person recognition for its gaming applications (e.g. user authentication and game controls).
A Kinect device features a depth sensor and RGB camera which it uses to recognise the position in space of different body parts for up to 6 people in a single frame. The depth sensor works using a CMOS camera to capture an image of an IR pattern [2]. The depth at each pixel in the image is calculated by the image processor in the Kinect device using the relative positions of dots in the pattern. The individual video frames captured by the Kinect can be viewed as either a "depth image" or as a point cloud, which shows the Kinect depth data after conversion to real world (XYZ) coordinates.
Recognition is performed using a machine learning pipeline which involves:
- Acquiring data from a subject
- Feature extraction to find identifying characteristics
- Classification to identify the subject
TBC Background about Fusion
Methodology
- 1. Acquire data
- 2. Preprocess data to improve quality
- 3. Extract features from preprocessed data
- 4. Use extracted features to create a classifier
Data Acquisition
Data is acquired using the Microsoft Kinect. RGB, depth and skeletal data is captured.
Preprocessing
Preprocessing is used to enhance the quality of the captured data and the facilitate more accurate extraction of features present in the data. Depth videos are segmented into sequences of frames by extracting gait cycles, which allows for feature averaging. This provides frame sequences which are suitable for extracting the features used in the project. Faces are also cropped for use in facial recognition algorithms.
Facial Recognition
Facial features are extracted from the depth data.
Local Feature Extraction
Local feature extraction analyses local patterns and textures to determine the similarity between images. Instead of a global approach, this method extracts textures and patterns from a subject’s face. This technique is very resilient in changes to rotation and resolution.
Eigenfaces
Eigenfaces is a facial recognition method which captures the variation in a collection of face images and uses the information to compare faces. This information is used to encode subject’s faces for comparison-based identification.
Body Feature Extraction
Body features are extracted from the skeletal data that is captured by the Kinect.
Anthropometrics
Anthropometrics is the analysis of body measurements and is used to identify subjects. This is done by calculating body part lengths based on the positions of skeletal joints relative to each other. Static features such as height, arm length, leg length, and shoulder width are calculated using joints extracted from the Kinect.
Gait Analysis
Gait analysis involves extracting features related to the movement of individuals. Skeletal joints are tracked as a person walks and are used to calculate things such as stride length and degree of arm swing. These tend to be quite individualistic and so are useful for classification. Joints are also used in this method, however the features are dependent on the subject’s motion. They include things such as stride length, speed, elevation angles of the arms and legs, and average angles of the elbows and knees.
Fusion
Combine the face and body recognition results in a structured manner to improve robustness of the person identification system. This stage merges the face and body features to create a unified person identity classifier. Fusion combines these outputs to arrive at a final decision.
Results
The results show that the best method was gait analysis, which performed better than anthropometrics. Body features consistently performed better than methods which used facial data, of which local feature extraction performed better than Eigenfaces. Despite the large difference in identification accuracy between each method, fusion improved the overall performance of the system.
Body Features
Anthropometrics
Preliminary testing with a dataset consisting of three subjects showed a 98% classification rate. This was done by inputting body part lengths into a KNN classifier with k = 1.
Conclusion
The aims of the project were achieved, as shown by the following key findings:
- 3D data can be used to create an automated person identification system
- Realistic data can be used to accurately identify people
- Fusion of different methods improves performance
References
- ↑ J. Han, L. Shao, D. Xu, J. Shotton, “Enhanced computer vision with microsoft kinectsensor: A review,”IEEE transactions on cybernetics, vol. 43, no. 5, pp. 1318-1334, Oct.2013
- ↑ M.R. Andersen, T. Jensen, P. Lisouski, A.K. Mortensen, T. Gregersen, P. Ahrendt,“Kinect depth sensor evaluation for computer vision applications,”Electrical and Com-puter Engineering Technical Report ECE-TR-6, 2012