Face Recognition Based on PCA, LBP and SVM Techniques

Although many of methods have accomplished good success in face recognition systems

ace recognition has attracted tremendous attention over the past few decades. Many well-known face recognition techniques have been developed over the last few decades [1]. The face is the primary focus of attention in the society, playing a major role in conveying identity and emotion [2]. These methods can be divided in three main methods of face recognition: structural matching method based on the characteristics, whole matching method and combination method.
Geometric characteristics of the face, such as the location ,size, relations of eyes, nose, chin and so on, are used to represent the face in structural matching method; in whole matching method, the gray image of whole face acts as input to train and test the classifier, such as the wavelet-based Elastic Matching, the principal component analysis and so on; combination method is a combination of the two former methods, usually the overall characteristics is used for a preliminary identification, and then local features for further identification [3].
As a great challenge, the single sample per person problem has become a big obstacle to many real-world applications, such as e-passport, watch list screening, since, in these scenarios, it is generally impossible to collect more than one sample per person. To solve Single Sample per Person problem, many methods have been developed recently [4]. The most representative recognition techniques frequently used in conjunction with face recognition are PCA, Independent Component Analysis (ICA), and Fisher's Linear Discriminant Analysis (LDA) [5]. In all kinds of the techniques of face recognition, PCA is effective feature extraction method based on face as a global feature. It reduces the dimension of image effectively and holds the primary information at the same time [6]. However, the PCA is not ideal for classification purposes mainly because of the fact it retains unwanted variations occurring due to lighting and facial expression.
LBP-based facial image analysis has been one of the most popular and successful applications in recent years [7]. The Local Binary Pattern is originally proposed by Ojala for the aim of texture classification, and then extended for various fields, including face recognition, face detection, facial expression recognition. The LBP method is computationally simple and rotation invariant method for face recognition. Adaptive smoothing for face image normalization under variation of illumination is presented. The illumination is estimated by iteratively convolving the input image with a 3-by-3 averaging kernel weighted by a simple measure of the illumination discontinuity at each pixel [8].
SVMs provide efficient and powerful classification algorithms that are capable of dealing with high-dimensional input features and with theoretical bounds on the generalization error and sparseness of the solution provided by statistical learning theory [9]. The SVM method is based on the principal of maximal margin bound. Intuitively, given any two classes of vectors, the aim of SVMs is to find one hyper-plane to separate the two classes of vectors so that the distance from the hyper-plane to the closest vectors of both classes is maximized. The hyper-plane is known as the optimal separating hyperplane. SVMs excel at two-class recognition problems and outperform many other linear and nonlinear classifiers [10]. This paper is organized as follows. Section II describes the structure of the face recognition system and its parts. The experiments result and analysis cover in Section III. Finally, concluding comments are presented in Section IV.

Structure of The Face Recognition System:
In order to clarify the scheme of how to employ Local feature extraction method as LBP and dimensionality reduction techniques like PCA on facial expression recognition tasks, Fig.1 shows the basic system structure of a propose facial expression recognition system based on local feature extraction and dimensionality reduction techniques. As shown in Fig.1, the proposed system consists of four main components: Image preprocessing , feature extraction, and dimensionality reduction and facial expression recognition. In the feature extraction stage, the original facial images from the Yale facial expression database are divided into two parts: training data and testing data.
The corresponding LBP features for training data and testing data are extracted. The result of this stage is the extracted facial feature data represented by a set of high dimensional LBP features. The second stage aims at reducing the size of LBP features and generating the new low dimensional embedded features with dimensionality reduction techniques using PCA. It is noted that for the mapping of testing data, the low dimensional embedded mapping of training data is needed to be learnt. This is realized by using the out-of-sample extensions of dimensionality reduction methods. Due to the linearity, the out-of-sample extensions of all used linear dimensionality reduction methods, such as PCA, are performed by multiplying testing data with the linear mapping matrix with a straightforward method. The last stage in this system is in the low dimensional embedded feature space the trained SVM classifier are used to predict the accurate facial expression categories on testing data and the recognition results are given. The stages of the system are elaborated as follow: Figure(1): shows the basic system structure of a propose system Image Preprocessing : The basic goal of image preprocessing is to process the input image, view and assess the visual information with greater clarity through removal of noise, the sharpening of image edges and...etc. In this stage contrast stretching and median filtering are used to implement this goal.
• Contrast stretching: It is also known as normalization. It operates by stretching the range of pixel intensities of the input image by determine the maximum and minimum pixel values to occupy a larger dynamic range in the output image [11].
• Median filtering: One of the main uses of filtering in image preprocessing is for noise removal. Median filters is used at eliminating noise, especially isolated noise spikes (such as 'salt and pepper' noise) whilst preserving sharp high-frequency detail [11].

Features Extraction:
A combination of LBP and PCA are used to extract distinct information from a face image and reduce the dimension of the face image. LBP is one of the most popular and successful applications. The most important properties of LBP are their tolerance against monotonic illumination changes and their computational simplicity. PCA is a classical method and widely used in the area of face recognition to extract features and reduce the dimensionality. However, PCA seeks to find most features in the face image; but this method has sensitive to changes in illumination and also their features may not include the effective information for classification, therefore combination PCA with LBP method is implementing to avoid this problem.
• Local Binary Pattern: The local binary pattern operator is aimed to compare each pixel position with its eight surroundings pixels to generate a set of binary code. Then, the operator encodes the neighbor to a binary code "1" if it is greater or equal than the center pixel value otherwise encodes it to "0". A binary number for each pixel is calculated by concatenating all binary values in a clockwise direction and then the corresponding decimal value of the generated binary number is used for labeling the given pixel. To understand the LBP let consider that, a given pixel position is (xc,yc) .The decimal value of the resulting LBP code can be stated as shown in equation (1): Where i c is the gray-level values of the central pixel (xc ; yc) and i P is the grey values of the surrounding pixels in the circle neighborhood. Now, the function s(x) can be defined as the following:

Face Recognition Based on PCA, LBP and SVM Techniques
Finally, the uniform patterns concept is used to reduce the possible bins number. According to this concept, the LBP pattern is called uniform if the binary pattern consists of at most two bitwise transitions from "0" to "1" or vice versa [12].

• Principle Component Analysis:
Let the face image I(x,y) which represented as a two-dimensional NxN array of intensity values converts to a vector of length N². After that the training set of PCA algorithm can be created. Now, there are M numbers of face images, and for each image, there are N pixels and each image is considered as a vector. The mean of face images is calculated according to Equation (3): Γ i is the face image in the training set. The normalized is performed in training set to obtain the zero mean that show degree of different between the mean face image and the original face image by equation: The covariance matrix C of the training set is calculated by equation (5): The eigenvector of the covariance matrix C is a non-zero vector u, if it satisfies the condition: Where V k are the corresponding eigenvalues. The matrix C with the dimension of NXN is a very large size and more computational time and memory is needed, therefore a computationally smaller covariance matrix L rather than the original covariance matrix C is calculated. The matrix C can be decomposed According to Equation below: Where A is matrix of zero mean [ϕ1, ϕ2…,ϕ i ] and at the same way the eigenvectors v i can be consideration as the following:

Face Recognition Based on PCA, LBP and SVM Techniques
Now, multiplying both sides of equation (8)  These eigenvector u i is also known as eigenfaces because when convert these eigenvectors for N length vector to NxN two dimensional array and display. The display image is appearing like a face [13].

Classification
Classification is one of most important stage of any recognition system. It is applied to recognize between samples according to its class. SVM is one of classification methods that proved its robust in many of application. It is aimed to find the best separating hyperplane between features that belong to different classes.
• Support Vector Machine: To understanding SVM, the basic idea of SVM is introduced in this section. Thus, let consider N are the points that belong to the two different classes as explain in equation (12): Where y i is the corresponding class labels that vector belongs to and x i is the n-dimension vector. SVM can be separated the two points that corresponding to different class label by using a hyper-plane. The linear discriminate function f(x) is defined as the follows:

Face Recognition Based on PCA, LBP and SVM Techniques
hyper-plane between two classes. The decision classification function of optimal separated hyper-plane according to Lagrangian formulation is given by the equation (14): Where x si is support vectors, m is the number of support vectors and ά i is the corresponding Lagrange multiplier. Now, any vector can be classified by using the sign function of f(x).
The sign of f(x) can be used for linear classification. To modify this function to solution the nonlinear problem a mapping of the input space into a high dimensional space x →Φ(x) and this is done by rewrite the dot product of the two functions as a kernel function. Now, the decision function of nonlinear classification is written as the follows: Many types of kernel functions can be used to give good separated solutions, such as (linear, polynomials, MultiLayer Perception and Radial Basis Function) which some of them define as following [14]:

CONCLSION
To overcome the inability of traditional PCA to deal with the single sample per person problem and monotonic illumination changes. New method is proposed to solve this problem. In the proposed method, the LBP and PCA is presented as features extraction as well as dimensionality reduction and supported with SVM classifier to overcome these problems. Two experiments on Yale face database is used to choose the kernel function of SVM classifier as well as to validate the proposed method. The experiment results provide the ability of the proposed method to overcome to this problem where the accuracy rate reaches 98% compare to the traditional PCA.