Image Classification Based on Hybrid Compression System

Due to fast development of internet technologies and multimedia archives are growing rapidly, especially digital image libraries which represent increasingly an important volume of information, so, it is judicious to develop powerful browsing computer systems to handle, index, classify and recognize images in database. In this paper a new algorithm image classification is proposed. This paper presents an efficient content-based image indexing technique for searching similar images using daubechies wavelet with discrete cosine transform. The aim of this work was to realize the image classification using hybrid compression system. The image was classified using 10 classes


INTRODUCTION
he current improvement in the digital storage media, image capturing devices like scanners, web cameras, digital cameras and rapid development in internet provide a huge collection of images. This leads to the retrieval of these images for visual information efficiently and effectively in different fields of life like medical, medicine, art, architecture, education, crime preventions. To achieve this purpose many image retrieval systems have been developed [2]. A successful image classification will

Daubechies Wavelets Transform (DAWT)
Ingrid Daubechies, one of the brightest stars in the world of wavelet research, invented what are called compactly supported orthonormal wavelets -thus making discrete wavelet analysis practicable. The names of the Daubechies family wavelets are written dbN, where N is the order, and db the "surname" of the wavelet [11].

Image Decomposition
The wavelet filters for sub-band decomposition derived from Daubechies wavelets are of non-linear phase, for this reason they are rarely used in image processing applications, such as denoising and compression. However, Daubechies wavelets can be derived from the mother wavelet high pass and low-pass filters in the dyadic sub-band image decomposition. The coefficients of the lowest frequency band are grouped in the upper left corner, mean while the coefficients of the higher frequency bands are in the other three image corners, in order to obtain the information contained in the images, sub-level signal decompositions is performed to separate the signal characteristics and to analyze them independently [13].Figure (1) shows the transformation of the image into the subimages. Approximation image is the sub-image compose of the low frequency parts in both row and column directions (LL), and details images are the remaining three images, containing high frequency components (LH, HL, and HH).
The sub band LL represents the approximation of the original signal, while the sub bands LH, HL, and HH represent the details [14].

Image
L L H H

Discrete Cosine Transform
Discrete cosine transform is a lossy compression algorithm that is discards those frequencies which do not affect the image as the human eye perceives it [8]. Minimizes the amount of visible blocking artifacts compared to other transforms provides a good compromise or balance between information packing ability and computational complexity [1]. The two dimensional DCT can be written in terms of pixel values f(i, j) for i,j = 0,1,…,N-1 and the frequency-domain transform coefficients F(u,v) [3]: For u = 0,1,2,…,N −1. Similarly, the inverse transformation is defined as x y For x = 0,1,2,…,N −1. In both equations (1) and (2) C(x) is defined as

Image Feature Extraction
Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately [23].Every image is characterized by a set of features such as Texture, color, shape and others, extract these features at the time of injecting new image in image database. Then summarize these features in a reduced set of k indexes and store it in image feature database. The query image is processed in the same way as images in the database. Matching is carried out on the feature database [17]. Nine features are exploited for our proposed algorithm.

Color Feature
Color Feature is the most significant one in searching collections of color images of arbitrary subject matter. Color plays very important role in the human visual perception mechanism, besides that image color is easy-to analyze, and it is invariant with respect to the size of the image and orientation of objects on it. The simplest and most frequently used way to represent color is color histograms [17].

Mean
It tells something about the general brightness of the image, where a bright image has a high mean while a dark image has a low mean. It provides average color value in the image [19]. It is calculated using following statics: … . (12) Where L is the gray level range such as [0, 1], [0 to 7] or [0 to 255], r is image row, c is image column and I(r, c) is the pixel at row r and column c.

Standard Deviation
Standard deviation is the square root of the variance of the distribution [19]. It is calculated using following statics:

Skewness
It gives measure of the degree of asymmetry in the distribution [19]. It is calculated using following statics: . . . (14) Texture Feature Textures are homogeneous patterns or spatial arrangements of pixels that cannot be sufficiently described by regional intensity or color features [18]. They are six texture

Entropy
The entropy is a measure that tells how many bits are needed to code image data. Entropy is a measure of information content. It measures the randomness of intensity distribution [16]. It can be calculated using the following equation:

Energy
The Energy measure tells something about how gray levels are distributed. Energy measures the uniformity of intensity in the histogram [32]. It is defined as follows:

Homogeneity
Homogeneity returns a value that measures the closeness of the distribution of elements in image [20].Homogeneity takes high values for low-contrast images [24].

Variance
Variance in the gray level in a region in the neighborhood of a pixel is a measure of the texture [25].

rd Moment=�
Step Step measures distribution of gray level [33].

Proposed Hybrid DAWT-DCT Based Algorithm
In this algorithm, hybrid compression is performed on query image and database images. First compression method is 2D DAWT followed by DCT. In DCT image data are divided up into 8*8 number of block.
DCT converts the spatial image representation into a frequency map: the average value in the block is represented by the low-order term, strength and more rapid changes across the width or height of the block represented by high order terms. The DCT is applied to the DAWT low-frequency components that generally have zero mean and small variance, and accordingly results in much higher compression ratio (CR) with important diagnostic information. Figure

First Level of DAWT Decomposing
2-D DWT decomposed the each 2x2 blocks of the Image_q and Image_d to be classified, the low-frequency coefficients are LL and high-frequency coefficients are HL, LH and HH . Approximation image is the sub-image composes of the low frequency parts in both row and column directions (LL). By convolution of the Image_q and Image_d with a filter bank(-0.48296,-0.8365,-0.2241,0.1294). Perform the average of four pixels then divide the result on 4 . new value is saved in an array called LL_array ( ), to be used in the next step. Like LL Decomposes, high-frequency coefficients HL is obtained by convolution of the Image_q, Image_d with same filter bank (-0.48296 ,-0.8365,-0.2241,0.1294). applying the average of four pixels then divide result on 4 .a new value is saved in an array called HL_array ( ), to be discarded. High-frequency coefficients HL is obtained by convolution of the Image_q, Image_d with same filter bank (0.1294, -0.2241, -0.8365, 0.48296). performing the average of four pixels then divide result on 4 .a new value is saved in an  (1) illustrates these steps.

Second Level of DAWT Decomposing
The passed LL components are further decomposed using another 2-D DAWT with same filter bank and the detail coefficients (HL, LH and HH) are discarded. The following algorithm (2) illustrates these steps.

Hybrid DAWT-DCT Compression
The presented hybrid DWT-DCT algorithm for image compression is to exploit the properties of both the DAWT and the DCT. After LL decomposing, 8x8 block of DCT has been applied to the reaming approximate DAWT coefficients (LL) and can be achieved high CR. DCT converts the spatial image representation into a frequency map. the average value in the block is represented by the low-order term, strength and more rapid changes across the width or height of the block represented by high order terms. The Discrete Cosine Transform (DCT) has been shown to be near optimal for a large class of images in energy concentration and decorrelating.The following algorithm (3) illustrates these steps.

Features Based Algorithm
Feature Extraction is a process that begins with feature selection. In this Algorithm, the image classification process deals with the compressed image from previous section to generate feature vectors .Training the classification systems with these features incorporated could increase accuracy rate. The extracted features include: mean,standarddeviation,entropy,energy,homogeneity,step,3rdMoment,skewness,variance .After trying a number of features for query and database images , these nine features are chosen, feature extraction is divided into following steps: 1.Compute probability of each pixel in image and save it in prob_array .after that we are calculating mean feature. each element of prob _array is saved in variable called pro then make prob_array =0 to be used for another iteration. pro is divided on size of image (width and height) are called s and q. 2. Extract standard deviation feature. 3. Extract the entropy feature. 4. Extract the Energy feature 5. Extract the homogeneity feature. 6. Extract the step feature.
Step 2: Compute probability of each pixel in image using the following equations:

Sum-of-Absolute Differences (SAD)
The similarity measurement is done using Sum-of-Absolute Differences distance. Then the top closest images are retrieved. Where D is the distance between the feature vector Q f and D f and N represent the number of color and texture feature. D is calculated using the following equation [21]: . . (21) Subsequently, the D is stored in the ascending order then the selected images are retrieved. Both query and database images are similar for D =0 and the small value of D shows the relevant image to the query image [21].

Results and Discussion
DCT and Daubechies Transform algorithms are implemented on 100 color images. Size of the images is 256*256pixels. These images are arranged in 10 semantic groups: People, Beaches, Building, Buses, Dinosaurs, Elephants, Roses, Horses, Mountains and Food. It includes 100 images from each semantic group. The images are in jpeg format. The algorithm is executed with 10 query images selected for each category in the image database. The results obtained are shown in the following tables.

Conclusions
Classification is the process of finding a model or a function that describes and distinguishes data classes. In this paper, a new algorithm is proposed to classify images, the basic idea is depend on using hybrid compression method (DAWT and DCT). To evaluate this algorithm, a heterogeneous image database is used which is downloaded from the website http://wang.ist.psu.edu/iwang/test1.tar. Hybrid compression produced better result (precision and recall values are improved 80%) in compare with using DAWT or DCT alone and are improved 90% in compared with proposed method which is presented in [22].the proposed compression produced better result(since precision and recall values are improved 80%)in compared with proposed method which is also presented in [23]. Figure (3) shows the classification of images using proposed compression algorithm.