The Evaluation of Rental Amount of Religious Endowments by Using Geomatic Techniques and Machine Learning Algorithms Hilla/ Iraq

The religious endowments are one of the important sources, which acquire historical, cultural, and economic importance in all countries of the world. In particular, a religious endowment in Iraq includes several distributed real estates and lands that usually require efficient management systems. One of the most important factors affecting the management of real estates that belong to religious endowments is the rental amount of each real estate. In general, the estimation of the rental of real estates can support the future development of religious endowments. Governmental agencies are faced with some challenges in the management of religious endowments in terms of rental pricing due to numerous economic and geographic factors. The rapid development of artificial intelligence systems and Geomatic techniques can present a framework for rental amount estimation based on spatial and non-spatial factors. In this study, a machine learning algorithm (Support Vector Regression) will be combined with Geographic Information System (GIS) to predict and evaluate the rental amount of real estates that belong to a religious institution in Iraq (Shiite endowment in Hillah city). The final results indicated that the proposed method achieved an overall accuracy of 71%, a root mean square error of 0.2257 million Iraq, Dinar (IQD), and a correlation coefficient of 0.9272. This study can be used as an effective tool for the decision-makers to plan and manage the religious endowments in developing countries.


H I G H L I G H T S A B S T R A C T
• 185 The religious endowments are one of the important sources, which acquire historical, cultural, and economic importance in all countries of the world. In particular, a religious endowment in Iraq includes several distributed real estates and lands that usually require efficient management systems. One of the most important factors affecting the management of real estates that belong to religious endowments is the rental amount of each real estate. In general, the estimation of the rental of real estates can support the future development of religious endowments. Governmental agencies are faced with some challenges in the management of religious endowments in terms of rental pricing due to numerous economic and geographic factors. The rapid development of artificial intelligence systems and Geomatic techniques can present a framework for rental amount estimation based on spatial and non-spatial factors. In this study, a machine learning algorithm (Support Vector Regression) will be combined with Geographic Information System (GIS) to predict and evaluate the rental amount of real estates that belong to a religious institution in Iraq (Shiite endowment in Hillah city). The final results indicated that the proposed method achieved an overall accuracy of 71%, a root mean square error of 0.2257 million Iraq, Dinar (IQD), and a correlation coefficient of 0.9272. This study can be used as an effective tool for the decision-makers to plan and manage the religious endowments in developing countries.

Introduction
In the planning concept, the concept of rent is defined as the amount of money that tenants are willing to pay for a specific real estate in terms of the geographic location and other characteristics such as land use and structural condition [1]. Rent in the urban theory can be presented as an important indicator of housing cost and investment for both individuals and governmental agencies [2]. Recently, spatial modeling has attracted the experts in the field of real estate development, due to its significant capabilities that combined geographical and statistical factors. Spatial modeling can represent the geographical distribution of real estate values. In addition, spatial modeling can evaluate the existing rent in terms of the influencing structural and geographical factors [3]. Furthermore, the spatial modeling of real estate rent expresses the spatial variation of real estate cost, which is important for stakeholders and associated laws making future development. Real estates that belong to religious endowments are one of the important sources, which acquire economic importance in many countries of the world such as Iraq. Endowments in Iraq contain many real estates that usually need efficient management systems. One of the most important factors impacting the management of these real estates is the rental amount of each real estate. The estimation of rental amount can efficiently support the future development of religious endowments. Many countries such as Iraq started to develop digital databases that include religious endowments information (lands and real estate). Many techniques have been applied for the evaluation of real estates that belong to endowments. The spatial modeling technique has not yet been implemented in the field of endowment management. This study presents integration between machine learning algorithm and GIS modeling to estimate and evaluate the rental amount of real estates that belong to a religious institution in Iraq (Shiite endowment in Hilla city).

Related works
Several studies have been conducted based on the integration of machine learning and GIS tools in the field of environment and urban applications [4,5]. In general, there are several studies that have been conducted for the prediction and evaluation of real estates (i.e. houses and lands). The basic method is the traditional method that is implemented based on statistical analysis. Zhou et al. [6] used a statistical analysis to evaluate several houses' prices in Las Vegas, USA.With the development of computation technology, experts applied machine learning algorithms for rental prediction as presented by Li et al. [7] who developed a methodology based on Support Vector Regression model (SVR) to predict real estate prices and rental in China. Their results showed that SVR outperformed other traditional models. In another study, Ma et al. [8] proposed a methodology based on several machine learning algorithms; Gradient Boosting Regression Trees, Random Forest Regression, Linear Regression, and Regression Tree algorithms in order to estimate warehouse rental values in China. Their results showed that the distance to the city center, warehouse size, and land prices were the most effective factors on the rental values of warehouses. In a separate study, Raftered et al. [9] proposed a methodology based on machine learning algorithms and real world data to estimate rental prices of houses in the USA. Their results showed a high correlation between the rental price and the property type and locality. Nowadays, the spatial modeling and mapping have attracted experts due to the ability of spatial modeling to represent both spatial and statistical variation of rental values at different scales [10, 11, and 12]. For example, Chen et al. [13] presented a prediction map of rental prices within the urban area in China. They proposed a methodology based on the integration of online data and prediction algorithms. Their maps showed a spatial and statistical variation in housing rents. Nowadays, deep learning techniques showed a significant accuracy in the field of prediction. Zhou et al. [14] developed an approach based on the integration of Convolutional Neural Network and spatial data to predict real estate rental values. The main limitation of deep learning methods is that they required big data. In this study, we integrated GIS tools and the machine learning algorithm (Support Vector Regression) to predict rental values of real estates that belong to religious endowments in Iraq.

Study area
The present study is conducted in Babylon province, Iraq. Babylon province is located in the south of Baghdad. It is geographically bounded by the intersection of latitude 32° 13' 23" -32° 47' 12" N and longitude 44° 16' 55" -44° 40' 33" E. The capital of Babylon province is called (Al-Hillah city), which is 89 km away from Baghdad city. Al-Hillah city is considered a large city that has an area of about 161 km² with a population of two million people in 2018. Al-Hillah city includes several human activities and land use types such as residential, religious, commercial, etc. There are many factors such as the rapid urbanization, the existing historical and tourism locations, and agricultural lands that cover wide areas within Babylon province. The climate in this area is generally considered hot-dry in summer and cold in winter [15], which makes this city a better environment for future development, see Figure 1.

The Overall methodology
The proposed methodology includes sequential steps; the first step is the collection of the tabular and spatial data that belong to a real estate of endowments in Babylon province, then we performed a spatial analysis and modeling, which was applied to extract distances to important factors (distance to main roads and distance to the city center), then the application of the Machine Learning algorithm (Support Vector Regression) was conducted and used to estimate the annual rental amount of endowments in the study area by using Weak machine learning software. The final step is the evaluation and the accuracy assessment of the proposed model. The overall methodology is presented in Figure 2.

The data collection
A field work has been conducted to collect almost all information related to endowments by using a specific form. The survey included all rented units within real estates that belong to the Shiite endowment in Babel province. The survey was conducted on 159 units within 38 real estate locations (Appendix 1). The form was designed based on related works and experts, which includes quantitative and qualitative information (i.e. structural condition, commercial use, contract's period, annual rental amount, water service, electricity, and sewer services), as well as the geographic location of each real endowments. Geographic locations have been observed by using Global Position System equipment (GPS) [16]. Structural condition refers to (a bad condition or a good condition), while commercial use means that the real estate is used for commercial issues or not, contract's period refers to the time of contract in years, the rental amount refers to the annual rental in Iraqi dinar (IQD), the availability of services (water service, electricity, and Sewer services) were collected, the area of each real estate, and the distances to main roads and city center were extracted from GIS techniques that were also recorded in the form. On the other hand, some textual information such as structural condition has been compensated by (0) or (1) values in order to facilitate the modeling process, where textual information cannot be used in the proposed model. Table 1 shows the designed form used to collect information about real estates of Shiite endowment in Babil province.

GIS works
The collected data was converted into GIS data (shape file format) using ArcGIS tools in order to complete the data preparing stage. On the other hand, satellite images were obtained from two satellites. The first satellite is World View3 with (8 bands) with a spatial resolution of 0.31 m (panchromatic band), 1.24 m(multispectral bands), and the second satellite is World View 2 with (8 bands) with a spatial resolution of 0.46 m (panchromatic band), 1.84 m (multispectral bands), [17] to extract the total area of each real estate by using GIS applications. Furthermore, we used spatial analysis techniques (Euclidean distance algorithm) to extract the distance from each real estate to the major roads and the city center. The Euclidean distance tools describe each cell's relationship to a source or a set of sources based on the straight-line distance. The Euclidean distance technique gives the distance from each cell in the raster to the closest source. The distance values were extracted to real estate's geo-data base using spatial analysis tools. After that the collected data was converted to a digital format by using Microsoft Excel with a single table with CSV format in order to use this file as an input data in machine learning, see Figure 3.

Regression modeling
Regression analysis is considered a supervised algorithm within machine learning techniques, where predicted values are continuous with a constant slope [18]. It is applied to estimate values at a continuous range such as prices values instead of trying to categorize them into categories like (cats, dogs). Regression analysis are classified into two types; simple regression and multivariate regression. In this study, we used multivariate regression analysis, which is considered more complex compared to the simple regression; Equation 1 shows the multivariate regression calculation [19].
f(x, y, z) = w1x + w2y + w3z + b Where w refers to the weights or coefficients [19] that our model will try to learn. While x, y, and z refer to the attributes, or distinct pieces of information, which we have about each observation. And b represents the bias value, and bias means that the expected value of the estimator is not equal to the population parameter. In a regression analysis, this would mean that the estimate of one of the parameters is too high or too low. In this study, for rent prediction, these attributes might include (Structural condition, Contract's period,… etc.). Equation 2 refers to the proposed regression equation for rental prediction (developed by the author).
Rental amount = w1 * Structural condition + w2 * Commercial use + w3 * Contract ′ s period + w4 * Water Service + w5 * Electricity + w6 * Sewer services + w7 * Area + w8 * Distance to main roads + w9 * Distance to city center + b The adaptation of SVM for regression is called Support Vector Regression or SVR for short. SVM was developed for numerical input variables, although it will automatically convert nominal values to numerical values. Input data is also normalized before being used, where the data normalization gets rid of a number of anomalies that can make the analysis of the data more complicated. Unlike SVM that finds a line that best separates the training data into classes, SVR works by finding a 1841 line of the best fit that minimizes the error of a cost function. This is done using an optimization process that only considers those data instances in the training dataset that are closest to the line with the minimum cost. These instances are called support vectors, hence the name of the technique. In almost all problems of interest, a line cannot be drawn to the best fit of the data, therefore a margin is added around the line to relax the constraint, allowing some bad predictions to be tolerated but allowing a better result overall. In this study we trained an SVR model based on training samples that represented 70% of the total data, and 111 samples were used for modeling process. While 48 samples were used for the testing and accuracy assessment process (Appendix 2), which represents 30% of the total data. On the other hand, Waikato Environment for Knowledge Analysis (Weak) software version (3.9.2) was used to implement the modeling and prediction task based on the input variables and machine learning algorithm (SVR). Weak software has the ability to automatically produce a regression equation that is used to predict dependent variables based on the independent variables.

Accuracy Assessment
The model's validation and verification was implemented based on the Root Mean Square Error (RMSE) function [20], the RMSE (also called the root mean square deviation, RMSD) is a frequently used measure of the difference between values predicted by a model and the values actually observed from the environment that is being modeled. These individual differences are also called residuals, and the RMSE serves to aggregate them into a single measure of predictive power. The RMSE of a model prediction with respect to the estimated variable X model is defined as the square root of the mean squared error, and Equation 2 shows the RMSE [20].
Where X orbs is observed values and X model is modeled values at time/place i. On the other hand, the Relative Absolute Error is defined as the absolute error relative to the size of the measurement, and it depends on both the absolute error and the measured value. The relative error is large when the measured value is small, or when the absolute error is large.

Result of regression modeling
The proposed model was trained and validated based on 9 independent variables (i.e. structural condition, commercial use, contract period, water service, electricity, sewer services, area, distance to city center, and distance to main roads) and the dependent variable (the annual rental amount), according to the basic correlation analysis that was conducted between the dependent variable (the annual rental amount) and other independent variables mentioned above. The lowest correlation value was identified between the rental amount and the structural condition and it was (0.0001), while the highest correlation was identified between the contract period (the duration of the contract in year) and the rental amount, which was (0.4069). On the other hand, the Support Vector Regression has the ability to generate a regression equation depending on the training model input data that are separated to training data (70%) and testing data (30%) [21], where the training data are used for the modeling process, while the testing data are used in the validation and verification process. Equation 4 shows the final regression equation (developed by the author):-Predicted annual rental amount = −0.1006 * Structural condition + 0.0005 * Commercial use + 0.0784 * contract period − 0 * Water Service − 0 * Electricity − 0 * Sewer services + 0.0002 * Area meter − 0.0002 * Distance to city center + 0.0071 * Distance to main roads + 0.2458 (4) The final regression equation illustrates that the support vector regression algorithm obtained a prediction equation that includes the mathematical relationship between a dependent variable (the annual rental amount) and independent variables, which are listed in the equation. The developed equation includes several weights that are multiplied by independent variables separated by plus and minus signs. In addition, this equation includes a bias value at the end of the equation, which is used to minimize the error of the prediction. This mathematical relationship was aimed to find the final predicted annual rental amount, where the highest value of the weight was recorded with the area variable which was 0.1006 with the structural condition. The resulted regression function refers to the rental amount of each real estate the belong to endowments in Babil province based on the variables used and their geographic position in the study area, and in this case we have the ability to predict rental amount for each real estate within the study area. Thus, in order to predict a rental amount of each real estate within endowments properties by Iraqi Dinar (IQD), we can use this function. For example, if the annual rental amount of a shop, which is located within Ali Soq-Al Asri group of shops by currency of Iraqi dinar (IQD) to be estimated, then we can directly apply the developed regression function as shown in Equation (5) The final predicted annual rental amount is 0.937 million IQD, while the annual rental amount is one million IQD in reality, which was collected during the field survey.

Result of accuracy assessment
The accuracy assessment results was indicated based on the root mean square function and the testing samples represented by 30% of the total data used in the modeling process and 48 samples that were used in the validation process. The root mean square function was applied based on observed values of rental amount and the predicted rental amount. Table 2 shows the observed and predicted rental values and the root mean square results for testing. On the other hand, Figure 4 shows the scatter plot based on regression analysis and testing samples.  Where the overall accuracy of Support Vector regression algorithm was 70%, which is considered a relatively acceptable accuracy in the prediction process with regard to the simple dataset used, in terms of the number of variables, quantitative variations, and spatial variables. Hence, the proposed model can be evolved and it can be more generalized by using more sampling and extra variables such as socio-economic data (i.e. population growth, culture, and land price). As well as environmental variables such as climate suitability and proximity to landfill locations. On the other hand, the correlation coefficient based on SVR algorithm was 0.9272. The highest RMSE is identified as 1.34 million IQD, while the lowest RMSE was 0.03 million IQD. The overall root mean squared error is 0.28 million IQD. According to the modeling results, many factors can affect the rental pricing within real estates of Islamic endowments in Babil province. The proposed model might be used as a tool for evaluation of rental pricing within Babil province, Iraq. Consequently, it can support the management and real estate development in the Shiite endowment in Babil province. The proposed model can be more evolved to be generalized for other endowments in Iraq, and this development requires the collection of all data related to Islamic endowments. On the other hand, extra geographic and environmental factors can be applied to improve the proposed model. Figure 5 shows the predicted values of rental amount for Islamic endowments in Babil province.
1844 Figure 5: The predicted rental amount map

Conclusion
Religious endowments (lands or real estates) are considered as one of the most important sources that acquire economic importance in many countries of the world such as Iraq. These endowments in Iraq contain many real estates that usually need efficient management systems. One of the most important factors affecting the management of these real estates is the rental amount of each real estate. The estimation of rental amount can effectively support the future development religious endowments. A machine learning algorithm was combined with Geographic Information System to predict the rental amount of real estates that belong to a religious institution in Iraq (Shiite endowment in Hilla city). The results showed that the overall accuracy of the support vector regression algorithm was 70 % ,which is considered as a relatively acceptable accuracy in the prediction process. Furthermore, the correlation coefficient based on SVR algorithm was 0.9272, and the overall root mean squared error is 0.28 IQD. Regression analysis can be a suitable analysis to predict the rental amount. In addition, this study is considered one of the first studies that combined GIS technology and machine learning technology to predict the rental amount of religious endowments in Iraq. Therefore, this study can be used as an effective tool for the decision-makers to plan and manage the religious endowments in developing countries.

Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Data availability statement
The data that support the findings of this study are available on request from the corresponding author.

Conflicts of interest
The authors declare that there is no conflict of interest.