Revella Eshaya Armya a, Maiwan Bahjat Abdulrazzaq b

a Technical College of Informatics, Akre, Kurdistan Region, Iraq – revella.eshaya@dpu.edu.krd

b Faculty of Science, University of Zakho, Zakho, Kurdistan Region, Iraq – maiwan.abdulrazzaq@uoz.edu.krd


Received: 14 Aug., 2023 / Accepted: 1 Nov., 2023 / Published: 28 Mar., 2024.                https://doi.org/10.25271/sjuoz.2024.12.1.1189


Academics and researchers worldwide have paid close attention to biometric handwriting recognition using deep learning as much research has been proposed to enhance biometric recognition in the past and in recent years. Several solutions for character recognition systems in various languages, including Chinese, English, Japanese, Arabic, and Kurdish have been developed. Unfortunately, there has been minimal growth in the Assyrian language. There is still little research on Assyrian handwriting. In this paper, a new Assyrian language dataset was created as part of the procedure by distributing 500 forms consisting of 36 Assyrian characters to people between the ages of 13 and 60 of both genders. The preprocessing operation includes cleaning the noisy data and segmenting each image to 224x224 pixels. This effort resulted in the collection of 18,000 images of these characters to be trained 70% and tested 30% in four CNN models, VGG16, VGG19, MobileNet-V2, and ResNet-50, over 30 epochs to give an accuracy rate of 90.97%, 92.06%, 95.70%, and 94.97%., respectively.

KEYWORDS: Deep Learning, Convolutional Neural Network, Handwritten Character Recognition, Assyrian Language.


        Due to recent advances in deep learning models, handwritten character recognition has virtually addressed the problem for many popular languages. For many other languages, however, the detection of handwritten characters remains challenging due to a lack of sufficiently large labelled datasets required for training deep learning models (Jayasundara et al., 2019).

        Character handwritten recognition systems in a variety of languages, including Chinese, English, Japanese, Arabic, and Kurdish have been the subject of a large number of potential solutions. On the other hand, there was no significant progress in the Assyrian language. As such, the recognition of Assyrian handwritten characters remains a current and relatively unaddressed research problem. Convolutional Neural Networks (CNNs) and other algorithms based on Deep Learning provide the ability to independently acquire the distinctive features of images without relying on human intervention. The Convolutional Neural Network (CNN) architecture is a more sophisticated iteration of the multi-layer perceptron (MLP) foundation. The functionality of the CNN framework has similarities to that of the human brain (Putri, Pratomo, & Azhari, 2023). Humans use their naked eyes to discover and distinguish items by viewing hundreds of objects (Albahli, Nawaz, Javed, & Irtaza). CNN uses the same patterns to see and recognize things. GoogleNet, AlexNet, VGG, and ResNet are some notable CNN examples. CNN networks combine key point detection and classification with little preprocessing and computation (Parikh & Desai, 2022).

        In addition, CNN has accomplished innovative feats due to their abilities to encode in-depth information and spatial awareness. CNNs are adept at comprehending both minute and large details in images, however as layers are pooled, critical data  are lost, and CNNs require a huge number of training samples (usually thousands or tens of thousands per class) to train and classify images successfully. Thus, there is considerable interest in training CNNs with fewer training samples (Jayasundara et al., 2019).

        For all these reasons the present research aimed to Create a new dataset for 500 writers of handwritten Assyrian language that consists of 36 characters, classify the Assyrian handwritten recognition using different Convolutional Neural Network models to achieve the highest accuracy rate, and compare the performance of the Assyrian handwritten recognition model with state-of-the-art models in other languages.

        The original zone of the Assyrian language is Upper Mesopotamia, southeastern Anatolia, northwestern Iran, and the northeastern Levant. This vast region stretches from the plain of Urmia in western Iran to the Nineveh Plains, Kirkuk, Erbil, and Duhok regions in northern Iraq, as well as the northern regions of Syria and southcentral and southern Turkey (Frederick Mario Fales, 2023), (De Ridder, 2018).

        However, Assyrians living in diaspora groups may instead use loanwords borrowed from the languages spoken by those communities (Frederick M Fales, 2021). So, instability in the Middle East over the course of the previous century has resulted in a global diaspora of Assyrian speakers, with the majority currently calling regions like North and South America, Australia, Europe, or Russia home. Those who speak Assyrian are of Assyrian ethnicity and are descended from the first people who inhabited Mesopotamia (Benjamen, 2022).

        The Assyrian language uses the Madnḥāyā Syriac alphabet and is written from right to left. Along with other current Aramaic languages, it is believed that the Assyrian language is endangered because younger Assyrians do not learn the entire language. This is due to the fact that many of them have moved to other nations and adapted to their culture (Muhammad, 2019).

This study aimed to create a new dataset for the Assyrian language characters, that contains 36 characters, train and test the VGG16, VGG19, MobileNet-V2, and ResNet-50 models on the new Assyrian language dataset, and extract the accuracy and loss rate.

Table 1: Assyrian Characters, its Name and the Sound of the Pronunciation (Sada, 2021)


Name of Character

Assyrian Character

Sounds of Character































First / kāp



Connected End/ kāp



Free End/ kāp






First/ mim



End/ mim



First/ nun



Connected End/ nun



Free End/ nun







































fē - wē

/f - w/











        Convolutional neural networks integrate artificial neural networks with contemporary methods of deep learning. They have been utilized for many years in image recognition tasks, such as handwritten character recognition, the topic of this research. CNNs are thought to be the first deep learning approach with successful multilayer hierarchical structure networks robustness. CNNs can help forward propagation network improve their backpropagation algorithm deficiencies by reducing the number of trainable network parameters (Seng, Chiang, Salam, Tan, & Chai, 2021).

        An explanation of the algorithms used to recognize handwriting written in this research:

2.1     VGG16

        Oxford Net is another name for the Visual Geometry Group (VGG). VGG16 is a convolutional neural network that was trained on more than one million images from the ImageNet database. The network is composed of sixteen layers and is capable of categorizing things into one thousand categories. Hence, the network has acquired rich feature representations for an assortment of images. The image input dimensions of the network are 224 by 224 pixels. VGG16 has thirteen convolution layers, which are separated by pooling layers. In deep learning, the final two layers are the fully connected layer and output layer. The layers corresponding to the different algorithms will replace these two levels. Loss 3 classifier and output layer will comprise the final two layers of VGG16. VGG16, which is a well-known baseline algorithm for feature extraction. Each layer is made of filters that extract features. As the number of layers increases, so does the number of filters, allowing for the extraction of additional data. When the number of layers increases, the object's size decreases (Pragathi, Priyadarshini, Saveetha, Banu, & Aarif), (Korichi, Slatnia, Aiadi, Tagougui, & Kherallah, 2020). Figure (1)(a) shows the VGG16 architecture.

Figure 1: (a) VGG16, (b) VGG19 Architecture (Jraba, Elleuch, & Kherallah, 2021)

2.2     VGG19

       VGG19 is a deep learning model for image categorization that has been pre-trained. This network has 19 layers and was trained on one million photos in 1000 categories from the ImageNet database.  These 19 layers consist of 16 convolutional, three fully connected CNN with stride and padding of 1, and 2*2 max pooling layers (Almisreb, Turaev, Saleh, & Al Junid, 2022). This network only contains 3x3 convolutional layers stacked on top of one another to increase depth. On top of that, a max pooling layer is introduced to handle volume size reduction. Max pooling is applied to a 2x2 pixel window. After Max-Pooling, the Model is composed of three Fully-Connected Layers (FC Layers): the first two layers  have 4,096 Nodes, while the third layer is used to accomplish 1000-way ILSVRC classification and hence has 1000 channels (one for each class). After that, a soft-max layer is added. All of the hidden layers in the VGG19 Model have rectified linear activation unit (ReLU) (Khari, Garg, Crespo, & Verdú, 2019). Figure (1)(b) Shows the VGG19 architecture.

2.3     MobileNet-v2

        MobileNet-V2 is a lightweight deep neural network model that utilizes depth-wise separable convolutions to efficiently extract spatial and channel features. This is achieved by decomposing the standard convolution operation into two distinct convolution methods (Ghosh et al., 2020). MobileNet-V2 also provides two global hyperparameters, width and resolution multiplier, to strike a balance between delay and precision (Hamida, El Gannour, Cherradi, Ouajji, & Raihani, 2022).  Figure (2) Shows the MobileNet-v2 architecture.



Figure 2: MobileNet-v2 Architecture (Jin et al., 2023)

        The depth wise separable convolution is MobileNet-V2's unit. The inverted residual structure and linear bottlenecks are MobileNet-V2, most significant enhancements. The modules that are coupled to residuals constitute the inverted residual structure. The initial step in the process involves employing the projection convolution to augment the dimensionality. Subsequently, the depth convolution is used, and finally, the projection convolution is utilized to reduce the dimensionality (Jin et al., 2023), (Srinivasu et al., 2021).

2.4     ResNet-50

        This model is a ResNet-50 Deep Convolutional Neural Network that has been pre-trained (CNN). The selection of ResNet was motivated by its significant utilization of Batch Normalization and Dropout techniques. These two strategies serve to standardize the model and mitigate the risk of overfitting. In addition, the inclusion of identity mappings or skip connections in Residual Neural Networks (ResNets) addresses the issue of vanishing gradients, so facilitating the training of a more complex model with increased depth, leading to improved performance. Using the ImageNet Large Scale Visual Recognition Competition (ILSVRC) dataset weights, the model was pre-trained. Thus, the initial model accepted photographs with a resolution of 224 by 224 pixels (Chatterjee, Dutta, Ganguly, Chatterjee, & Roy, 2019), (Chatterjee, Dutta, Ganguly, Chatterjee, & Roy, 2020).   Figure (3) Shows the ResNet-50 architecture.


Figure 3: ResNet-50 Architecture (Chatterjee & Roy, 2020)


        C. Chandankhede and R. Sachdeo, 2023 (Chandankhede & Sachdeo, 2023) created  the handwritten  modi barakhadi datasetwhich was developed by collecting samples from roughly 25 individuals. The 7721-item dataset has just been evaluated. The Otsu binarization technique was utilized to trim and pre-process all individual characters. The performance of pre-processed data on a real-world handwritten character database generated by multiple individuals was evaluated using both methodologies. The testing accuracy of ResNet-50 recognized image is 94.552%, and the model precision is 0.86.

        M. Halder, et. al., 2023 (Halder, Kundu, & Hasan, 2023) created a new convolutional neural network model and evaluated it using the CMATERdb 3.1.2 dataset. The model surpasses previous methodologies discussed in the literature pertaining to the alphabets of the dataset, with an average accuracy for training of 98.78%, an average accuracy for validation of 98.33%, and an average accuracy for testing of 98.21%. The present study has established a comprehensive framework for the identification and classification of individual Bangla characters. This framework serves as a crucial stepping- stone towards further developments and progress in the domain of Bangla handwriting recognition.

       S. D. Pande, et. al., 2022 (Pande et al., 2022) used the most effective techniques for improving recognition rate and configured CNN for effective Devanagari handwritten character recognition using the dataset (DHCD), a strong open dataset with 46 classes of Devanagari characters and 2,000 unique images for each class. After recognition, conflict resolution was crucial for effective recognition. This method enabled the user to resolve disagreements. In terms of precision and training time, this strategy produced favorable outcomes.

        A. A. A. Ali and S. Mallaiah, 2022 (Ali & Mallaiah, 2022) employed in their study, CNN and SVM classifiers were utilized for Arabic handwriting identification using two distinct deep neural network models to obtain accurate features. In addition, they investigated the application of dropout in the suggested model for text recognition in photos of handwritten documents and demonstrated the system's efficiency for handwritten recognition in several Arabic scripts tested on diverse datasets. Simulation findings indicate that the suggested CNN-based-SVM with dropout model outperforms the conventional CNN classifier and the dropout CNN-based-SVM model.

        M. Elleuch, et. al., 2021 (Elleuch, Jraba, & Kherallah, 2021) investigated the application of the transfer learning approach in their suggested models (Inception-v3, ResNet, and VGG16) with Arabic handwritten recognition and demonstrated the effectiveness of the system for Arabic handwritten script recognition utilizing the IFN/ENIT database. Deep CNN-based models that were trained from the ground up were compared to transfer learning techniques. When applied to a set of photos of the handwritten Arabic word IFN/ENIT, ResNet and VGG models with TL yield encouraging results with 98.99% and 98.10% accuracy, respectively.

        H. M. Balaha, et. al., 2021 (Balaha, Ali, Saraya, & Badawy, 2021) introduced a novel deep learning (DL) framework that utilizes two distinct Convolutional Neural Network architectures, namely HMB1 and HMB2. Additionally, they provided several optimization strategies, regularization approaches, and dropout mechanisms. The approach employed by the researchers might potentially serve as a foundation for further investigation into handwritten Arabic text. The performance metrics that were computed were accuracy, recall, precision, and F1. The uniform weight initializer and AdaDelta optimizer achieved the highest levels of accuracy. The implementation of data augmentation techniques resulted in a notable improvement in accuracy. HMB1 reported a testing accuracy of 98.4% by utilizing augmentation on the HMBD dataset, which consisted of 865,840 data points.

        M. M. Yapıcı, et. al., 2021 (Yapıcı, Tekerek, & Topaloğlu, 2021) Presented a Cycle-GAN as a new data augmentation strategy to address the issue of insufficient data in signature verification. In addition, a signature verification system unique to Caps-Net was revealed. Four commonly used convolutional neural network (CNN) approaches are used to test the proposed data augmentation technique: VGG16, VGG19, ResNet-50, and DenseNet-121. The approach has made a substantial contribution to the success of all of the previously stated CNN methods. On the DenseNet-121, and the proposed data augmentation technique provides the most advantages. Using two well-known databases, the authors evaluated the data augmentation technique using the suggested signature verification system, GPDS and MCYT. In comparison to other studies, their verification approach produced the best findings on the MCYT database and the second-best results on the GPDS database.

        M. Shams, et. al., 2020 (Shams, Elsonbaty, & ElSawy, 2020) described an effective deep convolutional neural network architecture for extracting and classifying Arabic handwritten characters dataset, used (AHCD). The researchers employed a dropout support vector machine (SVM) to categorize and identify missing attributes that were not accurately detected by a deep convolutional neural network (DCNN). This was done to enhance the dependability and effectiveness of the proposed framework. In addition, the proposed approach employed K-means clustering as a technique of dividing the multi-stroke Arabic characters into 13 distinct groups that exhibit similarity. In contrast to alternative approaches, the system under consideration exhibits a classification accuracy of 95.07% and a classification error rate of 4.93%.

        S. Jraba, et. al., 2020 (Jraba, Elleuch, & Kherallah, 2020) proposed a novel technique for the identification of the Arabic handwritten words. The study of Arabic handwriting identification is a recent area of focus within the field of computer vision, presenting several promising applications including intelligent systems, video conferencing, and real-time applications. The authors proposed the utilization of Deep Convolutional Neural Networks (DCNN) as a tailored approach for performing the classification task. Prominent architectural models in the field of computer vision encompass ResNet and VGG16. These models have been trained using an enhanced dataset comprising photos sourced from the IFN/ENIT database. This approach has demonstrated commendable performance in conventional pattern recognition tasks. The system that was created had very high rates of identification, as indicated by the data that was obtained.

        M. Elleuch and M. Kherallah, 2020 (Elleuch & Kherallah, 2020) fixed the problem of overfitting by adding regularization techniques to their Convolutional Deep Belief Networks (CDBN) model and used IFN/ENIT datasets with data augmentation to test the proposed model on low and high-level dimensions in Arabic textual (character/ word) images. The experimental results demonstrate that the proposed CDBN architectures outperform convolutional networks for categorizing textual picture data. CDBN were used to automatically learn the most discriminative characteristics from AHS's textual image data. The advantages of deep belief networks and convolutional neural networks can be incorporated into this architecture. In order to solve the issue of overfitting, the authors incorporated regularization techniques into their CDBN model.


4.1     Convolution Neural Network Operations

        Most often, CNNs consist of a number of sequentially linked layers of multi-convolutional processing, then a number of layers of fully connected processing. Inputs from the layer below it is convolved with filters that have been trained for each successive convolutional layer. After the convolution step, the pooling operation is carried out on the output of the current layer in order to lessen the magnitude of the data and cut down on the amount of overfitting that occurs in the network (Safarzadeh & Jafarzadeh, 2020).

        Convolution refers to the mathematical combination of two functions to form a third function. In the context of CNNs, a convolutional layer (called Filter or Kernel) is applied to the input data (image) to produce a feature map. The filter is applied to the input data and its output is formed on the new layer. Figure (4) shows the procedure for performing a Product Dot operation between a 3x3 filter matrix and a 3x3 region of the input image matrix. The resulting matrix elements are summed and the sum is the output value (Destination Pixel) on the feature map. The filter then passes over the input matrix, repeats the dot product with each remaining set of 3x3 regions, and completes the feature map. Multiple filters are used for a single input and the resulting feature maps are linked together to obtain the final result of a single convolutional layer (Dao, 2020).

Figure 4: The process of wrapping the filter with the image in a single layer (Dao, 2020)

4.1.1   Convolutional Layer: Convolutional layers extract picture features first. Because pixels are only related to nearby pixels, convolution maintains the relationship between different parts of a picture (Lamsaf, Ait Kerroum, Boulaknadel, & Fakhri, 2022). Convolution is the first and most crucial stage in filtering a picture with a lower pixel filter to minimize its size while keeping pixel relationships (Boutounte & Ouadid, 2021). A 3x3 filter with a 1x1 stride (1-pixel shift at each step) convolutions the 5x5 picture to a 3x3 output (64% reduction in complexity) (Hossain & Ali, 2019).

        The CNN's Convolution Layer is its most important part. It convolves or multiplies the resulting pixel matrix to create an activation map for the image (Niharmine, Outtaj, & Azouaoui, 2022). CNNs define mathematical convolution differently than mathematics or engineering. NN layers undergo convolution. K-filter convolutional layers (also called kernels). Filters identify corners, edges, and endpoints. Filters are N*N*R grids, where N is the filter's height and width and R is the number of picture channels. Each filter convolves through the input grid, multiplying each pixel by its filter value. The multiplications are then added (Altwaijry & Al-Turaiki, 2021).

        Combining two functions to create a third is called convolution. CNNs create feature maps from image data using a convolutional layer (Filter or Kernel) (Adebayo, Oluwatobi Aworinde, Akinwunmi, Ayandiji, & Olalekan Monsir, 2022). The new layer receives the filter output from the input data. Summating the matrix elements yields the feature map's Destination Pixel. The filter then travels over the input matrix, repeats the dot product with each set of 3 x 3 regions, and finishes the feature map (Truong Quang, Duy, & Nhan, 2020). A single convolutional layer is created by linking feature maps from many filters for a single input (Dao, 2020).

Figure 5: General architecture of CNN (Siddique, Sakib, & Siddique, 2019)

        Mathematically expresses the convolution process if we use a two-dimensional image (I) as input and use a two-dimensional filter (K) of size (m*n). The feature map (S) is obtained according to the mathematical equation (1) (Yao & Zheng, 2023):

        Additionally, the Accuracy for Correctly classified instances divided by the total number of instances, and mathematically is obtained according to the mathematical equation (2) (Vakili, Ghamsari, & Rezaei, 2020):

4.1.2   Activation Function: An activation function is a node (placed at the end or between the layers of neural networks) that helps determine whether a neuron will fire or not (Wang, Li, Song, & Rong, 2020). This study employed the Rectified Linear Unit (ReLU) function, which produces an output of zero when the input value is less than or equal to zero. Alternatively, the resulting output will be equivalent to the initial input value, Mathematical equation (3) explains this function (Kandel & Castelli, 2020):

4.1.3   Pooling or Subsampling: The Pooling process is an essential step in convolution-based systems, as it reduces the dimensions of feature maps and combines a set of values to search for a smaller number of those values, that is, to reduce the dimensions of the feature map (Gholamalinezhad & Khosravi, 2020).

4.1.4   Flattening Layer: Flattening reduces the spatial dimensions of the pooled feature map while preserving the channel dimension. The flattening layer provides an extra dimension even if the inputs are shaped without a channel dimension. Following the flattening operation, the feature matrix is turned into a vector that can be fed into Keras' dense layer, a fully connected neural network (Mishra, Sachan, & Rajpal, 2020).

4.1.5   Fully Connected Layer: This is the final layer that the neural network is fed after the convolution and pooling layers; the classification part consists of a few fully connected layers. These layers accept only one-dimensional data; where it is considered the interface between the individual neurons of one layer and the individual neurons of the next layer. So, a layer is completely connected and serves as a connection point for the neurons of all of the other layers (Dubey & Jain, 2019).

        The activation function (SoftMax) is used as a higher layer (after the fully connected layer), as it begins to deal with the results with a real value that is not appropriately scaled and which may be difficult to deal with, as it converts the number vector into a probability vector between 0 and 1. The SoftMax activation function () can be defined by mathematical equation (4) (Bhatnagar, Gill, & Ghosh, 2020):

x: Values from the neurons of the output layer.

n: The number of neurons.

 : The sum of the exponential values (e) of the output cells.

The fully connected layers finally connect each layer of the max pooling layer to the output neurons (Chauhan, Ghanshala, & Joshi, 2018).

4.2     Dataset

        The research utilizes a dataset created for Handwritten Characters Assyrian Language. The database consists of 36 Assyrian characters with 500 samples in each character. Thus, the dataset consists of 18,000 samples. Initially, the form for the Assyrian language was distributed to people between the ages of 13  to 60 years of both genders to fill it out.

After the researcher selected the sample from 500 people . The form distributed and collected  within 6 months to 4 governorates,  Table (2) shows that:

Table 2: Form Distribution details and Numbers



The number of forms










Kori Gavana





















        Then, the responders were tasked with writing the characters into the empty squares with size 224, 224 pixels provided in the forms table. as shown in Figure (6).

Figure 6: Sample of Assyrian Language full Forms

        All cropping characters images saved with the new label from 1 to1000 preceded by the folder number (class) which is from 0 to34, each character is in a separate folder,  Figure (7) shows a sample of characters after crop.

Figure 7: Sample of Cropping and Labelling Characters


         Because we used the VGG16, VGG19, MobileNet-V2, and ResNet-50 algorithms, we were able to reduce the size of each image to 224 by 224. Each image's original dimensions were different. Images in logical form were transformed into images in unsigned integer form throughout the training procedure to obtain relevant features. Eventually, the bmp photos were converted to greyscale. This is accomplished using MATLAB.

Certain perplexing photos in the dataset have an impact on classifier accuracy owing to their personal handwriting peculiarities and characteristics. Another issue that might be found is the scanning noise.  Figure 1 (a) shows examples of chosen characters utilized in the study. All of the preceding stages were completed as part of the preprocessing phase. (b), are modified as some instances of characters that cause confusion during the process.

4.3     Split the Dataset

        The new dataset consists of 18,000 images . The CNN models were trained on 70% of the characters, which means the total number of training dataset is 12,600 images, and tested on 30% of the characters, which means the total number of the testing dataset is 5,400 images. Table (3) describes the Training and Validation partition, and number of character’s images:





Table 3: Dataset Partition


Number of Characters

Training images (70%)


Validation images (30%)


Total of images


4.4     Models Training

        The training step is very important for the model, as this stage is concerned with creating a model from the data given to it. The model is trained on the training dataset to find the correct weights that will be automatically adjusted by the specified algorithm, which helps to reduce performance.

At this stage, the Compile and fit functions were implemented in MATLAB. Within the Compile function, use the network optimizer ((ADAM) Adaptive Moment Estimation).

        And the loss function (categorical_crossentropy) and the scale (accuracy), as for the fit function within this stage, through which the training and investigation data and the number of training cycles are determined (Epoch), where 30 epochs were tested experimentally and it is sufficient to adjust the weights to the best with a learning rate of 0.0001 to update the weights, and Batch Size is set to 100.

4.5     Models Testing

        In this step, the test process was applied to verify whether the model recognized the Assyrian characters correctly on the data allocated for the validation or not.

The confusion matrix was used to test the performance of prediction algorithms based on a set of tests by applying the function of confusion matrix, on the models used in the data assigned to the validation, The prediction results, which are the values of TP, FP, FN, TN were obtained after applying the mentioned equations to calculate the combination measure of precision and recall of the model.


4.6     Dimensional Reduction

        Principal Components Analysis (PCA) is a popular approach for both modifying data and reducing dimensionality (Mahmood & Abdulrazzaq, 2022). Calculating the features that explain the majority of the variation in the data is what principal component analysis (PCA) does. It merely locates a subspace that accounts for the vast majority of the variance present in the data, and then it eliminates dimensions that have a low variance (Yang et al., 2020). This is achieved by the modification of the data space by introducing additional qualities that exhibit non-linear relationships with one another. In classification and regression problems, PCA has been used successfully as a way to change and reduce the number of dimensions (Valls, Aler, Galván, & Camacho, 2021).

        In addition to being a statistical tool, principal component analysis makes use of orthogonal transformations. Using the use of principal components analysis (PCA), a set of correlated variables can be changed into a set of uncorrelated variables. PCA is a technique for analysing exploratory data. PCA can also be used to study the relationships between variables. As a result, it has the potential to be utilized to reduce dimensionality (Ma & Yuan, 2019), (Reddy et al., 2020).


        This work was performed using MATLAB R2022b on a machine with an Intel® Core (TM) i7-1165G7 processor, 2.80 GHz, 64-bit where four convolutional neural network models were used . The results of accuracy for the database of Assyrian characters  were obtained from 500 people aged (13-60) who wrote 36 handwritten letters of the Assyrian language . The results of the highest accuracy were as follows: 95.70% in MobileNet-V2, 94.97% in ResNet-50, 92.06% in VGG19, and finally 90.97% in VGG16.

Table 4: Confusion Matrix parameters for Four Models

CNN Models




















While the extracted results were compared with the results of other researchers who used other languages and datasets, they ranged between better and less due to the different types of datasets used in terms of language, the number of images, and their size.

Table 5:The accuracy Ratio for CNN Methods used in Handwritten Recognition


CNN Methods


No. of Dataset Sample Trian

Feature Selection Dimensionality reduction


(Chandankhede & Sachdeo, 2023)


Modi barakhadi

7721 characters



(Halder et al., 2023)

CNN New Method

CMATERdb 3.1.2

37,858 images



(Pande et al., 2022)

CNN including Dropout Layer


46 classes



(Ali & Mallaiah, 2022)






13311 words

16,800 characters

26.459 words

6.600 shapes






(Elleuch et al., 2021)



10 classes – 450 images





(Balaha et al., 2021)

CNN with HMB1 & HMB2


Seven-page dataset



(Yapıcı et al., 2021)



4000 signatures

VGG16, VGG19

ResNet50, DenseNet121



(Shams et al., 2020)



840 images



(Jraba et al., 2020)



AHS images

ResNet, VGG16


(Elleuch & Kherallah, 2020)



Augmented AHS images



(Ahlawat & Choudhary, 2020)

Hybrid CNN & SVM


70000 data Elements



(Das & Mohanty, 2020)


Javanes Script

120 characters



(Ashiquzzaman, Tushar, Rahman, & Mohsin, 2019)



3000 Digits



This Research



18000 images for 36 class of characters









        Therefore, while the subject of handwritten character and digit recognition requires major attention, few academic efforts are expended by Assyrian technicians to design a functioning system in this regard. At the same time, logically, the topic has not been addressed sufficiently in academic publications and essays, to the extent that the subjects are concerned with the Assyrian language.

         It can also be noted from the previous table that the researchers who used ResNet and VGG16 using DCNN in the reference (Elleuch et al., 2021) had better results than the results we extracted for the same models using CNN, Despite the reduced number of classes, the training and testing images were similarly less in quantity.

While the research (Chandankhede & Sachdeo, 2023), the result of ResNet-50 was slightly lower than the result of our research, as they used 7,721 images of characters, while we used 18,000 images.

        Therefore, while the subject of handwritten character and digit recognition requires major attention, few academic efforts are expended by Assyrian technicians to design a functioning system in this regard. At the same time, logically, the topic has not been addressed sufficiently in academic publications and essays, to the extent that the subjects are concerned with the Assyrian language.



Figure 8. Performance of model Training and Validation for VGG16: (a) Accuracy, and (b) Loss




Figure 9. Performance of model Training and Validation for VGG19: (a) Accuracy, and (b) Loss



Figure 10. Performance of model Training and Validation for MobileNet-V2: (a) Accuracy, and (b) Loss



Figure 11. Performance of model Training and Validation for ResNet-50: (a) Accuracy, and (b) Loss


        Initially, a dataset of 36 Assyrian characters was created. It was collected after distributing 500 forms to people of both  genders, between the ages of 13  to 60. Additionally, the dimensions of the images were modified to 224 * 224 pixels. The maximum accuracy rate in recognizing handwritten Assyrian characters was achieved by extracting accurate data using four models of the convolutional neural network, namely VGG16, VGG19, MobileNet-V2, and ResNet-50. Ultimately, a comparative analysis was conducted to evaluate the efficacy of the Assyrian handwriting recognition model in relation to contemporary models employed in other languages.


