Jump to content

Exchange Blog Cryptocurrency Blog


All Pips



Handwriting Recognition using AI Network and Image Processing


Seriolo

Recommended Posts

Deep learning (DL) is a hot topic in current pattern recognition and machine learning. DL has unprecedented potential to solve many complex machine learning problems and is clearly attractive in the framework of mobile devices. The availability of powerful pattern recognition tools creates tremendous opportunities for next-generation smart applications. A convolutional neural network (CNN) enables data-driven learning and extraction of highly representative, hierarchical image features from appropriate training data. However, for some data sets, the CNN classification method needs adjustments in its structure and parameters. Mobile computing has certain requirements for running time and network weight of the neural network. In this paper, we first design an image processing module for a mobile device based on the characteristics of a CNN. Then, we describe how to use the mobile to collect data, process the data, and construct the data set. Finally, considering the computing environment and data characteristics of mobile devices, we propose a lightweight network structure for optical character recognition (OCR) on specific data sets. The proposed method using a CNN has been validated by comparison with the results of existing methods, used for optical character recognition.
Optical character recognition technology refers to the process of using electronic devices to scan printed characters, determine their shape by detecting edge information, and then translate the shapes into computer characters by character recognition [1]. OCR technology combines digital image processing, computer graphics and artificial intelligence and is one of the most active research topics in the field of pattern recognition. In China, there is an urgent need to use OCR technology for digital preservation of Shui characters. Shui characters are hieroglyphic, except for the Dong Ba character, which shape resembles that of an oracle bone and a gold inscription. Cultural inheritance of this language currently depends on oral communication and the handwriting of certain people; thus, most of the existing Shui characters are illegible and the books are unreadable. Therefore, by using advanced information processing methods, such as machine learning as well as big data collection and analysis, we can change the traditional document preservation methods and urgently establish meaningful digital preservation.

With the rapid development of mobile Internet services and the popularity of various intelligent devices, more and more users receive and transmit various information through mobile devices, which brings great convenience in terms of data collection, storage and use. Advances in mobile service computing and embedded devices have led to the development of the Internet of Things (IoT), which has increasingly linked physical content in everyday environments, dramatically changing the interaction between people and things. Especially in mobile phones, artificial intelligence technology represented by deep learning makes mobile phones capable of machine learning. There are many potential applications, such as object detection and recognition; speech-to-text translation; media information retrieval; and multimodal data analysis. Deep learning brings tremendous opportunities for mobile computing and can greatly improve the performance of various applications. In addition, due to the rapid spread of smart portable devices and the development of mobile service technology [2, 3], the possibility of introducing smart applications in mobile environments is receiving increased attention. Therefore, people are increasingly concerned about the possibility of applying deep neural networks (DNNs) in mobile environments [4]. Deep learning not only improves the performance of existing mobile multimedia applications, but also paves the way for more complex applications for mobile devices. Many of these devices (including smart watches, smartphones, and smart cameras) can perform some sensing and processing, making them smart objects that can learn. Despite the large potential for mobile DNNs, the design of neural network architectures for mobile environments is not well developed. From basic deep learning techniques to network architecture, training and reasoning algorithms, there are still many important issues that need to be addressed.

The most widely used neural network for deep learning work is the convolutional neural network, which converts unstructured image data into structured object tag data [5]. Generally, the working principle of a CNN is as follows: first, the convolution layer scans the input image to generate a feature vector; second, the activation layer determines which feature should activate the image under inference; third, the pooling layer reduces the size of the feature vector; and finally, a fully connected layer connects each potential tag to all outputs of the pooling layer. Although current deep learning technology has made a major breakthrough in the field of OCR, the computing and storage resources of mobile intelligent devices are limited, and the convolutional neural network model usually has hundreds of megabytes of parameters, which makes it difficult to implement in mobile devices. It is a challenge to realize the unification of the interconnection and interoperability between an object and the cloud, and to guide the IoT application system that supports horizontal interconnection, heterogeneous integration, resource sharing and dynamic maintenance [6, 7]. How to abstract the capabilities provided by the “object” and “cloud” resources into a unified system of software components according to the application requirements, and define the interaction topology between these components to establish the software architecture based on IoT is also a test

The Internet of Things is an emerging concept that is likely to change our lives as much as the Internet has. In the Internet of Things, IoT devices with smart features or objects can communicate with each other. Each device will be given a unique identifier that allows the device to access the Internet and make decisions through calculations, thus creating a new world of interconnected devices. These IoT applications are sure to make a large difference in our lives. For example, in the medical industry, HIT (Health Internet of Things) sensors can be used to monitor important patient parameters (blood pressure, heart rate, etc.), to record data to better respond to emergencies, and to provide a better quality of life for people with disabilities [10]. Another area of research based on the Internet of Things is the IoUT (Underwater Internet of Things), whose goal is to monitor a wide range of inaccessible waters to explore and protect these areas [11]. The application of interest in this article is OCR.

Optical character recognition is one of the litmus tests of pattern recognition algorithms. In addition to pattern recognition, optical character recognition involves some other fields of knowledge, such as image processing, artificial intelligence and linguistics [12,13,14]. Optical character recognition can make computers handle real world texts directly. The research results can be applied to a large number of practical problems, such as mail splitting, check recognition, image retrieval, intelligent robots and handwriting input. Commonly used methods in character recognition can be divided into three categories: template matching; feature extraction and classification; and deep learning methods. In the template matching method, a standard template set is first established for the character to be recognized, and then the preprocessed images of the character to be recognized are sequentially projected onto the templates for matching. The best matching template corresponds to the character being recognized. Feature extraction and recognition are the most commonly used methods in optical character recognition. The process consists of two parts. The first step uses an algorithm to extract the features of the character image. The commonly used feature extraction algorithms are: SIFT, SURF, and HOG. The second step is to use a classifier to classify the acquired features. Common classifiers include the Bayesian classifier, support vector machine, K nearest neighbor algorithm, artificial neural network algorithm, etc. [15, 16] Deep learning has become the most commonly used method since its appearance. The most commonly used deep learning model in the field of character recognition is the convolutional neural networks.

 

 

In pattern recognition tasks, traditional machine learning tools often provide limited precision, and deep learning has been shown to be one of the most promising methods to overcome these limitations and achieve more powerful and reliable reasoning. However, although deep learning technology has existed and has been successfully applied in many fields, only a few DL-based mobile applications have been produced, mainly due to the low-power computing resources provided by mobile devices.. There are two main models for deploying DL applications on mobile devices: the client-server model and the client-only model. The former uses a mobile device as a sensor (without data preprocessing), or a smart sensor (with some data preprocessing), whose data are sent to a server or a simple cloud that runs the DL engine and sends the results back to the client. In this framework, maintaining good connectivity is the most important requirement for efficient operation of the entire system. The latter model runs DL applications directly on the mobile device. This solution can produce faster results without an operational network connection but must cope with the lack of device resources and therefore requires the right software and hardware solutions. In terms of mobile services, Honghao Gao, Wanqiu Huang and others have researched and innovated service patterns and workflows .. In the development of deep learning, scientists have optimized model compression and neural network structure, especially for the CNN. The SqueezeNet [33] structure proposed in 2016 is not designed to achieve higher precision, but to simplify network complexity, reduce the size and calculation time of the model, and achieve performance similar to that of some typical large networks (such as Alexnet and VGG). In 2017, Google Inc. proposed a small network structure, MobileNet for embedded devices such as mobile phones, which can be applied to various tasks such as target detection, face attribute analysis and scene recognition [34]. Similarly, there is an efficient network ShuffleNet designed for mobile devices by Vision Technology ..

Different input data and computing environments have different structural requirements for neural networks. In this paper, our work is to design a mobile computing framework and a convolutional neural network structure based on data characteristics and mobile computing requirements to achieve Shui character recognition.

Method
In this section, we designed a mobile service computing framework, which is divided into three modules. And we detail the main content of Shui character recognition, which is divided into two parts: first, we describe the construction process of the data set used in the study; then, we build the CNN model and explain its details.

Mobile computing framework
Deep learning of mobile devices such as mobile phones, watches and even embedded sensors is become a hot topic. This increased interest is achieved by a growing community of academic and industrial researchers who are connecting the world of machine learning, mobile systems and hardware architecture.

On the smart mobile device, the image library is constructed according to the user’s requirements, and image retrieval is realized. Most of the existing identification systems use traditional recognition technology, mainly based on image level matching, and need to manually extract features first, which makes the model susceptible to external factors such as illumination and deformation. The system extracts image features based on a convolutional neural network, which replaces the traditional artificial feature extraction algorithm and can obtain better recognition performance as it is invariant to illumination and deformation efforts. As shown in Fig. 1, the system uses a separate mechanism of training and classification. The module on the left is the data acquisition module, which collects the original data through the mobile device; the middle part is the calculation module, which performs data storage and model training in the cloud; the right part is the identification module, which completes the character recognition task. The amount of training data and the computational requirements of a convolutional neural network are extremely large, and mobile devices are over burdened by the amount of training calculations. Therefore, it is very unrealistic to expect to train the model on the intelligent device. The system first uses the GPU offline to train the lightweight image recognition CNN in the cloud; then, it loads the trained network parameters and performs the network’s forward propagation operation on the mobile intelligent device, performs feature extraction, and finally searches in the image library according to the cosine similarity.

Deep convolutional neural networks have made a great breakthrough in machine learning, but there are still some research challenges. The first is the determination of the number of CNN network layers and the number of neurons, which can only rely on many experiments. Second, efficient deep learning algorithms still rely on large-scale data sets. To improve the accuracy of Shui character recognition, we also need to provide many training samples. In addition, there are many parameters in CNN, and determining the optimal parameters is also a research problem.

Furthermore, the rapid development of deep learning technology has also been widely used in the fields of virtual reality, augmented reality and mobile devices. Today, when the mobile device fully enters the “AI era”, considering that CNN models are getting deeper, larger in size and more computationally intensive, it is especially important to design more efficient CNN models and compression techniques to reduce the resources (memory, storage, power, computation, band-width, etc.) required. At present, the deep learning model compression technology for mobile devices is mostly for a network structure. Therefore, from a software and hardware perspective, designing a complete deep learning model for mobile device deployment is still worth exploring.

 

Read more: https://intellifactsai.weebly.com/blog/handwriting-recognition-using-ai-network-and-image-processing

Link to comment
Share on other sites

  • 1 month later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...