Comparison of approaches for mobile document image analysis using server supported smartphones

Ozarslan S., EREN P. E.

Conference on Digital Photography X, San-Francisco, Costa Rica, 3 - 05 February 2014, vol.9023 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 9023
  • Doi Number: 10.1117/12.2040687
  • City: San-Francisco
  • Country: Costa Rica
  • Middle East Technical University Affiliated: Yes


With the recent advances in mobile technologies, new capabilities are emerging, such as mobile document image analysis. However, mobile phones are still less powerful than servers, and they have some resource limitations. One approach to overcome these limitations is performing resource-intensive processes of the application on remote servers. In mobile document image analysis, the most resource consuming process is the Optical Character Recognition (OCR) process, which is used to extract text in mobile phone captured images. In this study, our goal is to compare the in-phone and the remote server processing approaches for mobile document image analysis in order to explore their trade-offs. For the in-phone approach, all processes required for mobile document image analysis run on the mobile phone. On the other hand, in the remote-server approach, core OCR process runs on the remote server and other processes run on the mobile phone. Results of the experiments show that the remote server approach is considerably faster than the in-phone approach in terms of OCR time, but adds extra delays such as network delay. Since compression and downscaling of images significantly reduce file sizes and extra delays, the remote server approach overall outperforms the in-phone approach in terms of selected speed and correct recognition metrics, if the gain in OCR time compensates for the extra delays. According to the results of the experiments, using the most preferable settings, the remote server approach performs better than the in-phone approach in terms of speed and acceptable correct recognition metrics.