However, when I cut the pages in half (to be able to do OCR) I find that the document size is reduced by half, and each page cut has a significant loss in quality. Visual artifacts appear around all contrasting elements (tables, graphs, text).
I saved the image before cutting and after cutting the page to disk in JPEG format. The IrfanView application displays the following in the properties of these files:
original:
after splitting:JPEG, progressive, quality: 70, subsampling OFF
Number of unique colors: 187417
I can assume that either after cutting the image, the JPEG compression ratio is the same as in the original image, or all (?) images saved from the document are re-encoded at 70% quality.JPEG, progressive, quality: 70, subsampling OFF
Number of unique colors: 131079
I did not find a way in the application settings to specify at what quality the split images should be encoded. Can you help me find this setting, or suggest another way to split pages that are too long for recognition without degrading the quality of images (which, among other things, leads to deterioration in the quality of OCR)?