When I use the DLLs to extract text, for example (PXCp_ET_GetPageContentAsTextW), spaces and CR/LF symbols are not included. Is there an argument that achieves these goals?
There are two methods to resolve this issue:
A) Fill the PXP_TETextComposeOptions structure - specifically, the AddSpaces parameter. See here for further information.
B) Use the PXCp_ET_GetElementCount and PXCp_ET_GetElement functions to get text with the position and compose it yourself. This requires the implementation of a text composition algorithm, but the end results are greatly improved and text can then be extracted with spaces intact.
You can contact us by phone, email or our social media accounts — we are here to assist you.