Page 1 of 1

Extracting pdf comments using VBA and javascript

Posted: Sun Mar 19, 2017 9:20 pm
by Fenrir
Hello everyone,

I am looking for help regarding extraction of comments.

I want to use Excel to perform batch processing on several files, but I am unable to pass on the comments from X-change to Excel.

My idea was to use something like this:

Code: Select all

Const bolWaitOnReturn As Boolean = True
Const intWindowStyle As Integer = 0
Const strJScode As String = "C:\...\JavaScriptExample.txt"
Const strFullPathToProgram As String = "C:\Program Files\Tracker Software\PDF Viewer\PDFXCview.exe"

Dim objWsh As Object

Dim i As Integer
Dim strCmd As String

Set objWsh = VBA.CreateObject("WScript.Shell")

For i = 0 To frmPDFCommentExtract.lsbFilesList.ListCount - 1
    'based on X-Change command line options       
    'If the pathname contains space, it shall be enclosed in quotes """"
    
    strCmd = """" & strFullPathToProgram & """" & " " & _
                "/runjs" & "" & _
                ":showui" & " " & _
                """" & strJScode & """" & " " & _
                """" & frmPDFCommentExtract.lsbFilesList.List(i) & """"

    objWsh.Run strCmd, intWindowStyle, bolWaitOnReturn
    
Next i
And in JavaScriptExample.txt :
(sorry for the poor javascript, this is the first time ever that I use this language)

Code: Select all

function addslashes(ch) {
ch = ch.replace(/'/g,"\'")
ch = ch.replace("   "," ")
ch = ch.replace("  "," ")
return ch
}
var annots = this.getAnnots({nSortBy: ANSB_Page});
if ( annots != null ) {
var CommentsList = "";
var msg = "%s\n%s\n%s\n%s\n%s\n%s\n%s\n%s\n%s\n%s\n%s\n%s\n";
for (var i = 0; i < annots.length; i++)
var CommentsList = CommentsList + util.printf(msg, "!!PAGE!!", annots[i].page, "!!ID!!", annots[i].name, "!!ANSWER TO!!", annots[i].inReplyTo, "!!AUTHOR!!", annots[i].author, "!!CREATION DATE!!", annots[i].creationDate, "!!COMMENT!!", addslashes(annots[i].contents));
} else
var CommentsList = " No annotations in this document.";
var OutputFilename = this.documentFileName+"CommentsSummary.txt"
this.createDataObject(OutputFilename, CommentsList);
this.exportDataObject({ cName:OutputFilename, nLaunch:'2'});
//this.closeDoc(true)

My problem is that this.exportDataObject creates the textfile as an attachment to the pdf file and I am unable to pass on its content to Excel.
In fact, the macro above launch the scrip, but this leads to an open textfile and I have to manually close it to allow Excel to increment i and process with the next file.

For now on I can't figure on how to solve this issue.

Do you have any idea?

Thanks in advance !

Re: Extracting pdf comments using VBA and javascript

Posted: Mon Mar 20, 2017 1:25 pm
by Will - Tracker Supp
Hi Fenrir,

Thanks for the post - I'm not actually sure that this is possible with our End User application at all; it might only be possible using our SDK, but it may just be that the Viewer doesn't fully support this operation, so please download and try the Editor:
https://www.pdf-xchange.com/PDFXVE6.zip

Thanks,

Re: Extracting pdf comments using VBA and javascript

Posted: Tue Mar 21, 2017 7:33 am
by Fenrir
Hi Will,

I think that I was not clear enough: my purpose is to export the comments from the pdf file to Excel, in order to perform some VBA code on it (I want to sort them).
As I may have a lot of file to analyse, I want to do it automatically.

I am pretty sure that the problem comes from my javascript code: as long as I will not be able to export the textfile outside from the pdf attachment, Excel will not be able to work on it.
Then I do not think that X-Change editor will help me, except if it includes some useful tool that I can activate using VBA.

Could you please give me some clue about editor?

Regards,

Fenrir

Re: Extracting pdf comments using VBA and javascript

Posted: Tue Mar 21, 2017 9:25 am
by Will - Tracker Supp
Hi Fenrir,

I believe that the issue here is that using JavaScript to create a text file on disk, rather than as an attachment, may not be possible. Typically in PDF readers/editors, JS's cannot work beyond the scope of the application or document. This is for security reasons, as it would otherwise be extremely easy to execute malicious scripts. This is why I say that this may be beyond the scope of our end user applications. I'll double check this with the development team, but I'm not sure that there is an easy way to automate this with the end user application.

Thanks,

Re: Extracting pdf comments using VBA and javascript

Posted: Tue Mar 21, 2017 3:22 pm
by Will - Tracker Supp
Hi Fenrir,

Does this help with your issue?
http://help.adobe.com/en_US/acrobat/acr ... DataObject

Cheers,

Re: Extracting pdf comments using VBA and javascript

Posted: Tue Mar 21, 2017 9:46 pm
by Fenrir
Hi Will,

Thanks for your quick answers!
I believe that the issue here is that using JavaScript to create a text file on disk, rather than as an attachment, may not be possible. Typically in PDF readers/editors, JS's cannot work beyond the scope of the application or document. This is for security reasons, as it would otherwise be extremely easy to execute malicious scripts.
Yes, I am aware of that and I am afraid that will make my goal impossible to achieve.

However, your answer reminds me that I still did not explore all the exportdataobject possibilities.
I did a few tests but it seems that whatever the parameter is, user validation is requested.

Anyway, I still have a few things to try.

Thanks for your support.

Re: Extracting pdf comments using VBA and javascript

Posted: Wed Mar 22, 2017 9:37 am
by Will - Tracker Supp
No worries Fenrir! :D