Extract Pages Between Two Search Strings

This Forum is for the use of End Users requiring help and assistance for Tracker Software's PDF-Tools.

Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan

Post Reply
alexschomb
User
Posts: 1
Joined: Thu Nov 16, 2023 2:37 pm

Extract Pages Between Two Search Strings

Post by alexschomb »

I need to extract certain pages from a large number of PDF documents that are all consisting of a certain layout. Every PDF document consists of multiple page dividers that consist of specific strings (and background colors) that describe the type of content for the following pages until the next page divider.

Let's assume there is a PDF file with 10 pages:
  • first there is a page divider with the word "DIVIDER1" on it
  • following this page divider are 3 pages (number can vary between documents) with random content that should be extracted to a separate file called originalfilename_divider1.pdf
  • then there is another page divider with the word "DIVIDER2" on it
  • following this page divider are 5 pages (number can vary between documents) with random content that should be extracted to a separate file called originalfilename_divider2.pdf
I found that I can do filter pages with PDF-Tools by certain strings (e.g. "DIVIDER1") and can extract the filtered pages with a second action. Unfortunately there doesn't seem to be an option to extract all pages between two defined strings ("DIVIDER1" and "DIVIDER2"). Is there any other solution? I also had a look in the Adobe Acrobat Action Wizard and found a possible solution that requires me to write Javascript code to determine the page numbers and extract the pages between those. Still, I would prefer an already implemented solution, possibly by PDF-Tools.
User avatar
Jordan - Tracker Supp
Site Admin
Posts: 91
Joined: Mon Jul 03, 2023 3:10 pm

Re: Extract Pages Between Two Search Strings

Post by Jordan - Tracker Supp »

Hello alexschomb,

Welcome to the forum.

The PDFs that you quoted in your post were not attached to your post.

Please send them to support@pdf-xchange.com

Best regards,
Jordan
Best regards,
Jordan
Post Reply