Split a PDF document using OL Connect
In this tutorial, we explore how to use standard OL Connect techniques to split print-ready files, such as PDFs and PostScript files. The general steps, described in detail below, are:
- A Data Mapper configuration defines document boundaries.
- The Data Mapper configuration extracts relevant data.
- A Job Preset adds the extracted data to the document meta data.
- The meta data allows the Output Preset to name and store the individual files, and handle document separation.
- Deploy the resources.
Step 1: Define Document Boundaries
To split a document, we need to define the boundaries for each document within the file. This can be based on a fixed page count or triggered by specific text changes, such as a unique invoice number or page numbers. In this tutorial, we’ll work with a PDF containing multiple invoices as our example.
Create a Data Mapping configuration
- Start OL Connect Designer and, from the Welcome Screen, click New.
- Go to the Data tab and select PDF. The New Data Mapping configuration dialog appears.
- Browse for the PDF file and click Finish.
This creates a new Data Mapping configuration and adds the selected PDF file to the Data Samples.
Set the Document Boundaries
- In the Settings tab, set the Trigger to:
- On Page: If the documents always have a fixed number of pages.
- On Text: If document boundaries are based on a specific changing value (e.g., invoice number, customer ID).
- In this example, we select On Text and configure it to recognize the text
PAGE 1 OFin a specific area as the trigger. This ensures that the PDF is split correctly, even when the documents have a variable number of pages.

Step 2: Extract Data for the file names
Once document boundaries are defined, we extract data from the document for naming the output files.
Extract data
- Navigate to the Steps view.
- Select the text you want to extract and use for the file name. In our example, we’ve selected the Order Number.
- Right-click the highlighted text and choose Add Extraction.
- This creates a new data field that appears in the Data Model view, showing the value of the current record.

Rename the extracted field
- In the Data Model view, locate the newly created field.
- Right-click on the field and choose Rename.
- Enter a descriptive name (e.g., OrderNo) that clearly represents the extracted data.
- Save the Data Mapping configuration to disk to ensure all changes are preserved.

Step 3: Add extracted data to meta data
A Job Preset is used to include the extracted data in the meta data of the documents, ensuring it can be used in the Output Preset for naming the output files.
Create a Job Preset
- Open the Welcome Screen and click on New.
- Navigate to the Presets tab and select Job Preset.
- Choose the previously created Data Mapping configuration file.
- Check the Include meta data option.

- Click Next until you reach the Meta Data Options page.
- In the Meta Data Options step, go to the Document Tags tab and click the Add meta data icon, then select Add field meta data.
- Choose the
OrderNofield and click OK. TheOrderNofield now appears in the Document Tags overview and will be included in the document’s meta data. - Finally, click Finish to save the Job Preset.

Step 4: Name files and split documents
The Output Preset handles file naming and document separation.
Set the file name
- Open the Welcome Screen and click on New.
- Navigate to the Presets tab and select Output Preset.
- Choose the previously created Job Preset file.
- Enable Separation to generate individual output files per document.
- Click Next until you reach the Print Options page.
- On the Print Options page, set the Output Type to Directory.
- Define a Job Output Mask using the meta data variables:
${document.metadata.OrderNo}.pdf
This dynamically generates filenames using the order number.
Tip! Instead of setting the Separation option on the Separation Options page and entering the mask manually, this can be done through the Job Output Mask dialog invoked via the Pencil icon of the Job Output Mask field
Note! The Job Output Folder can be set in the preset, but can be overridden using OL Connect Workflow job infos or from within the All In One or Paginated Output nodes of OL Connect Automate.
Enable separations
- Click Next until you reach the Print Options page.
- Set the Separation Settings to Document.
- Click Finish to save the Output Preset.
Step 5: Deploy resources
At this stage, you have a Data Mapping configuration that defines document boundaries and extracts the order number for each document. A Job Preset stores the order number in the document’s meta data, while an Output Preset generates a separate file for each document, using the extracted order number as the file name.
To use these files, they must be deployed to OL Connect Workflow, or OL Connect Server when using OL Connect Automate (or sent directly to OL Connect Server via the REST API).
In this tutorial, we discussed the process of splitting a print-ready file. In scenarios where print-ready files are modified, such as when splitting, adding OMR marks, postal sorting, or performing similar tasks, there is typically no need to merge data with OL Connect templates. In OL Connect Workflow, this involves the Execute Data Mapping task with the Bypass content creation option enabled, as shown in the following image. This option allows the job to bypass Content Creation, which enhances overall performance.

The following image illustrates an OL Connect Automate flow, where the Data Mapping configuration is applied in the Document Mapping node (which applies Bypass content creation on the fly), and the Job and Output Presets are used in their respective steps.

Conclusion
This method ensures the efficient and automated processing of print-ready documents, while using data driven output file names. It also allows adding extra content such as text and barcodes in the Output Preset, as well as the option to prepend the job with a banner page using the Additional Content options.