pdfDX (TM)

PDF File Text Data Extractor

Main menu: Home | How to Use the Program | Troubleshooting | Download | Purchase |

How to use pdfDX in Windows mode:

How to Install pdfDX
How to use pdfDX in Command Line mode
Go to end of page

INITIAL SPLASH SCREEN -- pdfDX begins with this splash screen. The registered version has an "OK" button immediately. The screen also disappears after a few seconds and is not seen again while the Program continues in the registered version.

In the un-registered version, the "OK" button appears after a few seconds and MUST be pressed to continue. The initial splash screen appears each time you begin to format files in the un-registered version.

pdfDX is now ready to select files to format. With general MS Windows operation, there are three ways to use the Program:

When you select "About", pdfDX displays the "About" Dialog for basic information about the Program.

The un-registered version of pdfDX displays a "BUY" Button on the far-right of the toolbar.

You can use the previously-mentioned Keyboard Shortcut, ALT first, then CTL-R or just click on the "BUY" Button to display the "Registration Dialog".

To purchase your Registration Code, click on the "BUY" Button on the Registration Dialog and your Web Browser will open a Web Page from regNow (Digital River). Enter the necessary information and your Code will be emailed to you immediately.

Once you've acquired your Registration Code, you can enter the 16 character Code for the Program to get full capability.

IMPORTANT NOTE!: When you enter your Registration Code, some Anti-Virus/Firewall Programs will display a notification of a Registry Change. This is normal when installing software. Just accept the change and continue.

To select Files to format, use the Open Files Dialog. As with normal MS Windows operation, you can select multiple files by holding down the "Shift" Key and clicking on the beginning, then the ending filename for a block of names. You can also select individual File names by pressing the CTRL key and clicking on a single File name. All the selected File names are highlighted.

You can select as many File names as desired from one folder, but you can click the Open Button, then add more File names from different folders for each time you use the Open Files Dialog.

To start over, clear the existing File names with the File Menu or CTL-C shortcut Key (remember to press the ALT Key first).

If desired, you can set selected Options before formatting Files.

Choose a Folder to place formatted files.

Formatted files Folder chosen.

Options displayed and selected Files ready to format.

Files currently being formatted. Again, you can click the "Stop" Button to interrupt File formatting.

File formatting completed. You can now exit the Program or format more Files. You can edit/view the formatted Files in Notepad/Wordpad or your favorite text editor. Notice the output File Names are the complete original name with the extension: ".txt", attached.

How to use pdfDX in Command Line mode:

How to use pdfDX in Windows mode
Go to end of page

NOTE: If you need more information and instruction on how to use the Command Window or Command Line, just use your favorite Search Engine to find "MS DOS Commands". You will get a wealth of education at all levels.

The Command Line mode with pdfDX allows the program to be executed from a Command Window (or some people call MS DOS mode). It is powerful way to format large numbers of files. You can specify any of the program options in the Command Line. This mode has an additional feature that's not available in Windows Mode: if you specify and Input Folder, but don't include any individual File Names, ALL PDF Files in that Folder are formatted.

NOTE: If you use Shortcuts to Files and don't specify an output Folder, the formatted Files will be placed in the Folder where the original Files are, NOT the Shortcut Folder.

So, you can copy all the Files (or Shortcuts to the Files) into a single Folder and maintain strict control and organization of the process. Also, you can automate your multiple procedures by creating and running a Shell Script to perform multiple operations on the formatted PDF Files.

To get Command Line Help, go to the path where the program is located (usually "C:\Program Files\pdfDX") and enter: pdfdx /?. The program will display:

When pdfDX Help is displayed:

          Usage: pdfDX /I Folder [Options] [PDF File Names]

          /I Input Folder -- REQUIRED: Location of PDF Files.

          /O Output Folder -- Location of created Text Files.
          /F N -- DO NOT Format Files.
          /S Page Number -- Starting Page Number.
          /N 1+ -- Number of Pages to Format.
          /M 0-10 -- Maximum Number of Blank Lines.
          /? -- Show this screen.

Usage: pdfDX /I Folder [Options] [PDF File Names]

When you run pdfDX from the Command Line, You MUST place the program name first, then the required Input Folder. The program cannot work without knowing where the PDF Files are. If no individual File Names are specified in the optional arguments section, then ALL of the PDF Files in the Input Folder are formatted. This option is ONLY available in Command Line mode.

The optional argument options follow the program name and the Input Folder location.

/I Input Folder -- REQUIRED: Location of PDF Files.

Again, the Input Folder name MUST be included in the programs arguments when using Command Line mode.

To avoid ambiguity, the Input Folder name should contain the entire location, including the Drive Letter, enclosed in quotation marks. This Command will format ALL of the PDF Files in the Input Folder:

     pdfdx /i "C:\Temp\Unformatted PDF Files"

/O Output Folder -- Location of created Text Files.

The Ouput Folder argument is optional. If it is excluded, the formatted files are placed in the same Folder as the original PDF Files. Output File Names are the PDF File Name with the ".txt" (text file) Extension appended to the end of the File Name. Example:

     MyFileName.pdf is formatted into: MyFileName.pdf.txt

This Command will format ALL of the PDF Files in the Input Folder and place the formatted Files in the "C:\Temp\Formatted PDF Files" Folder:

     pdfdx /i "C:\Temp\Unformatted PDF Files" /o "C:\Temp\Formatted PDF Files"

/F N -- DO NOT Format Files.

Don't let the "DO NOT Format Files" mislead you. All this argument means the output will be left-justified with no added blank spaces. This format will NOT have the formatted Text Document resemble the original PDF File.

/S Page Number -- Starting Page Number.

Select the page in the original PDF File to begin your formatted output. If you select a page that doesn't exist, the output will begin with the first page. This example formats the PDF File, "MyPdfFile.pdf" beginning with page four:

     pdfdx /i "C:\Temp\Unformatted PDF Files" /s 4 "MyPdfFile.pdf"

/N 1+ -- Number of Pages to Format.

Select the number of pages you want formatted. This example formats 10 pages of EVERY file in the Output Folder:

     pdfdx /o "C:\Temp\Formatted PDF Files" /n 10

/M 0-10 -- Maximum Number of Blank Lines.

Select the number of blank lines to be output in formatting. You can enter any number from Zero to Ten. By restricting the number blank lines, you can keep file sizes down and avoid excessive whitespace. This example formats the file, "MyPdfFile.pdf", left-justified with no added spaces and no extra blank lines:

     pdfdx /i "C:\Temp\Unformatted PDF Files" /f n /m 0 "MyPdfFile.pdf"

/? -- Shows the Command Line Help Screen.

When the pdfDX Command Line mode is running:

When running in the Command Line mode. most of the Buttons are disabled. You can see the About Dialog or view the Program Help, but mainly the "Stop" Button is available to interrupt the processing and formatting, which can be very necessary sometimes when formatting a large number of files.

How to use pdfDX in Windows mode
How to use pdfDX in Command Line mode