Skip to content

How to extract hyperlink from PDF using iTextsharp?

  • by
  • 2 min read

PDFs are a great way to maintain your document’s formatting regardless of screen size or device when sharing it with others. That said, this also makes editing PDFs a hassle, especially if you’re trying to do it in code. 

Thankfully, libraries like iTextsharp allow for PDF manipulation using a series of simple methods. In this article, we’re talking about how to extract hyperlinks from PDFs using iTextsharp.

Also read: Fix: Unknown error: soap-error: encoding: object has no uirequestid property

Extracing hyperlinks from PDFs

Itextsharp has a lot of different functions for carrying out individual tasks. For extracting hyperlinks, we’ll use the getAnnotations() method that will loop over the file and collect them in a single List object.

The basic syntax looks something like this.

List annots = pdfPage.getAnnotations();

Of course, you’ll need a PDF document loaded in the pdfPage object mentioned above. Additionally, not all links are actual URLs, so we also need to perform a few sanity checks on our document. 

Here’s what a basic script would look like.

//Get the current PDF page
PdfPage pdfPage = pdfDoc.getPage(page);
//Get all of the annotations or hyperlink from the current page
List annots = pdfPage.getAnnotations();
//Check if there were any links
if ((annots == null) || (annots.size() == 0)) {
    System.out.println("No hyperlins in PDF");
}
//Loop through each hyperlink
else {
    for (PdfAnnotation a : annots) {
        //Make sure this hyperlink has a link
        if (a.getSubtype().equals(PdfName.Link))
            continue;
        //Make sure this hyperlink has an ACTION
        if (a.getAction() != null) {
            //Get the ACTION for the current annotation
            PdfDictionary annotAction = a.getAction();
            // Test if the found hyperlink is actually a URL
            if (annotAction.get(PdfName.S).equals(PdfName.URI) ||
                annotAction.get(PdfName.S).equals(PdfName.GoToR)) {
                    //Saving external links
                    PdfString destination = annotAction.getAsString(PdfName.URI);
                    String url1 = destination.toString();
            }
            else if (annotAction.get(PdfName.S).equals(PdfName.GoTo) ||
                annotAction.get(PdfName.S).equals(PdfName.GoToE)) {
                    //do smth with internal links
            }
        }
    }
}

Also read; How to Emphasize Crucial Textual Content on Foxit PDF Editor?

>