Broken Links : Why It Matters and How to Find It

Broken Links :  Why It Matters and How to Find It

In this digital age, where websites are the front face of businesses and personal brands, maintaining a functional and user-friendly website is important. Most of the time its missed how many links which are present in the web pages are not working and broken links leads to frustration in users and also negatively impact a website’s SEO performance. In this article I will discuss why checking broken web links is important and how to check that using Selenium with Java.

Importance of Checking Broken Links

  1. Bad User Experience : Users who faces broken links they might loose interest in the website and this can result in lower engagement.

  2. Trust and Credibility : A site with multiple broken links appears poorly maintained, which can lead to trust issues and business loss. Suppose if its a E Commerce company and multiple links like add products to the cart or like payment page if these links are broken then in that case this will lead to business loss to the company.

  3. SEO (Search Engine Optimization) : Broken links can negatively impact a site's performance, leading to lower visibility in search result of the website, because search engine like Google consider the quality of a website's internal and external links when determining the search ranking.

  4. Content Accessibility : Broken links can prevent users from accessing important information making the site less useful.

A site without broken links appears more professional and well maintained and it enhanced the overall impression for visitors. So its important to check how many links are working out of total links which are present in the web page.

Using Selenium with Java to Find Broken Links

Below I have used selenium with java to find all the broken links on a webpage

public class CheckAllBrokenLinks {
    @FindBy(tagName = "a")
    List<WebElement> elementForAllWebLink;

    @Test
    public void checkAllBrokenLinks() {
        FirefoxOptions options = new FirefoxOptions();
        options.addArguments("--headless");
        options.addArguments("--disable-notifications");
        WebDriver driver = new FirefoxDriver(options);

        PageFactory.initElements(driver, this);
        driver.manage().window().maximize();
        driver.get("https://www.amazon.in/");

        List<String> urlList = new ArrayList<>();
        for (WebElement element : elementForAllWebLink) {
            String url = element.getAttribute("href");
            urlList.add(url);
        }

        AtomicInteger notWorking = new AtomicInteger();
        AtomicInteger working = new AtomicInteger();
        urlList.parallelStream().forEach(links -> {
            boolean check = isLinkBroken(links);
            if (check) {
                System.out.println("This link is broken : " + links);
                notWorking.getAndIncrement();
            } else {
                System.out.println("This link is not broken : " + links);
                working.getAndIncrement();
            }
        });

        System.out.println("Total Links : " + urlList.size() + ", Broken Links : " + notWorking + ", Working Links : " + working);
        driver.quit();
    }

    public boolean isLinkBroken(String url) {
        boolean brokenLink = false;
        HttpURLConnection connection = null;
        try {
            URL urlClassObject = new URL(url);
            connection = (HttpURLConnection) urlClassObject.openConnection();
            int responseCode = connection.getResponseCode();
            if (responseCode >= 400) brokenLink = true;
        } catch (Exception e) {
            System.out.println("Exception : " + e);
        } finally {
            if (connection != null) {
                connection.disconnect();
            }
        }
        return brokenLink;
    }
} // main

Breakdown and Explanation of the Code

  1. WebElement Declaration
@FindBy(tagName = "a")
List<WebElement> elementForAllWebLink;

@FindBy Annotation: This annotation is used to locate all the anchor (<a>) elements on the webpage and its a part of Page Factory Pattern in Selenium Webdriver, which helps us in initializing the element automatically.

  1. Test Method
@Test
public void checkAllBrokenLinks() {
    FirefoxOptions options = new FirefoxOptions();
    options.addArguments("--headless");
    options.addArguments("--disable-notifications");
    WebDriver driver = new FirefoxDriver(options);

    PageFactory.initElements(driver, this);
    driver.manage().window().maximize();
    driver.get("https://www.amazon.in/");
  • FirefoxOptions: Sets Firefox to run in headless mode (no GUI) and disable notifications by adding arguments as --headless and --disable-notification respectively.

  • WebDriver Initialization: Initializes the Firefox driver with the specified settings.

  • PageFactory.initElements: Initializes the WebElement fields in the class. If we wont initialize the WebElement then driver wont be able to interact with the WebElement

  • driver.get(): Opens the URL (https://www.amazon.in/) in browser.

  1. Collecting All the Links
List<String> urlList = new ArrayList<>();
for (WebElement element : elementForAllWebLink) {
    String url = element.getAttribute("href");
    urlList.add(url);
}

Collecting all the links by Iterating through all anchor elements and collects their href attributes and add into ArrayList .

  1. Checking Links
AtomicInteger notWorking = new AtomicInteger();
AtomicInteger working = new AtomicInteger();
urlList.parallelStream().forEach(links -> {
    boolean check = isLinkBroken(links);
    if (check) {
        System.out.println("This link is broken : " + links);
        notWorking.getAndIncrement();
    } else {
        System.out.println("This link is not broken : " + links);
        working.getAndIncrement();
    }
});

System.out.println("Total Links : " + urlList.size() + ", Broken Links : " + notWorking + ", Working Links : " + working);
  • AtomicInteger: Used to count broken and working links concurrently and it ensures the thread safety while counting the broken and working links when we use parallelStream() .

  • Parallel Stream: Executes the list of urlList concurrently for efficiency.

  • Link Check: Calls the isLinkBroken method to check each link and updates the notWorking and working counts accordingly

  1. Method to Check Broken Link
public boolean isLinkBroken(String url) {
    boolean brokenLink = false;
    HttpURLConnection connection = null;
    try {
        URL urlClassObject = new URL(url);
        connection = (HttpURLConnection) urlClassObject.openConnection();
        int responseCode = connection.getResponseCode();
        if (responseCode >= 400) brokenLink = true;
    } catch (Exception e) {
        System.out.println("Exception : " + e);
    } finally {
        if (connection != null) {
            connection.disconnect();
        }
    }
    return brokenLink;
}
  • URL Object Creation:

    Creates a object of URL class and while creating object String url is passed as parameter.

  • Connection Establishment: Opens a connection to the url using URL class object and cast it to HttpURLConnection.

    HttpURLConnection provides additional methods and functionality specific to HTTP connections, such as retrieving the response code, header etc.

  • Response Code Check: Retrieves the response code and sets brokenLink to true if it's greater than or equal to 400 (indicating a broken link).

  • Connection Cleanup: In finally block it make sure that the connection is properly closed to avoid resource leaks.

Conclusion

Checking for broken web links is important part of software testing because it directly impacts user experience, SEO performance, and your site's overall professionalism. By using Selenium with Java we can easily automate the process of checking all the broken links on Web Page. Regular monitoring and timely fixes make sure that your website remains user-friendly, trustworthy, and optimized for search engines, ultimately supporting your business goals and enhancing your online presence.

Did you find this article valuable?

Support Perf Insights by becoming a sponsor. Any amount is appreciated!