In this digital age, where websites are the front face of businesses and personal brands, maintaining a functional and user-friendly website is important. Most of the time its missed how many links which are present in the web pages are not working and broken links leads to frustration in users and also negatively impact a website’s SEO performance. In this article I will discuss why checking broken web links is important and how to check that using Selenium with Java.
Importance of Checking Broken Links
Bad User Experience : Users who faces broken links they might loose interest in the website and this can result in lower engagement.
Trust and Credibility : A site with multiple broken links appears poorly maintained, which can lead to trust issues and business loss. Suppose if its a E Commerce company and multiple links like add products to the cart or like payment page if these links are broken then in that case this will lead to business loss to the company.
SEO (Search Engine Optimization) : Broken links can negatively impact a site's performance, leading to lower visibility in search result of the website, because search engine like Google consider the quality of a website's internal and external links when determining the search ranking.
Content Accessibility : Broken links can prevent users from accessing important information making the site less useful.
A site without broken links appears more professional and well maintained and it enhanced the overall impression for visitors. So its important to check how many links are working out of total links which are present in the web page.
Using Selenium with Java to Find Broken Links
Below I have used selenium with java to find all the broken links on a webpage
public class CheckAllBrokenLinks {
@FindBy(tagName = "a")
List<WebElement> elementForAllWebLink;
@Test
public void checkAllBrokenLinks() {
FirefoxOptions options = new FirefoxOptions();
options.addArguments("--headless");
options.addArguments("--disable-notifications");
WebDriver driver = new FirefoxDriver(options);
PageFactory.initElements(driver, this);
driver.manage().window().maximize();
driver.get("https://www.amazon.in/");
List<String> urlList = new ArrayList<>();
for (WebElement element : elementForAllWebLink) {
String url = element.getAttribute("href");
urlList.add(url);
}
AtomicInteger notWorking = new AtomicInteger();
AtomicInteger working = new AtomicInteger();
urlList.parallelStream().forEach(links -> {
boolean check = isLinkBroken(links);
if (check) {
System.out.println("This link is broken : " + links);
notWorking.getAndIncrement();
} else {
System.out.println("This link is not broken : " + links);
working.getAndIncrement();
}
});
System.out.println("Total Links : " + urlList.size() + ", Broken Links : " + notWorking + ", Working Links : " + working);
driver.quit();
}
public boolean isLinkBroken(String url) {
boolean brokenLink = false;
HttpURLConnection connection = null;
try {
URL urlClassObject = new URL(url);
connection = (HttpURLConnection) urlClassObject.openConnection();
int responseCode = connection.getResponseCode();
if (responseCode >= 400) brokenLink = true;
} catch (Exception e) {
System.out.println("Exception : " + e);
} finally {
if (connection != null) {
connection.disconnect();
}
}
return brokenLink;
}
} // main
Breakdown and Explanation of the Code
- WebElement Declaration
@FindBy(tagName = "a")
List<WebElement> elementForAllWebLink;
@FindBy Annotation: This annotation is used to locate all the anchor (<a>
) elements on the webpage and its a part of Page Factory Pattern in Selenium Webdriver, which helps us in initializing the element automatically.
- Test Method
@Test
public void checkAllBrokenLinks() {
FirefoxOptions options = new FirefoxOptions();
options.addArguments("--headless");
options.addArguments("--disable-notifications");
WebDriver driver = new FirefoxDriver(options);
PageFactory.initElements(driver, this);
driver.manage().window().maximize();
driver.get("https://www.amazon.in/");
FirefoxOptions: Sets Firefox to run in headless mode (no GUI) and disable notifications by adding arguments as --
headless
and--disable-notification
respectively.WebDriver Initialization: Initializes the Firefox driver with the specified settings.
PageFactory.initElements: Initializes the
WebElement
fields in the class. If we wont initialize theWebElement
then driver wont be able to interact with theWebElement
driver.get(): Opens the URL (
https://www.amazon.in/
) in browser.
- Collecting All the Links
List<String> urlList = new ArrayList<>();
for (WebElement element : elementForAllWebLink) {
String url = element.getAttribute("href");
urlList.add(url);
}
Collecting all the links by Iterating through all anchor elements and collects their href
attributes and add into ArrayList
.
- Checking Links
AtomicInteger notWorking = new AtomicInteger();
AtomicInteger working = new AtomicInteger();
urlList.parallelStream().forEach(links -> {
boolean check = isLinkBroken(links);
if (check) {
System.out.println("This link is broken : " + links);
notWorking.getAndIncrement();
} else {
System.out.println("This link is not broken : " + links);
working.getAndIncrement();
}
});
System.out.println("Total Links : " + urlList.size() + ", Broken Links : " + notWorking + ", Working Links : " + working);
AtomicInteger: Used to count broken and working links concurrently and it ensures the thread safety while counting the broken and working links when we use
parallelStream()
.Parallel Stream: Executes the list of
urlList
concurrently for efficiency.Link Check: Calls the
isLinkBroken
method to check each link and updates the notWorking and working counts accordingly
- Method to Check Broken Link
public boolean isLinkBroken(String url) {
boolean brokenLink = false;
HttpURLConnection connection = null;
try {
URL urlClassObject = new URL(url);
connection = (HttpURLConnection) urlClassObject.openConnection();
int responseCode = connection.getResponseCode();
if (responseCode >= 400) brokenLink = true;
} catch (Exception e) {
System.out.println("Exception : " + e);
} finally {
if (connection != null) {
connection.disconnect();
}
}
return brokenLink;
}
URL Object Creation:
Creates a object of
URL
class and while creating object Stringurl
is passed as parameter.Connection Establishment: Opens a connection to the
url
using URL class object and cast it toHttpURLConnection
.HttpURLConnection
provides additional methods and functionality specific to HTTP connections, such as retrieving the response code, header etc.Response Code Check: Retrieves the response code and sets
brokenLink
totrue
if it's greater than or equal to 400 (indicating a broken link).Connection Cleanup: In finally block it make sure that the connection is properly closed to avoid resource leaks.
Conclusion
Checking for broken web links is important part of software testing because it directly impacts user experience, SEO performance, and your site's overall professionalism. By using Selenium with Java we can easily automate the process of checking all the broken links on Web Page. Regular monitoring and timely fixes make sure that your website remains user-friendly, trustworthy, and optimized for search engines, ultimately supporting your business goals and enhancing your online presence.