Backlink Data Collection Processes and Tools

Once you have set clear goals and KPIs for your large scale link building campaign it's time to begin the work of prospecting. Large scale link building works best with thousands - or better yet hundreds of thousands - of link prospect URLs. Gathering this number of link prospect URLs requires tools. If you're conducting large-scale link building without developing your own crawler and scraper we recommend using free, web-based link data gathering tools.

When evaluating a free, web-based link building research tool, the main thing you want to look for is the ability for one of the following actions:

- Export to a CSV/TSC. This will allow you to open the file in MS Excel/OpenOffice Calc so that you can aggregate all of the data easily.
- Allow you to select and copy the data from the web page, so that you can easily paste it into MS Excel/OpenOffice Calc.

Large scale link building works only when you can aggregate and sort thousands or even hundreds of thousands of urls and their related link information. If you have to copy and paste individual pieces of data into a spreadsheet for analysis then the usefulness of a link data tool for large scale link building becomes negligible.

Here's what you'll find in this article:

- Large Scale Link Building Process
- Analyzing URL Link Value
- Downloading Bulk Search Engine Results
- Competitor Backlink Gathering Tools
- Co-Citation Analysis Tools
- Bulk PageRank Checkers

For link builders without access to custom scraping and crawling tools we recommend the following link prospecting process.

- Set SEO for FireFox settings to display PageRank, DMOZ status, and Yahoo! Backlinks
- Set a 5 second delay for each setting in SEO for FireFox to prevent banning by the search engines...it also doesn't hurt to be polite.
- Set your Google search preferences to display 100 results.

Get Top Ranking URLs
1. For each keyword...

a. Search Google
b. Wait until DMOZ, Pagerank, and Y! Backlinks are acquired for each of the top 100 results.
c. Click the "CSV" link at the top of the web page, underneath the Google search box, to export all results.

2. Aggregate each keyword report into a single spreadsheet. (These is your "competitor list".)
3. Sort the final spreadsheet by PageRank, then by Yahoo! Backlink totals, then by DMOZ status.
4. Manually review URLs, from top to bottom, to determine if the page is from a competitor or complimentary website. Prospect backlinks from the competitor URLs most relevant to your website.

Aggregate Top Backlink Opportunities
1. For each URL identified above, export the backlinks with Yahoo!
2. Aggregate all of the backlink URLs, similar to the CSV files above.
3. Here, you will have duplicates. Remove all duplicates to create a final list of Link Prospect URLs.

Create a Prioritized List of Link Prospects
1. Copy URLs from the Link Prospect URLs list, 100 at a time, and paste them into a bulk PageRank checker.
2. Once the PageRank has been calculated for each URL, copy the URL and PageRank back into the Link Prospect URLs list
3. With the final Link Prospect URLs list, sort it by PageRank.
4. Repeat this process for other link opportunity qualifiers you believe may impact your goals.

Manually Review URLs
1. From top to bottom, go through each URL in the Link Prospect URLs list.
2. On each URL determine if it is a viable link opportunity. If so, denote it in a third column in your spreadsheet.

Note: We usually find that URLs with a PageRank of 3-5 are the ones most open to giving a link back to your website. If you are looking to be effective with as little time invested as possible, we suggest quickly skimming URLs that have a PageRank of 6 or greater, and PageRank of 2 or lower. Spend the most time with PageRank 3-5 URLs.

Key Metrics for Analyzing the Link Value of URLs
When performing link building research, specifically to determine the value of a link (as opposed to relevance), the important thing to look for are URLs that have the qualities and qualifications of a rank-influencing link.

Some of the qualifiers and metrics readily available tools enable you to analyze are:

- Primary Metrics:

o PageRank
o Total Backlinks in Google and Yahoo!
o If the URL is in DMOZ
o Co-Citation Analysis

- Secondary Metrics:

o Domain Age
o Alexa Ranking
o Delicious Submissions
o EDU Links
o GOV Links
o Wikipedia Links
o Yahoo! Directory Links

Downloading Bulk Search Engine Results
The best free tool for downloading search results, hands down, is the SEO for FireFox plugin.

This is a free plugin that integrates with FireFox, a free web browser, and allows you to easily export search results to a CSV file.

The great thing about the SEO for FireFox tool, is that it allows you to also research and collect important information for each URL in the search results.

With it, you can collect the PageRank, Yahoo! Backlinks, EDU Backlinks, GOV backlinks, Alexa Ranking, etc. Very quickly, you can see which top websites you should target to analyze their backlinks.

Competitor Backlink Gathering Tools
One of the greatest opportunities you have for quickly finding large scale link opportunities is to look at what websites link to your competitors.

There are a number of pieces of software that do this exact thing.

Perhaps the best tool for this job is the Yahoo! Search, SiteExplorer tool.

With this tool, simply search "link:domain.com" to get a full list of backlinks. Export the list by clicking on the "TSV" link at the top.

To perform this search, without also showing backlinks from that same domain, choose the "Except from this domain" option from the "Show Inlinks" dropdown.

Co-Citation Analysis Tools: Measuring URL Weight
Co-citation occurs when a single URL links to two URLs. Each of the two URLs has the same co-citation.

This URL explains the co-citation relationship and importance well, saying:

"Bibliographic Co-Citation is a popular similarity measure used to establish a subject similarity between two items. If A and B are both cited by C, they may be said to be related to one another, even though they don't directly reference each other. If A and B are both cited by many other items, they have a stronger relationship. The more items they are cited by, the stronger their relationship is."

What this means for you is that when you find a potential link opportunity that links with other strong websites, your link will be similarly valued in a certain respect with those other valuable websites. In other words, you want to be linked to, where other valuable websites/competitors are linked to from as well.

SEOBook's Hubfinder is a subscription-based co-citation analysis tool.

Bulk PageRank Checkers
PageRank is the 0-10 score that Google gives to a website, based on all websites on the internet. In general, a higher PageRank gives a website the opportunity to rank for more total keywords and more competitive phrases.

Bulk PageRank Checkers:
- http://www.seochat.com/seo-tools/pagerank-lookup/
- http://checkbulkpagerank.com/
- http://www.bulkpagerankchecker.com/page-rank-checker.php

Though Page Rank should not be the sole metric you use for analyzing the value of a page to your link building efforts, it can serve as a good starting point for determining the flow of your link building acquisition.


  1. [...] Multiple Pages from the Same Domain Link to Top Ranking Sites Once you’ve conducted large-scale link building research you may determine that certain domains have multiple rank-influencing pages. This indicates that [...]

  1. [...] Linker Segments You may already be sitting on a link opportunity gold mine. For example your large scale backlink research may reveal a large number of influential forum and blog conversations about your company’s [...]

  1. [...] Harvest your

    [...] Harvest your competitors’ backlinks with large-scale backlink data collection processes and tools. [...]

Post a New Comment