Link Analysis and Searching with Webometric Analyst via the Bing API
Using the Bing API for direct link searches
Key points:
- Hyperlink searches are not possible in Bing and so this section refers to URL citation or title mention direct link searches.
- Weighted direct link networks of up to 22 sites can be calculated. Because of the disable host collapsing issue below, in theory it is not be possible to count how many direct links there are from site A to Site B, only whether there is at least one such link. In practice, weighted networks can also be calculated as Bing sometimes automatically disables host collapsing itself - although this may not be reliable. To construct a weighted direct link network for n sites, assuming that there are a maximum of 1000 results for each query, up to 20* (n^2-n) queries are needed. With this, the largest n such that 20 * (n^2-n) < 5000 and all the searches can be submitted within a month is 22.
- Unweighted direct link networks of up to 71 sites can be calculated. To construct a network diagram for n web sites, n^2-n searches are needed. These can be configured to return only 1 page of results. With this, the largest n such that n^2-n < 5000 and all the searches can be submitted within a month is 71.
To construct a weighted network direct link network, an unweighted or weighted direct link network for > 71 sites or a hyperlink direct link network, use SocSciBot instead. Note that SocSciBot is slower and cannot cope with large web sites.
Using the Bing API for co-inlink searches
Key points:
- Hyperlink searches are not possible in Bing and so this section refers to URL citation or title mention co-in link searches.
- Because of the absence of hit count estimates discussed below,
- Weighted co-inlink networks of up to 22 sites can be calculated. To construct a network diagram for n web sites, (n^2-n)/2 searches are needed. Each search may take up to 20 queries. With this, the largest n such that 20*(n^2-n)/2 < 5000 and all the searches can be submitted within a month is 22.
- Unweighted co-inlink networks of up to 100 sites can be calculated. As above but setting the maximum number of pages of results per query to 1.
To construct larger networks than above, additional queries could be bought from Bing Azure service. Note that SocSciBot cannot normally be used for co-inlink networks because the co-inlinks could originate from anywhere in the web and SocSciBot can only crawl a limited number of web sites and not the whole web.
Using the Bing API for Webometric research after July 2012
Key points:
- Each user gets a maximum of 5,000 queries per month and must sign up to get these. Since each query uses up to 20 separate searches, one per page of results, this may limit the monthly number of queries to as few as 250.
- The results are not the same as those from the main Bing web interface and there seem to be much fewer matches returned.
- Searches do not return hit count estimates and are therefore no longer useful for webometric purposes involving calculating or comparing hit counts where some of the results are likely to be over 1,000. Using level 1 query splitting, this can be extended to 2000 by trebling the number of searches, 4000 by multiplying the number of searches by 7 (level 2) and 8000 by multiplying the number of searches by 15 (level 3) but this will quickly use up the 5000 queries per month allowed. For example with level 3 query splitting then the number of queries per month could be as few as 16 (i.e., 250/15).
- It is not possible to get more than 50 results if the disable host collapsing option is selected. Hence it is no longer possible to get the number of matches for queries that have multiple matches >2) on one or more web sites. It is also no longer possible to estimate or count the number of matches that a search gets on a single web site.