Majestic SEO for Webometric Research
Our Ambassador in the Ukraine has just returned from presenting some university research in Seoul. WE thought it would be useful for blog readers to see some of the “bigger picture” things that Majestic SEO is beginning to get involved in. This is Dmytro’s very different conference report.
Dmytri Filchenko, possesses a Doctorate in Mathematical Modeling and Computational Methods and is currently the Head of the Centre for Webometrics and Web Marketing in Sumy State University, Ukraine. See (http://www.linkedin.com/in/filchenko) for further details.
Hyperlinks are a subject of interest not only to SEOs or web marketing specialists but also to researchers in the fields of web-mining, information retrieval or webometrics. Coined more than 15 years ago, the term ‘webometrics’ became a constantly growing field of science, which assesses web presence, web usage and web impact indicators, as well as discovers new patterns on the Web.
This subject area was discussed in depth at one of the most comprehensive events, the 8th International Conference on Webometrics, Informetrics & Scientometrics organized by Global Interdisciplinary Research Network COLLNET, which was held recently in Seoul, South Korea, where I was asked to present my paper and presentation.
Every day, all of us use various web metrics. For example: Google PageRank, when searching the web, Alexa Traffic Rank when searching for popular sites, or MajesticSEO’s Flow Metrics when estimating the impact of a given URL. However, have you ever considered the scale of samples examined while computing these metrics (thousands of billions of URLS’s) and how difficult it is to solve the problems of precision (Google PageRanks are 11 integers from 0 to 1)?
If one was to compute such metrics for a specific subject area, then size and precision would be less significant and more importantly, one would be able to provide a better insight into the subject area.
For instance, if you are going to estimate the authoritativeness of your company’s web domain alongside those of your competitors, you would need to calculate your own metrics on your own sample of web domains and not on the whole web as most major metrics do. However, this requires a database of backlinks, which are core data for any modern metrics. This is where MajesticSEO comes in.
My team were given a project which was to evaluate the level of mutual impact between universities; I thought, why not work on this from the webometric point of view. A hyperlink to any university’s website requires extra effort and begs the question, why would one university pay such a tribute to another? Should a university decide to place a link of another on their site, then this would imply that they appreciate adding the link.
More than ten years ago, there was a similar project in the name of ‘G-Factor’ which was aimed at evaluating hyperlinks to universities. Using the Google search engine, the number of hyperlinks to university’s websites from all other university websites was counted. The more hyperlinks a university website obtained from other universities websites, the higher it was ranked.
Despite obvious advantages, G-Factor in the classical form had some drawbacks. First and foremost, G-Factor used Google as a backlinks provider (expressed by letter ‘G’ in the name) which cannot be regarded as a hyperlink provider anymore. Secondly, the G-Factor did not take into account the authoritativeness of a hyperlink source. In which case, where does the link come from for university websites?
Back to our project; a new data provider was required, as well as some new algorithm which took into account mutual authoritativeness of university websites. As for the backlinks, we tried different sources, but they were either poor in quality, extremely expensive, or very primitive in APIs. After months of evaluations, we finally found Majestic SEO!
As we started using the Majestic SEO database, we worked tightly with the API Support Team. Webinars organized by Dixon Jones were exceptionally useful for us. All of this enabled us to create a special tool with GUI for exporting the list of hyperlinks from the Majestic SEO database for each university web domain. After that, another tool parses that list extracting only those hyperlinks that appear on university web domains included in the sample. Such an analyzer creates a so-called Backlinks Matrix.
Afterwards, we concentrated on strengthening the algorithm of the original G-Factor. At the first step of research, we proposed to use an original Brin & Page model for PageRank, that uses the hyperlink structure of the Web, to build a Markov Chain with a Transition Probability Matrix. Without getting too technical, I’ll just say that we used a power iteration technique for solving the system of equations, which finally gave us PageRank values.
All in all, for our project which we named ‘Extended G-Factor’, we used the same concept as Google PageRank and G-Factor (estimating the universities mutual impact as a function of mutual hyperlinks from their websites), although our model was much more intricate. We took 324 Ukrainian university web domains from the directory of ‘Ranking Web of World Universities’ and calculated our extended G-Factor.
The project revealed a number of findings. Firstly, it was discovered that most Ukrainian universities were not fond of citing other universities on their website. This may well be down to competition and to prevent students and other categories of customers from being exposed to rival sites.
However, those universities that do cite other universities on their website do not actually refer back to papers and articles. Again, this could be to produce less coverage of opponents’ works. Although, it was found that the academic institutions who had more in common in terms of specialty, had more mutual hyperlinks. Moreover, it appeared that small sized schools ranked very high when ‘hyperlinking’ with major universities. This is most likely due to the fact that they are not regarded as ‘big competition’. Therefore, the major universities, which usually obtain high positions in Webometrics Rankings are not well represented in an extended G-Factor Ranking.
It was also discovered that the Universities that are in the same region are more likely to link to each other often.
In conclusion, our work determined that the Hyperlink is a powerful tool, which provides a good metric of mutual impact. Our project has been published on http://ranking.sumdu.edu.ua.