Fresh Index hits 100 Billion URLs

By Dixon Jones November 6, 2011
Link index hits 100 Billion URLs

Screenshot taken on 6th November 2011

I noticed this little milestone just now. It’s a Sunday, so I really should not be looking at the business too closely, but it was a busy week last week winning the best SEO Technology at the Search Awards and we are gearing up for what we hope will be a massive week for Majestic as we go to Pubcon in Las Vegas.

In preparation for that, the Historic index was updated yesterday – but the Fresh Index now updates so often automatically that we forget to look at the numbers, even though they are listed on the home page.

Today, theough, the number stood out for me, at 100,272,695,204 URLs seen by our crawlers within a 30 day period. This does not mean NEW URLs, it means that the links were string enough to get re-crawled or re-seen in the last 30 days by our crawlers. This is why it makes sense to use the FRESH index for normal day-to-day analysis of link data. Frankly, a blog post that deprecates off the home page of a blog without itself getting any external links becomes largely lost on the Internet. We’ll still have the URL in our historic index, but neither ourselves nor the maim search engines will pay much attention to it – because other websites and therefore, presumably, people pay limited attention to it.

We continue onwards and upwards.

Posted In: Updates

9 Responses to “Fresh Index hits 100 Billion URLs”

  1. R2D2 said:

    November 06, 2011 at 12:06 pm

    Wow, such a massive amount of traffic must cost much money ! Perhaps Majesticseo could start its own Searchengine with its own ranking algorythms ?

  2. razvypp said:

    November 06, 2011 at 2:44 pm

    wow , congratulations

  3. Alex said:

    November 06, 2011 at 5:01 pm

    Wow, congratulations!

    Do you have, or are you planning on, some architectural posts? It would be very interesting with some insights on how you crawl and store such massive ammount of data.

    Thank you for a great service!

    • Dixon said:

      November 06, 2011 at 5:30 pm

      Like this? :)

      • Alex said:

        November 06, 2011 at 8:03 pm

        > Yes, that is a very good read. I (and others hopefully) am also interested in the actual software architecture. How do you store all links and their relations? Graph databases perhaps? How are your crawlers built, etc. Thanks!

  4. Pablo Alberto said:

    November 09, 2011 at 9:02 pm

    congratulations! and nice Job Majestic

  5. Small Business Website said:

    November 11, 2011 at 6:32 pm

    With an index that large will we see any performance hits? It is a great milestone and what to be celebrated. Thanks for the update.

  6. bolasonic said:

    November 21, 2011 at 9:57 am

    woahhhh that’s awesome…congratulations for your achievement Majestic, can’t wait for another milestone to be achieved :)

  7. George Eighmie said:

    November 21, 2011 at 8:31 pm

    This is a very good service you provide, I will be up-grading.