A Word about Scale

By Dixon Jones February 26, 2014

A short while ago, I found myself speaking immediately before Twitter. When he started talking scale, I realized that Majestic had been understating the level of scale it has achieved. I think some of these stats will surprise a few people…

Just think about that for a second… Majestic crawls at a speed four times faster than the combined population of Twitter Tweets.

That’s “big” data.

 

Posted In: General

17 Responses to “A Word about Scale”

  1. James said:

    February 27, 2014 at 10:26 am

    This is why I prefer Majestic data over any other source. Great graphic, Dixon – it really puts things in perspective!

    • Dixon Jones said:

      February 27, 2014 at 11:41 am

      Thanks James. I hear it isn’t rendering very well on some Android phones… but I think it tells a cool story doesn’t it?

  2. Jack Britton said:

    February 27, 2014 at 10:47 am

    Its incredible to think how much is covered per day – what surprised me the most was that the UK is responsible for nearly twice the amount when compared to the US over the last 30 days.

    Im not even going to mention Finland, thats a crazy amount of URLs crawled within 30 days.

    • Dixon Jones said:

      February 27, 2014 at 11:38 am

      To be fair – being a British company – the UK was lways going to be a stronghold for crawling. However – the Fins are impressive aren’t they?

  3. MIchael said:

    February 27, 2014 at 11:27 am

    That is impressive. It would be interesting to know, how many servers Majestic deploys to crawl in such a high volume.

    And the numbers are believable – how else could Majestic show links only 2-3 days after we build them. Even with minor websites. They show up faster here than on Moz or AHrefs regularly.

    • Dixon Jones said:

      February 27, 2014 at 11:40 am

      Well – rumour has it that we stared with an Apple Macintosh… ;)

      OK – that’s not true. The only Macs in the office are mine for travelling and one for UX testing. See how I cleverly avoided that question?

  4. Bradley Griffin said:

    February 27, 2014 at 12:11 pm

    That truly is ‘BIG DATA!’
    The speed in which links appear is what swayed me to sign up with you instead of moz.
    Great graphic Dixon

    • Dixon Jones said:

      February 27, 2014 at 12:35 pm

      Thanks. The Development team at Majestic sent me on a one day info-graphics course at the Guardian. Thought I had better make use of the lesson!

  5. Laurent Bourrelly said:

    February 27, 2014 at 1:02 pm

    Hi Dixon,

    Hope all is well.
    This is not Big Data, it’s Humongous Data !!!
    btw I have to show you something…

    • Dixon Jones said:

      February 27, 2014 at 1:04 pm

      Laurent… put it away. :)
      I’ll be in Paris in two weeks I think. Look forward to seeing you then?

  6. Club Mate said:

    February 27, 2014 at 10:32 pm

    Oh, Yeah ! Thats really BIG! Data ;) Love it!

  7. Kevin said:

    February 28, 2014 at 4:26 pm

    This is just WOW..

  8. paul said:

    February 28, 2014 at 5:24 pm

    Yes, this is clearly wow !!!

    the milky way comparaison just killed me ;-)

    Just one question,

    what does reflect “Crawled from: Top 30 countries in 30 days”
    majestic interest for that country ?
    country total web pages ?

    I am french, so be carrefull on the answer ;-)
    (in other words, why France has less pages crawled than belgium)

    • Dixon Jones said:

      February 28, 2014 at 5:31 pm

      It is OK… we are not ignoring France (or anywhere else). This is where we crawl FROM (Where the crawlers are physically located). We have more bandwidth in Belgium than France it would seem. But we will of course still crawl more pages on servers hosted in France than in Belgium. I only listed the first 30 countries because that data was easy for me to get, that is all.

  9. chris vos said:

    March 02, 2014 at 12:33 am

    It looks like you crawl about 2x ahrefs and 4x moz. Any idea why this is?

    Keep doing infographics, they are such a nice break from regular text blog posts!

    • Dixon said:

      March 03, 2014 at 10:54 am

      There are many reasons. We are not Always bigger than both. Moz has a more expensive (but ultimately more versatile) crawl infrastructure. Ahrefs maintain their index differently… Building it up over time, then rebuilding from scratch (I think) so when they start a rebuild they are small and accurate… At the end they are large and have more dead wood. We always maintain 90 days crawl data in Fresh and leave the deadwood in Historic.>

  10. Jonathan Cross said:

    March 02, 2014 at 11:58 pm

    Now I understand exactly why we have been so pleased with the data accuracy we receive from Majestic. Impressive work, Dixon. Very impressive…