Sunday, November 4, 2007

See who is linking to you

Google's Webmaster Tools site provides a reporting feature to let you see who is linking to you. Unfortunately, the report is backwards from the orientation I want to read it. It lists the remote links for each of your local pages. I want to see all of the local pages linked on a remote site grouped together. That helps me recognize trends and identify people who might be blogging about what I write here.

Luckily, in addition to the interactive report on the tools web site, you can download the data in a CSV file to be manipulated in any way you want. I put together a little script to produce an HTML file which shows what sites link to me, and the target links on my site. For example, I was a bit surprised to discover several links on an Italian Israeli site. Here's a segment of the output of LinkingToMe:



There are a lot of links on del.icio.us to the PyMOTW articles, and while that's cool it isn't very useful in this report. I included an option to filter out sites by their hostname, and included several bookmarking sites as defaults.

So far I'm not doing any other processing on the data (such as downloading the titles on those remote pages). Perhaps when I have a little more time I'll enhance the script.

[Corrected country of origin for whatsup.co.il.]

7 comments:

Anonymous said...

just so you know,

whatsup.org.il (which is also accessible through whatsup.co.il) is an Israeli forum dedicated to Free Software in general and Linux in particular.
There is a programming section in the forum, and python is gaining popularity due to the work of some language advocates and experts.

BTW - thanks for the PyMOTW articles, they are a great read :)

Doug Hellmann said...

Thanks for the tip! From the Tux logo, I thought whatsup.org.il probably had something to do with Linux, but I don't read Hebrew so I couldn't tell what was actually being said on any of the pages.

mksoft said...

Hi Doug,

I'm one of the owners and maintainer of whatsup.org.il (or co.il) and a reader of your blog (valuable information in pymotw 10x :-)

In that discussion (full link at
http://whatsup.co.il/forum/42547 ) the guy hit a wall trying to implement getopt/optparse on his own, instead of using the libraries (hence the links to pymotw).

After pointing it to him, he wanted to implement something simple with a limited scope (for his needs - 3 required options) as a learning exercise, I've posted a sample towards the end of the discussion.

Doug Hellmann said...

@mksoft - That's a clever use of zip() and slicing to assemble the option pairs from sys.argv. I'll have to remember that trick. :-)

mksoft said...

10x Doug, It's the best I could come up with. It's not too readable though. If you'll come up with something more elegant, let me know :-)

BTW, the sample code posted in the link above has one bug (which I've mentioned in the following comment there): It won't catch a case where the same param is passed more than once. so the line:

if not opt in OPTS.keys():

should be:

if not opt in OPTS.keys() or OPTS[opt].get('value', None):

I'll update it.

Anonymous said...

great trick by the way, and thanks for the help. (yeah i'm the guy who asked the question).

MetalloUrlante said...

It will be also useful to see the anchor text of the backlinks.
As you know anchor text have a bug value for search engine.