Feb
09
2005

wordpress plugins

Yesterday i wrote my first couple wordpress plugins. One you can find already on every post at the bottom. If you click ‘Show wikified noun phrases ‘ you’ll see a select appear with all noun phrases my little script extracted. Clicking on a noun phrase will point you to wikipedia. The script is pretty simple, through regular expressions it gets out nouns and noun phrases (yeah I know, it’s not optimal yet). Those nouns are compared to a (for now outdated) local database of wiki entries. If such an entry is found the noun(phrase) is made into a link and placed into the select. Pretty cool huh? I think something like this can give a user more information about something related to the post. Any ideas and suggestions are more than welcome. The source will be made available soon, after some more tweaking.

As a second plugin i made a simple script which makes an array of words in the post and the amount of times it appears in the post. The idea behind this is that recurring words might be good tags to classify this post in. I’m thinking how to let my wordpress make good use of tags. Maybe tags could replace categories and could, through something like touchgraph, make you find related information on our blog pretty easy. This script could also be used to insert tag statements for Technorati, to link it to del.icio.us … The possibilities are legio :-) Again, any ideas are welcome!

Written by Erik in: projects | Tags:

18 Comments »

  • erik

    grmbl, the showHide script doesn’t seem to work on firefox windows. It does work on firefox mac and linux … Strange …

    Comment | 9/2/2005
  • jaap

    Dude, really cool! I’m thinking it might be an idea to remove common nouns from the words that will be wikified. I like your solution though because it solves the problem I raised before that it might drown relevant links in wikipedia links.

    Also, I’m not sure what you mean by “the showHide script doesn’t seem to work on firefox windows” … it seems to be working for me, unless I am missing something.

    Anyway, let’s talk about it l8r ;)

    Comment | 9/2/2005
  • jaap

    http://esl.about.com/library/vocabulary/bl1000_list1.htm

    Comment | 9/2/2005
  • erik

    well, i thought it didn’t work on win firefox but it was just on http://www.justlol.org where the javascript wasn’t included. It is solved now.

    I was indeed planning to get out words from standard stoplists and maybe indeed the most common words in english. Also i should have noun phrases extracted by a system learnt through a part of speech tagger so it gets out more nounphrases than currenlty (only more than one noun if two following nouns have capital letters).

    Comment | 9/2/2005
  • jaap

    Another thing you could do is compare noun phrases to a list of existing wikipedia articles. Check out:

    http://en.wikipedia.org/wiki/Wikipedia:Quick_index
    Other ways to browse:
    http://en.wikipedia.org/wiki/Wikipedia:Browse (for example if you only want to access certain topics)

    Comment | 9/2/2005
  • erik

    dude, that’s exactly what i’m doing …

    Comment | 9/2/2005
  • jaap

    ok, and i’m sure you looked at http://www.antisleep.com/wikipedizer/api/wikipedizer.php.txt aswell:

    // Match proper noun phrases.
    preg_match_all(”/[A-Z][a-zA-Z]+(\s[A-Z][-a-zA-Z]+)+/ms”, $result, $propernounphrases);
    // Match acronyms. (performance seems to go through the floor if we do these in one pass.)
    preg_match_all(”/[A-Z][A-Z][A-Z]+/ms”, $result, $acronyms);

    // Merge and de-duplicate.
    $phrases = array_unique($propernounphrases[0] + $acronyms[0]);

    // Open up a db connection and whittle our list down against the real titles.
    $connection = mysql_connect (”localhost”, “wikipedia”);
    if ($connection == null)

    Comment | 9/2/2005
  • erik

    that’s where i got the idea ;-)

    i just grabbed a list of common words from the net and let the script filter them out of the noun phrases. As you wished. Good idea ;-)

    Comment | 9/2/2005
  • jaap

    dude, it’s listing all the words above the article now, that can’t be the idea…

    Comment | 9/2/2005
  • jaap

    hehe now there’s an error, i guess you’re working on it :)

    Comment | 9/2/2005
  • erik

    yep, working on it, should have some development wordpress up somewhere.
    The script works pretty neat now but for some strange reason it doesn’t get rid of ‘the’ although it is the most common word … mmhhh … dinner first :-)

    Comment | 9/2/2005
  • erik

    also it should actually stem the words or something like that. e.g. ‘thing’ is in the most common word list but ‘things’ isn’t …

    Comment | 9/2/2005
  • How about a plugin that rejects texas-holdem posts :))

    Comment | 10/2/2005
  • erik

    Well, i had to change ip, i got plugged out, now i am plugged in ;-) No more texas sh*t again. Oh, there is this guy called bush …

    Comment | 10/2/2005
  • jaap

    since you’re thinking about a redesign, i kind of like this design:

    http://www.stopdesign.com/log/

    Comment | 10/2/2005
  • erik

    not bad at all

    Comment | 10/2/2005
  • jaap

    Instead of writing a website where people could post urls and wikify them, i think it would be awesome if you wrote a firefox extension that would take the current page and wikify it. Of course if you did this and many people start using it then that could become a real bandwidth bitch and mess up your server.

    maybe if you write the plugin and it works well the people at mozilla can help you out.

    of course if i can help you with this i’d be more than willing :)

    Comment | 10/2/2005

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress | Aeros Theme | TheBuckmaker.com WordPress Themes, modified by Erik Borra.
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.