Topic: Plugin to pull tag info from a webpage
There are a number of websites out there that could be used to infer info such as genre, artist, album, and either track number or track name (if one of the two is present).
For instance, take a look at this page:
http://archive.org/details/1-001
At some point a few years ago, I downloaded the VBR zip file which came through without track numbers. I could easily write a script to parse archive.org pages and return a JSON object containing whatever info can be sniffed in the page source. Track numbers, for instance, could be inferred from the order in which track names are presented.
My only problem would be that I would be doing it in PHP (I know -- I should have learned PERL). So I'm wondering if there are any plugin authors who would be willing to collaborate on this (I do the PHP side and you do the PERL side). I'm thinking it could be written as a general plugin which others could extend (for other sites such as last.fm) by simply writing a script either in PERL, PHP or any other scripting language that could be executed from PERL.
In pseudocode the plugin would look something like this
Create right-click option to get tags from URL
Capture URL from user
Parse URL for Top Level Domain
For each script type and while found flag is false
--check script directory for parsing script (i.e. archivedotorg.php)
--if found
----execute script
----compare values from returned JSON object to existing tags and generate preview to user
----prompt user to apply changes or cancel
----set found flag to TRUE
if found flag is false
--inform user that no script was found for the tld of the provided url
each TLD script would be expected to return a JSON object per specifications from the plugin author
I realize that I'm new here and therefore not a known quantity. So I would be willing to write the PHP side first
EDIT: I just want to point out that if this functionality was combined with tha autoclustering that I have suggested/asked for here:
http://forums.musicbrainz.org/viewtopic.php?id=3632
The script to pull tags from archive.org could potentially run without prompting the user (since there is a good chance the name of the directory the files live in can be used to find the associated page on archive.org. (see an archive.org url to see what I mean: http://archive.org/details/corpid005).