The barrier for merging during migration was quite high; due to the problems with release events not matching on one or more of the discs of a set - and added complexity from bonus discs. If I recall correctly, the rule was
merge to NGS release IF
1) all discs conform to old DiscNumberStyle and/or BonusDiscStyle
2) EXACTLY matching release events can be found on ALL discs (country, barcode, catalog number etc)
3) there are "part of set" relationships linking all of the discs
This has meant that things can be orphaned if (among other reasons)
a) they missed some of the "part of set" links (as this had only been around a couple of years in MB, a lot of discs, especially VA missed these links)
b) one or more of the discs had a slightly wrong release event
c) discs missing from DB completely
d) discs in a big set missing the middle of the set
e) bonus disc links were it was indeterminate how to merge them
Remember that during this process, it was not just trying to merge. It was trying to split pre-NGS releases by release event into what a release in NGS is as well.
Lixobix wrote:Ah, so most of the releases on that list are lone discs, whereas those that were complete were merged during NGS, correct? But if that's the case then how come there are still release groups with disc 1 and disc 2 in? e.g: http://musicbrainz.org/release-group/51 … b49fe11724
Well in that case it would have been incorrect to merge - because they look like UK CD singles? So the script did exactly what it should, in noticing lack of part of set relationships, as well as differences in cat #/barcode. Those singles should have disc 1 and disc 2 removed from their names, as discussed above, I guess :)
Lixobix wrote:would have thought a bot could see "title: disc 1" and "title: disc2" with matching catalog and barcode, then merge. Why is it not possible?
Well this is kinda what it did; but title and release event match was not deemed enough to automatically merge something. The "part of set" relationship barrier was added to ensure someone had actually looked at this data and said that it was correct. Please don't underestimate the amount of wrong data and mess in parts of the "Various Artists" landscape; especially where multi-disc releases were released slightly different across different countries. I've come across all types of mess there.
Bots tend to make changes based on confirming data with other sources. Merging is a destructive edit, difficult to undo and quite tough to confirm the data with other sources. Someone could definitely write something to suggest merges and improve the reports we have to make it easier to get through the backlog, but actually entering edits I would be quite against.
Our edit queue is already completely unmanageable currently - most edits go through unnoticed and un-voted; including human-proposed merges.