We are using SharePoint 2010 Enterprise
We are trying to remove duplicate documents from our document library. The duplicates will be pulled based on a couple column values that match among documents found within the library, Such as:
- Customer ID
- Document Date
- Document Description
This removal process needs to be done carefully and unfortunately the actual delete of the doc needs to be done manually, once a user determines it's a true duplicate.
So, I need to generate list of all the potential duplicate documents based on the criteria noted above so a user can then manually go through the list of documents (which would be sorted also by the criteria above) to determine if the documents are, in fact, duplicates and then delete any copies leaving just one.
So Far,
I have a sql query that will return the duplciates from our doc library, but I haven't found a way to apply this within our SharePoint site. I tried an external list, but since the results are returned as an actual sharepoint list, the user can't view or delete any documents from the doc library when going through the list. I looked at the Content Query Webpart, but from what I could tell I couldn't set up the query to compare column values among documents, I could just add a filter value for a column.
Any thoughts on how I can accomplish this? Hopefully using an OOB web part!
Thanks in advance!
zc26