Andro, that is much easier... I like to do things the hard way I guesss...
It boils down to if the OP needs the information to stay in place or not... sorting would rearrange data. Also, if he/she is even using Linux vs Windows.
Then the new found issue of needing to know if duplicates existed prior to their removal... to do this with Andro's suggestion, you could use uniq's -d -c -u flags... see here:
http://www-128.ibm.com/developerwork...l-tiptex6.html
I saw the code below on
http://www.perl.com/doc/FMTEYEWTK/regexps.html
It may help, if you are entertaining the idea of using Perl.
#!/usr/bin/perl -00 -n
while ( /\b(\w+)(\s+\1)+\b/gi ) {
print "dup $1 at paragraph $.\n";
}
This now yields:
dup at paragraph 10
dup at paragraph 33