Removing Duplicate Records from Dataset/DataTable
Posted by Viral Sarvaiya on September 27, 2010
Hello friends….
Hear i demonstrate, how to remove the rows from the dataset that have duplicate rows.
Duplicate DataTable looks like,
Name Company
Viral BNF Tech
Sandeep Gateway Tech
Dharmik Om info
Malhar BNF Tech
Viral BNF Tech
Dharmik Om info
Viral BNF Tech
Sandeep Gateway Tech
now this is the function that remove the duplicate rows from the DataTable as below,
public DataTable DuplicateRowRemove(DataTable dt, string Col)
{
Hashtable hTable = new Hashtable();
ArrayList ArrDupli = new ArrayList();
foreach (DataRow r in dt.Rows)
{
if (HeshTbl.Contains(r[Col]))
ArrDupli.Add(r);
else
HeshTbl.Add(r[Col], string.Empty);
}
foreach (DataRow R in ArrDupli)
dt.Rows.Remove(R);
return dt;
}
so, after the removing the duplicate data DataTable shows as blow,
Name Company
Viral BNF Tech
Sandeep Gateway Tech
Dharmik Om info
Malhar BNF Tech
thanks you…..



Ramani Sandeep said
good one… useful in many cases….
Parag Shukla said
Nice, really useful…………….
this is also possible in SQL Server using Common Table Expression and Row_number() function……
Gooood One…….
Rushit Shukla said
Good one dear …..
Remove duplicate records from the database table « Web Developer Friend – Viral Sarvaiya said
[...] Removing Duplicate Records from Dataset/DataTable [...]
sushil jain said
hi
i read data from xls file into data tabel1.
now i have to check that row of datatabel1 is already present in dataase or not.
if yes,so it add this row in other data tabel2
and if no,so it save in database.
all duplicate value that already present in database should come in data tabel2.
“if you have any idea so please tell me exatly at my email id”
viralsarvaiya said
Hi sushil jain,
you can use the following code for compare two datatable,
public static DataTable CompareTwoDataTable(DataTable dtXML, DataTable dtDatabase)
{
dtXML.Merge(dtDatabase);
DataTable dtDuplicate = dtDatabase.GetChanges();
return dtDuplicate;
}
Hope this will Usefull to you….
Nick Hanshaw said
That is a nice code example however it is always more efficient to have less loops and in this scenario that is especially important since there could be a good deal of records. By starting at the end of the table and iterating backwards items on the end can be removed right away without the need for a second loop.
public DataTable DuplicateRowRemove(DataTable dt, string Col)
{
Hashtable hTable = new Hashtable();
for (int i = dt.Rows.Count – 1; i >= 0; i–)
{
DataRow r = dt.Rows[i];
if (HeshTbl.Contains(r[Col]))
dt.Rows.RemoveAt(i);
else
HeshTbl.Add(r[Col], string.Empty);
}
return dt;
}
Viral Sarvaiya said
hi,
thanks for the comment, but i have some doubt
you declare
Hashtable hTable = new Hashtable();
is right, but in if condition
HeshTbl.Contains(r[Col])
you have not declare and fill values in that, so it is blank how it check .Contains() ?
Nick Hanshaw said
It is a bit hard to read the code because it was missing indentions, here is the code again, as you can see there is an else. If the hash table does not contain the value then the value is added. So if the loop comes across that value again the row will be removed.
public DataTable DuplicateRowRemove(DataTable dt, string Col)
{
Hashtable hTable = new Hashtable();
for (int i = dt.Rows.Count – 1; i >= 0; i–)
{
DataRow r = dt.Rows[i];
if (HeshTbl.Contains(r[Col]))
dt.Rows.RemoveAt(i);
else
HeshTbl.Add(r[Col], string.Empty);
}
return dt;
}