We are maintaining the user database in our environment, every week automatically the excel will be placed in the the specific folder. Post this, the automatic script will start run and upload the user information in the database by reading the excel file. In our current scenario, the excel will have duplicate entries which will create more than one entry for in the user in the database. The problem is that every once in a while, duplicates end up in the CSV file
To over come this scenario we thought to create script to remove the duplicate detail (which similar in any other rows in excel). This can be achieve by using theย Sort-Object and the Import-CSV cmdlet to remove duplicates from a CSV file.
After the contents of the CSV file sorted using Sort-Object, you can use the uniqueย switch to return only unique rows from the file.
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
####################################################################### #Project : How to remove duplicate rows in a CSV using Powershell #Developer : Thiyagu S (dotnet-helpers.com) #Tools : PowerShell 5.1.15063.1155 #E-Mail : mail2thiyaguji@gmail.com ###################################################################### #Getting the Path of CSV file $inputCSVPath = 'C:\BLOG_2020\DEMO\UserDetails.csv' #The Import-Csv cmdlet creates table-like custom objects from the items in CSV files $inputCsv = Import-Csv $inputCSVPath | Sort-Object * -Unique #The Export-CSV cmdlet creates a CSV file of the objects that you submit. #Each object is a row that includes a comma-separated list of the object's property values. $inputCsv | Export-Csv "C:\\BLOG_2020\DEMO\UserDetails_Final.csv" -NoTypeInformation |
Thank you so much for this. I have been tearing my hair out for the past few hours trying to figure this out.
Thanks :).