Merge Delimited Files on a Common Key

Many times I need to merge two separate datasets that have a common key column – for example:

dataset 1:

ID, Name, Weight, Height

dataset 2:

ID, Address1, Address2, Phone1, Phone2

I can’t just concat the data together, instead I need the merge to happen around the ID key, so that each ID only has a single row in the result.  So, of course I wrote a powershell script to do this for me.  Take a look at it below.  I called it “Merge-TsvFiles”, but it takes a delimiter as one of its paramters, so you could use this for CSV or whatever.

If you have any thoughts as to how this can be further improved code-wise or speed-wise, please comment!

function Merge-TsvFiles {
  param(
    [string] $file1 = $(throw "file1 required."),
    [string] $file2 = $(throw "file2 required."),
    [string] $delimiter = "`t",
    [string] $key = "ProfileID"
  )

  $data1 = import-csv -path $file1 -delimiter $delimiter
  $data2 = import-csv -path $file2 -delimiter $delimiter

  #write-host "$data1.Length total data rows, and $data2.Length total merge data rows."
  $i = 0

  #grab the new column names
  $newcols = $data2 | gm -MemberType NoteProperty

  Foreach ($datarow in $data1)
  {
    $i++
    $data2 | where-object {$_.$($key) -eq $data1.$($key)} | %{
      Foreach ($col in $newcols) {
        if ($col.Name -ne $key) {
          Add-Member -inputObject $datarow -name $col.Name -value $_.$($col.Name) -MemberType NoteProperty;
        }
      }
    }
  Write-Progress -activity "Merging data" -status "Merged $i of $($data1.Length) rows..." -percentComplete (($i / $data1.length) * 100) 
}

write-host "Writing output..."
$data1 | export-csv "output.txt" -delimiter "`t"
}

Technorati Tags: ,,,

Print | posted on Tuesday, October 18, 2011 1:17 PM

Feedback

# re: Merge Delimited Files on a Common Key

Left by Service Κινητων at 2/20/2012 8:30 AM
Gravatar Nice post, always something interesting on here :)

# re: Merge Delimited Files on a Common Key

Left by rental sewa mobil bogor at 2/29/2012 6:21 PM
Gravatar
prosesor dual-core 1.5 Ghz Snapdragon S3 CPU dengan layar SAMOLED ukuran 4 inchi, kamera belakang 5MP dengan kemampuan merekam video 720P, kamera depan 1.3MP, dan akan beroperasi dengan Android 2.3 Gingerbread. Mengenai konektivitas, ponsel rental sewa mobil bogor ini sudah mendukung 42Mbps HSPA+. Ponsel ini juga akan diberikan aplikasi bawaan seperti DropBox, Evernote, Square, TripIt, Camscanner dan LinkedIn untuk membantu produktivitas pengguna. Sayangnya belum ada informasi mengenai ketersediaan Galaxy S Blaze 4G ini di Indonesia

# re: Merge Delimited Files on a Common Key

Left by event organizer di jakarta bogor at 3/3/2012 12:14 AM
Gravatar Padahal sebenarnya dalam SE-97/PJ/2011 hanyalah mengatur mengenai perlakuan mengenai pembentukan dan pemupukan dana cadangan premi bagi Wajib Pajak Asuransi Jiwa yang dapat dikurangkan dari penghasilan bruto dalam menghitung PPh Terutang. Dalam SE-97/PJ/2011 ini hanya menegaskan kembali bahwa cadangan premi asuransi jiwa dalam bentuk investasi Unit Link event organizer di jakarta bogor depok bekasi, adalah tidak dapat dikurangkan dari penghasilan bruto dalam menghitung PPh terutang karena karena Penghasilan yang diterima oleh jenis asuransi jiwa Unit Link ini telah dikenakan PPh yang bersifat final dan/atau yang bukan merupakan objek pajak.

# re: Merge Delimited Files on a Common Key

Left by Budy at 3/16/2012 5:24 AM
Gravatar mungkin hal diatas tidak saya pahami. If you search a notebook, you should visit my blog.

Notebooks Info

# re: Merge Delimited Files on a Common Key

Left by maquinas de coser at 3/20/2012 7:59 AM
Gravatar good post, and interesting for me!

Your comment:





 
 

Copyright © Lance Robinson

Design by Bartosz Brzezinski

Design by Phil Haack Based On A Design By Bartosz Brzezinski