Monday, May 13, 2019

.Net DataTable.Merge

The DataTable class has a Merge property that lets you merge two data tables. It sounds like an useful method, but how exactly does it resolve row state conflicts? Does it do what you want?

It seems the algorithm is asymmetrical - it matters which data table is being merged into which.

If the two data tables have the same schema then there is only one parameter that matters. Its cryptically called "preserve changes" and the Microsoft documentation doesn't explain what it does very well.

One thing that's important is to define primary keys on your data tables. If you don't do that then the merge simply appends all the rows from one data table to the other data table.

I've created a demo that shows how it works. You can alter the value of "preserve changes", optionally specify primary keys, and switch the two data tables. The only tricky part is you can't display deleted row in a data grid so each of the data tables has a display version which is suitable to be bound to a data grid.

Start a new WPF project called DatatableMerge in C#. I'm using Visual Studio 2019 and targeting framework 4.7.

There are three checkboxes that control options, then two data grids side-by-side, and a results grid below. The code will merge the right hand data table into the left hand data table and display the results below.

The XAML looks like this.


<Window x:Class="DatatableMerge.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        xmlns:local="clr-namespace:DatatableMerge"
        mc:Ignorable="d"
        DataContext="{Binding RelativeSource={RelativeSource Self}}"
        Title="Merge right datatable into left datatable" Height="450" Width="800">
    <Grid>
        <Grid.RowDefinitions>
            <RowDefinition Height="auto"/>
            <RowDefinition Height="*"/>
            <RowDefinition Height="*"/>
        </Grid.RowDefinitions>
        <Grid.ColumnDefinitions>
            <ColumnDefinition Width="*"/>
            <ColumnDefinition Width="*"/>
            <ColumnDefinition Width="2*"/>
        </Grid.ColumnDefinitions>

        <CheckBox Grid.Row="0" Grid.Column="0" Content="Preserve Changes" IsChecked="{Binding preserveChanges}"/>
        <CheckBox Grid.Row="0" Grid.Column="1" Content="With Primary Key" IsChecked="{Binding withPrimaryKey}"/>
        <CheckBox Grid.Row="0" Grid.Column="2" Content="Switch tables" IsChecked="{Binding switchTables}"/>
        <DataGrid Grid.Row="1" Grid.Column="0" Grid.ColumnSpan="2" ItemsSource="{Binding dtDisplay1}" AutoGenerateColumns="False" IsReadOnly="True">
            <DataGrid.Columns>
                <DataGridTextColumn Binding="{Binding RowState}"/>
                <DataGridTextColumn Binding="{Binding ID}"/>
                <DataGridTextColumn Binding="{Binding Code}"/>
                <DataGridTextColumn Binding="{Binding Description}" Width="*"/>
            </DataGrid.Columns>
        </DataGrid>
        <DataGrid Grid.Row="1" Grid.Column="2" ItemsSource="{Binding dtDisplay2}" AutoGenerateColumns="False" IsReadOnly="True">
            <DataGrid.Columns>
                <DataGridTextColumn Binding="{Binding RowState}"/>
                <DataGridTextColumn Binding="{Binding ID}"/>
                <DataGridTextColumn Binding="{Binding Code}"/>
                <DataGridTextColumn Binding="{Binding Description}" Width="*"/>
            </DataGrid.Columns>
        </DataGrid>
        <DataGrid Grid.Row="2" Grid.Column="0" Grid.ColumnSpan="3" ItemsSource="{Binding dtDisplayResult}" AutoGenerateColumns="False" IsReadOnly="True">
            <DataGrid.Columns>
                <DataGridTextColumn Binding="{Binding RowState}"/>
                <DataGridTextColumn Binding="{Binding ID}"/>
                <DataGridTextColumn Binding="{Binding Code}"/>
                <DataGridTextColumn Binding="{Binding Description}" Width="*"/>
            </DataGrid.Columns>
        </DataGrid>
    </Grid>
</Window>

The only interesting thing in the code is that I've overridden the default DataViews to include deleted records. Normally they are filtered out.

using System.ComponentModel;
using System.Data;
using System.Windows;


namespace DatatableMerge
{
 
    public partial class MainWindow : Window, INotifyPropertyChanged
    {
        public DataTable dt1;
        public DataTable dt2;
        public DataTable dtResult;

        public DataView dvDisplay1 { get; set; }
        public DataView dvDisplay2 { get; set; }
        public DataView dvDisplayResult { get; set; }

        private bool _preserveChanges;
        public bool preserveChanges
        {
            get { return _preserveChanges; }
            set
            {
                _preserveChanges = value;
                PropChanged("preserveChanges");
                DoMerge();
            }
        }

        private bool _switchTables;
        public bool switchTables
        {
            get { return _switchTables; }
            set
            {
                _switchTables = value;
                PropChanged("switchTables");
                Refresh();
            }
        }

        private bool _withPrimaryKey;
        public bool withPrimaryKey
        {
            get { return _withPrimaryKey; }
            set
            {
                _withPrimaryKey = value;
                PropChanged("withPrimaryKey");
                Refresh();
            }
        }
        public MainWindow()
        {
            Refresh();
            InitializeComponent();
        }

        private void Refresh()
        {
            dt1 = new DataTable();
            dt2 = new DataTable();
            dtResult = new DataTable();

            dt1.Columns.Add(new DataColumn("ID", typeof(int)));
            dt1.Columns.Add(new DataColumn("Code", typeof(string)));
            dt1.Columns.Add(new DataColumn("Description", typeof(string)));
            dt1.TableName = "dt1";

            dt2 = dt1.Clone();
            dt2.TableName = "dt2";

            dtResult = dt1.Clone();
            dtResult.TableName = "dtResult";

            if (!switchTables)
            {
                dt1 = PopulateSet1();
                dt2 = PopulateSet2();
            }
            else
            {
                dt1 = PopulateSet2();
                dt2 = PopulateSet1();
            }

            dvDisplay1 = new DataView(dt1, "", "", DataViewRowState.CurrentRows | DataViewRowState.Deleted);
            dvDisplay2 = new DataView(dt2, "", "", DataViewRowState.CurrentRows | DataViewRowState.Deleted);

            if (withPrimaryKey)
            {
                dt1.PrimaryKey = new DataColumn[] { dt1.Columns[0] };
                dt2.PrimaryKey = new DataColumn[] { dt2.Columns[0] };
                dtResult.PrimaryKey = new DataColumn[] { dtResult.Columns[0] };
            }

            PropChanged("dvDisplay1");
            PropChanged("dvDisplay2");
            DoMerge();
        }

        private DataTable PopulateSet1()
        {
            DataTable dt = dt1.Clone();
            dt.Rows.Add(1, "Apples", "Fresh rosy apples just off the tree");
            dt.Rows.Add(2, "Bananas", "Get your hands on my loverly bananas");
            dt.AcceptChanges();
            dt.Rows[1].Delete();
            dt.Rows.Add(3, "Cherries", "I've got the sweetest bunches of cherries ever");
            return dt;
        }

        private DataTable PopulateSet2()
        {
            DataTable dt = dt2.Clone();
            dt.Rows.Add(1, "Apples", "Fresh rosy apples just off the tree");
            dt.Rows.Add(2, "Bananas", "Get your hands on my loverly bananas");
            dt.AcceptChanges();
            dt.Rows[1]["Description"] = "Get your hands on my lovely bananas";
            dt.Rows.Add(4, "Dates", "Date shakes, date candy, dates for every day");
            return dt;
        }

        private void DoMerge()
        {
            dtResult = dt1.Copy();
            dtResult.Merge(dt2, preserveChanges);
            dvDisplayResult = new DataView(dtResult, "", "", DataViewRowState.CurrentRows | DataViewRowState.Deleted);
            PropChanged("dvDisplayResult");
        }

        public event PropertyChangedEventHandler PropertyChanged;
        public void PropChanged(string name)
        {
            if (PropertyChanged != null)
                PropertyChanged(this, new PropertyChangedEventArgs(name));
        }
    }
}


If we run the application we see the default behavior which is to return all rows in both tables.

No primary key - dumb merge
Now check "With Primary Key" and see an intelligent merge. Without "Preserve Changes" the data table on the right has precedence ie. dt1.Merge(dt2, false)

Bananas comes from the right
Now check "Preserve Changes" and see how the row state of Bananas changes. You can see the data table on the left now has precedence ie. dt1.Merge(dt2, true). I don't know why the row state of Apples changes.

With "Preserve Changes" the data table on the left has precedence