Friday, July 16, 2010

WPF Datagrid – Load and Performance

This post is not about performance numbers of WPF Datagrid but simply about what you should be aware of in order to make it perform well. I was not motivated enough to use profiler to show realistic numbers but instead used the Stopwatch class wherever applicable. This post does not go into techniques to handle large amounts of data such as Paging or how to implement paging, but focuses on how to make the datagrid work with large data.

Here is the C# class that generates the data I want to load the Datagrid with.

public class DataItem
{
public long Id { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public long Age { get; set; }
public string City { get; set; }
public string Designation { get; set; }
public string Department { get; set; }
}

public static class DataGenerator
{
private static int _next = 1;
public static IEnumerable GetData(int count)
{
for (var i = 0; i < count; i++)
{
string nextRandomString = NextRandomString(30);
yield return new DataItem
{
Age = rand.Next(100),
City = nextRandomString,
Department = nextRandomString,
Designation = nextRandomString,
FirstName = nextRandomString,
LastName = nextRandomString,
Id = _next++
};
}
}

private static readonly Random rand = new Random();

private static string NextRandomString(int size)
{
var bytes = new byte[size];
rand.NextBytes(bytes);
return Encoding.UTF8.GetString(bytes);
}
}

My ViewModel has been defined as shown below.

 public class MainWindowViewModel : INotifyPropertyChanged
{
private void Notify(string propName)
{
if (PropertyChanged != null)
PropertyChanged(this, new PropertyChangedEventArgs(propName));
}
public event PropertyChangedEventHandler PropertyChanged;

private Dispatcher _current;
public MainWindowViewModel()
{
_current = Dispatcher.CurrentDispatcher;
DataSize = 50;
EnableGrid = true;
_data = new ObservableCollection();
}

private int _dataSize;
public int DataSize
{
get { return _dataSize; }
set
{
LoadData(value - _dataSize);
_dataSize = value;
Notify("DataSize");
}
}

private ObservableCollection _data;
public ObservableCollection Data
{
get { return _data; }
set
{
_data = value;
Notify("Data");
}
}

private bool _enableGrid;
public bool EnableGrid
{
get { return _enableGrid; }
set { _enableGrid = value; Notify("EnableGrid"); }
}

private void LoadData(int more)
{
Action act = () =>
{
EnableGrid = false;
if (more > 0)
{
foreach (var item in DataGenerator.GetData(more))
_data.Add(item);
}
else
{
int itemsToRemove = -1 * more;
for (var i = 0; i < itemsToRemove; i++)
_data.RemoveAt(_data.Count - i - 1);
}
EnableGrid = true;
};
//act.BeginInvoke(null, null);
_current.BeginInvoke(act, DispatcherPriority.ApplicationIdle);
}
}

As you can see, as the DataSize is changed, the data would be loaded. Currently I use a slider to change the load size. This is all pretty easy and fun stuff starts in the XAML.


In order to apply this "Data" to my WPF datagrid, I apply this viewmodel instance to the DataContext of my class. See below for the code-behind that I have for my window

 public partial class MainWindow : Window
{
private MainWindowViewModel vm;

public MainWindow()
{
InitializeComponent();
vm = new MainWindowViewModel();
this.Loaded += (s, e) => DataContext = vm;
}
}

Lets start with the following XAML.


<stackpanel>
<slider maximum="100" minimum="50" value="{Binding DataSize}" />
<label grid.row="1" content="{Binding DataSize}">
<datagrid grid.row="2" isenabled="{Binding EnableGrid}" itemssource="{Binding Data}">
</datagrid>
</stackpanel>

Now build the application and run. The result appear as shown below.


image


As you can see above, I loaded 100 items yet I do not see the scrollbar. Lets change the slider’s Maximum property from 100 to 1000 and rerun the application. Dragging the slider to 1000 at once. So even for the 1000 items, the grid does not respond that well.


image


Let us look at the memory usage.


image


This is pretty heavy for an application with just 1000 items of data loaded. So what is using all this memory? You can hook up a Memory Profiler or use Windbg to look at the memory content but since I already know what is causing this issue, I am not going through that.


This issue is that the DataGrid has been placed inside a StackPanel. When vertically stacked, the StackPanel basically gives its children all the space that they ask for. This makes the DataGrid create 1000 rows (all the UI elements needed for each column of each row !!) and render it. The virtualization of the DataGrid did not come into play here.


So let us make a simple change and put the DataGrid inside a grid. The XAML for which is shown below.

<Grid>
<Grid.RowDefinitions>
<RowDefinition Height="30"/>
<RowDefinition Height="30"/>
<RowDefinition Height="*"/>
</Grid.RowDefinitions>
<Slider Value="{Binding DataSize}" Minimum="50" Maximum="1000"/>
<Label Content="{Binding DataSize}" Grid.Row="1"/>
<DataGrid ItemsSource="{Binding Data}" Grid.Row="2" IsEnabled="{Binding EnableGrid}">
</DataGrid>
</Grid>

When I run the application, you would notice that as I load 1000 items, the performance of the same application (no code changes, except that XAML one I just talked about) is a lot better than what it was. Moreover I see nice scrollbars.


image

Let us look at the memory usage.


image


Wow! 10 folds difference. This until now appears to be a re-talk about my previous post on WPF Virtualization. The same rules applies to DataGrid as well. Read this post if you are intertested.


So what else am I talking here.



  • If you notice the ViewModel code, you should be seeing that I disable the grid as I load data and enable it back once I am done. I have not really tested to see if this technique helps, but I did use this technique in HTML pages where loads of items in a listbox were all to be selected and this technique was very useful.
  • In all the screenshots I showed, the grid is sorted. So as the data changes, the grid has to keep sorting the data and show based on what you chose to sort. This, I believe, is a big overhead. Consider removing sort of the datagrid before you change the data if it is a viable option and does not impact the end user. Have not tested this, but the same should apply to the groupings as well (which most of the time cannot be simply removed).

With a simple point of loading the DataGrid into any other panel like Grid, instead of a StackPanel you get to see a lot of difference. The WPF datagrid performs just fine, as long as you keep the viewable region of the grid small.


Shown below is my grid with almost 1 Million data items loaded. The footprint is pretty small compared to the amount of data loaded. This means – either WPF Controls are memory intensive or WPF UI Virtualization is a boon.


Impact of sorting on the DataGrid



  • With no sorting applied on the datagrid, it took almost 20 seconds to load 1 Million items into my collection.
  • With sorting enabled, loading half those items iteself took over 2 minutes and the complete items took over 5 minutes and I killed the application because it was a pain. This matters because the application keeps the CPU busy with all the sort that has to keep happening as your data changes. So for every item added, the sort might be triggered, since I am placing it directly into an observable collection.
  • Instead consider sorting on the backend and not using the datagrid.

image


I can still scroll the application if the virtualization was properly utilized, inspite of the grid binding to 1 million items.


USING BeginInit() and EndInit() on the datagrid.


I changed the ViewModel’s LoadData() such that it calls BeginInit() as it starts loading the data and EndInit() when it done loading the data. This has helped quite a lot. Loading 1 Million items (without any sort applied on the grid) only took around 8 seconds (compared to the 18 seconds it took earlier). Unfortunately I did not spend enough time to use a profiler to show real numbers.


The changed code-behind for the Window is as shown.

public partial class MainWindow : Window
{
private MainWindowViewModel vm;

public MainWindow()
{
InitializeComponent();
vm = new MainWindowViewModel();
this.Loaded += (s, e) => DataContext = vm;
vm.DataChangeStarted += () => dg.BeginInit();
vm.DataChangeCompleted += () => dg.EndInit();
}
}

I also had to include the DataChangeStarted and DataChangeCompleted actions to the Viewmodel class. The changed portion of the ViewModel class is shown below.

	public event Action DataChangeStarted ;
public event Action DataChangeCompleted;

private void LoadData(int more)
{
Action act = () =>
{
//Before the data starts change, call the method.
if (DataChangeStarted != null) DataChangeStarted();
var sw = Stopwatch.StartNew();
EnableGrid = false;
if (more > 0)
{
foreach (var item in DataGenerator.GetData(more))
_data.Add(item);
}
else
{
int itemsToRemove = -1 * more;
for (var i = 0; i < itemsToRemove; i++)
_data.RemoveAt(_data.Count - i - 1);
}
EnableGrid = true;
sw.Stop();
Debug.WriteLine(sw.ElapsedMilliseconds);
if (DataChangeCompleted != null) DataChangeCompleted();
};
//act.BeginInvoke(null, null);
_current.BeginInvoke(act, DispatcherPriority.ApplicationIdle);
}

You can try this out and notice the performance difference yourself.


If the sorting is applied on the datagrid, the performance still hurts in spite of using the above mentioned trick. The overhead of sorting out weighs the performance gain we get calling the BeginInit and EndInit. May be having 1 million records is not realistic enough.

2 comments:

Anonymous said...

I got the following error message in Step 3b:

not enough arguments for constructor TemplateEngine: (sourceDirectories: Traversable[java.io.File], mode: String)org.fusesource.scalate.TemplateEngine

please help me solving this problem. Thanks.

Abbas said...

Hi, Good article. I would look to implement your approach to my datagrid for performance improment. By disable sorting, do you mean setting CanUserSortColumns to false? Thanks for your reply.