Contents
Introduction
About the time .NET Framework 1.0 was released I managed to win a project whose large part consisted of parsing and processing large amounts of text. The text was in the form of huge, semi-structured files (representing a kind of legislative documents). The project's job was mainly to:
- Parse the files and display the parsed tree-like structure to the user.
- Let the user edit the hierarchical structure and the contents of the individual nodes.
- Save the (potentially modified) hierarchical structure to an SQL database.
I analyzed the requirements and then I've made probably the most risky decision in my professional career: I'll do that in .NET! I was .NET "greenhorn" at that time and I had NO practical experience programming .NET. Nevertheless, I took that route and now I have to say that...Well, I'm going ahead of myself:-). Let's explore things in order.
I've designed the application to consist of three major parts:
- The parsing engine to parse an input file and produce an in-memory tree of
Block objects (I've used the System.Text.RegularExpressions classes for this). The Block tree was wrapped in a BlockSet object (more on that later).
- The user interface controls and windows for display and modification of the
Block tree and the individual Block objects.
- The
DB class responsible for saving and restoring BlockSets to / from database.
The application needed to provide very rich user interface with various editors for the tree structure as well as for the individual Block objects and their properties (screenshot [^] - 278KB!). To support this seamlessly and in an extensible manner, I've employed a kind of an observer design pattern:
When modifying a Block property, the object raises two events--pre and post-modification. For example for Block.Title property, we have TitleChanging and TitleChanged events.
In addition, the BlockSet wrapper handles the events of each of the underlying Block objects and "forwards" them by means of raising its own events with the Block passed as the event argument.
This way, the UI windows subscribe just to the BlockSet events so they don't have to subscribe to events from the individual Blocks (and they don't have to monitor the tree for added / deleted Blocks - it is handled by the BlockSet class as well).
The design looked nicely, but what about the huge input files, resulting in LOTS of objects in memory. I was lucky enough to have all the input files at my disposal, so I analyzed the biggest file and...well, I became a bit nervous. Processing the file would result in more than 8000 Block objects.
In addition, because each block object declares 32 distinct events, that means additional 32 event delegate objects per Block instance (256000).
In addition, 6 of the 16 block properties were objects (reference types), so additional 6 objects per Block in memory (48000).
In addition, when the Block tree is represented in a TreeView control that means additional 8000 TreeNode objects in memory...
At this point I stopped to count the objects (I was well over 300000 already) and decided that a try-out is definitely in order.
I've created a small console application that allocated user-entered number of objects, stored them in an ArrayList, added two event handlers to each of the allocated objects and finally, forced the whole mess to be garbage collected .
Public Class BlockClient
Public Shared Sub Test()
Dim Count As Integer = 1
Do While Count > 0
Console.Write("Number of blocks (0 to exit): ")
Count = System.Convert.ToInt32(Console.ReadLine())
If Count = 0 Then
Exit Do
End If
Console.WriteLine("Creating ArrayList of {0:d} blocks {1:mm:ss}...",
Count,
DateTime.Now)
Dim Blocks As New ArrayList(Count)
Dim i As Integer
Console.WriteLine("Creating {0:d} blocks {1:mm:ss}...",
Count,
DateTime.Now)
For i = 0 To Count - 1
Blocks.Add(New Block(String.Format("Block #{0:d}", i + 1)))
Next
Console.WriteLine("Adding {0:d} event handlers {1:mm:ss}...",
Count * 2,
DateTime.Now)
For i = 0 To Count - 1
AddHandler CType(Blocks(i), Block).OnCaptionChanging,
AddressOf OnCaptionChanging
AddHandler CType(Blocks(i), Block).OnCaptionChanged,
AddressOf OnCaptionChanged
Next
Console.WriteLine("Changing block #{0:d} {1:mm:ss}...",
Count \ 2,
DateTime.Now)
CType(Blocks(Count \ 2), Block).Caption = "New value"
Console.WriteLine("Finished, press ENTER to GC.Collect()")
Console.ReadLine()
Blocks = Nothing
GC.Collect()
GC.WaitForPendingFinalizers()
Loop
End Sub
Protected Shared Sub OnCaptionChanging
(ByVal sender As Object, ByVal e As Block.CaptionEventArgs)
Console.WriteLine("Changing: {0} -> {1}", e.OldCaption, e.NewCaption)
End Sub
Protected Shared Sub OnCaptionChanged
(ByVal sender As Object, ByVal e As Block.CaptionEventArgs)
Console.WriteLine("Changed: {0} -> {1}", e.OldCaption, e.NewCaption)
End Sub
End Class
Public Class Block
Private _caption As String
Private _id As Integer
Private Shared _maxId As Integer
Public Sub New()
Me.New(String.Empty)
End Sub
Public Sub New(ByVal caption As String)
_caption = caption
_maxId += 1
_id = _maxId
End Sub
Public Class CaptionEventArgs
Inherits EventArgs
Private _oldCaption As String
Private _newCaption As String
Private _canceled As Boolean
Public Sub New(ByVal oldCaption As String, ByVal newCaption As String)
_oldCaption = oldCaption
_newCaption = newCaption
End Sub
Public ReadOnly Property OldCaption() As String
Get
Return _oldCaption
End Get
End Property
Public ReadOnly Property NewCaption() As String
Get
Return _newCaption
End Get
End Property
Public Property Cancel() As Boolean
Get
Return _canceled
End Get
Set(ByVal Value As Boolean)
_canceled = Value
End Set
End Property
End Class
Public Event OnCaptionChanging
(ByVal sender As Object, ByVal e As Block.CaptionEventArgs)
Public Event OnCaptionChanged
(ByVal sender As Object, ByVal e As Block.CaptionEventArgs)
Public Property Caption() As String
Get
Return _caption
End Get
Set(ByVal Value As String)
Dim Args As New Block.CaptionEventArgs(_caption, Value)
RaiseEvent OnCaptionChanging(Me, Args)
If Not Args.Cancel Then
_caption = Value
RaiseEvent OnCaptionChanged(Me, Args)
End If
End Set
End Property
End Class
I started the application and on the console prompt I entered 100000! On my machine (~850 MHz PII, ~500 MB RAM) the allocation took about a second, the events hookup another second and the working set raised from 7 MB to 24 MB. The garbage collection was almost instantaneous.
I entered 500000, the allocation took 7 seconds, the events hookup 4 seconds and the working set went to 81 MB. Once again, the collection was a snap.
Of course, the test application was very simplistic and it didn't take into account the dynamics of a long running application and the effects concerning memory management, but...for me it was sufficient!
I've implemented the application according to the design and delivered it to the customer. 15 editors for more than year have used the application now (June, 2003) and there were only 7 non-critical bugs reported!
Top 
So what's the conclusion?
Here you are (IMHO):
.NET is the first environment where I can design truly object-oriented systems without worrying much about the physical characteristics of the execution environment. I can design elegant and easy to use object models without having to worry about circular references and reference counting (I can afford the cost of IDisposable [^] in this case). I can design dynamic systems that are easily modifiable and extensible. I can design anything!
Well, I don't want to sound pathetic; I just want to tell you that I like .NET and that for me, programming is fun again (unless I have to touch some old VB6 code:-).
What about you?
© Palo Mraz, Sunday, June 29, 2003
PS: I've been preparing an article on a VB6 project that was designed ending up with lots of objects in memory. I was the project lead and the project failed. I'm going to tell you honestly what happened-it would make for an interesting reading, I hope. The article will be available at the end of this year (2003) and it will be available to subscribers only. Not a subscriber yet? Never mind, it's easy and it's FREE!...
Top 