Having finished up a pile of work-work, I can now return to the interesting suggestions raised by my recent word frequency entry. Simon suggested using a custom class that implements IComparable. That was new to me, so I gave that a try. It wasn't immediately obvious to me what to do, but with a little poking around, I found a number of examples, including in the .NET QuickStarts, who knew?
To jump ahead a moment, you can see the result of the effort first. I now have two word frequency pages, the original that uses a DataTable and a new one that uses the custom class and array sorting. Try them out:
Word frequency with data table
Word frequency with custom class implementing IComparable
The version with a data table is no longer interesting from an implementation POV, but I was curious about timings.[1]
The custom class is a simple class with a textbook implementation of CompareTo. The mildly fun twist is that I coded the CompareTo method to sort by two values (first frequency, descending, then word, ascending).
Eric had some other suggestions. One was to use a HashTable, which I couldn't figure out how to do that; putting instances of the custom class into a HashTable worked, but HashTable does not seem to support the Sort method. He's also got an implementation using generics, which is new in Whidbey, and thus not possible yet in 1.1. If you're curious, though, have a look at his second comment.
Anyway, here's the code:Sub Button1_Click(sender As Object, e As EventArgs)
Dim startTime As DateTime = DateTime.Now
Dim endTime As DateTime
Dim i As Integer
Dim s As String
Dim punctuation() As Char = {".", ",", "!", "=", "-", _
", "_", ";", ":", "(", ")", "[", "]", """", "?", "/", "\", _
"@", "#", "$", "%", "&", "*", "=", "<", ">", "|", _
"~", "‘", "`"}
Dim t As String = TextBox1.Text
t = t.ToLower()
t = t.Trim()
For i = 0 to punctuation.Length - 1
t = t.Replace(punctuation(i), " ")
Next i
t = t.Replace(vbcrlf, " ")
t = t.Replace(vbtab, " ")
' Dumb old so-called smart quotes, grrr
t = t.Replace(Chr(145), " ")
t = t.Replace(Chr(146), "'") ' smart apostrophe
t = t.Replace(Chr(147), " ")
t = t.Replace(Chr(148), " ")
t = t.Replace(Chr(151), " ")
t = t.Replace(vbcrlf, " ")
t = t.Replace(vbtab, " ")
While t.indexOf(" ") > -1
t = t.Replace(" ", " ") ' double spaces
End While
' Create array of all words
Dim wordArray() As String
wordArray = t.split
Array.Sort(wordArray)
Dim WordsByCount As New ArrayList()
' Walk through word array, accumulating count of (sorted)
' words. When we run out of words, write word and accumulator
' to new array of custom WordFrequency objects.
Dim arrayLength AS Integer = wordArray.Length - 1
Dim accumulator As Integer = 0
Dim nextWord As String = ""
Dim currentWord As String = ""
For i = 0 to arrayLength
nextWord = wordArray(i)
If nextWord = currentWord Then
accumulator += 1
Else
If i > 0 Then
WordsByCount.Add(New WordFrequency(currentWord, accumulator))
End If
currentWord = nextWord
accumulator = 1
End If
Next
WordsByCount.Add(New WordFrequency(currentWord, accumulator))
' Sort method invokes custom comparison method of objects in array
WordsByCount.Sort()
' Display results
s = "<table cellpadding=4>"
For Each wf As WordFrequency in WordsByCount
s &= "<tr>"
s &= "<td>" & wf.Frequency & "</td>"
s &= "<td>" & wf.Word & "</td>"
s &= "</" & "tr>"
Next
s &= "</table>" Literal1.Text = s
labelWordCount.Text = wordarray.length
endTime = DateTime.Now
Dim timeDiff As TimeSpan = endTime.Subtract(startTime)
Dim totalSeconds As Double = (timeDiff.TotalMilliSeconds / 1000)
labelTime.text = totalSeconds.ToString("g")
End Sub
Class WordFrequency: Implements IComparable
Dim WordValue As String
Dim FrequencyValue As Integer
Public Sub New()
End Sub
Public Sub New(word As String, freq As Integer)
Me.Word = word
Me.Frequency = freq
End Sub
Public Property Word As String
Get
Return WordValue
End Get
Set (value As String)
WordValue = value
End Set
End Property
Public Property Frequency As Integer
Get
Return FrequencyValue
End Get
Set (value As Integer)
FrequencyValue = value
End Set
End Property
Public Function CompareTo (ByVal ObjectToCompare as Object) As Integer _
Implements IComparable.CompareTo
Dim WordFrequencyObject As WordFrequency = _
CType(ObjectToCompare, WordFrequency)
CompareTo = WordFrequencyObject.Frequency - Me.Frequency
If CompareTo = 0 Then
' Word frequencies are the same, so now compare words
If WordFrequencyObject.Word < Me.Word Then
CompareTo = 1
ElseIf WordFrequencyObject.Word > Me.Word
CompareTo = -1
ElseIf WordFrequencyObject.Word > Me.Word
CompareTo = 0
End If
End If
End Function
End Class