How to create an index of an entire text, i.e. of all words in a word (2013) document?

In order to find (missing) keywords, count frequencies etc. I need to make an index in Word documents of the entire text of that document, of every word from 1 letter onward.

At first I tried to work with a concordance file (search and replace to get every word on a separate line) , but that is very unpractical because of all the multiple entries and the fact that they are case sensitive, meaning they all have to be copied with a capital letter.

Here and there on the internet I found some references to macro's, but they do not work (at least not in v. 2013). I also found a reference on this site saying that (in word 2007) no macro would be needed, but following the link I found no answer....

I hope someone can help.

A macro containing the following code will work in all versions of Word, including 2013

Sub WordFrequency()

 

         Dim SingleWord As String           'Raw word pulled from doc
         Const maxwords = 9000              'Maximum unique words allowed
         Dim Words(maxwords) As String      'Array to hold unique words
         Dim Freq(maxwords) As Integer      'Frequency counter for Unique Words
         Dim WordNum As Integer             'Number of unique words
         Dim ByFreq As Boolean              'Flag for sorting order
         Dim ttlwds As Long                 'Total words in the document
         Dim Excludes As String             'Words to be excluded
         Dim Found As Boolean               'Temporary flag
         Dim j, k, l, Temp As Integer       'Temporary variables
         Dim tword As String                '

 

         ' Set up excluded words
'         Excludes = "[the][a][of][is][to][for][this][that][by][be][and][are]"
         Excludes = ""
         Excludes = InputBox$("Enter words that you wish to exclude, surrounding each word with [ ].", "Excluded Words", "")
'        Excludes = Excludes & InputBox$("The following words are excluded: " & Excludes & ". Enter words that you wish to exclude, surrounding each word with [ ].", "Excluded Words", "")
' Find out how to sort
ByFreq = True
Ans = InputBox$("Sort by WORD or by FREQ?", "Sort order", "FREQ")
If Ans = "" Then End
If UCase(Ans) = "WORD" Then
    ByFreq = False
End If

Selection.HomeKey Unit:=wdStory
System.Cursor = wdCursorWait
WordNum = 0
ttlwds = ActiveDocument.Words.Count
Totalwords = ActiveDocument.BuiltInDocumentProperties(wdPropertyWords) 

         ' Control the repeat
         For Each aword In ActiveDocument.Words
             SingleWord = Trim(aword)
             If SingleWord < "A" Or SingleWord > "z" Then SingleWord = ""    'Out of range?
             If InStr(Excludes, "[" & SingleWord & "]") Then SingleWord = "" 'On exclude list?
             If Len(SingleWord) > 0 Then
                 Found = False
                 For j = 1 To WordNum
                     If Words(j) = SingleWord Then
                         Freq(j) = Freq(j) + 1
                         Found = True
                         Exit For
                     End If
                 Next j
                 If Not Found Then
                     WordNum = WordNum + 1
                     Words(WordNum) = SingleWord
                     Freq(WordNum) = 1
                 End If
                 If WordNum > maxwords - 1 Then
                     j = MsgBox("The maximum array size has been exceeded. Increase maxwords.", vbOKOnly)
                     Exit For
                 End If
             End If
             ttlwds = ttlwds - 1
             StatusBar = "Remaining: " & ttlwds & "     Unique: " & WordNum
         Next aword

 

         ' Now sort it into word order
         For j = 1 To WordNum - 1
             k = j
             For l = j + 1 To WordNum
                 If (Not ByFreq And Words(l) < Words(k)) Or (ByFreq And Freq(l) > Freq(k)) Then k = l
             Next l
             If k <> j Then
                 tword = Words(j)
                 Words(j) = Words(k)
                 Words(k) = tword
                 Temp = Freq(j)
                 Freq(j) = Freq(k)
                 Freq(k) = Temp
             End If
             StatusBar = "Sorting: " & WordNum - j
         Next j

 

         ' Now write out the results
         tmpName = ActiveDocument.AttachedTemplate.FullName
         Documents.Add Template:=tmpName, NewTemplate:=False
         Selection.ParagraphFormat.TabStops.ClearAll
         With Selection
             For j = 1 To WordNum
                 .TypeText Text:=Words(j) & vbTab & Trim(Str(Freq(j))) & vbCrLf
             Next j
         End With
         ActiveDocument.Range.Select
         Selection.ConvertToTable
         Selection.Collapse wdCollapseStart
         ActiveDocument.Tables(1).Rows.Add BeforeRow:=Selection.Rows(1)
         ActiveDocument.Tables(1).Cell(1, 1).Range.InsertBefore "Word"
         ActiveDocument.Tables(1).Cell(1, 2).Range.InsertBefore "Occurrences"
         ActiveDocument.Tables(1).Range.ParagraphFormat.Alignment = wdAlignParagraphCenter
         ActiveDocument.Tables(1).Rows.Add
         ActiveDocument.Tables(1).Cell(ActiveDocument.Tables(1).Rows.Count, 1).Range.InsertBefore "Total words in Document"
         ActiveDocument.Tables(1).Cell(ActiveDocument.Tables(1).Rows.Count, 2).Range.InsertBefore Totalwords
         ActiveDocument.Tables(1).Rows.Add
         ActiveDocument.Tables(1).Cell(ActiveDocument.Tables(1).Rows.Count, 1).Range.InsertBefore "Number of different words in Document"
         ActiveDocument.Tables(1).Cell(ActiveDocument.Tables(1).Rows.Count, 2).Range.InsertBefore Trim(Str(WordNum))
         System.Cursor = wdCursorNormal
      '   j = MsgBox("There were " & Trim(Str(WordNum)) & " different words ", vbOKOnly, "Finished")
     Selection.HomeKey wdStory

 

End Sub

Hope this helps,
Doug Robbins - MVP Office Apps & Services (Word)
dougrobbinsmvp@gmail.com
It's time to replace ‘Diversity, Equity & Inclusion’ with ‘Excellence, Opportunity & Civility’ - V Ramaswamy

5 people found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

AMAZING!

Thank you so much, Doug. I couldn't believe it could not be done, and now it can.

Never would have guessed it would be called Word Frequency. Hope this reply solves that.

One thing would interest me though..... could I have found this solution somewhere on the internet, because I realy DID search.

Thanks again,

JMTB.

1 person found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Cheers
Paul Edstein
(Fmr MS MVP - Word)

2 people found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

I'm afraid I was a bit to enthusiastic yesterday, because I overlooked that the macro - beautiful as it is - only lists the frequencies and not the page numbers/line numbers. These things seem to have disappeared from my original post, but I'm afraid I do need them as well, if only because so many words can have different meanings in various contexts.

Again, I will greatly appreciate help.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Amongst the links I posted, this one includes the page references: http://www.msofficeforums.com/word/11875-how-can-i-count-multiple-usage-same.html#post31360

Do you really need line #s?

Cheers
Paul Edstein
(Fmr MS MVP - Word)

2 people found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

 
 

Question Info


Last updated January 21, 2023 Views 2,702 Applies to: