macro extract sentence from multiple files , and arrange sentence in order

Hi

 

I would like to extract every line of sentences in each doc and place in a new doc in order. By the word “order” I meant below

For example, These are sentences in each doc :

 

Doc 1

 Sentence A. Sentence B. Sentence C ….

 

Doc 2

Sentence 1. Sentence 2. Sentence 3

 

Doc 3

Sentence !. Sentence @. Sentence # ..

 

And I like to arrange them as below using Macro

 

Doc New

Sentence A

Sentence 1

Sentence !

 

Sentence B

Sentence 2

Sentence @

 

Sentence 3

Sentence C

Sentence #

 

For every sentence in a doc 1 there is a sentence in doc 2 to be placed right below that sentence in doc 1

and for every sentence in doc 2 , there is a sentence in doc 3 to be placed right below that sentence in doc 2

and on and on

 

here is my work to help you better understand my intention.

it was originally a sinlge doc “1” and I replicated it into multiple docs “2,3..”

and I wrote explanatory sentence for each line of sentence in doc 1 not right below the sentence 1 in the doc 1 but in the sentences in doc 2 and deleted an actual sentence after I wrote an explanation in doc 2 so there are only explanations left in doc 2 .  so original content in doc 1 and explanation in doc 2 totally separated and kept in perfect order. And I want now to match the sentence to its explanation.

 

I did this because it would be easier to keep them separated and combine then to separate them after working every thing on Doc 1

 

By the way all the sentences end with dot and they are kept in perfect order .

 

I have looked several extract VBA macros and there weren’t any that could do the job and modification would be in the level I cannot even dream of

 

Any help will be appreciated

Thanks!



* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.

How many documents are involved and how is their order identified?
How many sentences are involved?
When you say 'sentences' do you actually mean 'paragraphs'?
Graham Mayor (Microsoft Word MVP 2002-2019)
For more Word tips and downloads visit my web site
https://www.gmayor.com/Word_pages.htm

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Q. How many documents are involved and how is their order identified?

it will be 3 Docs for now but it might become 4 maximum 

following is the documents that need to be combined into a single new doc.  

1. Doc1 containing originial sentence
2. Doc2 containing explanation for Doc 1  
3. Doc3 containing explanations for Doc 1 ( from different author , with whole sentence in different colors ) 
4. Doc4 containing explanations for Doc 1 ( from different author , with whole sentence in different colors ) - undecided. 


Q. How many sentences are involved?

I am working on a book. novel , and selfdevelopment and so on 

so there are as many senteces as in a novel. 

say I am doing this for 1984 by George Orwell 

I couldn't count the numbers of sentences but the program tells me that there are 103,990 words and there might be about 10,000 sentences taking that each sentece comprises of 10 words or so . 

Q. When you say 'sentences' do you actually mean 'paragraphs'?

I meant sentence. I tried with paragraph then it was really hard to match the content with the explanation. and other followings . so I decided to do it with sentences. 

For each original sentence,  there will follow 2~3 sentences with each sentence starting in a new line right below the original sentence ( as shown above and below ). it would be nice to give double spaces between each sentence group. 

Doc New

Sentence A

Sentence 1

Sentence !

 

Sentence B

Sentence 2

Sentence @

 

Sentence 3

Sentence C

Sentence #


thank you for asking!!!  


Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

I think that the following may work for you. I have only included three docs, but it would be simple enough to add a fourth. Change the paths and filenames as appropriate

Sub CombineDocs()
Const strDoc1 As String = "C:\path\filename1.docx"
Const strDoc2 As String = "C:\path\filename2.docx"
Const strDoc3 As String = "C:\path\filename3.docx"
Dim oDoc1 As Document
Dim oDoc2 As Document
Dim oDoc3 As Document
Dim oTarget As Document
Dim oSentence1 As Range
Dim oSentence2 As Range
Dim oSentence3 As Range
Dim i As Long
    Set oTarget = Documents.Add
    Set oDoc1 = Documents.Open(Filename:=strDoc1, AddToRecentFiles:=False, Visible:=False)
    Set oDoc2 = Documents.Open(Filename:=strDoc2, AddToRecentFiles:=False, Visible:=False)
    Set oDoc3 = Documents.Open(Filename:=strDoc3, AddToRecentFiles:=False, Visible:=False)
    For i = 1 To oDoc1.Sentences.Count
        oTarget.Range.InsertAfter Replace(oDoc1.Sentences(i), Chr(13), "") & vbCr
        oTarget.Range.InsertAfter Replace(oDoc2.Sentences(i), Chr(13), "") & vbCr
        oTarget.Range.InsertAfter Replace(oDoc3.Sentences(i), Chr(13), "") & vbCr
    Next i
    oDoc1.Close 0
    oDoc2.Close 0
    oDoc3.Close 0
    Set oDoc1 = Nothing
    Set oDoc2 = Nothing
    Set oDoc3 = Nothing
    Set oTarget = Nothing
    Set oSentence1 = Nothing
    Set oSentence2 = Nothing
    Set oSentence3 = Nothing
End Sub


http://www.gmayor.com/installing_macro.htm   
Graham Mayor (Microsoft Word MVP 2002-2019)
For more Word tips and downloads visit my web site
https://www.gmayor.com/Word_pages.htm

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

many thanks! 

it works! 

I really appreicate your help! 

but, when the macro brings the sentences into a new file, it removes all the settings of sentences( such as colors , other black block thing, don;t what to call it. ) 

I forgot to mention that there will be features, such as words highlighted and some other words subscripted and some words will be underlined in colors. 

will there be any way I can bring sentences in each file as they were in each file ? 

thanks you~! 

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Word will have real difficulty doing this through vba as the vba object model does not have a conventional understanding of what a sentence is. For example:

I went to Mr. Hill, who was speaking with Mrs. Smith about Christmas, New Year, etc., etc., ad nauseum.

would count as five sentences in vba, though you and I might only regard it as one.

Cheers
Paul Edstein
(Fmr MS MVP - Word)

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

I have to agree with Paul - and this was the main reason I asked if your 'sentences' were in fact 'paragraphs'. With paragraphs, what you asked would have been much easier to determine, but with sentences which are not so strictly defined, you would have to allow for every variation in what the perceived sentence was from what VBA indicates that it is. This is barely practical in a short example. In a book length document it would be impossible.
Graham Mayor (Microsoft Word MVP 2002-2019)
For more Word tips and downloads visit my web site
https://www.gmayor.com/Word_pages.htm

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

thank you for pointing that out 

that is a good point 

but, the way I worked with the doc will allow me not having to consider that 

although there are 3~4 files, all of them derives from an original file that the consistency will be kept whether the system mistook it . 

it would not matter whether VBA mistook the sentences (as you point out) since all 4 docs will have the same punctuation in same position that however they mistook it the consistency will be kept. and I will be able to manually edit that to make sense, and I think it would be really minor compared to the manual labor expected to do without the code.  

for your example , it would look like below and it would be ok for me. 


I went to Mr. 
I went to Mr .  ( explanations of it ) 
I went to Mr.  ( other explanation of it with different color) 
Hill, who was speaking with Mrs.  
Hill, who was speaking with Mrs. explanations of it )
Hill, who was speaking with Mrs.  ( other explanation of it with different color) 
Smith about Christmas, New Year, etc.
Smith about Christmas, New Year, etc.explanations of it )
Smith about Christmas, New Year, etc. ( other explanation of it with different color) 
, etc.
, etc.explanations of it 
, etc.other explanation of it with different color)
, ad nauseum.
, ad nauseum.explanations of it 
, ad nauseum.other explanation of it with different color)


I would appreciate any help with this ! I can't imagine manually editing sentence by sentence for 10,000 sentences 

thank you for asking !!! 


Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

I seriously doubt that would work.

 

Unless your explanations are organised so that they match up with the periods following abbreviations and with colons & semi-colons, etc. (indeed, anything that Word VBA treats as a sentence delimiter) they'll immediately lose synchronisation. In your posted scenario, the explanations for

Hill, who was speaking with Mrs.

Smith about Christmas, New Year, etc.

, etc.

, ad nauseum.

would all come from different VBA 'sentences', not the same VBA 'sentence', because there's no way for the VBA code to differentiate a grammatical sentence from a VBA 'sentence'.

 

Equally problematic is that the explanations will lose synchronisation with the source immediately any of the 'explanation' files contains more than one VBA 'sentence' for a given source VBA 'sentence'.

Cheers
Paul Edstein
(Fmr MS MVP - Word)

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

now I get to understand the point! 

it took me several readings, though I do not think I fully understood you, I think I came up with solution for the biggest problem at hand.  

Here is the way to define sentences strictly,  I think. 

I did below find/replace for the docs I have and I belive each sentence is now defined as a paragraph. 

find: .  (Period + space) 
replace: . ^p^p ( period+^p^p) 

with above replacement, every sentence is distinguished with line break.  

given that the period synchronization is perfect, I belive above distinction would let me achieve my objective. 

I would appreciate any code that would do something even remotely close to accomplishing my goal, even if there is some manual labor required after the code is run.  

thanks! 

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

now I get to understand the point! 

it took me several readings, though I do not think I fully understood you, I think I came up with solution for the biggest problem at hand.  

Here is the way to define sentences strictly,  I think. 
 
I did below find/replace for the docs I have and I belive each sentence is now defined as a paragraph. 

find: .  (Period + space) 
replace: . ^p^p ( period+^p^p) 
 
with above replacement, every sentence is distinguished with line break. 

No, what that will give you is:

I went to Mr.

 

Hill, who was speaking with Mrs.

 

Smith about Christmas, New Year, etc.

 

, etc.

 

, ad nauseum.

Cheers
Paul Edstein
(Fmr MS MVP - Word)

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.

 
 

Question Info


Last updated October 5, 2021 Views 765 Applies to: