Saving multiple embedded files in an Excel spreadsheet

I have a large Excel spreadsheet that has about 50 embedded documents in it (The files are actually embedded in the spreadsheet - they are not a link to a file stored somewhere else).  Would someone know of a way to easily save all those embedded documents as individual files outside of the spreadsheet (some are excel spreadsheets, some are in Adobe format, some are WORD documents, etc)?  In other words, I would like to "strip" all the embedded documents out of the spreadsheet and save them to my harddrive as individual files.  I am wondering if there is an easy way of doing this (e.g.  highlight all the embedded documents, right click and say "save").  Currently, the only way I know to do it is to individually open each document and then save it.  I am willing to consider some kind of VBA code to do it, if someone has programmed it.

Thanks

Here's a partial solution that doesn't use programming. You can get the Word documents, but this will most likely not work with the PDFs:

  1. Make a copy of the Excel file to a new folder.
  2. Edit the file name, adding .zip on the end.
  3. Click OK when Windows warns you about changing file endings.
  4. Right-click and expand the file.
  5. Open the xl folder, then the embeddings folder.

The Word files will be there with their original file endings. The PDFs will probably be saved as something like oleObject1.bin. While some other file types can be rescued by changing the file ending back to the original, this doesn't seem to work with PDFs. In my trials, Acrobat can't open them.

Here's a discussion of programmatic extraction from Excel. There's a macro about 2/3 the way down that one user says worked for them: Extract PDF from Excel

Author of "OOXML Hacking - Unlocking Microsoft Office's Secrets", ebook now out
John Korchok, Production Manager
production@brandwares.com

19 people found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Thank You.  You are right, this method works great for Word documents.  However, it is hit and miss for embedded Excel spreadsheets and won't do anything with PDF files.  Unfortunately, of the 50 embedded documents only 6 are Word documents.

Oddly, it will open up some of the Embedded Excel spreadsheets, but others just open to a blank spreadsheet.  I can't tell the difference between the ones it opens fine and the ones that are blank.

1 person found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

There's a way to extract the PDF documents similarly.  They'll have .BIN extensions once you crack open the file as John's explained.

You can open the extracted PDF files in a text editor (NOT Word).  Locate the first instance of this:

%PDF-1.5

(it might be 1.4 or some other number).

Select and delete everything from the beginning of the file up to (but not including %PDF-) then save to a new file name with a PDF extension.

You should now be able to doubleclick the file and view the PDF.

You can probably do something similar with the other files in the same folder, assuming you can work out what they should have been originally.  Most of the files from any modern version of Office will actually be ZIP files, so instead of the %PDF, you're looking for PK followed by 03 04 14 00 bytes.  

If you extract that much, save the file, then doubleclick it, it should open as a ZIP, and by looking at the innards thereof, it should be fairly obvious whether it's a PowerPoint, Excel or other XML-based file.

PowerPoint Help: https://www.pptfaq.com/
PPTools: https://www.pptools.com
Presentation Guild: https://presentationguild.org

3 people found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Thank you for the help.  However, since there are about 15 embedded PDF files and about 20 embedded excel files, your method sounds like it would take longer than just opening each file and saving.  My main purpose of wanting to do this is because the "parent" spreadsheet with all the embedded "children" is about 300meg in size.  I was wanting a way of stripping out the embedded files, saving them, leaving the parent spreadsheet a more manageable size - maybe 10meg.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

When I do a "Record Macro" and click on one of the attachments, then delete it, here is the code that is written. 

Sub Get_embed()
'
' Get_embed Macro
' Open all embedded files and delete from spreadsheet
'
' Keyboard Shortcut: Ctrl+Shift+A
'
    ActiveSheet.Shapes.Range(Array("Object 20")).Select
    Selection.Verb Verb:=xlPrimary

    Selection.Delete
End Sub

I am thinking if I can put that in a loop where the "ObjectXX" Array name is incremented each time, I can at least get all the files opened.  However, there are a couple of things I am not good enough to code.

1.  How do I increment the array name when it is between quotation marks (e.g. Object 1, Object 2, Object, 3, etc).

2.  I would need an error handling routine if it cant find an Object.  For instance, there is NO "Object 3".  Also the loop would have to automatically stop when it couldn't find anymore Objects to open.

3.  As you can see, it is using an "ActiveSheet" command.  I would need it to work on ALL the tabs in the spreadsheet (there are embedded files on several different tabs)

4.  I would like a way of canceling/Interruping the Macro mid-stream so that I don't have to wait until it opens 50 files, which would be unmanageable memory-wise and logistically.

Thanks

1 person found this reply helpful

·

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

 
 

Question Info


Last updated December 18, 2024 Views 25,306 Applies to: