I've been having trouble as of late with my OneNote search function. It was working great up until a month or two ago when it started highlighting the wrong stuff (or not highlighting anything at all even though it came up with search results). I love this function and would love to have it back.
I'm running OneNote 2010 on Windows 7 (x64). Any idea what's going on?
studpitcher21 - First of all thank you for posting this and sorry for the trouble.
Here is what is going on: we have a bug in OneNote 2010 (our apologies). When you print into OneNote we use an XPS technology to know what all of the text is in a printout, this is a mix of OCR and actually knowing what is on the page from the source file. This helps us have much more accurate search results when you print into OneNote. We still have the OCR engine for images inserted into OneNote but for printout we use this new XPS technology.
We found that under certain conditions when you insert a printout we will have all of the text in our index however when you search for text we don't highlight the results in the XPS (the printout you see on the page). So it is crazy you see OneNote say there is a result on the page and you see it move you to that page in the printout however you don't see any yellow!
What is happening is our x,y coordinate for where the text appears in the XPS is messed up, so we are highlighting something but it is just off the page and not in view in OneNote. This is our bug. Thankfully we have this logged (I think found from customers on this answers forum, yes we listen!) and the dev understands the problem and we are testing out a fix. We hope to have this in a future service pack.
But what about in the meantime? Thankfully Benoit and some other devs found a workaround which should work well for you or anyone else seeing this problem.
Now search should work for you on all of the pages in the printout. Of course the quality of the printout will help determine how well OCR works but in most cases for printed content it works quite well.
Please let us know if this works for you and again our apologies.
Today we released a non-security update for OneNote 2010, here are the details from the Microsoft Office SE blog:
A non-security update for OneNote 2010 32-bit/64-bit Editions was also released. This update provides fixes associated with displaying search results, fixes to optical character recognition (OCR), indexing, and displaying of inserted documents. Additional information can be found in the Microsoft Knowledge Base article Update for Microsoft OneNote 2010 (KB2493983).
Thanks all for your patience - apologies for the delay. By doing this through a public update we can reach a lot more users. Please let us know if you have other concerns/issues!
Check if the search feature is working fine with other office applications. If you’re facing the same issue with other office applications then try the steps below:
1) Close all office applications
2) Rebuild your index by clicking on ‘Start > Programs > Windows Desktop Search > click the dropdown button on the right-hand side near the help button > Desktop Search Options > Advanced > Rebuild’.
3) Now restart OneNote
Thanks for your suggestion! I tried that, but it didn't work. However, I tinkered around with some settings and I may have taken a step in the right direction.
For some reason, when I print to OneNote 2010, it inserts the printout with the search language set to 'disabled'. Once I right click on the page, then select 'Make Text in Image Searchable' and select 'English U.S', the search works perfectly.
However, it's requiring me to do this one every single page (not printout). I have printouts that are several hundred pages long... so this method is very inefficient.
Is there anyway for me to get printouts to be printed to OneNote 2010 with the default searchable text to be set to English?
I think this is a step in the right direction.
What is the language of the content present within the picture?
If the content is in different language or if the font style is not recognized, then the only option that is taken for ‘Make Text in Image Searchable’ is ‘Disabled’.
Also ensure to have the box ‘Disable text recognition in pictures’ unchecked. The option is found under ‘File > Options > Advanced > Text recognition in pictures’.
The ‘Disable text recognition in pictures’ box is unchecked. The language of the printouts is in English U.S.
I know I can select individual pages, right click, select "Make text in image searchable" then select 'English U.S." This works perfectly! However, when I select more than one page (Use Shift + Click another page) and right click, it un-selects all the other pages except the one I right clicked on.
Is there a way for me to keep all those pages selected so I can right click and select "Make text in image searchable" it can OCR all those pages at once?
Thank you for the help!
Also, this work around doesn't seem to work for the bug where it finds the word, but highlights the wrong word or nothing at all (despite finding what I'm looking for in the printouts).
It highlights the wrong word or nothing at all (takes me to a page that doesn't even contain the word at all or highlights blank space off the page) in all types of documents inserted to OneNote... Word, Excel, Powerpoint, etc...
The other problem where it doesn't recognize text when I have to right click each page of a printout for it to be OCR'd occurs with PDFs.
There is no way that you get to click and select objects from multiple pages together. You’ll able to work with the contents on one page at a time just like how we work in a book.
Provide more information with regards to the search and the OCR thingy that you’re trying to do.
I think you're referring to this:
When I print PDFs to OneNote, they are not OCRd (and in turn are not searchable). The only way for me to OCR them is for me to go through them page by page, right click, select "Make Text in Image Searchable" and then select "English U.S."
That is the only way PDFs are searchable within OneNote for me.
My other big issue is that when I print Word, Excel and Powerpoint files to OneNote, the search function doesn't work properly. It recognizes some, but not all, of the text. However, when the search is performed (and it finds the word I'm looking for), blank space on the side of a page (the incorrect page) is highlighted or nothing highlights at all despite the OCR software recognizing the word. =\
I also tried repairing my installation of OneNote which didn't work. Also, the search function works perfectly for all my other Office 2010 programs.
Here's a screen shot of my search for 'gingivitis'.
As you can see, for some reason it highlights stuff off the page, and in some instances it doesn't highlight at all despite finding the word in the printout! This only happens in OneNote; as I said before, the searches work perfectly in my other office applications.
P.S. I was lucky that gingivitis even appeared on the screen in the search. Usually the searched word isn't on the screen and OneNote randomly highlights white space on the side of the printout or nothing at all. I think that's only because it came up 17 times in that particular powerpoint.
I have also rebuilt the index several times and cleared my onenote cache as well. My next step is to uninstall office and try putting it on there again.
When you print to onenote, the OCR function will not work all by itself to start to recognize the characters in pictures. The only way to recognize the characters in the documents is to enable the ‘Make text in image searchable’ option for the pictures, one at a time and page by page.
Regarding the OCR scan there is nothing much that could be done because there is no way to control the way OCR works. The only thing that you can do is to look for a better OCR tool.
Actually, it should be automatic. The OCR should work in the background. It can take a few seconds, to a few minutes, but it should catch-up and eventually OCR everything you insert.
Could you try inserting a picture that contain text directly in OneNote and then doing a search? I'd like to know if it is specific to printouts or if it is all cases of OCRs.
I will ask a dev to take a look later today and see what could be going on.
* Please try a lower page number.
* Please enter only numbers.