Help setting new margins on scanned text

Everything related to our flagship word processor.
Post Reply
Warren Rogers
Posts: 5
Joined: 2012-12-27 15:53:07

Help setting new margins on scanned text

Post by Warren Rogers »

Hi everyone,
I'm fairly new to this forum. I've posted one topic on the Macro list and now this. Anyway, my problem is: I'm scanning some books for my iPad. I've got ABBYY FineReader Express which is doing a fairly good job (compared to TesseractOCR which was fairly abysmal). I've got the FineReader to do huge blocks of text automatically which is great. But it formats them in the original (paperback) narrow format and I can't get rid of it! How do I do that in Nisus?

thanks,

Warren Rogers
User avatar
xiamenese
Posts: 547
Joined: 2006-12-08 00:46:44
Location: London or Exeter, UK

Re: Help setting new margins on scanned text

Post by xiamenese »

Warren Rogers wrote:Hi everyone,
I'm fairly new to this forum. I've posted one topic on the Macro list and now this. Anyway, my problem is: I'm scanning some books for my iPad. I've got ABBYY FineReader Express which is doing a fairly good job (compared to TesseractOCR which was fairly abysmal). I've got the FineReader to do huge blocks of text automatically which is great. But it formats them in the original (paperback) narrow format and I can't get rid of it! How do I do that in Nisus?

thanks,

Warren Rogers
If it's anything like the result of doing OCR in PDFPen Pro, each line will end with a carriage return. Turn on "Show Invisibles" which will show you if I'm right. If so, you'll need to get rid of the excess ones. If you have a blank line between paragraphs, it's quite easy ... search and replace two carriage returns with a glyph that doesn't appear in the text (I usually use @), search and replace one carriage return with space, search and replace @-or-whatever with carriage return, search and replace two spaces with one space (Just in case there were space characters floating just before the carriage returns, which would give you two when you run the above). Macroize that and it should go swimmingly. Erm ... there may already be a macro set up to do this ... I haven't looked.

If there isn't a blank line between paragraphs, it's going to be more time consuming and frustrating replacing only the carriage returns marking the ends of paragraphs with the @-or-whatever by hand.

Others may have even better ideas.

Mark
Warren Rogers
Posts: 5
Joined: 2012-12-27 15:53:07

Re: Help setting new margins on scanned text

Post by Warren Rogers »

Hi Mark.
Thanks for answering so quickly. I didn't make myself very clear. What you suggest is quite good and I wish I had your response a few weeks ago when I was still struggling with TesseractOCR!
But my problem is more complicated now. The new ABBYY-scans have the page width set at the paperback width. It's at about 2 and 1/2 inches on the left and 5 and 3/4 inches on the right. I can change that on any given page. I actually tired selecting the entire document and then setting them on the first page. Didn't work. It only changed the first page. I know I can do the whole document because I accidently did it on another paperback scan. Unfortunately, I didn't realize then that it was important! Story of my life.
So, does anyone know how to change the page width in these circumstances? 300 plus pages of scanned text that are set at 2 and 1/2 and 5 and 3/4 which I want to reset at 1 inch and 7 and 1/2 inches.

thanks,

Warren
User avatar
phspaelti
Posts: 1319
Joined: 2007-02-07 00:58:12
Location: Japan

Re: Help setting new margins on scanned text

Post by phspaelti »

Is it possible that the OCR software is using section breaks instead of page breaks? My OCR software does this. My first step with OCRed pages is to do a find/replace "Any break" -> "page break".
Other than that I try to work with styles. Define a style (perhaps using the settings from one of the paragraphs) than apply it to all paragraphs. Then you can adjust the width by changing the style definition. How well this works depends on how complicated the formatting of the original is, and how much of it you want to save. My OCR software has an option to turn off its attempt to copy the formatting, and I often use that instead, and just reformat the outcome to my liking from scratch.
philip
Warren Rogers
Posts: 5
Joined: 2012-12-27 15:53:07

Re: Help setting new margins on scanned text

Post by Warren Rogers »

Thanks Phillip. I just came here to post my solution and I read yours. It's the same! Yup! I figured out that it was putting in Section Breaks after each page. So I did a Find and Replace switching the Section Break to a Page Break. And viola! It got rid of the narrow margins as well. Now I just have to find and replace all the spelling errors my OCR makes. But I look on the bright side. I got rid of all those nasty narrow margins! A step at a time.
Post Reply