Jump to content

Scanning Books With A Digital Camera


dsys

Recommended Posts

I have some software sitting on my PC that can accept RAW/TIFF files fom digital cameras and convert the images to text/pdf. (given that the image is a photograph of a document that is!)

Now to get this up and running I need to go and get hold of a digital camera with the following specs:

1. Ability to save as either RAW or TIFF format

2. Good macro capability - (this means nothing to me)

3. 8 megapixels or higher

Personally I would prefer to have the camera as small as possible. The camera will only be used for scanning books.

Could somebody give me an idea of a good camera that would fit into these specs and idea on price. If worse comes to worst would go for a DSLR but prefer not to.

edit (Read RAW above as Adobe RAW, open format)

Edited by dsys
Link to comment
Share on other sites

I have some software sitting on my PC that can accept RAW/TIFF files fom digital cameras and convert the images to text/pdf. (given that the image is a photograph of a document that is!)

Now to get this up and running I need to go and get hold of a digital camera with the following specs:

1. Ability to save as either RAW or TIFF format

2. Good macro capability - (this means nothing to me)

3. 8 megapixels or higher

Personally I would prefer to have the camera as small as possible. The camera will only be used for scanning books.

Could somebody give me an idea of a good camera that would fit into these specs and idea on price. If worse comes to worst would go for a DSLR but prefer not to.

edit (Read RAW above as Adobe RAW, open format)

Buy a scanner, saves you A LOT OF WORK, how do you plan to keep the pages straight enough to get a correct picture which again the OCR program would read?

Link to comment
Share on other sites

I have some software sitting on my PC that can accept RAW/TIFF files fom digital cameras and convert the images to text/pdf. (given that the image is a photograph of a document that is!)

Now to get this up and running I need to go and get hold of a digital camera with the following specs:

1. Ability to save as either RAW or TIFF format

2. Good macro capability - (this means nothing to me)

3. 8 megapixels or higher

Personally I would prefer to have the camera as small as possible. The camera will only be used for scanning books.

Could somebody give me an idea of a good camera that would fit into these specs and idea on price. If worse comes to worst would go for a DSLR but prefer not to.

edit (Read RAW above as Adobe RAW, open format)

Buy a scanner, saves you A LOT OF WORK, how do you plan to keep the pages straight enough to get a correct picture which again the OCR program would read?

Scanning books with a flatbed scanner is not very efficient.

The page orientation/curvature is sorted in software. Also scanners are not very portable. Not easy to pull one out, scan a doc and stow it in a laptop case.

Here's an example of a basic program that does this. Note that this is not the best solution out there.

http://snapter.atiz.com/

Edited by dsys
Link to comment
Share on other sites

Scanning books with a flatbed scanner is not very efficient.

The page orientation/curvature is sorted in software. Also scanners are not very portable. Not easy to pull one out, scan a doc and stow it in a laptop case.

Here's an example of a basic program that does this. Note that this is not the best solution out there.

http://snapter.atiz.com/

And it is efficient to use a camera? You are talking about a book in your post, either you are a troll or you have absolute no idea what you are talking about.

OCR software has existed since early 90s, so I really don't understand your issue?

Link to comment
Share on other sites

Scanning books with a flatbed scanner is not very efficient.

The page orientation/curvature is sorted in software. Also scanners are not very portable. Not easy to pull one out, scan a doc and stow it in a laptop case.

Here's an example of a basic program that does this. Note that this is not the best solution out there.

http://snapter.atiz.com/

And it is efficient to use a camera? You are talking about a book in your post, either you are a troll or you have absolute no idea what you are talking about.

OCR software has existed since early 90s, so I really don't understand your issue?

Most scanners are pretty slow. For each new page, you have to lift the book out, turn the page, position it on the scanner, close the lid, wait for the scvaanner to scan etc...

Wit a camera on a tripod, he could just turn to the next page and snap a picture. Yup, sounds pretty efficient to me compared to a scanner.

Try the higher end canon cameras, they use RAW as far as I know.

Link to comment
Share on other sites

@person calling me a troll

Actually it is a lot more efficient - ask any archivist or goolge who spent a couple of million dollars on the technology. Check this out http://www.bookscanbureau.co.uk/about-book-scanning.htm

The issue is exactly what Gimbo brought up.

@Gimbo. Thanks for that by high end are you talking dSLR or will the (ultra)compact fit this bill as well.

Link to comment
Share on other sites

There are some pretty good, high-res flatbed scanners on the market (under $200 U.S.) out there that include good quality OCR software.

One advantage to using a scanner is that the scanning and OCR process can be a unified one step automated process, all handled by the scanner software with each two-page scan. It also benefits from you having constant and consistent exposures for your scanned image files, instead of having to struggle with camera distances and lighting variations.

From my years in governmnt, whenever anybody needed to generate digital files of printed material, they always went the scanner route. I can't recall anyone I know ever going the route of using a digital camera to generate OCR output on a large scale document basis.

Link to comment
Share on other sites

How many books are you intending to scan?

Are these books a collection of yours that you wish to preserve or are you borrowing them?

What's your subsequent intention having digitised them?

The answers to the above may enable us to advise more comprehensively.

Link to comment
Share on other sites

Most, if not all cheaper cameras save as jpg as their only option. Is it really necessary to have 8 megapixel or beter images? OCR with a scanner is often done as fairly low res scans. Why not just try snapping away with a cheap 'point and shoot' camera in macro mode and then do a bulk resave of the jpg images as TIFF's and see how that goes? A small tripod and some additional lighting would definitely help. If the solution needs to be portable you could try strapping a wide beam flashlight to the tripod too for extra lighting.

Link to comment
Share on other sites

How many books are you intending to scan?

Are these books a collection of yours that you wish to preserve or are you borrowing them?

What's your subsequent intention having digitised them?

The answers to the above may enable us to advise more comprehensively.

@the vulcan

Books are mainly text books, owned. Reason(s); archive, make portable and searchable.

How many about 150. Average size 400 pages. Some are 1000 pages.

Ocassional scan of one page documents when out and about (instead of photocopy)

@jchandler

I have a scanner already that I use for documents with omnipage pro, dont fancy cutting the books up to get them in the auto feeder, or sit for ages flipping and turning pages. Thanks for the info on macro.

Link to comment
Share on other sites

Most, if not all cheaper cameras save as jpg as their only option. Is it really necessary to have 8 megapixel or beter images? OCR with a scanner is often done as fairly low res scans. Why not just try snapping away with a cheap 'point and shoot' camera in macro mode and then do a bulk resave of the jpg images as TIFF's and see how that goes? A small tripod and some additional lighting would definitely help. If the solution needs to be portable you could try strapping a wide beam flashlight to the tripod too for extra lighting.

Specs I was given suggested 8MP or greater. Tried it already with a 2MP camera I have here and didn't work out too well. Not sure that I tried it in macro mode though. Will try again with the 2MP in macro mode see if it makes any difference. Also try the flashlight, don't have a tripod at the moment.

Link to comment
Share on other sites

@person calling me a troll

Actually it is a lot more efficient - ask any archivist or goolge who spent a couple of million dollars on the technology. Check this out http://www.bookscanbureau.co.uk/about-book-scanning.htm

The issue is exactly what Gimbo brought up.

@Gimbo. Thanks for that by high end are you talking dSLR or will the (ultra)compact fit this bill as well.

The Canon Powershot series. Many of them have RAW. I am not sure wich ones are more than 8 Megapixel cameras though.

Here is a link comparing models:

http://www.usa.canon.com/consumer/controll...p;modelid=15669

And here is one of the high end models:

http://www.usa.canon.com/consumer/controll...p;modelid=15669

Link to comment
Share on other sites

HELLO?

You guys seem to like to surf the net, the link you provided is for a scanner - a scanner is simply nothing else than a kind camera and a system. In regards of your comments, I never suggested a flatbed scanner, there are numerous solutions, which you certainly will find by keep googling.

And needing 8 megapixel, why? We used 1.4 megapixel cameras (Olympus C-1400L) in 1997 to OCR single documents which worked perfect.

Link to comment
Share on other sites

@Gimbo

Thanks for that - cant get the links working at the moment but the title suggests its the powershot G9 - is that the model you suggest?

Yes, the Powershot G9. Has RAW and is 12.1 MP. Just google for the Canon Powershot series.

Link to comment
Share on other sites

HELLO?

You guys seem to like to surf the net, the link you provided is for a scanner - a scanner is simply nothing else than a kind camera and a system. In regards of your comments, I never suggested a flatbed scanner, there are numerous solutions, which you certainly will find by keep googling.

And needing 8 megapixel, why? We used 1.4 megapixel cameras (Olympus C-1400L) in 1997 to OCR single documents which worked perfect.

Yes, book scanners do exist, that are fast to use. But since he specifies that sice is a concern (wanting a compact camera), and also above 8MP camera, thats what I help him find.

If he wanted a scanner, I think he would ask for that.

Link to comment
Share on other sites

A few people have questioned why i need an 8MP camera. Well the answer was I dont know, so I gave the tech support guys a mail to ask why. The answer I got is this.

" To get reliable OCR it is best to scan at 300DPI. If you consider a one inch square at 300DPI it will contain 300x300 dots or 90,000 dots or 0.09Megapixels. We consider an average scan size to be 8 inches by 10 inches giving 80 x 0.09 = 7.2 megapixels. We rounded this up to 8 to be on the safe side."

Kind of makes sense to me.

Edited by dsys
Link to comment
Share on other sites

Use a piece of glass to press down each page as you go, this will help you avoid curves in your text lines and keep everything in the same focal plain. Put equal lights at 45 degrees to your book, try not to make a shadow. If you shoot through a hole in a dark cloth you will avoid some of the reflection of the camera in the glass. If you have someone turn the pages for you, you can rip through a book in no time.

don't worry about shooting raw or tiff. There are many programs that can convert a jpeg to a tiff, but I can't imagine that is necessary. Text reading programs work fine with jpg's. What you need most is a close-focusing camera and a tripod thaat lets you shoot straight down.

Link to comment
Share on other sites

Use a piece of glass to press down each page as you go, this will help you avoid curves in your text lines and keep everything in the same focal plain. Put equal lights at 45 degrees to your book, try not to make a shadow. If you shoot through a hole in a dark cloth you will avoid some of the reflection of the camera in the glass. If you have someone turn the pages for you, you can rip through a book in no time.

don't worry about shooting raw or tiff. There are many programs that can convert a jpeg to a tiff, but I can't imagine that is necessary. Text reading programs work fine with jpg's. What you need most is a close-focusing camera and a tripod thaat lets you shoot straight down.

@canuckamuck

Thanks for those tips. As you can probably guess I'm not a photographer.

The glass idea is a great one, hole in a dark cloth never would have thought of that; How big would you suggest the hole be? Or is it best to experiment - say start with half the lens diameter and work from there? Or is there some general rule to work this out? Or does the size of hole not matter?

Link to comment
Share on other sites

Shoot your normal .jpegs then use a program like Irfanview for a batch conversion to TIFFs. You'll have to do some experiments to get the right settings for the conversion.

Make sure you've got even lighting across the page; the TIFF conversion is fairly sensitive and you'll get black splotches if any areas of a page are less bright than others.

If you're only looking to archive the books and don't need text-search capabilities, then just convert the jpegs to black-and-white .gifs, which -- unlike TIFFs -- don't take up much disk space. You'll save loads of time if you can avoid dealing with OCR.

Link to comment
Share on other sites

Use a piece of glass to press down each page as you go, this will help you avoid curves in your text lines and keep everything in the same focal plain. Put equal lights at 45 degrees to your book, try not to make a shadow. If you shoot through a hole in a dark cloth you will avoid some of the reflection of the camera in the glass. If you have someone turn the pages for you, you can rip through a book in no time.

don't worry about shooting raw or tiff. There are many programs that can convert a jpeg to a tiff, but I can't imagine that is necessary. Text reading programs work fine with jpg's. What you need most is a close-focusing camera and a tripod thaat lets you shoot straight down.

@canuckamuck

Thanks for those tips. As you can probably guess I'm not a photographer.

The glass idea is a great one, hole in a dark cloth never would have thought of that; How big would you suggest the hole be? Or is it best to experiment - say start with half the lens diameter and work from there? Or is there some general rule to work this out? Or does the size of hole not matter?

Poke the lens through the hole, or have the hole big enough so that it doesn't interfere with the line of sight between the lens and the book.

Link to comment
Share on other sites

A few people have questioned why i need an 8MP camera. Well the answer was I dont know, so I gave the tech support guys a mail to ask why. The answer I got is this.

" To get reliable OCR it is best to scan at 300DPI. If you consider a one inch square at 300DPI it will contain 300x300 dots or 90,000 dots or 0.09Megapixels. We consider an average scan size to be 8 inches by 10 inches giving 80 x 0.09 = 7.2 megapixels. We rounded this up to 8 to be on the safe side."

Kind of makes sense to me.

What a lot of rubbish. Ask them how we could possibly do OCR 10-15 years ago? Maybe time to change software? Lots of free stuff that does the job.

Link to comment
Share on other sites

I have some software sitting on my PC that can accept RAW/TIFF files fom digital cameras and convert the images to text/pdf. (given that the image is a photograph of a document that is!)

Now to get this up and running I need to go and get hold of a digital camera with the following specs:

1. Ability to save as either RAW or TIFF format

2. Good macro capability - (this means nothing to me)

3. 8 megapixels or higher

Personally I would prefer to have the camera as small as possible. The camera will only be used for scanning books.

Could somebody give me an idea of a good camera that would fit into these specs and idea on price. If worse comes to worst would go for a DSLR but prefer not to.

edit (Read RAW above as Adobe RAW, open format)

Hi DSYS,

I dont know what type of mobile phone you have but it's worth checking the link below because i think that this might be a solution to your problem.

Photo link

Cheers, Rick

Link to comment
Share on other sites

Now to get this up and running I need to go and get hold of a digital camera with the following specs:

1. Ability to save as either RAW or TIFF format

2. Good macro capability - (this means nothing to me)

3. 8 megapixels or higher

Hi DSYS,

I dont know what type of mobile phone you have but it's worth checking the link below because i think that this might be a solution to your problem.

Photo link

Cheers, Rick

That is a link to the Nokia N95 phone. Does this fit his needs? Let's see......

Point 1. No

Point 2. No.

Point 3. No

Nice phone, but what this guy needs is a CAMERA.

Link to comment
Share on other sites

A few people have questioned why i need an 8MP camera. Well the answer was I dont know, so I gave the tech support guys a mail to ask why. The answer I got is this.

" To get reliable OCR it is best to scan at 300DPI. If you consider a one inch square at 300DPI it will contain 300x300 dots or 90,000 dots or 0.09Megapixels. We consider an average scan size to be 8 inches by 10 inches giving 80 x 0.09 = 7.2 megapixels. We rounded this up to 8 to be on the safe side."

Kind of makes sense to me.

What a lot of rubbish. Ask them how we could possibly do OCR 10-15 years ago? Maybe time to change software? Lots of free stuff that does the job.

OK I bow to your superior knowedge in this. My understanding was that the first scanner came out in the 50's and had a 30(ish)dpi resolution.Black and white only. I remember being in university and they had scanners that did 1200dpi, that was more than 15 years ago.

I'd be interested at looking at the software that you suggest especially since it is free - could you provide a link(s).

Also would be usefull for me if you could point out where the calculations quoted above have gone astray, if they are wrong I'll hand back my MSC in maths. They could be wrong because the basic premise is wrong - could you point out the flaw. As they say "we live and learn" and I'm always open to learning from other people.

Oh wait a sec, I was rather slow in this one cause you acussed me of being a troll.

Another added to the ignore list

Link to comment
Share on other sites

Just wanted to say thanks for all of your suggestions and input. going to go with the model that was suggested. Have a flight to the us next month so will try to get it over there, appears to be the cheaper option

thanks again

Link to comment
Share on other sites

Now to get this up and running I need to go and get hold of a digital camera with the following specs:

1. Ability to save as either RAW or TIFF format

2. Good macro capability - (this means nothing to me)

3. 8 megapixels or higher

Hi DSYS,

I dont know what type of mobile phone you have but it's worth checking the link below because i think that this might be a solution to your problem.

Photo link

Cheers, Rick

That is a link to the Nokia N95 phone. Does this fit his needs? Let's see......

Point 1. No

Point 2. No.

Point 3. No

Nice phone, but what this guy needs is a CAMERA.

A slightly hasty reply. The Nokia has a 5 megapixel camera and the link is to service for scanning documents via the camera. You should read before you reply.

Rick

Link to comment
Share on other sites

OK I bow to your superior knowedge in this. My understanding was that the first scanner came out in the 50's and had a 30(ish)dpi resolution.Black and white only. I remember being in university and they had scanners that did 1200dpi, that was more than 15 years ago.

So? Are you able to read english? Able to distinguish between digital camera, resolution, scanner and the various ccd technologies used here?

I'd be interested at looking at the software that you suggest especially since it is free - could you provide a link(s).

download.com, tucows.com lots of freebies.

Also would be usefull for me if you could point out where the calculations quoted above have gone astray, if they are wrong I'll hand back my MSC in maths. They could be wrong because the basic premise is wrong - could you point out the flaw. As they say "we live and learn" and I'm always open to learning from other people.

Gee, you are nut ain't u? http://en.wikipedia.org/wiki/Optical_character_recognition read and learn.

Edited by kash
Link to comment
Share on other sites

quote name='dsys' date='2008-02-03 21:36:21' post='1798983']

I have some software sitting on my PC that can accept RAW/TIFF files fom digital cameras and convert the images to text/pdf. (given that the image is a photograph of a document that is!)

Now to get this up and running I need to go and get hold of a digital camera with the following specs:

1. Ability to save as either RAW or TIFF format

2. Good macro capability - (this means nothing to me)

3. 8 megapixels or higher

Personally I would prefer to have the camera as small as possible. The camera will only be used for scanning books.

Could somebody give me an idea of a good camera that would fit into these specs and idea on price. If worse comes to worst would go for a DSLR but prefer not to.

edit (Read RAW above as Adobe RAW, open format)

Here is an example of a document scanned with my nokia N95 camera phone and the scanned using the Qipit on line service. I think that the quality is pretty good.

qipit.pdf

Cheers, Rick

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.





×
×
  • Create New...