Are there any tools I can use for translating a ~400 pages scanned book?

morto@piefed.social · edit-2 1 day ago

Are there any tools I can use for translating a ~400 pages scanned book?

candyman337@lemmy.world · 3 hours ago

Locked for rule 5

starlinguk@lemmy.world · 11 hours ago

Yes. Pay a translator.

andrew0@lemmy.dbzer0.com · 21 hours ago

If you find that OCR doesn’t get you very far, maybe try a small vLM to parse PNGs of the pages. For example, Nanonets OCR will do this, although quite slow if you don’t have a GPU. It will give you a Markdown version of the page, which you can then translate with another tool.

PaddleOCR might also be useful, since it focuses on Chinese, but it’s more difficult to set up. To add to this, some other options are MinerU and MistralOCR (this is paid, but you can test it for free if you upload it in Mistral’s library).

morto@piefed.social · 18 hours ago

That PaddleOCR looks very interesting. It will even extract images and formulas and somewhat preserve formatting in the output! I will try this one, even if takes more than a day to process is with my low end cpu. Thank you for the suggestion!

andrew0@lemmy.dbzer0.com · 14 hours ago

Be wary that their docs are so and so. Nanonets OCR, Mistral OCR and MinerU will also extract formulas and images.

One other model I forgot to mention is Docling. This one is quite quick to set up in a docker container, and will have a web interface ready to go where you can upload documents. This sort of follows the PaddleOCR pipeline, but also allows you to use vLMs.

Good luck!

BurgerBaron@piefed.social · 23 hours ago

This is more intended for real time usage, but might work for you:

https://github.com/Artikash/Textractor

https://github.com/Crivella/ocr_translate

I watch Macaw45 play full fledged Japanese retro RPG games using Textractor it’d probably be good for books too.

morto@piefed.social · 18 hours ago

Thanks for the suggestions. That OCR_translate looks interesting. I will prioritize other recommended tools that seem to be more focused on books, but I bookmarked it for future needs.

BlameThePeacock@lemmy.ca · 1 day ago

You can literally just feed the images into chat gpt at this point.

morto@piefed.social · 18 hours ago

I’m giving preference to open source tools, but that’s a good thing to know, thanks

mesa@piefed.social · 20 hours ago

Every time I’ve done it, it’s pretty bad. Ocr is much better.

thebestaquaman@lemmy.world · 24 hours ago

This doesn’t work after the pdf reaches a cert max size.

BlameThePeacock@lemmy.ca · 22 hours ago

Could just break it up into chapters or something, pretty easy to split a pdf.

gramie@lemmy.ca · 22 hours ago

Which Google lens work? And take a picture of each page and feed it to the Google translate engine. It might be the easiest way.

morto@piefed.social · 18 hours ago

I’m not sure if it would be viable for a long book, and I’m also avoiding google, but thanks for helping. I got some nice suggestions in this thread.

lemmyuser68@sopuli.xyz · 1 day ago

notebooklm (Google)

morto@piefed.social · 18 hours ago

Well, I’m avoiding google, but I will keep it in mind as a last last resort, thanks