I am a teacher and have students who speak many different languages. The most common ones are Chinese, Spanish and Portuguese, but we have other folks speaking other languages as well.
I wish to translate all my notes, lecture subtitles, and topic-exercise documents to other languages. Moreover every year, I have to update my teaching material and include new stuff. So, muddling though everything manually is not much of an option as it might take up to two months just for this task.
Are there nice self hosted and libre/open source solutions out there for this task?
You can self-host libretranslate: https://libretranslate.com/
Looks like the engine behind it is https://opennmt.net/
Which can use a Tensor Flow backend, which can potentially be accelerated by a rather cheap Coral TPU. Neat!
Yes, just grab any recent LLM like Mistral-7B and ask it to translate for you. A local client is here https://github.com/LostRuins/koboldcpp but you might need a good GPU to get quick answers.
Alternatively use https://lite.koboldai.net to use someone else’s computer.
How trustworthy are LLM translations? Normal machine translation may lose context but I imagine LLM could make up shit?
All translations are LLM translations by this point I believe.
Translator here. They do make up stuff or omit stuff they don’t like. Machine translation is fine for tourists or to translate a ikea manual in the wrong language. If there are stakes, risky. They got good enough to make sentences that look right so it can be tricky to spot the errors if you don’t pay attention.
Numbers are typical errors. Sometimes it’s there but the number has changed. Sometimes it’s not there at all. Oh and if you have currencies a translators knows a document from the UK in pounds that is adapted for France will have to be converted in euros. Machines don’t.
Generally speaking when a client wants to use machine translation, it costs them more money in the end because of the extra time needed to correct everything to a high human grade standard.
From my experience: They’re pretty good
A simpler answer might be llamafile if you’re using Mac or Linux.
If you’re on windows you’re limited to some smaller LLMs without some work. In my experience the smaller LLMs are still pretty good as chat bots so they might translate well.
Please add a disclaimer to the documents stating it was machine translated. Machine translation can get it wrong or take liberties, make up stuff. Please inform your readers so they can be on the lookout.
Keep in mind the translated stuff by machine translation won’t be 100% what you say in your native language to other students. Be careful not to spread wrong information or knowledge.
deleted by creator