Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
therepo90
Posts: 6
Joined: Wed Apr 24, 2024 8:11 pm

Subtitles translation with AI

Wed Apr 24, 2024 8:18 pm

Hey, Just created subtitles translation AI powered . Feel free to use it. Hope it doesnt break the forum rules(its free)
https://translatesubtitles.org

User avatar
oss
Site Admin
Posts: 5918
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Subtitles translation with AI

Fri Apr 26, 2024 6:29 am

hi

you can describe project, how it is working, whole file needs to be uploaded, what about if I have 1 GB of movie, which AI models are used etc...

therepo90
Posts: 6
Joined: Wed Apr 24, 2024 8:11 pm

Re: Subtitles translation with AI

Tue May 21, 2024 8:51 pm

Well its using custom compression method plus publicly available(yet priced) AI model
Everything is on the page - you can upload whatever size you want. For smaller ones its free if quota available.

User avatar
OhItsStefan
Posts: 12
Joined: Sat Jan 06, 2024 7:36 pm

Re: Subtitles translation with AI

Wed May 22, 2024 11:35 am

Just tried it out, it gets fairly accurate results. That being said, it does not produce subtitles that are of acceptable quality, even if the source file is.

I found that it's mostly the things all AI translations suffer from;
- Lack of consistent CPS
- Translating per sentence (ignoring the line breaks)
- Breaking the line in odd places (3 words top, 6 long words on the second line?)
- More characters on one line than the recommended 43

This is my main complaint with AI translations and why I think they should not be allowed on the site. It's fine for personal use but uploading it online for others to use just creates more issues since they're not of sufficient quality.

therepo90
Posts: 6
Joined: Wed Apr 24, 2024 8:11 pm

Re: Subtitles translation with AI

Thu May 23, 2024 2:58 pm

Thank you for testing that.
Yeah its in testing phase right now. It depends on the source quality of subtitles as well.

Can you paste me the subtitles file so I can work more on the line breaks?
What do you mean consistent CPS?

User avatar
OhItsStefan
Posts: 12
Joined: Sat Jan 06, 2024 7:36 pm

Re: Subtitles translation with AI

Thu Jun 06, 2024 2:28 pm

Yeah, I bet the original file does have quite the effect on the final result. As for the ones I've tested, they are mainly my own subtitles (Attack on Titan S02E01, Attack on Titan S02E02), that way I had something familiar to compare it to, because I made them from scratch. As for the quality of those, they are consistent with the guidelines for the subtitles on this website, max CPS of 18, consistent 150ms gap, max 42 characters per line, etc.

Still, despite the source being good quality, some issues persist. Those I've mentioned, especially the line breaking, to mention a couple examples of the subtitle I used for reference;
11
00:01:12,758 --> 00:01:17,185
managed to capture one of these creatures
captured alive.

59
00:06:21,975 --> 00:06:27,147
They would probably
know more than you know anyway.

103
00:09:53,619 --> 00:09:57,744
Why should I care
about people who have always looked down on us?

231
00:23:44,380 --> 00:23:48,562
That's what Conny is thinking to reduce the shock
of what he has seen.
To a minor extent, the translations are also lacking, likely due to the line breaks.
3
00:00:25,839 --> 00:00:29,662
People got used to
to living like cattle.

22
00:02:20,973 --> 00:02:23,710
Oh no, this doesn't mean anyway....

30
00:04:28,126 --> 00:04:31,354
Apologies, but what are
we all think of this?

49
00:05:48,014 --> 00:05:50,205
Is straight down right?

12
00:00:58,051 --> 00:01:01,634
Their origins and how they got here
have come remain a mystery.

122
00:11:19,407 --> 00:11:24,145
The heritage of your ancestors
Swap for civilization?

209
00:20:29,759 --> 00:20:31,423
Conny, drive softer.

236
00:24:05,726 --> 00:24:10,117
Next installment:
"Southwest"
These are just a few examples I managed to pick out in a short amount of time. Like I said before, AI translations are cool in theory, but in practice, they just can't replace the manual process because they contain too many mistakes. Both technical and grammatical.

therepo90
Posts: 6
Joined: Wed Apr 24, 2024 8:11 pm

Re: Subtitles translation with AI

Mon Jun 10, 2024 4:26 pm

Thank you very much, I'll work on that. Already have an idea how to make it much better so problems you mentioned dissapear.

therepo90
Posts: 6
Joined: Wed Apr 24, 2024 8:11 pm

Re: Subtitles translation with AI

Wed Jun 12, 2024 12:05 pm

Yeah, I bet the original file does have quite the effect on the final result. As for the ones I've tested, they are mainly my own subtitles (Attack on Titan S02E01, Attack on Titan S02E02), that way I had something familiar to compare it to, because I made them from scratch. As for the quality of those, they are consistent with the guidelines for the subtitles on this website, max CPS of 18, consistent 150ms gap, max 42 characters per line, etc.

Still, despite the source being good quality, some issues persist. Those I've mentioned, especially the line breaking, to mention a couple examples of the subtitle I used for reference;
11
00:01:12,758 --> 00:01:17,185
managed to capture one of these creatures
captured alive.

59
00:06:21,975 --> 00:06:27,147
They would probably
know more than you know anyway.

103
00:09:53,619 --> 00:09:57,744
Why should I care
about people who have always looked down on us?

231
00:23:44,380 --> 00:23:48,562
That's what Conny is thinking to reduce the shock
of what he has seen.
To a minor extent, the translations are also lacking, likely due to the line breaks.
3
00:00:25,839 --> 00:00:29,662
People got used to
to living like cattle.

22
00:02:20,973 --> 00:02:23,710
Oh no, this doesn't mean anyway....

30
00:04:28,126 --> 00:04:31,354
Apologies, but what are
we all think of this?

49
00:05:48,014 --> 00:05:50,205
Is straight down right?

12
00:00:58,051 --> 00:01:01,634
Their origins and how they got here
have come remain a mystery.

122
00:11:19,407 --> 00:11:24,145
The heritage of your ancestors
Swap for civilization?

209
00:20:29,759 --> 00:20:31,423
Conny, drive softer.

236
00:24:05,726 --> 00:24:10,117
Next installment:
"Southwest"
These are just a few examples I managed to pick out in a short amount of time. Like I said before, AI translations are cool in theory, but in practice, they just can't replace the manual process because they contain too many mistakes. Both technical and grammatical.

Regarding this:

11
00:01:12,758 --> 00:01:17,185
managed to capture one of these creatures
captured alive.

59
00:06:21,975 --> 00:06:27,147
They would probably
know more than you know anyway.

103
00:09:53,619 --> 00:09:57,744
Why should I care
about people who have always looked down on us?

231
00:23:44,380 --> 00:23:48,562
That's what Conny is thinking to reduce the shock
of what he has seen.

I assume the translation is good, so its just a matter of srt consistency in terms of timing for better read etc, so you'd expect eg 11 to be:

11
00:01:12,758 --> 00:01:15,000
managed to capture one of these

12
00:01:15,150 --> 00:01:17,185
creatures captured alive.

? This is a minor issue, right? As the same text appear on the same screen time

Another thing:
Can you elaborate on '?' - Im not sure what you mean whats the issue there

3 - double 'to'
22- ?
30 - ?
49 - ?
12 - ?
122 - bad translation cause 2 lines are the same sentence
209 - ?
236 - ?

I'd love to hear your response, I'll add you to some help credits if you wish.

User avatar
OhItsStefan
Posts: 12
Joined: Sat Jan 06, 2024 7:36 pm

Re: Subtitles translation with AI

Wed Jun 12, 2024 6:39 pm

The first column of examples was for the inconsistent line breaks. Translations in those examples are good enough, except for 11 where it mentions both capture and captured in the same sentence. Even if you remove the line break, the sentence makes no sense. While I agree with the fact you could solve inconsistent line breaks in the way you're suggesting, I prefer keeping them together. Short display times make the subtitle flash, combining as much text as possible in a sentence with longer a display time is far easier on the eyes and thus ups the likelihood of the viewer comprehending what is said.

I should've elaborated on my examples in the initial post, that's on me. Here they are;
  • 3 - 'to' twice.
  • 22 - A good translation should be closer to "Oh no, this doesn't mean...?". It messed it up because the original said "Oh nee, dit betekent toch niet…", likely taking 'toch' and translating it without properly analysing the context it's used in.
  • 30 - The right translation should be "Excuse me, what are we supposed to think of all this?" or "Apologies, what are we supposed to make of all this?". Again, failing to translate certain words depending on the context. In this sentence, the word "all" refers to the whole situation, not who or how many people should think what about the situation (if that makes sense). On top of that, the line break has also messed up the order of the sentence, making it even harder to decipher what it was trying to convey.
  • 49 - "Right" in this sentence is a wrong translation of the word "Goed", which again, depending on the context, can have a different meaning. In this case, swapping "right" with "okay" or even the very direct translation, "good", would be more accurate.
  • 12 - "How they got here" and "have come" cover the same meaning and are basically duplicates. The correct way to translate it would be "Their origins and how they got here remain a mystery."
  • 122 - It's both a wonky translation in English and does not cover the meaning of the original sentence. In this sentence, someone asks whether the other person would sacrifice or swap the way of life and the home of their ancestors for civilisation. It's preceded by the question, "Do you want to take that step with me?". Based on the context of the question, a correct translation would be "To exchange your ancestral heritage for civilisation?"
  • 209 - This is a very direct translation of "Conny, rijd zachter." which would be better translated as "Conny, slow down."
  • 236 - Installment is a very odd choice to translate the word 'Aflevering' which would be closer to "episode"
It seems the translation process does not take context in mind when choosing certain words. On top of that, some languages just contain words or combinations of words, that need more characters to be translated for it to make sense. What can be said in 5 words in English, could take 10 in Dutch and vice versa. In my opinion, a manual process is required for this because AI models simply lack the ability to make those advanced decisions in regard to translations. This would require some major developments in the AI models that are being used or a manual check to ensure the quality is up to par. Especially when it's a paid service.

therepo90
Posts: 6
Joined: Wed Apr 24, 2024 8:11 pm

Re: Subtitles translation with AI

Wed Jun 12, 2024 7:47 pm

The first column of examples was for the inconsistent line breaks. Translations in those examples are good enough, except for 11 where it mentions both capture and captured in the same sentence. Even if you remove the line break, the sentence makes no sense. While I agree with the fact you could solve inconsistent line breaks in the way you're suggesting, I prefer keeping them together. Short display times make the subtitle flash, combining as much text as possible in a sentence with longer a display time is far easier on the eyes and thus ups the likelihood of the viewer comprehending what is said.

I should've elaborated on my examples in the initial post, that's on me. Here they are;
  • 3 - 'to' twice.
  • 22 - A good translation should be closer to "Oh no, this doesn't mean...?". It messed it up because the original said "Oh nee, dit betekent toch niet…", likely taking 'toch' and translating it without properly analysing the context it's used in.
  • 30 - The right translation should be "Excuse me, what are we supposed to think of all this?" or "Apologies, what are we supposed to make of all this?". Again, failing to translate certain words depending on the context. In this sentence, the word "all" refers to the whole situation, not who or how many people should think what about the situation (if that makes sense). On top of that, the line break has also messed up the order of the sentence, making it even harder to decipher what it was trying to convey.
  • 49 - "Right" in this sentence is a wrong translation of the word "Goed", which again, depending on the context, can have a different meaning. In this case, swapping "right" with "okay" or even the very direct translation, "good", would be more accurate.
  • 12 - "How they got here" and "have come" cover the same meaning and are basically duplicates. The correct way to translate it would be "Their origins and how they got here remain a mystery."
  • 122 - It's both a wonky translation in English and does not cover the meaning of the original sentence. In this sentence, someone asks whether the other person would sacrifice or swap the way of life and the home of their ancestors for civilisation. It's preceded by the question, "Do you want to take that step with me?". Based on the context of the question, a correct translation would be "To exchange your ancestral heritage for civilisation?"
  • 209 - This is a very direct translation of "Conny, rijd zachter." which would be better translated as "Conny, slow down."
  • 236 - Installment is a very odd choice to translate the word 'Aflevering' which would be closer to "episode"
It seems the translation process does not take context in mind when choosing certain words. On top of that, some languages just contain words or combinations of words, that need more characters to be translated for it to make sense. What can be said in 5 words in English, could take 10 in Dutch and vice versa. In my opinion, a manual process is required for this because AI models simply lack the ability to make those advanced decisions in regard to translations. This would require some major developments in the AI models that are being used or a manual check to ensure the quality is up to par. Especially when it's a paid service.
Thank you very much. Still its possible for AI to understand the context plus be aware of different languages properties. It just requires a bit more work and more such cases to check. I'll keep you posted when I make an update to that.
Also: the tool is free if there is free quota and the subs file is not too big.

Return to “General talk”

Who is online

Users browsing this forum: YaCy [Bot] and 8 guests