Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
Zeb-SLP
Posts: 1
Joined: Tue Jun 30, 2020 3:51 pm

What constitutes a "bad subtitle file"?

Tue Jun 30, 2020 4:01 pm

I've been going through my movie collection and downloading the subtitles for the versions of the films I've got.

Because I'm encoding the subtitles into the video stream as an optional extra, I really want to make sure the subtitle file is correct.

On my travels I've noticed a LOT of subtitles have a random mix-up of capital I and lower-case l. Now in some fonts they appear the same but change the font (eg. one with serifs) and the subtitles now look trashy. Not to forget I've also noticed that when this is the case, it's also accompanied by random missing spaces (words can be stuck to each other) or there'll be random spaces between numbers so 007 may appear as 00 7.

In some instances where no faultless subtitle file can be found, I've corrected all of the mistakes and uploaded a clean and correct copy.

Do I report these erroneous subtitles as bad?

One such example is "Star Trek V - The Final Frontier (1989)" - it took me absolutely AGES to find the correct one as most of them had the I and l swapped around all over the place - became very obvious when searching the text for "Nimbus III" because it actually appears as "Nimbus Ill" (will need to copy and paste this into a text editor with a different font to see exactly what I mean.)

User avatar
scooby007
Site Admin
Posts: 584
Joined: Thu Mar 05, 2009 10:49 pm
Location: Scandalous

Re: What constitutes a "bad subtitle file"?

Wed Jul 01, 2020 9:34 pm

There are two ways of looking at this. Generally good subtitles or technically freakish way of looking at it.
I'll firstly explore generally good subtitles and what that means to me.

GENERALLY GOOD

1: Not a machine translation (google translator).
2: Verbatim. Needs to be word for word. Not different words with a similar meaning.
3: Grammatically correct, or very close to it.Like having the correct comma placing, fullstops, capital letters at start of sentences, etc.
4: No more than 3 lines and around 42-45 characters per line.
5: Timing of on screen text synced to spoken speech.
6: Spellchecked.
7: Hyphens to separate two character speech.
8: Correct spacing for readability.

Example below:

1 (MEH)

00:00:11,001 --> 00:00:13,134
-You see where you're going?
-Mm-hmm.

Should be:

2 (GOOD)

00:00:11,001 --> 00:00:13,134
- You see where you're going?
- Mm-hmm.

Notice the spacing between the hyphen and first word character.

I randomly checked one of your subtitles, and it had 3 liners in there. I personally hate that and find it irritating on screen. Many users do, unless when we're desperate for a subtitle.

I (i) and l (L) are OCR errors and as long as it's not noticeable, I don't mind them and wouldn't personally mark them as bad. Probably give them a 7 out of 10 if they met the other criteria. I would definitely want to correct them if I wanted to add them to my personal collection.

In a subtitle editor, when you come across a word that's wrong and you correct it, it usually gives you a choice to correct all the words in the text file automatically that were similarly bad like that.
0 07 to 007 will also be done automatically by a subtitle editor. Just don't forget to upload your work with the correct release name for the sub-file on the site. :D


TECHNICALLY FREAKISHLY GOOD

Often forgotten are the technical aspects of a subtitle, such as line length, line breaks and minimum/maximum duration. Most forgotten, least 'visible', but maybe most important is the CPS ratio (Characters Per Second). Most subtitle software have an error check feature, but I would recommend Subtitle Edit:

- Check for errors: Tools > Fix common errors... (Ctrl+Shift+F)
- Checking/changing the error detection settings: Options > Settings > TAB General
Values may vary slightly, depending on language and personal preferences. But as rough guideline, you should never exceed these values:

- Maximum line length: 45 characters
- Minimum duration: 1200 ms
- Maximum duration: 6000 ms
- Maximum CPS ratio: 24 CPS

Also synchronisation with the video file should be checked. In Subtitle Edit, add the video pane (Video > Show/hide video) and visualize the synchronisation by adding the wave form pane (Video > Show/hide waveform). As rough guideline (don't forget the CPS ratio), use these values for synchronisation:

- In-cue: -150 ms (fixed)
- Out-cue: +400 ms (flexible, depending on space, CPS, cam changes etc.)

The above technically freakish methods are a general official retail standards. Or close to it.
Info provided by SmallBrother for a user to obtain "Trusted" status on the site.

More info about Trusted/SubTranslator users here: viewtopic.php?f=1&t=14224

Programs that you can use to make your life easier for subtitle correction here: viewtopic.php?f=1&t=2881#p9206

Hope that helps.

User avatar
SmallBrother
Site Admin
Posts: 3167
Joined: Sun Mar 04, 2012 12:59 pm
Location: Somewhere on this globe

Re: What constitutes a "bad subtitle file"?

Fri Jul 03, 2020 7:09 am

Keep in mind that subtitling guidelines differ to some extend for each language.
For example, in Dutch, compare to what Scooby007 mentioned for English subs:
- Three lines should not be done ever.
- A space after the dialogue hyphen is not mandatory. On the contrary, retail subs usually DON'T use a space.
- Only the SECOND line of a dialogue is hyphened.
- Subtitling of sounds etc. (like "Mm-hmm") is not needed (and considered very amateurish).

As far marking/reporting subs as "bad", this may vary per language section, depending on the admin(s). For the Dutch section, subs should be marked as bad only if they are useless. "Useless" in sense of very bad sync, countless spelling errors, twisted machine translations, syntax or similar errors possibly causing a crash, missing parts, etc.. Only 'some errors' don't make a subtitle useless.

Return to “General talk”

Who is online

Users browsing this forum: No registered users and 3 guests