Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
User avatar
arcchancellor
Moderator
Posts: 202
Joined: Sat Apr 03, 2010 12:56 pm
Location: Ankh-Morpork

Re: guidelines for subs

Wed May 20, 2015 9:33 am

I like the three dots at the end to show the sentence continues. Theoretically not necessary, but it is more clear, more explicit.
That's exactly my point.
I spoke once more with some friends yesterday and they all see it like this, therefore I will continue with it.

The three dots at the beginning of the following part are not necessary though (clarity was already made right before).
I think you're right.

With one exception. For example in a scene in which someone turns on a TV and a speaker is just in a midsentence (or something like that).
"I don't believe in God. I just believe in Billy Wilder" - Fernando Trueba

User avatar
SmallBrother
Site Admin
Posts: 3724
Joined: Sun Mar 04, 2012 12:59 pm
Location: Somewhere on this globe

Re: guidelines for subs

Wed May 20, 2015 9:48 am

The three dots at the beginning of the following part are not necessary though (clarity was already made right before).
With one exception. For example in a scene in which someone turns on a TV and a speaker is just in a midsentence (or something like that).
Exactly. I mean, I do that too.
In such cases, the previous phrase (spoken by someone else) has ended (with a dot) and there will NOT be a capital as first letter, so one could 'conclude' it is a half sentence. But also here, I prefer the explicit clarity.
Nowadays a VPN is a must for everyone. A VPN allows you safe surfing and protects you against spying governments and companies.
I advise AirVPN - from € 2,75 per month. Click the below banner for more info.


Image

User avatar
SmallBrother
Site Admin
Posts: 3724
Joined: Sun Mar 04, 2012 12:59 pm
Location: Somewhere on this globe

Re: guidelines for subs

Wed May 20, 2015 10:27 am

... OCR ...
First of all, I never ever OCR'ed a disk - but I understand kerremelk has developed a pretty sophisticated way to get good results.

But... the very best result of an OCR rip would be exactly the same as the original on disk. The problem is that long long time ago, retails subs were gooood. But nowadays that's not so much the case anymore, since things must be fast and cheap. So retail subs may give in on many 'details', like smart line breaks, spotting and even the actual translation. The only solution to get a 'prefect' result, is to do a final check by reading the subs line by line, and actually even watch the movie with those subs.
Nowadays a VPN is a must for everyone. A VPN allows you safe surfing and protects you against spying governments and companies.
I advise AirVPN - from € 2,75 per month. Click the below banner for more info.


Image

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: guidelines for subs

Wed May 20, 2015 11:25 am

Well, I still don't like the final three dots. But it's just my opinion. By the way, do you use "..." or "…" (U+2026)?

That's why I think it's a good idea to have configurable software (in this case the media player). You could turn dots on and off with a checkbox.

And that's why I think we need something more than SubRip (SRT) format can offer. It would be a good thing to have content structure and presentation issues separated. The goodness of SRT is its simplicity but it does not separate well this two things. I'd like to have something more. Besides, it is not well defined. The <i> and <b> tags are generally supported but what about <font>? There is not a well defined standard and that's not good for information interchange.

Now I'm using a customised XML application (something resembling TTML). The problem now is that this format is not supported by any software. But I use XSLT to generate the SRT automatically. I guess this "final dots switch" could be done in transformation.

Do you know any media player that supports TTML?

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: guidelines for subs

Fri Jun 19, 2015 6:23 pm

Guidelines... that's what I'm looking for. Now there are some things I'm wondering about, some details. All this is about low-level matters. Perhaps some day I'll start wondering about semantics and all that.

I think character encoding is not an issue any more. I decided to use UTF-8. My subtitles use two or three European languages, so I could easily choose ISO 8859-1 but I think Unicode is a good thing and should be used. Now, what about BOM (byte order mark). I see some files have it, many don't. It could facilitate automatic encoding detection, I guess. So, with or without BOM?

About the famous three dots... there is a character in Unicode named "HORIZONTAL ELLIPSIS" with code point 0x2026. i think it is a better option than three ASCII dots "...".
I think this question is on the same domain as keeping apart content and presentation. It's some sort of abstraction. If you write "..." it is formatting or presentation, those dots have no implicit meaning. If you write "…" (U+2026) you are saying that there are some missing words (ellipsis). It has more content information and less formatting. The drawback is that there could be some fonts that don't have this glyph and software that doesn't understand it. So, should I write ASCII ("...") or Unicode horizontal ellipsis ("…")?

There's another detail that I think is more a question of personal preference. When you write a dialog:
- What time is it?
- It's 3 o'clock.
sometimes they write one space after the "-", sometimes they don't.
So, "-What" or "- What"?

But the whole big question is: should we stick to SRT? Yes, it is simple, and that's its strength, and I like its simplicity. But sometimes I feel I need something more. I'd like to be able to set subtitle position. It's something that you don't change every 10 seconds. But it's really difficult to read the subtitles when there are credit lines in the image.

Formatting, I don't think it's so important in this field. It's more about content. I'd like to store some more information. For example, if it is some kind of aural information I'd like to mark it somehow. Something more than […] (another example of content against presentation). It would be nice to keep all this information logically separated from dialog. Then you could show it in another box, for example in the upper left corner of the image. So, I think SRT is good because it adheres to the KISS principle but it lacks many features, most notably those for hearing impaired people. And then you start seeing those <i> and <b> in SRTs, without any guarantee that they are effective. And then you start wondering what can be put in an SRT: What about "<font>" or "<u>"? And the answer is that you put whatever you want and the software interprets them as it wants. Not a very good system. Most of the time what they mean with "<i>" is "off-screen". Then, why don't you say it? Why has it to be in italics?

The truth is that very few of all subs are in SSA format. I think SSA has the same problems. Perhaps it has more formatting support and is better defined, but it still has content and presentation intermixed.

I think a lot of effort has been put on XML development and many applications are migrating to XML-based formats. So, SRT is good but it's not the future.

User avatar
SmallBrother
Site Admin
Posts: 3724
Joined: Sun Mar 04, 2012 12:59 pm
Location: Somewhere on this globe

Re: guidelines for subs

Thu Jun 25, 2015 6:53 pm

Who am I, but okay, my personal answers and ideas...
So, with or without BOM
I see advantages in having a BOM and in not having it. Answer: do as you like.
Just make sure the BOM is not present atthe start of every line, as happened in one upload a couple of days ago. DIfficult to detect, almost or totally invisible for text or subtitle editors/viewers, but causing trouble in VLC (and probably a lot of players). You will get this:

Image
So, "..." or "…"
Same same, I think you already said it yourself, both have advantages and disadvantages. I would say do as you like, but personally I always use three dots, mainly because I rather choose for simplicity (easy, reliable) instead of fancy stuff, especially if not necessary. Hallelujah for smoke signals, they even work when it's windy.
So, "-What" or "- What"
Same same, do as you like. Personally I think - followed by a space is more beautiful, so this is what I always use. Some say it MUST be without space (including the very professional Dutch Hoek & Sonépouse), others say it MUST include a space. So...? I think most important is to use consistency. Do as you like, but stick to one method throughout the whole subtitle (or even all your subtitles).
Btw, different languages have their own different guidelines. In Dutch, in dialogs, only the SECOND line has a -, NOT the first line, In English both lines start with a -. Also here I think it matters most to stick to one method.
But the whole big question is: should we stick to SRT? Yes, it is simple, and it's his strength, and I like its simplicity. But sometimes I feel I need something more. I'd like to be able to set subtitle position. It's something that you don't change every 10 seconds. But it's really difficult to read the subtitles when there are credit lines in the image.
I think: Subtitles is text and timing and SRT supports that, that's enough. I have my VLC player set up to use a nice font, nice size, nice position and enough shade to be visible on any background.
And then you start seeing those <i> and <b> in SRTs, without any guarantee that they are effective. And then you start wondering what can be put in an SRT: What about "<font>" or "<u>"? And the answer is that you put whatever you want and the software interprets them as it wants. Not a very good system. Most of the time what they mean with "<i>" is "out of picture". Then, why don't you say it? Why has it to be in italics?
I use italics for 'out of view', spoken text on tv, incidental foreign or alien words, or not at all. It involves a risk, but so be it, I think the risk is small enough to pay for the added value.
I avoid <font> and <u> and that kind of formatting, even though the risk might be as small as with <i>. Underlining decreases readability and emphasis and meaning could (should) be achieved with writing good text. Coloring fonts could be helpful for HI subs (Zed's text is yellow, Harry's text is green), but given the risk, maybe this is the wrong way to make HI subs.
Nowadays a VPN is a must for everyone. A VPN allows you safe surfing and protects you against spying governments and companies.
I advise AirVPN - from € 2,75 per month. Click the below banner for more info.


Image

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: guidelines for subs

Fri Jun 26, 2015 12:41 am

I know VLC supports "<font color="*">" but MPlayer does not. At least the versions I know.
n Dutch, in dialogs, only the SECOND line has a -, NOT the first line,
Yes, the first time I saw this it looked very strange to me. I always use "-" in both lines.

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: guidelines for subs

Fri Jun 26, 2015 10:55 am

I was expecting something like that: no clear standard way. There's another low-level issue concerning end of line.
I always use DOS style, that is, "\r\n" ASCII codes 13 10. Once I red (I think it was Wikipedia) that it was more standard. You can find most software understand "\n" but it was more generally accepted "\r\n". And I see most (notice I say MOST and not ALL) of the SRT files use this EOL convention.

My system is UNIX-like, so all my text files use the "\n" end-of-line. I do the conversion manually before uploading, so it is possible that sometime I forget. I hope not.

User avatar
SmallBrother
Site Admin
Posts: 3724
Joined: Sun Mar 04, 2012 12:59 pm
Location: Somewhere on this globe

Re: guidelines for subs

Fri Jun 26, 2015 11:20 am

There's another low-level issue concerning end of line.
I think in general "\r\n" is better. But I have seen subtitle files, when opened in Windows Notepad, they miss all end-of-lines (probably \n has been used). Or files that show double line feeds when using the preview method on the OpenSubtitles web site (I am speculating that was maybe some conversion mistake like, ending up in \r\n\r\n or \n\n or so). However, I think (but I am not sure) that both would work okay anyway, when actually used as subtitle.
Nowadays a VPN is a must for everyone. A VPN allows you safe surfing and protects you against spying governments and companies.
I advise AirVPN - from € 2,75 per month. Click the below banner for more info.


Image

biman
Posts: 1
Joined: Mon Jun 29, 2015 10:46 am
Contact: Website

Re: guidelines for subs

Mon Jun 29, 2015 10:56 am

Which software is best for subtitle work?
my bengali subtitle site is [link removed by SmallBrother]

User avatar
SmallBrother
Site Admin
Posts: 3724
Joined: Sun Mar 04, 2012 12:59 pm
Location: Somewhere on this globe

Re: guidelines for subs

Mon Jun 29, 2015 11:37 am

Which software is best for subtitle work?
Slightly off-topic ;-) but have you tried looking around on this forum?
For example, the first topic in the general section:
USEFUL SUBTITLE SOFTWARE
my bengali subtitle site is [link removed by SmallBrother]
Please do no post links to other subtitle websites.
Nowadays a VPN is a must for everyone. A VPN allows you safe surfing and protects you against spying governments and companies.
I advise AirVPN - from € 2,75 per month. Click the below banner for more info.


Image

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: guidelines for subs

Sat Jul 25, 2015 1:06 pm

Hi!
How is the summer going? For me it means more time to do what I like: subtitles :-)

I've been developing an XML application (don't misunderstand the term, it is not a program) to store subtitles. You can see it here: viewtopic.php?f=8&t=15203. I didn't get much feedback :-(

Well, the question is related to sub style. Sooner in this thread somebody talked about an algorithm to break lines, that is, not breaking them by hand but letting the computer do it based on some rules. And I was thinking if the same could be applied to time breaks. Take dialogs, for instance. You break the whole script into small time slices. But when these are too short, it is better to group them and show them in the same subtitle:
--------------------------------------------
1
00:00:10,000 --> 00:00:11,000
Did you get the keys?

2
00:00:11,050 --> 00:00:11,400
Yes.
-------------------------------------------
becomes:
-------------------------------------------
1
00:00:10,000 --> 00:00:11,400
- Did you get the keys?
- Yes.
-------------------------------------------

The question is when they should be grouped and when not. What is the minimum reasonable duration for a line? Could the break be chosen algorithmically? There are clear cases but computers are not very good with fuzzy logic. You must set some fixed limit. What could this be? One must also take into account the time between the utterances.

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: guidelines for subs

Sat Jul 25, 2015 1:45 pm

I use italics for 'out of view', spoken text on tv, incidental foreign or alien words, or not at all. It involves a risk, but so be it, I think the risk is small enough to pay for the added value.
So, you are admitting formatting and styles are useful ;-)

I've been reading about SSA and ASS and I think it's promising. I agree that text and timings is enough. But I think styles are very useful for hearing impaired. Sometimes I do an experiment: I watch the film only with subtitles, without sound. It is not a very well driven experiment because I, unconsciously, can remember some information because it's not the first time I watch the film. But you can see, more or less, what information a deaf person is lacking (who the hell said that? I'm lost)

You could argue that SSA has a lot of redundant information and you'd be right. I think it's a bit space waste. If you look at some SSA files, you can find the same parameters: 0000,0000,0000, repeated in each and every line. But it is only about 1.5 to 2 times the size of a SRT. It's not so much. And you can always compress it.

User avatar
arcchancellor
Moderator
Posts: 202
Joined: Sat Apr 03, 2010 12:56 pm
Location: Ankh-Morpork

Re: guidelines for subs

Sun Jul 26, 2015 8:52 am

My system is UNIX-like, so all my text files use the "\n" end-of-line. I do the conversion manually before uploading, so it is possible that sometime I forget. I hope not.
If you use the SubtitleEditor you can choose save under and in the opening window choose under New Line Windows instead of Unix. Problem solved. And the subs will show up correctly, whether on Windows or UNIX.
I know VLC supports "<font color="*">" but MPlayer does not. At least the versions I know.
Use SMPlayer instead of MPlayer. SMPlayer is a free, open source media player that uses the playback engine of MPlayer. It runs under Unix and you find it here:
http://www.fosshub.com/SMPlayer.html

Under Preferences/Options/Subtitles you can choose the font, fontsize, fontcolor etc.
I've been reading about SSA and ASS and I think it's promising. I agree that text and timings is enough. But I think styles are very useful for hearing impaired.
I have experimented some years ago for a while with ass and ssa because of hardcoding nice looking subs for friends with standalone players that do not many support of soft subtitles. See here: viewtopic.php?p=26190#p26190
These subs have many opportunities and can be edited in any way, but many players have trouble with it. So I hardcoded it for my friends.
But if one wants to reach a lot of people with his work, then I say: Keep it simple = srt.
(And by the way: hardcoding subs is not good. It should be avoided.)
I think it's a bit space waste.
Size doesn't matter. :mrgreen:

Furthermore I agree 100% with SmallBrother about his statements to italics, underlines, bold fonts etc.
"I don't believe in God. I just believe in Billy Wilder" - Fernando Trueba

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: guidelines for subs

Sun Jul 26, 2015 12:12 pm

But if one wants to reach a lot of people with his work, then I say: Keep it simple = srt.
Yes, somebody said "KISSes for me, save all your KISSes for me"? That is: "Keep It Simple with SRT for me" :-) This is a recurrently popping issue in IT. Sometimes you get seduced by all the bells and whistles but you lose portability. I'll stick to SRT. But I still think hearing impaired people can greatly benefit from the features SSA provides. And for the subtitle author it means much more fun (and work).
Size doesn't matter
Even when you have 4,000,000 files and more to come? I disagree. Yes, I know nowadays storage is very cheap. I still remember those 5 1/4'' floppies with 360 kbytes and those 100 Mbytes hard drives. It was not so long ago. That's what makes us think size doesn't matter. But I think it always does. I like them small. :-D
And by the way: hardcoding subs is not good. It should be avoided.
I totally agree. Then OS wouldn't exist. But I would go further: "raster based subtitles are not good". I think DVD developer's choice was a mistake. Well, I know they did it on purpose but it's "defective by design".
Last edited by hector on Sun Jul 26, 2015 12:26 pm, edited 2 times in total.

Return to “General talk”

Who is online

Users browsing this forum: Ahrefs [Bot] and 49 guests