Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
User avatar
oss
Site Admin
Posts: 5890
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Problems with the addition of new lines in the srt files

Tue Aug 07, 2012 5:06 pm

I understand, I will write to todo list support UTF16 subtitles.

User avatar
jcdr
Posts: 540
Joined: Sun Apr 08, 2012 9:49 am

Re: Problems with the addition of new lines in the srt files

Tue Aug 07, 2012 10:37 pm

Very good. So remains the subject of adding/replacing lines in the file.
As inserting a line at the right begining of the file seems to be a problem, maybe for now on it would be enough to add one only as the last line ?

Anyway I personaly think that this does more wrong than good to OS. Many users were put off from allsubs because of this. As many others, the first thing I do is always removing "alien" lines from the sub file. At least if the line is added at the end only, this won't be the first thing you see and it will be less protrusive.

User avatar
arcchancellor
Moderator
Posts: 202
Joined: Sat Apr 03, 2010 12:56 pm
Location: Ankh-Morpork

Re: Problems with the addition of new lines in the srt files

Wed Aug 08, 2012 6:50 am

Very good. So remains the subject of adding/replacing lines in the file.
As inserting a line at the right begining of the file seems to be a problem, maybe for now on it would be enough to add one only as the last line ?
I agree.
Many of my friends have told me that they are angry about this first item and at first delete it, while the last item does not bother and they feel it as a credit and a thanks to OS.
Honestly I don't know anybody who likes this first inserted item.
This first item is a way to annoying users permanently. And that's not good.
"I don't believe in God. I just believe in Billy Wilder" - Fernando Trueba

srtpal
Posts: 59
Joined: Sun Jun 21, 2009 5:28 pm

Re: Problems with the addition of new lines in the srt files

Wed Aug 08, 2012 9:56 am

Oss, I too agree 200% on the need to accept UTF-16 for OpenSubtitles. After all, it is a variable length code so it should not take a lot more space, if this is the reason for not accepting it ?
While I agree that UTF-16 is desirable, I have to say that it is not a variable length code. UTF-16 is the same as UCS-2. It uses exactly two bytes for each character.

UTF-8 is the one with a variable length. Because of its variable length, most Roman-based alphabets produce a considerably smaller file in UTF-8 than in UTF-16. On the other hand, non-Roman characters tend to produce a bloated UTF-8 file and use up less space in UTF-16. For example Chinese characters will convert into about 4 bytes per character in UTF-8 but, as all others, only 2 bytes in UTF-16.

So, both have their advantages and disadvantages when it comes to the size of the file. Plus, as I mentioned Theatre 5 does not understand UTF-8 but does understand UCS-2 (aka UTF-16).

User avatar
jcdr
Posts: 540
Joined: Sun Apr 08, 2012 9:49 am

Re: Problems with the addition of new lines in the srt files

Wed Aug 08, 2012 2:11 pm

Oss, I too agree 200% on the need to accept UTF-16 for OpenSubtitles. After all, it is a variable length code so it should not take a lot more space, if this is the reason for not accepting it ?
While I agree that UTF-16 is desirable, I have to say that it is not a variable length code. UTF-16 is the same as UCS-2. It uses exactly two bytes for each character.
Ah, my apologies. So if it takes twice the space for 90% of the files without any addition of non-roman letter/alphabet, then UTF-16 compatibility does not seem that important after all. Except for compatibility with few softwares such as Theatre 5, but even then, the majority of the files which will remain ANSI or UTF-8 coded will need conversion in Notepad.

EDIT: What about having the preview feature in UTF-8 instead of ANSI ? That would avoid having previews full of ��� for non-English files.

NomadaPT
Posts: 20
Joined: Mon Dec 22, 2008 3:12 am

Re: Problems with the addition of new lines in the srt files

Wed Aug 08, 2012 2:46 pm

So if it takes twice the space for 90% of the files without any addition of non-roman letter/alphabet, then UTF-16 compatibility does not seem that important after all.
100% correct, reason why UTF-8 is used preferably in net, further more... if, in the future the majority of softwares migrate to UTF-16, maybe that codification will be needed, but the more I think in that particular detail (use of space) I'm not so sure anymore that O.S. need to support UTF-16, after all the conversion from UTF-8 to UTF-16 (little indian or big indian) or to ANSI (when the latin alphabet is the issue) is fast (almost automatic in some softwares, even inherent) and without losses (srtpal can confirm this).

The real problem will be the non-latin alphabets, mainly the major ones of the Far East (Chinese, Japanese and Hangul) covered (partially) in the Supplementary Ideographic Plane, to be honest, I'm not so sure that subtitles in scripts covered in the Supplementary Multilingual Plane will arise suddenly, or even many of the Basic Multilingual Plane... I mean, how many subtitles do you believe O.S. is going to have in Cherokee (for instance)?
Last edited by NomadaPT on Wed Aug 08, 2012 3:05 pm, edited 1 time in total.

srtpal
Posts: 59
Joined: Sun Jun 21, 2009 5:28 pm

Re: Problems with the addition of new lines in the srt files

Wed Aug 08, 2012 3:03 pm

Except for compatibility with few softwares such as Theatre 5, but even then, the majority of the files which will remain ANSI or UTF-8 coded will need conversion in Notepad.
Not just softwares, languages. For Chinese, UTF-16 is much better because the file is much smaller than UTF-8. And not just Chinese. Indic languages, too. So, if you consider the size of China and India, UTF-16 would be a very useful addition.

Simply put, straight 7-bit ASCII will take one byte, anything above it takes more than one byte. That means at least two bytes but the higher on the Unicode numbering scheme a character is, the more bytes it takes.

Please see this table. Only the first 127 characters (including control characters) are expressed with one byte. Characters 128-2047 take two bytes. Anything above that takes three or more bytes (up to six in theory, though in practice three, not the four I mistakenly mentioned earlier).

So, for anything numbered 2048 or above, UTF-16 is more compact than UTF-8. We are talking about billions of people who are better off with UTF-16. With the exception of the Middle East and the Asian part of the former Soviet Union, plus Vietnam and Mongolia (both of which use the Roman alphabet), pretty much most of Asia.

NomadaPT
Posts: 20
Joined: Mon Dec 22, 2008 3:12 am

Re: Problems with the addition of new lines in the srt files

Wed Aug 08, 2012 9:29 pm

Just as information, because can be useful, Sourceforge developed a small program to convert between code pages (ANSI to UTF8/16/32, UTF* to ANSI/MAC/DOS, and so on).

I've tried and work's well.

It's open code and I wonder (please don't curse me) if the code can be included in the O.S. allowing the downloader to choose the final format.... it's just a thought, but would be a way to end the discussion about the codification(s) of the files and the allowed formats to support.

Here's the link: http://sourceforge.net/projects/cp-converter/
PHP, which OpenSubtitles uses, has this functionality built-in. It could be made into an option for downloads. Store the best format internally (UTF-8 or UTF-16), convert on the fly to preferred format for user. I don't feel it makes sense to have multiple subs with multiple encodings for movies.

cp-converter wasn't developed by sourceforge, by the way. It's hosted in Sourceforge but it's from sleeveroller. For every unix machine out there "iconv" does the same thing and it's easily scriptable (as well as the built-in editors in all Operating Systems, as far as I know).

subshare
Posts: 9
Joined: Fri Jan 07, 2011 4:57 am

Re: Problems with the addition of new lines in the srt files

Thu Aug 09, 2012 7:46 am

Hello, it's me again.

Meanwhile, it seems the problem got fixed. The first line isn't replaced anymore. Tried it today. However, if your first line is early in the file, then it overlaps with the line added by opensubtitles.org. This may lead to problems with some players. Some will show the following line nevertheless, others give priortiy to the very first line. Anyway, I know for sure it leads to problems when trying to mux it into an MKV file. Before doing so, you have to fix the overlap or MKVmerge will give an error.

As for the encoding, I think UTF-8 is the way to go. It's universal and covers all types of languages. In order to convert between different text encodings, you can simply use a text editor that is capable of doing this.

As a Ubuntu user, I simply use gedit which is preinstalled and handles everything. It can save in all sorts of codepages.
http://projects.gnome.org/gedit/

For Windows, I used Editpad Lite (freeware).
http://www.editpadlite.com/
It's the best Notepad replacement I know. It also handles all encodings and is able to convert between them. On top of that, you can also force Editpad Lite to interpret a text file with another codepage. This is very useful, if you extract srt files from an mkv, for instance, and then find out that it was saved with a wrong codepage.

I'm glad the first line issue got fixed. I can live with an extra line that promotes Opensubtitles. After all, it's still the best subtitles website. I found that Subscene is quite a mess sometimes, because the film data is not directly retrieved from IMDb, but titles are manually added by users. Thus, quite a few movies have wrong data (wrong IMDb number or wrong year).

Opensubtitles still has the best way to organize the movies/subtitles. Keep up the good work. I'll gladly continue to upload.

NomadaPT
Posts: 20
Joined: Mon Dec 22, 2008 3:12 am

Re: Problems with the addition of new lines in the srt files

Thu Aug 09, 2012 4:10 pm

.... it seems the problem got fixed. The first line isn't replaced anymore. Tried it today....
Sorry, but I've also tried and the problem remains, when the subtitle have one line in or before the first second continues to be replaced by the ads.

I can't avoid questioning, it's really necessary? Right now non logged users already have pop-up windows, several ads and a re-direct window for download fully dedicated to the Open Subtitles MKV Player, and seems to be general consensus that this first line is annoying, highly undesirable and may result more in the withdrawal of users (translators/uploaders and downloaders) that in his attraction to the site, then why insist on it?

So far, all opinions expressed herein show that no one would care that the additions were made ​​at the end of the subtitles, so, if really necessary, why not put both lines there?

User avatar
eduo
Posts: 716
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website Yahoo Messenger

Re: Problems with the addition of new lines in the srt files

Thu Aug 09, 2012 4:41 pm

.... it seems the problem got fixed. The first line isn't replaced anymore. Tried it today....
Sorry, but I've also tried and the problem remains, when the subtitle have one line in or before the first second continues to be replaced by the ads.

I can't avoid questioning, it's really necessary? Right now non logged users already have pop-up windows, several ads and a re-direct window for download fully dedicated to the Open Subtitles MKV Player, and seems to be general consensus that this first line is annoying, highly undesirable and may result more in the withdrawal of users (translators/uploaders and downloaders) that in his attraction to the site, then why insist on it?

So far, all opinions expressed herein show that no one would care that the additions were made ​​at the end of the subtitles, so, if really necessary, why not put both lines there?
API users don't see the web at all, so they don't get any add or banner from it. The line may be aimed at those.
http://eduo.info/
[url=http://eduo.info/soleol/]OpenSubtitles from your desktop: SolEol for Mac/Windows/Linux[/url]
[url=http://forums.plexapp.com/index.php?showtopic=325&st=0&p=2480&#entry2480]My current episode processing work flow[/url].

User avatar
arcchancellor
Moderator
Posts: 202
Joined: Sat Apr 03, 2010 12:56 pm
Location: Ankh-Morpork

Re: Problems with the addition of new lines in the srt files

Fri Aug 10, 2012 6:53 am

As a Ubuntu user, I simply use gedit which is preinstalled and handles everything
me2. Simple but extensive.
For Windows, I used Editpad Lite (freeware).
http://www.editpadlite.com/
It's the best Notepad replacement I know.
Or Notepad++
http://en.wikipedia.org/wiki/Notepad%2B%2B
I used it under Windows few years ago and it leaves nothing to be desired.
"I don't believe in God. I just believe in Billy Wilder" - Fernando Trueba

srtpal
Posts: 59
Joined: Sun Jun 21, 2009 5:28 pm

Re: Problems with the addition of new lines in the srt files

Mon Aug 13, 2012 6:34 am

By the way, shouldn’t http://trac.opensubtitles.org/projects/opensubtitles be edited now? It still contains this line:

no signature or advertisment is added to subtitles

User avatar
oss
Site Admin
Posts: 5890
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Problems with the addition of new lines in the srt files

Mon Aug 13, 2012 6:51 am

done. for sure there are more outdated infos floating there.


Return to “General talk”

Who is online

Users browsing this forum: No registered users and 31 guests