Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

inconsistent language codes

Wed Jan 25, 2017 1:50 pm

Hi.
I am using language codes to manage subtitle files from OS. And I've noticed some inconsistency.

Surprisingly you are using ISO 639-2T in some cases and ISO-639-2B in others. More specifically, you use
2B codes
  • dut - Dutch
  • fre - French
  • ger - German
but 2T code:
  • ell - Greek
I think you usually use 2B with Greek being an exception. It would be great if you could be consistent and either use 2T (nld, fra, deu) for all (which I think would be preferable) or 2B for all (gre for Greek).

You culd say I'm too picky but with some experience in programming I know inconsistencies can cause a lot of trouble in the long run.

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: inconsistent language codes

Thu Jan 26, 2017 12:07 pm

See the Wikipedia article on ISO 639-2
It says code "scc" (which OS is using too) is deprecated. I don't know how difficult it would be to change this now but I think it would be a good thing to switch to 639-2T or even 639-3 instead of 639-2B just partially.

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: inconsistent language codes

Mon Jan 30, 2017 7:19 am

well, changing would produce again some problems in programs, which already have our language codes list.

Those codes are changing in the time, there was one member, for Greek we should use ell etc...in the beginning we used gre I think (or gr)

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: inconsistent language codes

Mon Jan 30, 2017 10:07 pm

If you were using "gre" for Greek why did you change it?

Again, you can use the codes you like. You could even forget about standards and use your own. But then the problem comes when you interact with the outside world. That's why standards were invented.

I extract the language code from filename. If I use 639-2B Greek is not recognised. And if I choose 639-2T then all the other languages are not recognised. "scc" is not recognised because it is deprecated. You should use "srp" instead.

Well, I can workaround this but it would be much easier and simple if you'd follow the standard.

The same for "pob", "che" and some other non-standard codes. But those are needed because ISO 639 does not support country specification. You should consider IETF language tags but that's already been discussed somewhere else.

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: inconsistent language codes

Tue Jan 31, 2017 11:27 am

yes, thats why we here: https://www.opensubtitles.org/addons/ex ... guages.php

changing to "gre" would cause problems in applications, which are working and not daily developed. As you find out, we are using custom codes also.

we are not going to change Greek to gre, sorry, it would cause mess in all existing applications.

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: inconsistent language codes

Tue Jan 31, 2017 2:22 pm

Well, the mess is already there and is caused by mixing ad lib two different standards.
I hope you can be more standard in the future.
Thanks for the list, anyway.

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: inconsistent language codes

Wed Feb 01, 2017 7:38 am

basically I could fix it, but it would produce more mess (accepting both codes), but I think more systematic is not to mix it.

I know what you mean to be more standards positive, but already when I add there POB and some Chinese stuff is already out of standard, so basically I could use any codes...because to implement it, one have to download "our codes" anyway.

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: inconsistent language codes

Tue Feb 07, 2017 5:25 pm

some Chinese stuff is already out of standard
Speaking of consistency and language codes... the problem with "zhe" is that you are allowing bilingual subtitles. I think it's okay. But then (being consistent) you should do the same for other languages. Why not Spanish/English or POR/POB or whatever? Perhaps it is because nobody requested it. And perhaps it is because most video players support showing two subtitles at the same time.

So again, I think they are a waste of space and resources. And you could get rid of "zhe", "spe", "dee", "gee"... Oh, gee! :)

Return to “Developing”

Who is online

Users browsing this forum: No registered users and 29 guests