Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
suadnovic
Posts: 281
Joined: Tue Aug 19, 2014 7:41 pm

UTF-8 to ANSI

Mon Feb 22, 2016 10:12 am

I can't convert this Russian UTF-8 subtitle in ANSI. Notepad, Notepad++, EmEditor, online,...
all failed. Became just ?????
It's about Law & Order: UK (2009) S03E02
https://www.sendspace.com/file/wmvzs5
What to do?

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: UTF-8 to ANSI

Mon Feb 22, 2016 1:52 pm

I don't understand why somebody still needs pre-Unicode encodings. For Russian you have KOI8 and Windows cp1251. And less frequently used ISO8859-5 (Latin/Cyrillic)
I don't know what you mean with "ANSI". This is an American organisation that has nothing to do with Russia or Russian. Perhaps you mean Windows codepage 1251. In that case:

https://www.sendspace.com/file/x9nhqi

User avatar
vankasteelj
Posts: 175
Joined: Sun Nov 15, 2015 1:09 am

Re: UTF-8 to ANSI

Mon Feb 22, 2016 3:25 pm

Most encodings are starting to slowly go away and move towards generalization of utf8 or similar "broad" standards. I would suggest you to try adapting w/e support you need non-unicode encoding to allow unicode, rather than trying to force a last-generation encoding. Just saying.

User avatar
oss
Site Admin
Posts: 5916
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: UTF-8 to ANSI

Mon Feb 22, 2016 4:02 pm

utf-8 is definitely way to go. Would be so happy to have just utf8 files on server....

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: UTF-8 to ANSI

Tue Feb 23, 2016 1:38 pm

So would I.

Why don't you do the conversion? Yes, I know it is easier to say than to do. But you could try gradually. Perhaps some known-to-work languages.

But then you have the problem with multilanguage subtitles. You have just one language but many subtitles have some words in other languages. For example, many English subs have some French words like "fiancé" or Spanish like "adiós".

Anyway it must be done carefully. And perhaps with human intervention.

But it would be great.

User avatar
hector
Posts: 370
Joined: Wed Jan 01, 2014 12:27 pm
Location: Spain

Re: UTF-8 to ANSI

Tue Feb 23, 2016 2:31 pm

And then you have VERY WEIRD files like this:
http://www.opensubtitles.org/en/subtitl ... ympathy-fr

It is UTF-8 except the French character "œ" (U+0153) which is encoded as Windows cp1252, i.e. it gives U+009C which is a control character, not what was intended I think. For example:
"Les cœurs d'enfants"

As we Spaniards say "vivir para ver" (live and you'll learn) :-D

Can I correct this or should I upload a new file?

Return to “General talk”

Who is online

Users browsing this forum: No registered users and 2 guests