Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
User avatar
scooby007
Site Admin
Posts: 396
Joined: Thu Mar 05, 2009 10:49 pm
Location: Scandalous

MACHINE TRANSLATION NOTICE!

Thu Oct 22, 2009 4:09 pm

(EDITED 04 May 2014)

Dear Opensubtitle users,

After some serious debate and discussion, Opensubtitles has decided to get rid of all “MACHINE TRANSLATED” subtitles.

Some admins might not still delete them, as is their discretion.
The reason for this is:
Surprisingly, some people still prefer the translation to nothing at all, but the majority of the admins will delete them on sight.

Opensubtitles (OS) understands that this may offend some members of the community, but I think most can agree this is best for the future of OS as we can avoid mass trash being accumulated and saved here over the coming years.

Here at OS, we think viewing a movie with a machine (google) translation, you are truly missing a lot of the real dialogue with these subtitles. Even though you can make basic sense out of them, the wording does not portray the true concept, belief and emotions depicted in a particular scene of a movie/TV series. Hence, depriving you of true enjoyment and understanding. e.g.

All together or altogether?

Altogether, is an adverb meaning completely, entirely, wholly, or "considering everything." It often modifies an adjective.
"All together" means as a group.
The meal was altogether pleasing, but I would not have served those dishes all together.

OS is one of the best sites out there and it wants to offer good quality subtitles to the community at large. For this to happen, we implore all the uploaders of “Machine Translations” to clearly mark your subtitles as, “MACHINE TRANSLATION” followed by the release name in the "RELEASE NAME" field when uploading your subtitle. This way you can avoid getting harsh comments and it makes it easier for the Admins to spot the bad from the good.

PLEASE NOTE:

Let the Admins know of any machine translations to be deleted. You can use the following guide, which will show you how to make reports: viewtopic.php?f=1&t=2595

Thank you.

User avatar
eduo
Moderator
Posts: 715
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website WLM Yahoo Messenger AOL

Thu Oct 22, 2009 4:55 pm

Great News!

Will the api provide a way to mark these in the same way we can mark BAD subtitles? Or a specific phrase or word in the comment can be used?

A problem I've found with machine-translated subtitles is that they usually appear earlier than proper ones. This means that for popular releases, by the time a good subtitle appears, the bad one has gotten so many downloads that users will pick that one always as default. They download, see it's a piece of crap and blame OpenSubtitles for the low quality of its subtitles.

This is a great step in the right direction. I've always said we should only have at the most two subtitles per language per release. Closed Captions and Speech Subtitles. If I get a special dispensation in the API I could even try to find a way to get a good source of subtitles to provide versioning, like addic7ed.

User avatar
scooby007
Site Admin
Posts: 396
Joined: Thu Mar 05, 2009 10:49 pm
Location: Scandalous

Thu Oct 22, 2009 5:57 pm

OS thought of marking these subs, but also expressed needing help with the code in setting it up. Help that I sadly won't be able to provide. If this is something you could help him with, then that would be great. I think it would be better if they were marked some how, but until that comes into effect we thought for the mean time they could at least be clearly marked in the notes. (Edit: A symbol is now available to mark machine translations)


A problem I've found with machine-translated subtitles is that they usually appear earlier than proper ones. This means that for popular releases, by the time a good subtitle appears, the bad one has gotten so many downloads that users will pick that one always as default.


I agree, I can't see why people think like that, but... (quantity equals quality) unfortunately they do. As said in the previous post, when quality subs are available, if someone could make the admins aware of it then the bad ones can be removed promptly. Forcing the user to download the best version.

User avatar
eduo
Moderator
Posts: 715
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website WLM Yahoo Messenger AOL

Thu Oct 22, 2009 6:21 pm

Probabyl cloning the current "BAD" behaviour is enough for the API and the Web, and a search can be set for all movie+language where "Machine Translated" is not the only sub (for removal).

User avatar
scooby007
Site Admin
Posts: 396
Joined: Thu Mar 05, 2009 10:49 pm
Location: Scandalous

Thu Oct 22, 2009 6:33 pm

Sounds like a sound plan, but as far as coding and restructuring of the site goes, I think it would be wise to get OS' input on this. He should be here later on in the evening. As for your suggestion.

If I get a special dispensation in the API I could even try to find a way to get a good source of subtitles to provide versioning, like addic7ed.


I don't think there is enough editers here for that to work? But who knows what the future holds.

User avatar
oss
Site Admin
Posts: 4619
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Sat Oct 24, 2009 5:21 am

ok, eduo, I got some chat with scooby, I asked him, if I should add to database 2 more columns: mark for automatic translation, mark for HD content (need to specify, I think everything more than 720p?), so when I will add these, is there something more to do ? I mean some mark of subtitles, so I can do it together.

Later then, I can add it to API. Also I can make some query to db and list movies with more than 2 subs pre language/release (but release thing is quite hard, I have to check it against moviefilesize, ufff)

Also, I think, there should be nice to have some statement of opensubtitles, which includes in one section handling of Automatic translations...but I dont have time to write it :( any help?

User avatar
rogard
Posts: 21
Joined: Thu Mar 20, 2008 2:37 pm
Contact: Website

Sun Nov 01, 2009 12:27 am

Hi guys,

it's nice to see that everybody is working on improving quality. Definitely a step in the right direction.

You might know that we banned machine translations from subscene altogether 8) (hehe) because in our opinion they are just an abomination. I can see the other point though...a little.

I am just wondering how you will go about the handling. People would need to set a flag or something. From my experience I know they are not so good at that...

Even if they were, who's gonna go through all the subtitles and check whether a newer subtitle is not a machine translation in disguise? Isn't all that a mission impossible?

--------------------------------------------
Warning! From here on it's getting philosophical.

The next question would be this: If you plan to delete inferior machine translations as soon as a better subtitle arrives, how about deleting all the other inferior subtitles as well? Is that feasible?

Sometimes there are 10 or more different subtitles for a film release, some are google crap, some are transcripts in different stages of editing, finally there are retail rips, again duplicates and corrected versions etc etc.

As someone already said, people tend to favor the subtitle with the most downloads, so how do we get them to download the best subtitles? I think by deleting the bad ones or by pointing towards the good ones. Checking all the different versions and decide which one's the best is a Sisyphos task.

Hmmm...I am thinking a trusted uploader badge or something...since the current sign is solely based on # of uploads. That is definitely the wrong incentive for improving quality.

Then there are ethical and moral questions: If someone made a transcript which is pretty good but then someone uploaded an edited version....can and should the original be deleted just because it's inferior? How about two entirely different translations, one brilliant, one crap? Two slightly different edits of a series subtitle? Subtitles with ocr errors and without? Subs with missing or wrong credits? ...

Is there a right for anybody to upload their subtitles, no matter how bad they are? Or do you focus on quality which means you will delete everthing as soons as something better came up?

Note: these are *not* rethorical questions since I don't know the answers either.

Anyway, good to see some movement over here, and I am seriously jealous of you guys. On subscene, we have a list of possible improvements that is almost 100 items long and is nearly 2 YEARS old. Our sysop has other stuff to do I guess. Frustrating.

Cheers

User avatar
scooby007
Site Admin
Posts: 396
Joined: Thu Mar 05, 2009 10:49 pm
Location: Scandalous

Sun Nov 01, 2009 7:07 pm

Did you mean, Sisyphus? Pretty good points, I don’t know where to begin. Incidentally I like that analogy.

You mentioned people would need to set a flag. Well as of today the subs will (should) be clearly marked, where the up loader can clearly click ‘Subtitles are machine translated’ option in the upload page which should help and yes not everybody will use this feature unfortunately, but at least it’s a start.

We agree that translations are an abomination, but it’s impossible to satisfy everyone’s opinion and we thought this way it would be more of an impartial one by taking the said steps.

“Who's gonna go through all the subtitles and check whether a newer subtitle is not a machine translation in disguise? Isn't all that a mission impossible? “


Couldn’t agree more. That’s where everyone comes in. Its impossible to wade through every subtitle upload, this is where we’ll have to rely on the users help. After all this is all being done for them, too.

I have already seen reports being made in regards to deleting machine translations, so hopefully we can build from there.

NOTE: This is not a fool proof plan. In my opinion nothing will be perfect, but this is as close that anyone can get it (I think). There is another site which in my belief doesn’t even let you upload a machine translation, but that may be going too far, as no one will be able to get their hands on a raw transcript to edit.

“next question would be this: If you plan to delete inferior machine translations as soon as a better subtitle arrives, how about deleting all the other inferior subtitles as well? Is that feasible?


Sometimes there are 10 or more different subtitles for a film release, some are Google crap, some are transcripts in different stages of editing, finally there are retail rips, again duplicates and corrected versions etc etc. “

This is all ground zero at the moment. We are always open to suggestions and our opinions are subject to change, after all its all of us that make this community what it is, so its up to us where we go from here.

Everything said by me here isn’t concrete and is subject to change and I think it will remain that way for some time to come. Simply because this is all really difficult in regards to what approach to take.

But for now I think it’s a safe bet assuming ‘Google crap’ will definitely be deleted once edited versions appear. In regards to subtitles in “different stages of editing”? Well, for now I personally think they may be left untouched for the points you raised and we should wait and see where the future takes us from here (users opinions on them) before continuing in this course of action.

“someone already said, people tend to favour the subtitle with the most downloads, so how do we get them to download the best subtitles? I think by deleting the bad ones or by pointing towards the good ones. Checking all the different versions and decide which one's the best is a Sisyphus task. “


Hmmm… We can’t really make anyone do anything can we? I think all we can do is make sure ‘Translations’ are deleted and then its up to the individual what he/she downloads.

What I like about subscene is, everybody votes for the subtitle. Granted, often the up loader himself takes part which I think is folly, but that at least brings out the bad and the good rise above the rest. Over here you only get votes (mostly) if your subtitles been downloaded in the 1000’s.

“there are ethical and moral questions: If someone made a transcript which is pretty good but then someone uploaded an edited version....can and should the original be deleted just because it's inferior?”


That’s difficult, and I wouldn’t want to voice my opinion on that one just yet as I can’t make my mind up? As I said, we’ll have to see what happens in the future. For now we should at least concentrate on removing machine translations, once that field is mastered, if ever, then…?

For now there is a new way of marking the said translation subs, lets hope it helps, AND ALL MAKE USE OF IT. :D

User avatar
eduo
Moderator
Posts: 715
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website WLM Yahoo Messenger AOL

Sun Nov 01, 2009 10:03 pm

Hi,

The post below deals with two specific items:

-Making a strategic relationship with a reputable wiki-based subtitle versioned edition web (like addic7ed) to ensure only the best subtitles are provided for each release and movie.

-Improving the current ranking and download counters so as to make them actually useful.

I realize now, after having written it, that it's almost impenetrable. I still include it but I don't expect any response and I'd appreciate no response quoting it's horrible, horrible text. :)

---

There is a way to ensure proper quality in subtitles and prune out all inferior ones, but it'd imply becoming more astringent in who can upload what and an alliance with a Wiki-based subtitle editiion community like Addic7ed.

I have tried doing this in the past, but sadly OpenSubtitles enjoys a fame (I won't enter in whether it's deserved or not. I believe it's easy to misunderstand decisions in OpenSubtitles as having a bad intention behind them) that has meant I can't really get any results. I have been given green light to help creating an API in Addic7ed but not to have it interact directly with OS. Even if the benefit is for both (to be 100% fair) they're reluctant.

It's a shame, because that would close the loop. Ensure only good subtitles survive and eliminate all duplicates.

Another option, which sounds more complicated than it actually is, would be to improve the current voting system by changing it to a thumbs-up/thumbs-down system and adopting a wilson score confidence system. The current system is simplistic in a way that doesn't help users.

http://blog.reddit.com/2009/10/reddits- ... ystem.html
http://blog.stackoverflow.com/2009/10/a ... ng-orders/

Our current system allows one bad subtitle with a single score of 10.0 to *seem* better than a subtitle that has a 7 from 100 votes. It's clear to everyone that the second one has more *weight*.

In the same way subtitles have dowload counts embedded. By using the API and registered users we can consider every subtitle download and upvote and every alternate subtitle downloaded a downvote of the original.

User avatar
oss
Site Admin
Posts: 4619
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Mon Nov 02, 2009 5:29 am

Hi,

we got such a nice debate here :) I completely understand the problem, and idea of sorting is purely subjective. Anyway, we are using voting from +1 to +10, so using that magical formula on that site is not possible. I just dont like, if people can vote only plus and minus. When subtitles are good, but there are 2 grammatical mistakes, what should user vote ? Some can vote plus, some come vote minus. I would vote 8 or 9.

So, what about implementing Bayesian rating?

http://www.thebroth.com/blog/118/bayesian-rating

Idea is as following:
Baesian rating will be implemented not for all the subtitles in database, but only for set of subtitles, which is all subtitles for certaing language for certain movie.

I think this is what we are looking for.

Of course, we need more user feedback - I mean voting. As I wrote before, I can imagine, players should implement this, after watching movie (when logged user quits current movie - lets say he watched at least 90% of movie and want quit player, want open another file (of course this should be default setting, but still possible set by the user) player will ask him for vote. I think only in this way we can get enough data for rating.

It is hard to imagine, when user downloads subtitles using player or program, and then ask user going back to website and want some vote from him. He doesnt need to know about just downloaded subtitles anything - I mean IDSubtitle...

I can imagine I can implement Bayesiang rating as I wrote here, but first I need your comments on this. Next thing is, it is really resources ineffective to do some sorting/ordering when returning results using API.

User avatar
rogard
Posts: 21
Joined: Thu Mar 20, 2008 2:37 pm
Contact: Website

Mon Nov 02, 2009 1:00 pm

How about a rating system similar to ebay? Each user has a userpage where they can see their last downloaded subtitles and rate for them in an easy way.

To make this more obvious, this should be displayed as soon as users visit OS and, in addition, somewhere where it's always visible, like a reminder in the title bar or such.

You could even invent a "top voter" badge, people just love these awards and shiny medals...
Top voters of the week/month/all time...?

On subscene we have much more rating going on because the community is stronger and extremely close-knit. Especially the Arabic translators are kind of heroes with fans who vote diligently for them. Therefore, lots of ratings are not so much a sign of great quality in the first place, but a sign of appreciation and reverence. :-)

You have set up OS so that's possible to get subtitles without even visiting the site. That's convenient, but it doesn't do anything for the community "vibes".

Where's the incentive to vote if subtitles are gathered almost automatically? A rating feature that works just as easy would be helpful. No clue how that's gonna work though...

Voting is an extra effort, so how do you get people to vote? Give them a reason, an incentive, a reward.
Obviously the quality aspect of subtitles is not enough.
People will vote if they have a strong connection and identification with your site. They have to care not only about the subtitles, but about OS as a whole.

Again, I suggest a "Trusted Uploader" thingy. You award it to people who do a really good job and follow to the rules. That is a reward and a responsibilty at the same time, because now they have to be even better..

You could also invent statistics like the average votes for all the uploads of someone. If their ratio is better than X, they automatically become a trusted uploader, I don't know...

Subtitles uploaded by these guys should always be displayed in a more prominent way, so that downloaders know: this is the cool stuff.

The users need some kind of guidance, but right now they are just looking at the color of the metal. Duh. By the way, my theory is that "Gold uploaders" have more downloads - it just looks nicer than the muddled grey platinum. :-)

You could even go a step further. Give those trusted guys more relevance: as soon as they upload something for a film, it's time to review anything that's been there before. They must have a reason to upload something although there was already a subtitle, so it must be an improved version. If not, and it happens often, they'll risk their status.

Everybody should vote, but in the end those ratings are most useful that are made by people who really know how and why to rate. So instead of having 1 billion people vote, I'd rather have 1000 whose judgment is reliable. These trusted guys could be your eyes and ears on the site even more than they probably already are.

User avatar
eduo
Moderator
Posts: 715
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website WLM Yahoo Messenger AOL

Mon Nov 02, 2009 5:18 pm

I guess this is where I step down and become just a silent developer behind the scenes.

I have a problem, in that I'm an idealist in several aspects, some of which clash directly with the direction OpenSubtitles needs to take to be self-sustainable.

I myself would go full-API with the service, with the web as a last-resort system.

I would make the service more focused in quality instead of quantity, including a way to give incentive to editors and problem solvers, instead of raw numbers for uploaders, downloaders and badges.

I would try and provide an API for subtitle editors and include code in wikititles-type sites to enable this communication, providing a source of new subtitles that is reliable and trustable.

I know, human nature goes against what I want. That's my cross to bear. I don't blame anyone for taking a different direction, but I really can't support it if it goes against what I believe.

I'll keep doing my own API-based software and linking to the site as desired by "os" for as long as he allows it. And I'll regret not being able to do anything different due to lack of time.

It was foolish of me thinking we could find a common ground for all of the subtitle "scene" (there is one, yes) to agree and work together, like more chaotic scenes have managed to do so in the past.

User avatar
oss
Site Admin
Posts: 4619
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Mon Nov 02, 2009 5:36 pm

ok, status trusted uploader is possible. I can give then option for admins and platinum uploaders and to active members to spread this thing to other people (but they can not give it to someone else). I can imagine this should be step forward.

Eduo, I know what are you talking about and I am idealist too, I can imagine a lot of things. I agree, in deep with you. This will be possible, I hope on next version of opensubtitles. Now coding XML-RPC is pain in the ass, coz thats the way how is OS coded, could be much more better.

One thing, which I am testing is to put into subtitles itself some advertisement. For example "Best viewed by OpenSubtitles player". I know many will disagree to put ads in subtitles, but I simply dont have enough budget to pay servers and it is in deep reality only one way how to open website completely to the world. Then our idealistic worlds can be same. Of course advertisement will be not visible for admins, platinums, vip and so on...I will try to code also it will be not visible for those people, who visit the webpage.

Simply:

using api and not "better" member => 1 line of ad in subtitles at the beggining

using website => nothing changes, it will be like this.

I will try to do something with ratings, for example trusted members, admins and so on will got higher priority (I hope they also check others subtitles:)

User avatar
eduo
Moderator
Posts: 715
Joined: Sat Feb 10, 2007 1:40 am
Location: Information Technology
Contact: ICQ Website WLM Yahoo Messenger AOL

Mon Nov 02, 2009 6:13 pm

Yeah. I'm sorry. I wasn't clear. That's what I meant with not being able to agree with the direction being taken. It's not only the focus on the web, or the focus on penis size (high numbers, big rankings, shiny colors), or the multitude of banners. It's that adding ads o the subtitles is something I can't commune with.

I've thought about it long and hard, since I first saw the proposal. And I assure you I can understand the reasoning and that it's not an easy solution. But I still can't accept it. It's a line I can't cross.

As stated, I'll keep using OpenSubtitles as long as I'm allowed, and will include everything OS requires in my program regarding links and mentions. But I'll keep wishing for a different way to do things better suited to my principles (and, to be absolutely clear, that way doesn't currently exist anywhere, the only alternative to OS is run by one of the most morally questionable characters I have met on the Internet, so that's the worst possible alternative to consider).

On ratings: Scaled ratings are false friends. Without an objective relationship for each value they become even worse than any alternative. This is the reason ratings in IMDB and Amazon and the like are inherently broken.

To see how broken they are open a movie from a known source in Mininova (anything by aXXo or FXG would be perfect). You'll see plenty of people voting 10/10 without even having downloaded it. In places like Amazon or Apple's App Store or Macupdate you get people voting 1 or 0 to "balance things out for those that vote inaccurately".

If you like a subtitle you upvote it. If you don't like a subtitle you downvote it. No subjectivity. If you like it enough to vote it then that's an upvote, regardless of whether you liked it a lot or just a little.

This is not my own thing, this is the currently-accepted best way to deal with the utter subjectivity of voting on a place like the internet, where as has been rightly said ratings are dictated by several things and little by whether they allow 5, 10 or 20 steps inbetween.

This is hard so believe because rating systems are false friends. They are so "obvious" they can't be wrong. In the end it's the same asunderstanding that if more than 20 people are in a party, the possibility of two of them sharing the same birthday is over 50%: It's not common sense, it's against common sense but it's provable math, and easily demonstrable.

Bayesian depends on consistent results for unknown behaviours. But here you're dealing for inconsistent data all thoroughout.
Last edited by eduo on Mon Nov 02, 2009 9:45 pm, edited 1 time in total.

xgpalex28
Posts: 6
Joined: Thu Feb 05, 2009 7:27 pm
Location: Bucharest
Contact: Website

Mon Nov 02, 2009 8:44 pm

Hello,

First of all, congrats for trying to make things better ( machine translation is one of the biggest problem this scene has imo ).
Well i read all the thread in the past minutes, so some stuff is still not clear ( i should read it again later, maybe some of the things i remember have been discussed already - approved or denied ).

If it doesn't offend anyone, i`ll compare Opensubtitles to TPB. They both are big and they both have the same problem ( well one has fake torrents, other has lots of low quality files ). A way that worked ( and still does ) on TPB is that users are "Trusted" as you may know. If you notice the top torrents, they are uploaded by "Trusted" users.
I myself always download files by trusted users. That way i`m sure it`s almost impossible to get a fake / infected file.
Long story short, this imo would be the best way for OS. It can also be done automatically i think. For example if an uploader has his uploaded subtitles voted on an avarage of 7+, at 100 subtitles that have 7+ avarage votes, they are promoted to " Test Trusted Uploader ".
After a while a user with a higher rank can promote the respective to Trusted Uploader ).

This is a way that i think might work. The proof is already there.

( if this has already been suggested, sorry... read to much o0 )

Ok, about the ads in subtitles... i`m totally against it. I can only imagine the costs that OS has ( since we have problems with costs, and we`re much smaller than you are )... but there`s got to be another way.
Users already have AdBlock ( and there are many! ). OS already has ads running ( including pop-ups... that are blocked by most of the browsers anyway ) and adding ads in subtitles might annoy users more. For example, i`m annoyed by tvsubtitles. They add a line at the end of the subtitle that sometimes messes all the file, so you have to manually edit the file and remove the line. I assure you there aren`t many people that know a .SRT file can be opened by Notepad ( true story ).
A file should only be edited if you improve it, not lower its quality. Well, just my 2 cents :-)
Hello, my name is Alex and i`m an Addic7

Return to “General talk”

Who is online

Users browsing this forum: No registered users and 11 guests