Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
amaarten
Posts: 8
Joined: Fri Dec 11, 2015 6:43 am

Hashes change due to advertisements

Thu Mar 30, 2017 5:43 pm

Hey,

Since some time, opensubtitles.org adds advertisements in subtitles.
This changes the hash of the subtitles.
My tool relies on these hashes to match downloaded subtitles and subtitles on the server.

My question(s) are:
  • Is there another way to match subtitles?
  • I would propose to add a parsable comment in the header of the downloaded subtitles containing the original hash.
    Adding something like:

    Code: Select all

    ;;ORIGINALHASH=d41d8cd98f00b204e9800998ecf8427e
Any ideas?

Many thanks!

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Hashes change due to advertisements

Tue Apr 04, 2017 4:26 pm

Hi

we can add original subtitle hash in HTTP header. It would help you ?

amaarten
Posts: 8
Joined: Fri Dec 11, 2015 6:43 am

Re: Hashes change due to advertisements

Tue Apr 04, 2017 11:16 pm

When a subtitle is downloaded, the original (=without adds) hash is already known though your xml-rpc interface. So your suggestion wouldn't work.
My question is about the hash of an already downloaded file, with ads.
The hash of that file will not match with any file on the server, even so if the advert changes.

That's why I suggest to add the hash of the subtitle (=index in your database) as text in the subtitle itself.
Either by prepending it with something computer parsable like ;;ORIGINALHASH=d41d8cd98f00b204e9800998ecf8427e=ORIGINALHASH
A problem with this approach is that this might confuse some video players.
VLC 3.0 on linux does play srt files with the above line prepended.

Another possibility is to put this computer parsable hash at the end of the file at a time offset at or beyond the end of the movie. This way, the subtitle still has a valid syntax.

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Hashes change due to advertisements

Wed Apr 05, 2017 1:09 am

we keep track of all subtitle hashes, so it is linked to original file.

when you will download subtitle, in your app you will know (already from the search results) the original subtitle hash, so you can store it with subtitle file, and then compare this, so it will work.

we will not add anything anymore to subtitles itself, coz it could break the compatibility, and this could lead to bigger problems.

Adding some metadata in 0:00:00 -> 0:00:00 could be also not valid I think.

amaarten
Posts: 8
Joined: Fri Dec 11, 2015 6:43 am

Re: Hashes change due to advertisements

Wed Apr 05, 2017 8:02 am

For a recently downloaded file you can keep track of the relation. Sure.

But when we scan a new directory with videos and subtitles (including ads), the mapping will not work.

Yeah, adding `0:00:00 -> 0:00:00` might nog be valid.
That's why I suggest adding something like

Code: Select all

09:59:58,000 -> 09:59:59,000 ORIGINALHASH=d41d8cd98f00b204e9800998ecf8427e
at the end of the file.

Video players will never see that line since it is beyond the length of the movie.
That way programs can simply apply a regex to the last chunk of a file.

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Hashes change due to advertisements

Thu Apr 06, 2017 4:32 am

Hi

did you try to use http://trac.opensubtitles.org/projects/ ... eckSubHash

maybe this would help you

adding original hash in subtitle contents creates more subtitle hashes, so I want to avoid (and also there are not only SRT out there...)

Another thing - in your app you could make track of subtitles which you already downloaded (so you got original hash from search results), and then ignore those...

Return to “Developing”

Who is online

Users browsing this forum: No registered users and 24 guests