Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Subtitle Time Stamp Groups

Sun May 29, 2016 8:14 am

Hi,

after release https://github.com/Ivshti/node-subtitles-grouping I knew it is good feature, but never gets really into it. I have been contacted couple of days ago by its developer and I looked into it again, and it comes as nice idea.

In easy way how to explain what is going on - imagine you got 4 subtitles (they can be in 1 language or in any language, language is not important here) for 1 movie (imdbid). You download 1st subtitle and it is out of sync. It would be good to know, which subtitles you dont have to download, because they will be out of sync too.

And here comes Subtitle Time Stamp grouping. This algo makes heatmap of timestamps of subtitles, then compares them to others and then with give threshold it either groups that subtitles or it makes another group.

I made some tests, and it is quite interesting.

Now the database is populating with data, so it will take some time, but you can see in xml-rpc results 2 new fields, example:

Code: Select all

[SubTSGroup] => 2 [SubTSGroupHash] => 041ec2cdae88105ccce1e1f9f4eebc85
where:
SubTSGroup means IDSubtitleFile belongs to some TIME STAMP group for given movie
SubTSGroupHash is something like md5($SubTSGroup . "|" . $IDMovieIMDB) - so this will work system wide.

This could be used in many different areas (if it proves to be right)

You can for example right now check xml-rpc SearchSubtitles() with this request:

Code: Select all

'imdbid' => '49778'
results:
http://pastebin.com/kmGz5Zrb

and over there you can see SubTSGroup - and some number. So why not to download couple of subtitles and actually see, if SUBTITLE TIMESTAMPS are similar in each group (and not similar in other groups) ?

For now works just for srt files, support for other formats will be added later

User avatar
vankasteelj
Posts: 175
Joined: Sun Nov 15, 2015 1:09 am

Re: Subtitle Time Stamp Groups

Mon May 30, 2016 5:24 pm

Oh yes I've been told that this grouping thing was awesome but never took the time to look into it. Great news if it can improve automatic matching!

IvoGeorgiev
Posts: 1
Joined: Tue May 31, 2016 2:07 pm

Re: Subtitle Time Stamp Groups

Tue May 31, 2016 2:42 pm

hello, developer here

vtt would be simple to add, it's almost the same as srt and if I'm not mistaken the system may support vtt at the moment at the parsing level

I think there may be a lot of room for improvement on the algo itself, as @oss discovered "hearing impaired" subtitles are being grouped at the moment in separate groups, which is a problem

-----

Few words of explanation: the idea of grouping subtitles by how they're synced is beneficial, because we can cross-reference that with moviehash matches and fight the "right" group

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Subtitle Time Stamp Groups

Tue May 31, 2016 4:37 pm

yes vtt is simple, but we dont have vtt subtitles uploaded yet.

for hearing impaired, well, it can, but dont have to end up in different group, it will be on border with threshold.

User avatar
vankasteelj
Posts: 175
Joined: Sun Nov 15, 2015 1:09 am

Re: Subtitle Time Stamp Groups

Tue May 31, 2016 6:11 pm

you could easily make your srt into vtt with a script, but you probably know that, so I might misunderstand

User avatar
oss
Site Admin
Posts: 5879
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Re: Subtitle Time Stamp Groups

Wed Jun 15, 2016 7:01 am

sure. But this is more on the "output support". (download as vtt)

Return to “Developing”

Who is online

Users browsing this forum: No registered users and 22 guests