Writing a film identifying application, thoughts appreciated

 
Post new topic   Reply to topic    OpenSubtitles.org Forum Index // Developing
View previous topic :: View next topic  
Author Message
stavros_sk



Joined: 25 Sep 2008
Posts: 9

PostPosted: Sun Nov 02, 2008 8:26 pm    Post subject: Writing a film identifying application, thoughts appreciated Reply with quote
Hello all, first i would like to wholeheartidly thank the OSdb developers for your great work with this open database, and especially the video hash feature, it has been more useful to me (and thousands other users) that one could imagine.

I'm currently coding a film identfying application in the form of a Meedio(a well respected HTPC -Home Theater- application/solution) plugin.
Please see the development forum thread here:
http://www.meedios.com/forum/viewtopic.php?t=3912

Currently (and thanks to your API), my application can identify the user's (mostly .avi) films and download details for them (from IMDb), and also download posters, backdrop art (for the HTPC frontend) background, and of course synchronized subtitles in the language of his/her choosing.

But something's still missing from the whole picture, and i think your database is the only open database in the world which (with a small modifying) can provide the whole set of benefits of video hash identifying.

So let me explain: Currently the benefit a user has from linking a unique hash to his/her video fils, is that he/she can download perfectly synced subtitles for this film, and that's great.

But there's another benefit the video hash feature can provide: identify a movie from it's video hash and download details for it without requiring the user to provide any info for this film to the program.
You may wonder of course why one would want to download details for a film without providing it's name to the application.
The answer is this: An important feature for Home Theater applications is to able to show a set of details about a film in their graphical interface. But downloading those details isn't always straight-forward and a lot of times those users have to resort to complicated methods like tag masks, regural expressions, html scraping for the application to be able to effectively filter their video files and download relevant information.

Currently thanks to you my application is able to identify most of the films in .avi format, but a lot of these users' films are in .vob, .iso and .mkv container format, and most of these films are not recognized by OSdb (because they are not contained.

So the idea is this: If you would kindly allow users or applications (through your API) to upload their video hashes along with their relevant film details without requiring the user to supply a subtitle , mine and other applications could provide a semi-automatic upload feature, increasing expontentially the number of films which are inlcuded in the database, and effectively identifying most films in any video format.

From my side, i am implementing a semi-automatic subtitles uploading feature in my application, which combined with the film details uploading mechanism, will potentially increase the number of subtitles included in the database by a fair amount.

Please let me know your thoughts about this idea, and thanks for reading.

Regards,
Stavros.
Back to top
View user's profile Send private message
os
Site Admin


Joined: 25 Feb 2006
Posts: 1229

PostPosted: Mon Nov 03, 2008 6:31 pm    Post subject: Reply with quote
Hello to Greece, enthalpient Smile

I read that topic carefully, just downloading all needed stuff, so I can see that in action. Of course I understand very well to your problem, and to be honest, I awaited that Smile

Hash fingerprinting service for movies is really nice service, I wish there should be something like for audio fingerprinting: http://wiki.musicbrainz.org/AudioFingerprint - in short that means, if you reencode audio it is detected anyway - thats the future anyway, but I am too short in this area Smile

Ok, I understand what you need, now os doesnt support what you are asking for, so I have to make it.

Write XMLRPC methods and parameters and I will start code it. Now I can imagine:

Code:

SaveHashMovie($moviehash, $imdbid)


What else you need ? Smile As time show, there will be needed something like "ReportWrongHash" and so on Smile

But we have to start somewhere first. Also thanks for nice article in your forum, I will upgrade servers soon and some other features.
_________________
Support us

Back to top
View user's profile Send private message
stavros_sk



Joined: 25 Sep 2008
Posts: 9

PostPosted: Wed Nov 05, 2008 12:48 am    Post subject: Reply with quote
Hello again from Greece Smile

Yes a simple uploading function like UplodadMovieHash( $moviehash, $imdbid ) should be enough for a start. May it could contain a bool as return, to indicate if the function succeeded or not. Don't worry about duplicate upload requests, my application will check first if the data is already there.

Then, mine and other applications will be able to identify the film by its video file. Smile

Thanks for considering my request.

Feel free to ask if you want to know anything other about this feature.

Thanks, Stavros.
Back to top
View user's profile Send private message
os
Site Admin


Joined: 25 Feb 2006
Posts: 1229

PostPosted: Wed Nov 19, 2008 8:03 pm    Post subject: Reply with quote
it is enough for me, these days I am working on that. Little problem is, database is not designed in this way, but I have already couple of ideas Smile
_________________
Support us

Back to top
View user's profile Send private message
os
Site Admin


Joined: 25 Feb 2006
Posts: 1229

PostPosted: Fri Dec 05, 2008 6:11 am    Post subject: DEV: InsertMovieHash, CheckMovieHash2 Reply with quote
stavros: I finished (not really, it is still in work-in-progress) programming what you've ask for, please try and comment those XMLRPC functions:

Code:

InsertMovieHash($token, array(array('moviehash' => $moviehash, 'moviebytesize' => $moviebytesize, 'imdbid' => $imdbid, 'movietimems' => $movietimems, 'moviefps' => $moviefps), array(...)))

Will insert movie hashes to database, movietimems and moviefps are optional parameters, maybe I can add also moviefilename in those params (then we can know a real name of file according stat. That is for insert, user cannot insert same (duplicate) hashes (eg. unique key is hash,size,imdb)

Code:

CheckMovieHash2($token, array($moviehash, $moviehash,...))

return found moviehashes, output looks like this:
Code:

Array
(
    [status] => 200 OK
    [data] => Array
        (
            [53fc6fe84ad5ee31] => Array
                (
                    [0] => Array
                        (
                            [MovieHash] => 53fc6fe84ad5ee31
                            [MovieImdbID] => 1
                            [MovieName] => Carmencita
                            [MovieYear] => 1894
                            [SeenCount] => 1
                        )

                )

            [a3fc6fe84ad5ee31] => Array
                (
                    [0] => Array
                        (
                            [MovieHash] => a3fc6fe84ad5ee31
                            [MovieImdbID] => 2
                            [MovieName] => Clown et ses chiens, Le
                            [MovieYear] => 1892
                            [SeenCount] => 1
                        )

                )

        )

    [seconds] => 0.002
)


it is still in very beta stage, so try to insert some movie hashes, and then request them, if it works for you. Also dont forget when it will be officialy launched, tables will be truncated and all hashes from old tables transfered to these new.

Maybe you can ask, why there is so complicated output, why for one hash can exists more movies. It is because users sometimes send wrong hashes, or just some mistake (always possible) - when there are more options, user should decide which movie is the right one, and then call InsertMovieHash() (maybe with some paramater?) to correct it.

Ok, thats all for now, I will be 5 days out, so please test it as musch as you can, when I return I will finish this.
_________________
Support us

Back to top
View user's profile Send private message
stavros_sk



Joined: 25 Sep 2008
Posts: 9

PostPosted: Thu Dec 11, 2008 1:50 am    Post subject: Reply with quote
Hi os, thanks for getting back to me, i'll do some tests and will let you know how it goes! Smile
Back to top
View user's profile Send private message
os
Site Admin


Joined: 25 Feb 2006
Posts: 1229

PostPosted: Thu Dec 11, 2008 5:01 am    Post subject: Reply with quote
today I am just finishing that methods, they are already documented in trac. Let me know then ASAP.
_________________
Support us

Back to top
View user's profile Send private message
os
Site Admin


Joined: 25 Feb 2006
Posts: 1229

PostPosted: Thu Dec 11, 2008 9:25 am    Post subject: Reply with quote
Read more about this topic here:

http://trac.opensubtitles.org/projects/opensubtitles/wiki/MovieIdentification
_________________
Support us

Back to top
View user's profile Send private message
stavros_sk



Joined: 25 Sep 2008
Posts: 9

PostPosted: Wed Dec 17, 2008 8:02 am    Post subject: Reply with quote
Ok os, i implemented the hash uploading methods in my program. The InsertMovieHash() works fine, however the new movie hash can not be retrieved later with CheckMovieHash(). Are the new hashes instantly available when they get inserted or do they need to be verified by the admins before they are available?

If not, i must be doing something wrong which i will figure out eventually but knowing this might save me some headaches Smile

edit: It works great now, it was an error in my program as i suspected.

The 'SeenCount' addition in CheckMovieHash2() is very helpful too to minimize false mathces.

Thanks for everything os, i'm trying to spread the word for your site so hopefully some more people will support it.
Back to top
View user's profile Send private message
os
Site Admin


Joined: 25 Feb 2006
Posts: 1229

PostPosted: Wed Dec 17, 2008 1:36 pm    Post subject: Reply with quote
thanks a lot. The most important part is implementing all guidelines as I wrote in trac - for example "my" movies has everytime nfo attached (but I have them all in rars). In this way, there is possible to insert so many (new) moviehashes...

let me know, about your progress.
_________________
Support us

Back to top
View user's profile Send private message
Post new topic   Reply to topic    OpenSubtitles.org Forum Index // Developing All times are GMT + 2 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Powered by phpBB © 2001, 2002 phpBB Group