Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
DrKnowLittle
Posts: 8
Joined: Wed Nov 12, 2008 11:05 am

computing subtitlehash when uploading does not match

Wed Jan 14, 2009 4:11 pm

Hi!

Made a small tool for my own amusement to download (swe/eng) subtitles from diffrents sources and opensubtitles is one of the best ones and now I wanted to contribute by adding uploding fuctionality in my code but don't understand how the subtitlehash is calculated , I use the same code (see below) as I use to calculate moviehash but it's not a match with the subtitlehash on the subtitle downloaded.

For example: Bones.S04E01E02.720p.HDTV.X264-DIMENSION
Moviehash = d51ed8ab9b5c436a
When I download the english srt it has subtitlehash = 56df05ef3ba83fd82252ae09ef2e6051
If i calculate a hash for the download srt file using the same code as I use for movies I get hash = 5faa63a7aa5429e4

Why is the subtitle hash longer than the movie hash ? And how should I calculate it ?

Code: Select all

{ class MovieHash { public static byte[] ComputeMovieHash(string filename) { byte[] result; using (Stream input = File.OpenRead(filename)) { result = ComputeMovieHash(input); } return result; } public static byte[] ComputeMovieHash(Stream input) { long lhash, streamsize; streamsize = input.Length; lhash = streamsize; long i = 0; byte[] buffer = new byte[sizeof(long)]; while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0)) { i++; lhash += BitConverter.ToInt64(buffer, 0); } input.Position = Math.Max(0, streamsize - 65536); i = 0; while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0)) { i++; lhash += BitConverter.ToInt64(buffer, 0); } input.Close(); byte[] result = BitConverter.GetBytes(lhash); Array.Reverse(result); return result; } public static string ToHexadecimal(byte[] bytes) { StringBuilder hexBuilder = new StringBuilder(); for (int i = 0; i < bytes.Length; i++) { hexBuilder.Append(bytes[i].ToString("x2")); } return hexBuilder.ToString(); } } }

DrKnowLittle
Posts: 8
Joined: Wed Nov 12, 2008 11:05 am

Re: computing subtitlehash when uploading does not match

Thu Jan 15, 2009 11:06 am

Okey , 26 views and no response.

I have some progress now that I found out that the subhash is a md5 checksum and changed my code to the one below but I still don't get the expected result.

For example: Bones.S04E01E02.720p.HDTV.X264-DIMENSION
Moviehash = d51ed8ab9b5c436a
When I download the english srt it has subtitlehash = 56df05ef3ba83fd82252ae09ef2e6051

When using the code below I tried every encoding type availalbe and got the following result , but none of the match . Please help me and others how to calculate the subtitlehash correctly as it's not possible to upload without it.


56df05ef3ba83fd82252ae09ef2e6051 opensubtiles md5 (is this correct ??)

My results:
de9c55b37ce2ccdeaa68eb9709239d01 ascii
de9c55b37ce2ccdeaa68eb9709239d01 utf7
c76443ee3c2879ea25a163c833e95197 utf8
f8bfae9161174f752818be4a543c5631 utf32
b138df6ff5f25890183e4677f6a49fc8 Unicode
244c6ba09574c52ba613726cc621bbac bigEndianUnicode


Code: Select all

public static string ComputeSubtitleHash(string filename) { string result; string input = System.IO.File.ReadAllText(filename); { result = HashString(input); } return result; } private static string HashString(string value) { byte[] data = new MD5CryptoServiceProvider().ComputeHash(Encoding.UTF32.GetBytes(value)); StringBuilder hashedString = new StringBuilder(); for (int i = 0; i < data.Length; i++) hashedString.Append(data[i].ToString("x2")); return hashedString.ToString(); }
[/code]

DrKnowLittle
Posts: 8
Joined: Wed Nov 12, 2008 11:05 am

Solution :!:

Thu Jan 15, 2009 1:58 pm

Finaly figured out how to do it , the trick was to NOT make a big string out of the whole file first and then calculate the hash.

The following c# code should be added to the wiki.

Code: Select all

public static string ComputeSubtitleHash(string filename) { MD5 md5 = MD5.Create(); StringBuilder sb = new StringBuilder(); using (FileStream fs = File.Open(filename, FileMode.Open)) { foreach (byte b in md5.ComputeHash(fs)) sb.Append(b.ToString("x2").ToLower()); } return sb.ToString(); }

User avatar
oss
Site Admin
Posts: 5891
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Thu Jan 15, 2009 2:01 pm

as you know moviehash and subtitlehash are different hashes. moviehash is special hash, you can find it in wiki (sourcecodes - just download example files and make sure you get same hashes). For subtitle hash - it is ordinary md5 hash, it is used in TryUploadSubtitle(), for checking for duplicity (duplicity is later checking in UploadSubtitles(), but thats another story...).

So what is problem now ?

- you upload subtitles with some md5, subtitles are uploaded
- you download SAME subtitles (are you sure it is same?) and have different md5 ?

just compare (fc) those subtitles and you will see why, what is wrong.

DrKnowLittle
Posts: 8
Joined: Wed Nov 12, 2008 11:05 am

Thu Jan 15, 2009 2:15 pm

I don't know if you are a C# programmer but both the first code I used and the one that worked out to give the same result as the one used here on OpenSubtitles both calculates a md5 hash but using different methods ,the first one calculates on a whole string ( and when searing on google seams to be the most common one) where as the one used here is hashing one byte at time.

Anyway problem solved for me but I won't be the last C# developer to find "the wrong" solution when searching on google so please add the code to the wiki.

Im still confused :? about the answer I get when using the TryUploadSubtitles method.

I call it with the correct Moviehash , MovieByteSize , MovieFileName, SubFileName and SubHash. MovieFps,MovieFrames and MovieItems left blank.

The answer I get is:
alreadyindb = 0 data = false ( shouldn't this allways be an array ?)
seconds = 0.002
stauts = "200 OK"

But as I downloaded the subtitle using SearchSubtitles with my moviehash and DownloadSubtitles to downloadf the actual srt I know the the srt allrady is in the database so why do I get alreadyindb = 0 ??

User avatar
oss
Site Admin
Posts: 5891
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Fri Jan 16, 2009 5:59 am

I am not C# developer, I played with it long time ago. Anyway, for calculating md5 I think there is only one right method.

For TryUploadSubtitles() - if those subtitles (by md5 subtitlehash) already exists in database, alreadyindb should return true ofcourse. You can try UploadSubtitles() - after uploading them, try to upload them again, and it should work. If not, there is something wrong, you can send me what you are actually sending to server and then I can debug that - but after 26.1. in that time I am travelling and "not working" :)

thanks

DrKnowLittle
Posts: 8
Joined: Wed Nov 12, 2008 11:05 am

Sat Jan 17, 2009 1:38 pm

I am not C# developer, I played with it long time ago. Anyway, for calculating md5 I think there is only one right method.
Nevermind the md5hash , that is solved now.
For TryUploadSubtitles() - if those subtitles (by md5 subtitlehash) already exists in database, alreadyindb should return true ofcourse. You can try UploadSubtitles() - after uploading them, try to upload them again, and it should work. If not, there is something wrong, you can send me what you are actually sending to server and then I can debug that - but after 26.1. in that time I am travelling and "not working" :)

thanks
For my tests I been using the following :
Bones.S04E01E02.720p.HDTV.X264-DIMENSION
Moviehash = d51ed8ab9b5c436a
Moviebytesize = 2346568376

I search for subs with "swe" or "eng" and get a hit on "eng" and download that sub using the DownloadSubtitles method without problems.

I then try to use use the TryUploadSubtitles with the following array[0] of parameters:

MovieFileName= Bones.S04E01E02.720p.HDTV.X264-DIMENSION
MovieByteSize = 2346568376
MovieHash= d51ed8ab9b5c436a
MovieFps = null
MovieFames = null
MoveItems = null
SubFileName = Bones.S04E01E02.720p.HDTV.X264-DIMENSION.srt
SubHash = 56df05ef3ba83fd82252ae09ef2e6051


The response I get is:
alreadyindb = 0
data = false
seconds = 0.002
stauts = "200 OK"

Don't know if you keep a log ( I imagine that it would be a large one) but my login token on the last test was rpegakkohgbu2q8cgrja0ofoc6 and my test application login as "AllSubFinder v0.1".
Please help me. as I would love to contribute to the database

DrKnowLittle
Posts: 8
Joined: Wed Nov 12, 2008 11:05 am

:cry:

Mon Feb 09, 2009 6:52 pm

2 weeks and no response , as I can't find any useful help on how to get upload work using the XMLRPC in C# I can't do much to support this site as I really wish todo.

Isn't there anyone who made a program that is using the xmlrpc api to upload subs to opensubtitles.org that care to share some pices of code ?

Cougar_
Posts: 19
Joined: Fri May 23, 2008 9:18 pm

Re: :cry:

Wed Feb 11, 2009 6:16 pm

You must do something wrong, my pice of code works very well. I don't use C# but it doesn't matter which development enviroment we use.

Check if you send really correct data:

MoveItems = null ???? - wrong copy/paste ?? or mistake in data ??. As i looked in doc I only see 'movietimems' => $movietimems. Checkt it.

There may by problem with Archiver(implementation) you use, It may compress with different algo, there is many implementatnions, older and newer with new features .... Generaly use those on GNU license - gzip etc.

But as i see your trouble with hash, I think you simply don't put correct data to procedures, so algorithm calculate wrong result or signal error.
First, you should understand what you are doing.
Check if your XML data you send are correct - not only data, but XML structure too !!! - - very important

Login hash doesn't metter, use simply "00000" for testing purpose.

User avatar
oss
Site Admin
Posts: 5891
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Thu Feb 12, 2009 1:40 pm

yes, you must doing something wrong. When you are developing, just upload some subtitle be yourself, and then delete it...

no one has a problem, all programs works, so there must be a some kind of problem on your side :)

Return to “Developing”

Who is online

Users browsing this forum: No registered users and 23 guests