Page 1 of 1

C# What's wrong with my OSDB hash algorithm?

Posted: Wed Oct 11, 2017 2:13 pm
by jessicamec
I'm trying to write a c# algorithm to get a hash from an online video file to search for subtitles on (https://trac.opensubtitles.org/projects/opensubtitles/wiki/HashSourceCodes)

My idea is that the algorithm is fed a url to a video file, and returns the hash. Simple. Problem is, I'm not getting the right value back. According to the page I've linked to, this file should return 8e245d9679d31e12, but I'm getting 00c4fcb4aa6f763e. Here is my C#:

Code: Select all

public static async Task<byte[]> ComputeMovieHash(string filename) { long filesize = 0; //Get File Size HttpWebRequest req = (HttpWebRequest)WebRequest.Create(filename); req.Method = "HEAD"; var resp = await req.GetResponseAsync(); filesize = resp.ContentLength; long lhash = filesize; //Get first 64K bytes byte[] firstbytes = new byte[0]; using (HttpClient client = new HttpClient()) { client.DefaultRequestHeaders.Add("Range", "bytes=0-65536"); using (HttpResponseMessage response = await client.GetAsync(filename)) { Debug.WriteLine("getting first bytes (bytes=0-65536)"); firstbytes = await response.Content.ReadAsByteArrayAsync(); } } lhash += BitConverter.ToInt64(firstbytes, 0); //Get last 64K bytes byte[] lastbytes = new byte[0]; using (HttpClient client = new HttpClient()) { client.DefaultRequestHeaders.Add("Range", "bytes=" + (filesize - 65536) + "-" + filesize); using (HttpResponseMessage response = await client.GetAsync(filename)) { Debug.WriteLine("getting last bytes (" + "bytes=" + (filesize - 65536) + "-" + filesize + ")"); lastbytes = await response.Content.ReadAsByteArrayAsync(); } } lhash += BitConverter.ToInt64(lastbytes, 0); //Return result byte[] result = BitConverter.GetBytes(lhash); Array.Reverse(result); Debug.WriteLine("RESULT=" + ToHexadecimal(result)); return result; }
What am I doing wrong?? I've compared it to the code given by opensubtitles.org, and it seems like it should have the same outcome

Re: C# What's wrong with my OSDB hash algorithm?

Posted: Fri Nov 03, 2017 8:32 pm
by sarathkcm
You are supposed to get all bytes from the beginning and end, convert these bytes into Int64s and sum up along with the file size. Instead, you are adding only the first Int64 from beginning and end of the file.
Also, you are reading first 64k+1 bytes instead of 64k bytes