Forum rules
Under no circumstances is spamming or advertising of any kind allowed. Do not post any abusive, obscene, vulgar, slanderous, hateful, threatening, sexually-orientated or any other material that may violate others security. Profanity or any kind of insolent behavior to other members (regardless of rank) will not be tolerated. Remember, what you don’t find offensive can be offensive to other members. Please treat each other with the kind of reverence you’d expect from other members.
Failure to comply with any of the above will result in users being banned without notice. If any further details are needed, contact: “The team” using the link at the bottom of the forum page. Thank you.
Yougli
Posts: 9
Joined: Sun Feb 17, 2008 10:07 am

ComputeMovieHash in C#

Tue Nov 25, 2008 4:47 pm

Hi,

I'm using the code provided in the wiki to compute the hash of a movie. But I'm getting an overflow exception when executing the code. Any idea please?

Code: Select all

        private static byte[] ComputeMovieHash(Stream input)
        {
            long lhash, streamsize;
            streamsize = input.Length;
            lhash = streamsize;
 
            long i = 0;
            byte[] buffer = new byte[sizeof(long)];
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
                lhash += BitConverter.ToInt64(buffer, 0); // The exception occurs here, at the 3rd iteration with the test avi: breakdance.avi
            }
 
            input.Position = Math.Max(0, streamsize - 65536);
            i = 0;
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
                lhash += BitConverter.ToInt64(buffer, 0);
            }
            input.Close();
            byte[] result = BitConverter.GetBytes(lhash);
            Array.Reverse(result);
            return result;
        }


Thanks for your help...

stavros_sk
Posts: 9
Joined: Thu Sep 25, 2008 6:11 am

Wed Nov 26, 2008 8:23 am

Enclose all the function's operations (code) into a try-catch e.g.:


try
{
//function code
//...
//...
}
catch(Exception e)
{
Console.Writeline(e.ToSting() );
}


...to get a clue where exactly, in which line the error occurs.

Yougli
Posts: 9
Joined: Sun Feb 17, 2008 10:07 am

Wed Nov 26, 2008 2:28 pm

I know exactly where the exception occurs, I added a comment in the code in my first post.
Sorry if it wasn't clear...

The error occurs at:

Code: Select all

lhash += BitConverter.ToInt64(buffer, 0);

in the first while loop, at the 3rd iteration for the test avi 'breakdance.avi'

Cougar_
Posts: 19
Joined: Fri May 23, 2008 9:18 pm

Wed Nov 26, 2008 6:22 pm

I'm not C# programmer, so I may mistake, but overflow is normal in this pice of code. Yes it contains errors but even if it doesn't have them it is possible(higly) that overflow occure.

First, signed and unsigned arithmetic looks similary but are often reason of very hard to detect errors.

In this code I think, every variable that participate in calculating hash value should be 64bit unsigned integer!!.
As I look to MSDN "long" in C# is 64bit but not unsigned. So I think this is first mistake.

Second, Binaryconvert converts data stream to SIGNED integer and should to unsigned. Big mistake!!!. Why?
look:
4 + 0xff = 3 => 0xff is -1 in signed arithmetic( of course if we use 8 bit variables)

In unsigned arithmetic 4 + 0xff = 0x103 but variable has obly 8 bits and result require 9 bits, so overflow occured and only 8 bits are save so result is 0x03 - most important bit is lost. This bit trow overflow exception by set in procesor CF flag.

So overflow is normal in calculating hash, simply add
4 + 0xff ff ff ff ff ff ff ff and you have overflow but in this case you should ignore it an value 0x3 is correct as hash.


You may ask, why there is error, when variable is declared as long. In both cases result is the same = 0x03. Yes, when you add two variables of the same width result will be the same, but this is a exception. I'm 100 percent sure, that author wrote this pece of code and left it in this form becouse it works, but not becouse he know about exception in signed/unsigned arithmetic. The only difrences between those two sitiations is moment when error occure, but pseuedo error doesn't interest us in this situation, so finally result is the same in both cases so practicaly this code works great.

When width isn't the same problem begins:

int64 = in64 + signed byte => 4 + 0xff = 3;

int64 = in64 + unsigned byte => 4 + 0xff = 0x103;


Autor of this code probably compile this code with overvlow checknig option turned off.
Hmm, I don't know if C# is so restrictive in checking integer overflow, personaly I doubt that, for me it isn't possible so you must set so restrictive checking in project options - I only guess.

As I mentioned I'm not a C# programer, but I think that direct conversion to 64 bit integer may help.

lhash = (long)(lhash +BitConverter.ToInt64(buffer, 0)); // The exception

In this case compiler should know, that you know what you do, becouse you directly convert result to long and truncate its, so there shouldn't be any error.

for me this pice of code works great, as i sad he had errors only in theory.

Yougli
Posts: 9
Joined: Sun Feb 17, 2008 10:07 am

Thu Nov 27, 2008 3:44 pm

Ok, I tried declaring my variables using ulong instead of long, and the application doesn't throw an exception anymore. But the hash I get is incorrect according to the wiki.
I get 55f61777884dc435 instead of 8e245d9679d31e12.

The code:

Code: Select all

      private static byte[] ComputeMovieHash(Stream input)
      {
            ulong lhash, streamsize;
            streamsize = (ulong) input.Length;
            lhash = streamsize;
 
            long i = 0;
            byte[] buffer = new byte[sizeof(long)];
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
                lhash = BitConverter.ToUInt64(buffer, 0);
            }
 
            input.Position = (long) Math.Max(0, streamsize - 65536);
            i = 0;
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
                lhash += BitConverter.ToUInt64(buffer, 0);
            }
            input.Close();
            byte[] result = BitConverter.GetBytes(lhash);
            Array.Reverse(result);
            return result;
      }


Thanks for your help

Cougar_
Posts: 19
Joined: Fri May 23, 2008 9:18 pm

Thu Nov 27, 2008 5:35 pm

you forgot add operator in(first loop):

lhash = BitConverter.ToUInt64(buffer, 0);

When I add missing operator, then I compared result with C++ procedure that I use and returned value was the same;

for overflov error (if it happens) try:
lhash = (ulong)(lhash + BitConverter.ToUInt64(buffer, 0));


I suggest to add this line before first loop:
input.Position = 0;
for sure that stream is at the begining.

Yougli
Posts: 9
Joined: Sun Feb 17, 2008 10:07 am

Fri Nov 28, 2008 5:15 am

Thank you for pointing I forgot the add operator.
I tried what you told me, but I'm still getting an overflow exception at the same line, at 13th iteration this time.

Code:

Code: Select all

      private static byte[] ComputeMovieHash(Stream input)
      {
            ulong lhash, streamsize;
            streamsize = (ulong) input.Length;
            lhash = streamsize;
 
            long i = 0;
            byte[] buffer = new byte[sizeof(long)];
            input.Position = 0;
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
                //lhash += BitConverter.ToUInt64(buffer, 0);
                lhash = (ulong)(lhash + BitConverter.ToUInt64(buffer, 0));
            }
 
            input.Position = (long) Math.Max(0, streamsize - 65536);
            i = 0;
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
                //lhash += BitConverter.ToUInt64(buffer, 0);
                lhash = (ulong)(lhash + BitConverter.ToUInt64(buffer, 0));
            }
            input.Close();
            byte[] result = BitConverter.GetBytes(lhash);
            Array.Reverse(result);
            return result;
      }

Yougli
Posts: 9
Joined: Sun Feb 17, 2008 10:07 am

Mon Dec 01, 2008 5:07 pm

Update:
I removed the overflow check in the project compiler settings, and it works fine now with the code submitted in the wiki :p

User avatar
oss
Site Admin
Posts: 4314
Joined: Sat Feb 25, 2006 11:26 pm
Contact: Website

Tue Dec 02, 2008 7:23 am

ok, I will add this thread to wiki. Also dont forget to test BOTH files, which are in wiki, so they will give you same hash.

Yougli
Posts: 9
Joined: Sun Feb 17, 2008 10:07 am

Tue Dec 02, 2008 12:18 pm

The code works fine for both files :)

Cougar_
Posts: 19
Joined: Fri May 23, 2008 9:18 pm

Wed Dec 03, 2008 2:33 am

Yougli wrote:Update:
I removed the overflow check in the project compiler settings, and it works fine now with the code submitted in the wiki :p


As I said, overflow is normal in this code and everything you should is to disable overflow checking.

I think, you should first read a book about C# and meet every aspect of programing in this language.

I quickly look to: Sams Teach Yourself C# in 21 Days by Bradley Jones on http://books.google.com and guess what I found (I was looking for: How to ignore overflow errors)?? :PPP
In c# is special clause to force compiler to ignore or to check some expression: checked/unchecked without need to globaly turn overflow checking option on/off :PP

So, everything you need is change both lines from:
lhash += BitConverter.ToInt64(buffer, 0);

to:

unchecked { lhash += BitConverter.ToInt64(buffer, 0); }


So, first read a book about C# syntax ;PP

And Yes, both code works, but that in which you use ulong is better becouse it correctly implement arithmetic operations.

Code: Select all

        private static byte[] ComputeMovieHash(Stream input)
        {
            ulong lhash;
            long streamsize;
            streamsize = input.Length;
            lhash = (ulong)streamsize;
 
            long i = 0;
            byte[] buffer = new byte[sizeof(long)];
            input.Position = 0;
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
               unchecked { lhash += BitConverter.ToUInt64(buffer, 0); }
            }
 
            input.Position = Math.Max(0, streamsize - 65536);
            i = 0;
            while (i < 65536 / sizeof(long) && (input.Read(buffer, 0, sizeof(long)) > 0))
            {
                i++;
               unchecked { lhash += BitConverter.ToUInt64(buffer, 0); }
            }           
            byte[] result = BitConverter.GetBytes(lhash);
            Array.Reverse(result);
            return result;
        }


I removed from code line that is closing stream. If steram is opened outside procedure it should be closed outside too.

kokoko3k
Posts: 1
Joined: Wed Mar 07, 2012 10:58 am

Re: ComputeMovieHash in C#

Wed Mar 07, 2012 11:07 am

Post edited.
(nevermind, i was wrong)

Koko
Posts: 1
Joined: Mon Jun 09, 2014 5:35 pm

Re: ComputeMovieHash in C#

Mon Jun 09, 2014 5:54 pm

I'd recommend someone with access to the trac wiki change the code to the following, to address the above issue, and to simplify the code:

Code: Select all

using System;
using System.Text;
using System.IO;

namespace MovieHasher
{
    class Program
    {
        private static ulong GetHash(string filepath)
        {
            using (FileStream input = File.OpenRead(filepath))
            {
                ulong lhash = (ulong)input.Length;
                byte[] buf = new byte[65536 * 2];

                input.Read(buf, 0, 65536);
                input.Position = Math.Max(0, input.Length - 65536);
                input.Read(buf, 65536, 65536);

                for (int i = 0; i < 2 * 65536; i += 8) unchecked
                {
                    lhash += BitConverter.ToUInt64(buf, i);
                }

                return lhash;
            }
        }

        static void Main(string[] args)
        {
            ulong moviehash = GetHash(@"C:\test.avi");
            Console.WriteLine("The hash of the movie-file is: {0}", moviehash.ToString("x16"));
        }
    }
}



This will only work on a little endian machine, like the C and C++ code, for two reasons: BitConverter and the hex conversion.
Hardly anyone will ever want to compile this on a big endian machine, but in case:

Change the loop to

Code: Select all

                for (int i = 0; i < 2 * 65536; ) unchecked
                {
                    //source data is always considered little endian, BitConverter won't correctly convert that on big endian platforms -> convert it manually
                    lhash += (ulong)buf[i++] << 0 | (ulong)buf[i++] << 8 | (ulong)buf[i++] << 16 | (ulong)buf[i++] << 24 | (ulong)buf[i++] << 32 | (ulong)buf[i++] << 40 | (ulong)buf[i++] << 48 | (ulong)buf[i++] << 56;
                }

and use

Code: Select all

        private static string ToLittleEndianHexadecimal(ulong l)
        {
            StringBuilder hexBuilder = new StringBuilder();
            for (int shift = 56; shift >= 0; shift -= 8)
            {
                hexBuilder.Append((l >> shift & 0xFF).ToString("x2"));
            }
            return hexBuilder.ToString();
        }

to convert the hash to a hex string.

Return to “Developing”

Who is online

Users browsing this forum: No registered users and 0 guests