Official eMule-Board: How Aich (and Ich) Work - Official eMule-Board

Jump to content


  • (3 Pages)
  • +
  • 1
  • 2
  • 3

How Aich (and Ich) Work short explanation

#1 User is offline   Some Support 

  • Last eMule
  • PipPipPipPipPipPipPip
  • Group: Yes
  • Posts: 3667
  • Joined: 27-June 03

Posted 09 September 2004 - 01:06 AM

Since AICH was introduced in 0.44a and the helpfiles (to which the changelog refers) are not yet up to date, I will post here a short explanation what it actually does at all, as a reference for the future and to avoid wrong rumors about it.

Because it is somewhat related to ICH (Intelligent Corruption Handling) I will quote the our helpfile on it first:

Quote

Data transfer in the donkey network is organised in chunks. A chunk totals 9MB. Each complete chunk downloaded is checked for corruption; if not corrupted, the chunk is made available for uploading.
Normally, corrupted chunks must be completely redownloaded. ICH tries to reduce the amount of data that needs to be redownloaded by rechecking it everytime new data for this part is received and thus saves time if a corruption is detected.


Statistically if one byte in a part is corrupted, ICH saves 50% of redownloading on average. In the best case it saves 99% (if the first byte we redownloaded was the corrupted one) and in the worst case it saves 0% (if the last byte we redownload for this part was the corrupted one). However if more than one position is corrupted ICH becomes more likely to be uneffective for this part. It also doenst helps if other malicious clients spread wrong data, because it is very likely that this part gets corrupted again and again.

Now what is AICH (Advanced Intelligent Corruption Handling)?
This system uses a complete different approach. It consists of a new hashsetset which is build from 180KB blocks and put together in a Hashtree. The used hashalgorithm is SHA1 (160Bits).
eMule creates this new hashset for all your shared files and stores it in the known2.met. Because the size of those hashset can be quite big - about 24 000 hash for a 4GB files and 48 000 hashs for a complete hashtree (which can be calculated from those 24K hashs), it is not stored in memory but only in this file and read on demand. When eMule has stored the full hashset it propagates the root/masterhash to other downloading clients.
Now if your client is downloading a file and detects a corrupted part it will request a recoverypacket from a random client which has a full AICH hashset. This recoverypacket consists of up to 69 hashs (53 for the partdata and 1-16 which make it possible to verify those 53 hashs against the masterhash which we trust). When your client received this packet and verified that the hashs fit to the roothash it checks all 180KB blocks of your corrupted part against the hashs it received and restores those 180KB blocks which have no corruption. This means if we assume that one byte of your 9.28MB part was corrupted, AICH would restore all blocks except the one were the corrupted byte is and your would have to redownload only this 180KB block.

In your log this would look like:

09.09.2004 02:43:43: Downloaded part 6 is corrupt :( ([file])
09.09.2004 02:43:46: AICH successfully recovered 8.22 MB of 9.28 MB from part 6 for [file]
In this example there were at least 6 corruptions in one part on different positions.

Why should you use Links with AICH Hashs?
One important thing is that we have to trust the AICH masterhash. IF this hashs is wrong (aka fits not to the md4 hash), it can cause serious problems while downloading and makes at least AICH for this file unusable (even tho on normal condtions you would still be able to finish a file with a wrong AICH hash).
eMule has two trustlevels when downloading a file.
If you didn't used a AICH link, eMule will use the hash which it receives from other clients, if certain conditions are met: At least 10 unique IPs have to sent us this hash and 92% of all clients which sent a hash have to agree on the same one. This hash gets the lowest trustlevel, is not saved when restarting eMule and you can't create AICH links with such a hash.
For rare files or a new released file with very few complete sources it can happen that eMule will not be able to trust any hash. Another case would be when some malicious client spreads wrong AICH hashs, so that eMule can't trust any hash or even worse trusts a bad hash. In all those cases AICH will be useless for this file.
Therefore the better solution is to use a link with an attached AICH hash. This hash is trusted from the beginning, it will be saved and you can also create filelinks with this hash.

AICH hashlinks are also backward compatible to earlier eMule versions (which will just ignore the additional hash).

Thats all :)

#2 User is offline   Archmage 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1859
  • Joined: 14-September 02

Posted 09 September 2004 - 01:35 AM

Thank you. Although I have to read it twice (I think you have over ten different hash-types/names) I did understand it and don't have any question at all.

So it is a good info. :+1:

This post has been edited by Archmage: 09 September 2004 - 01:36 AM

0

#3 User is offline   Avi 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1460
  • Joined: 11-September 02

Posted 09 September 2004 - 02:40 AM

Thanks for the explanation.

Quote

At least 10 unique IPs have to sent us this hash and 92% of all clients which sent a hash have to agree on the same one.

Now I'm calm knowing this (I was actually wondering about this!). :flowers:
0

#4 User is offline   Ingolf 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 460
  • Joined: 29-September 02

Posted 09 September 2004 - 03:51 AM

I got this:
09-09-2004 04:45:35: Downloaded part 14 is corrupt :(  
09-09-2004 04:45:55: AICH successfully recovered 8.93 MB of 9.28 MB from part 14
09-09-2004 05:02:03: I.C.H.: Recovered corrupted part 14, Saved: 21.17 KB


ok... so it found the 180kb block that had the error, and started to download it.. when it recieved 21.17kb the error was corrected. Right?

So i go to the transfer window and look in file details:
Posted Image

It says there is 9.28mb corrupted, when there was only 21.17kb, or atleast 180kb corrupted. So is 9.28mb a bug or what? And the Recovered (sh)©ould indicate the total bytes that had to be redownloaded, instead of 'parts'?

I love this feature! :punk:
0

#5 User is offline   Synetech 

  • Magnificent Member
  • PipPipPipPipPipPip
  • Group: Members
  • Posts: 369
  • Joined: 27-December 02

Posted 09 September 2004 - 07:06 AM

Ingolf, on Sep 8 2004, 11:51 PM, said:

I got this:
09-09-2004 04:45:35: Downloaded part 14 is corrupt :(  
09-09-2004 04:45:55: AICH successfully recovered 8.93 MB of 9.28 MB from part 14
09-09-2004 05:02:03: I.C.H.: Recovered corrupted part 14, Saved: 21.17 KB


ok... so it found the 180kb block that had the error, and started to download it.. when it recieved 21.17kb the error was corrected. Right?

So i go to the transfer window and look in file details:
Posted Image

It says there is 9.28mb corrupted, when there was only 21.17kb, or atleast 180kb corrupted. So is 9.28mb a bug or what? And the Recovered (sh)Šould indicate the total bytes that had to be redownloaded, instead of 'parts'?

I love this feature!  :punk:
View Post


Hmmm... It looks like the file details dialog is still using the old 9.28MB chunk method for it's statistics instead of the AICH method.
0

#6 User is offline   Superlexx 

  • noble steed
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 539
  • Joined: 19-October 03

Posted 09 September 2004 - 12:35 PM

Some Support, on Sep 9 2004, 03:06 AM, said:

At least 10 unique IPs have to sent us this hash and 92% of all clients which sent a hash have to agree on the same one.
View Post

If I remember correctly, it's 95% :flowers:
0

#7 User is offline   Some Support 

  • Last eMule
  • PipPipPipPipPipPipPip
  • Group: Yes
  • Posts: 3667
  • Joined: 27-June 03

Posted 09 September 2004 - 01:10 PM

you do not remember correctly :)

#8 User is offline   buzz 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 860
  • Joined: 25-December 02

Posted 09 September 2004 - 01:36 PM

hm, just curious. Why this odd number?
0

#9 User is offline   Superlexx 

  • noble steed
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 539
  • Joined: 19-October 03

Posted 09 September 2004 - 02:39 PM

Some Support, on Sep 9 2004, 03:10 PM, said:

you do not remember correctly :)
View Post

now I know why:

Quote

// for this version the limits are set very high, they might be lowered later
// to make a hash trustworthy, at least 10 unique Ips (255.255.128.0) must have send it
// and if we have received more than one hash  for the file, one hash has to be send by more than 95% of all unique IPs
#define MINUNIQUEIPS_TOTRUST  10 // how many unique IPs most have send us a hash to make it trustworthy
#define MINPERCENTAGE_TOTRUST  92  // how many percentage of clients most have sent the same hash to make it trustworthy

comments don't match the code :furious:
0

#10 User is offline   Some Support 

  • Last eMule
  • PipPipPipPipPipPipPip
  • Group: Yes
  • Posts: 3667
  • Joined: 27-June 03

Posted 09 September 2004 - 02:43 PM

buzz, on Sep 9 2004, 01:36 PM, said:

hm, just curious. Why this odd number?
View Post

92% seemed perefect ;)

@superlexx
well always trust the code, comments are ignored by the compiler (well it wasnt really hard here to see it, the value in the comment was just an example, which didnt got edited when adjusting the real one) :)

#11 User is offline   Soothsayer 

  • Advanced Member
  • PipPipPip
  • Group: Members
  • Posts: 83
  • Joined: 22-July 03

Posted 09 September 2004 - 03:00 PM

Does the new hash also have the same 4GB file size limit?

If not, is it likely that some future version of eMule will be able to accept links containing only the new hash and enable sharing of files >4GB?
0

#12 User is offline   Some Support 

  • Last eMule
  • PipPipPipPipPipPipPip
  • Group: Yes
  • Posts: 3667
  • Joined: 27-June 03

Posted 09 September 2004 - 03:03 PM

well hashs itself have never a sizelimit, this limit is caused by the edonkey protocol (however the implementation of the hash expects also only files smaller than 4GB right now).
And no I don't expect eMule beeing able to handle files bigger then 4GB in the near future.

#13 User is offline   IceStorm 

  • Member
  • PipPip
  • Group: Members
  • Posts: 18
  • Joined: 13-September 02

Posted 09 September 2004 - 05:39 PM

Ok, thanks for explaining, I think I understand (or at least most of it ;)).

Is this also a way to make the handling of files more secure (by replacing MD4 hashes by SHA1) and to reduce the chunk-size (from 9.28 mb to 180 kb) in the (near) future or not?
Because both would be great in my opinion :P.
0

#14 User is offline   Archmage 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1859
  • Joined: 14-September 02

Posted 09 September 2004 - 07:46 PM

IceStorm, on Sep 9 2004, 07:39 PM, said:

Is this also a way to make the handling of files more secure (by replacing MD4 hashes by SHA1) and to reduce the chunk-size (from 9.28 mb to 180 kb) in the (near) future or not?
Because both would be great in my opinion :P.
View Post


Since the new hash is using SHA1 and reduicing the chunk-size teoretical down to 180kb you get already what you wish for - without the negativ effects that you might get if you implement your wishes completly. (incompatiblity and more traffic for the little chunks.) :respect:
0

#15 User is offline   basketor64 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2036
  • Joined: 07-October 02

Posted 09 September 2004 - 09:31 PM

There is more traffic for little chunks, because you will get hashsets for the 180kb chunks.
With a different hashing system there would not have been more incompabilities.
Different block size could have been a problem, if they weren't a divisor of 180kb.

The ideal would have been a power of 2.
Using leafs blocks of 1ko could have worked, but there would have been really more overhead, but it could have been possible to use 128ko blocks hashes for the hashsets at network level, or any other multiple.

Why the hell did Swamp choosed 9500ko for chunks and 180ko for blocks ???
9500 is not even a multiple of 180 !!!! ^_^

I have one question, why is it important to verify 180ko blocks ?
Can't emule download datas from anywhere inside this blocks, or is emule obliged to download a block from scratch and cannot resume partial blocks ?

If it can, why not directly use hashests of size that would make more sens than 180ko like 128ko or 256ko ? :unsure:
This way it would be possible to have a really customisable chunk size.

This post has been edited by basketor64: 09 September 2004 - 09:32 PM

0

#16 User is offline   Superlexx 

  • noble steed
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 539
  • Joined: 19-October 03

Posted 10 September 2004 - 12:20 AM

I don't see the point of chunks of custom length. 180kB is the "BLOCKSIZE" in emule and probably other ed2k apps: data is requested 180kB-wise (formerly it was also written 180kB-wise, now on every received data packet it gets written into the buffer and then to the disc). Why that's not 200 or 500kB - I don't know (and actually don't care).

180kB is definitely small enough, you have a 1MB hash tree for a 4GB file. An upload session shouldn't be shorter than 500kB, 1MB is a good value IMO.

But I don't feel like the new hash will be used as the data spreadability factor in the next time (6 months).
0

#17 User is offline   basketor64 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2036
  • Joined: 07-October 02

Posted 10 September 2004 - 09:45 AM

I will just resume what I had in mind.

Well emule can download from anywhere in a file (I had a confirmation), it means, that emule is not entitled to 180ko bkocks.
9728000/184320=52.777 means that in a chunk there is a block that is incomplete.
It's wobbly.

If a new hash must be created, I just think that 180 makes no real sens, exept there is #DEFINE emuleblocksize 184320 in the code.
I agree it's a reasonable size, but 128 or 256 are too and still makes more sens as computer numbers.

If a new hash must come into the network, then it's better to use a hash tree that other p2p aps, or even better non p2p apps, would use if they add to use a hash tree.

That's why I was thinking that a hash tree with 1ko leaf hashes or upper (if it consumes really to much CPU) but still a power of 2 would be probably better in the long term.

Having hash leafs of 1k doesn't mean huge hash sets, it means having the choice in using any hashset level size between 1k and filesize/2, so here in base 2 you have the choise beteen 128 and 256 if you want to stay close of 180ko granularity.

I just brough some reflexion elements; I will not try to convince anyone and write 100 posts about that. :)

EDIT :
Maybe the goal was to still keep the ability to verify chunks as soon as possible by downloading as little as possible, so downlaoding size of a chunk.
Verifying a particular chunk would take in the worst case less than 2 blocks in plus. :+1:

This post has been edited by basketor64: 10 September 2004 - 11:00 AM

0

#18 User is offline   BigRedBrent 

  • You will be safe now, good citizen..... For I am..... BATMAN!
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 928
  • Joined: 25-July 03

Posted 10 September 2004 - 11:53 AM

I hope a plan to add all full hash info for the targeted file in some form of link is planned for the future. This way a releasing group can make sure that all the hash info will be good and emule could check the downloaded data against it in real time as it is downloaded.

This information would obviously need to be placed inside of some sort of container like a file or something, so that it can be stored on web sights in about the same way torrent files are stored.

Maybe at some point they might be sent from peer to peer as well, just like the original hash set is. This could allow in some instances the ability to share the complete blocks inside of incomplete chunks.

I don't see an additional 250k for every gigabyte to be that much overhead. I think I could easily live with that with the potential benefits it could bring.

I see no reason why partial chunk uploading can not be offered to anyone that has been given an upload slot so that they may better choose what would be the best to download.


I would definitely like to applaud the devs for there devotion and determination. Good work and for every ones enjoyment please keep it up. :clap:

This post has been edited by BigRedBrent: 26 October 2004 - 10:59 AM

0

#19 User is offline   basketor64 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2036
  • Joined: 07-October 02

Posted 10 September 2004 - 12:37 PM

It works like edonkey crumbs if you don't have the root hash from a link.

There is a scheme in the sticky release thread.
0

#20 User is offline   Some Support 

  • Last eMule
  • PipPipPipPipPipPipPip
  • Group: Yes
  • Posts: 3667
  • Joined: 27-June 03

Posted 10 September 2004 - 12:41 PM

Actually there is quite a difference, since eMule doesnt shares smaller blocks (and it is not planed to implement this soon). So it's still possible to download the file with a wrong AICH hash and not cause any corruption to other clients.
Also I'm not sure if eDonkey actually uses a real hashtree for their crumbs.

  • Member Options

  • (3 Pages)
  • +
  • 1
  • 2
  • 3

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users