It seems that the publishing industry (music, books, movies, whatever) has decided that the only way to stop eMule is to screw up the files, and persuade ISPs to block eMule traffic. I have no idea how to solve the second problem, but the first one is worrying me greatly.
I started downloading a large zipped file recently, and it went fine for the first few days. At about 80% a bunch of new users at a german ISP suddenly appeared on the scene, all supposedly with a complete copy of the file and all happy to give out small pieces of the file. I became suspicious becuase these users would come and go quite rapidly, and only allow a small download, much less than the normal 8MB chunk I am used to. So by the time I had finished downloading 8MB from around 20 different users, eMule told me the chunk was corrupted. Overnight this process caused me to download 380MB of completely useless data.
The problem here is that we are based on a model of trust, rather than one of distrust. From a security point of view we need to be based on distrust, i.e. the user is fake and the data is fake, until proven otherwise. I understand and appreciate that OSS is based on trust, but when it comes to data we should be less trusting.
This would result in the following changes in eMule behaviour:
- Before downloading the file we would have to download a complete hash table. From what I can see this happens at the end, not the beginning.
- We would need to be able to download and check the hashes of much smaller pieces. 8MB is all very well when you have oodles of bandwidth, and you are assuming the file is OK, but isn't so great when you know that it could all be corrupt. Maybe 8kb would be better. Bittorrent seems to use even less.
- The entire chuink should only be downloaded from one user, so "blame" can be assigned if the chunk is corrupt, and the user can be distrusted. At present it is too easy to download an 8MB chunk from a dozen users. Which one do you blame if the chunk is corrupt?
- Once a user has sent a corrupt piece, we should assume the user is having technical difficulties and distrust his content for a few hours. Alternatively we could try downloading it again and if it is identical then we ban the user, at least for a while.
This process would allow users to identify and ignore posters who are actively poisoning legitimate files. It doen't address the issue of bogus files though. That's a problem that is already catered for by means of comments and marking stuff as spam.
I realise I am asking a lot from the developers, and I have no idea whether this would mean a complete revision of the protocols. I am happy to work with the people working on the problem.
Once again, thanks for an excellent program that has been a valuable resource, and has become part of my hobby or listening to audio books.

Sign In
Register
Help




MultiQuote

