Official eMule-Board: Improve Kademlia Publishing Speed - Official eMule-Board

Jump to content


Page 1 of 1

Improve Kademlia Publishing Speed

#1 User is offline   tHeWiZaRdOfDoS 

  • Man, what a bunch of jokers...
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 5630
  • Joined: 28-December 02

Posted 16 April 2016 - 04:09 PM

The current publishing speed of Kademlia is highly unsatisfying compared to server indexing. That's known for long and made that way on purpose but I'd like to improve this.
However, I don't want to violate any official rules. Thus, this thread.

Technical background:
From my understanding, files have to be republished ever KADEMLIAREPUBLISHTIMES (i.e. 5h). Notes and keywords have different times (24h).
Now eMule starts up to KADEMLIATOTALSTORESRC (3) indexing actions for up to SEARCHSTOREFILE_LIFETIME (140s or 120s for early abort).
I.e. if searches are always active simultaneously, that means that we have 3 files indexed in 120-140s or 1 per 40-46s. In 5h that's 390-450 files - best case scenario.
Please correct me if there are any errors so far :worthy:

Problems:
If the share of a Kad-only user surpasses 390/450 files, there will always be some files to be indexed.
Partial files are not published at all except passively, i.e. by searching for sources and then getting added by them. That means that 2 downloaders can only find each other if they both find a complete src for a file and get the other client information via XS.

Possible solutions:
One thing I tried out in my latest private kMule beta was to dynamically adjust the publishing speed. Checking the % of unshared files, I raised the max indexing actions up to KADEMLIATOTALSTORESRC*4 though that's still "just" 1560-1800 files in 5h. Still, an improvement... but at the cost of overhead. Ideally, you should calculate and apply the best multiplier automatically... but users sharing thousands of files would put incredible overhead on the network.

That's why I also tweaked how the next publishing file is chosen. Instead of just walking the shared file list, I wrote a function to pick the next file, i.e. files that have never before/rarely been requested or uploaded get an advantage. That's far from perfect but may help rare files getting spread better.

Another idea would be to combine certain files (e.g. into collections) and spreading the collection rather than the individual files. Those special "collections" would mean that the included files shared by the publisher. That's TBD.



What do you think? What can and may we do to improve spreading?
0

#2 User is offline   xSTHNSx 

  • Splendid Member
  • PipPipPipPip
  • Group: Members
  • Posts: 147
  • Joined: 01-December 15

Posted 16 April 2016 - 07:57 PM

Well as you stated we all know about the current problem with KAD and its notorious inability to fully propagate large number of files in timely manner for it to be effective or even remotely consider half of ED2K in general to been taken or seen seriously but thats another debate for 2G vs 3G p2p network. Now I was reading your kMule thread few months back as you were planning to modifying it to run and test to publish faster. As I have stated the ratio should always be dynamic and at the sametime should be based on users ability to handle bandwidth. Hopefully all MODers can agree with you to some point and add it to their own project and as far as eMule Vanilla I have a feeling we will get into debate over excessive overheads usage how it will harm clients with lower bandwidth, which in itself is foolish in this day and age thus nothing will change. Who cares about people running VeryCD clients in China and trying to download/upload files below 1kB/s.
Posted Image
BitCoin: 18rNdfKrUz5j3a9TQcECfKzY13gb1NLcmv
0

#3 User is offline   pier4r 

  • Ex falso quodlibet ; Kad is the major concept behind emule.
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 572
  • Joined: 31-March 09

Posted 17 April 2016 - 08:11 AM

View PosttHeWiZaRdOfDoS, on 16 April 2016 - 06:09 PM, said:

Another idea would be to combine certain files (e.g. into collections) and spreading the collection rather than the individual files. Those special "collections" would mean that the included files shared by the publisher. That's TBD.



What do you think? What can and may we do to improve spreading?


I guess this can be a way to go, i wrote a similar feature request in the past and maybe other wrote it too. Like "kad for starting, then source exchange for large discoveries", but it is still overhead. I think that this is normal: small scale, small overhead, large scale, large overhead. We are in 2016, we can do that. Sure not everywhere people have more than 256Kb in upload, but imo emule should follow the idea "decentralized + large variance of files", therefore kademlia somehow should be able to publish more, but it is also true that it is a complicated topic because it affects all the network.
>>> My wiki (ITA) on emule >>>Feature Request (ICS) or SOTN, ClientAnalyzer(fixing fastXs and reask punishment),, EmuleCollectionV2 >>> Emule on old hardware (intel pentium 2 or 3 - via c3 - and so on) with good OS settings and enough ram (256+ mb): great >>>user of: eMule - Xtreme - ZZUL bastard - SharX - SharkX 1.8b5 pierQR - ZZUL-Tra - ZZUL-Tra-TL - kMule - Beba

Extended signature: click.
0

#4 User is offline   tHeWiZaRdOfDoS 

  • Man, what a bunch of jokers...
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 5630
  • Joined: 28-December 02

Posted 18 April 2016 - 08:24 AM

The problem with Kad is that OH is distributed on the network. You don't know if you publish to a 5kB/s modem node or a GBit/s super-node. BUT even if we knew and used that information... using super-nodes and thus creating some "centralized" parts in a supposedly decentralized network would create additional issues and vulnerabilities. :ph34r:
I still think we should allow more publish requests or maybe "bundle" publish requests in a new Kad version (i.e. publishing a list of files instead of single files).
Also, partial files should be published, too ASAP.
0

#5 User is offline   pier4r 

  • Ex falso quodlibet ; Kad is the major concept behind emule.
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 572
  • Joined: 31-March 09

Posted 18 April 2016 - 06:30 PM

View PosttHeWiZaRdOfDoS, on 18 April 2016 - 10:24 AM, said:

The problem with Kad is that OH is distributed on the network. You don't know if you publish to a 5kB/s modem node or a GBit/s super-node. BUT even if we knew and used that information... using super-nodes and thus creating some "centralized" parts in a supposedly decentralized network would create additional issues and vulnerabilities. :ph34r:


Yes and no. If it is reasonable to think that tens of thousands of nodes have a certain upload speed (like 100kb or more) then you can scale from 5kb/s node to 100kb/s node. So you won't give gigabit nodes too much weight but at least you allow a bit of variance according to the network. Otherwise you set the minimum for everything and that's limiting.

And i know dht, i did my thesis on one of them (the point is that it was simulated).
>>> My wiki (ITA) on emule >>>Feature Request (ICS) or SOTN, ClientAnalyzer(fixing fastXs and reask punishment),, EmuleCollectionV2 >>> Emule on old hardware (intel pentium 2 or 3 - via c3 - and so on) with good OS settings and enough ram (256+ mb): great >>>user of: eMule - Xtreme - ZZUL bastard - SharX - SharkX 1.8b5 pierQR - ZZUL-Tra - ZZUL-Tra-TL - kMule - Beba

Extended signature: click.
0

#6 User is offline   tHeWiZaRdOfDoS 

  • Man, what a bunch of jokers...
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 5630
  • Joined: 28-December 02

Posted 19 April 2016 - 05:21 PM

Well, in that case we are just talking about the new limit? If the current code is for 5kbps and the new is 20x higher, we could publish 20x files... in theory. Not bad for a start... but I still highly suggest that we should add a Kad extension to publish a list of files instead of single files. A packet with 20x the size is better than 20x the packets. :angelnot:

It's a pity that interest into development seems to have faded - especially about such a long-known problem where workarounds require violation of the "do-not-touch" code parts.
2

  • Member Options

Page 1 of 1

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users