Official eMule-Board: Feature: Zz Chunk Chooser - Official eMule-Board

Jump to content


  • (2 Pages)
  • +
  • 1
  • 2

Feature: Zz Chunk Chooser Faster completion of chunks during DL

#21 User is offline   zz 

  • -
  • PipPipPipPipPipPipPip
  • Group: Debugger
  • Posts: 2014
  • Joined: 30-November 02

Posted 09 January 2006 - 04:55 PM

I see what you are getting at, but I'm not sure it's the best approach from the "network perspective". It's a question of balance if you should get both those chunks, or if someone else should get a chance to download. If the second chunk isn't rare, you'll complete it on your next transfer anyway.

/zz B)
ZZUL - get control of your uploads: ZZUL Forum
0

#22 User is offline   jicxicmic 

  • Premium Member
  • PipPipPipPipPip
  • Group: Members
  • Posts: 279
  • Joined: 08-November 02

Posted 09 January 2006 - 08:13 PM

well. from 'network' point of view, have a partial chunck is complete useless. that means I waited in two queues for a long time to get a complete chunck from each source and I get 1 and a half, i think I loose and the network too, since I can't share the partial chunck.
0

#23 User is offline   zz 

  • -
  • PipPipPipPipPipPipPip
  • Group: Debugger
  • Posts: 2014
  • Joined: 30-November 02

Posted 09 January 2006 - 08:21 PM

To me it seems most transfers doesn't manage a complete chunk (my stats show between 3-4 megs), so splitting the transfers on two chunks will most probably cause two even less complete chunks.

But I can see the advantage in your strategy as well, if we can be reasonably sure that the first chunk completes without the extra help.

/zz B)
ZZUL - get control of your uploads: ZZUL Forum
0

#24 User is offline   clyde 

  • Newbie
  • Pip
  • Group: Members
  • Posts: 11
  • Joined: 30-January 04

Posted 18 January 2006 - 04:22 AM

zz - per your request, I am posting this question in the "correct" forum. Sorry for heading of tangent in the other. (I deleted my question from the other thread, so as to not encourage any other wild tangents....)

Based on what I have read in this forum, one difference between "official" emule and zz is that zz attempts to complete chunks by aggregating several users if that chunk is the rarest. Also, several people have pointed out the obvious that the rarest chunk is indeed THE chunk with the least sources. Official emule has claimed to pick the rarest chunk for several releases. However, based on information that I have gathered and behaviour I have observed, it does not do a very good job of this (particularly with truly rare chunks -- ie Less than 10). My understanding is that official emule splits the available chunks into rarity "bands", 0-10, 10-200, 200-800, 800+ and somewhat randomly picks the chunks from those bands. Therefore, back to my original question, previously posted in the incorrect forum....

What is the difference between zzul rare chunk selection, and official rare chunk selection.

This post has been edited by clyde: 18 January 2006 - 04:24 AM

0

#25 User is offline   coluche 

  • hm ?
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2274
  • Joined: 02-May 05

Posted 18 January 2006 - 05:28 AM

Hej, clyde

You "shocked" me over there with revealing those bands.

But how I understood zz's posting over at the other topic, those bands are used only to determine what is considered most rare,

for files with 0-10 sources : 1 is rarer than 2, is rarer than 3, ...
for files with 10-200 sources : 1 is as rare as 2, is rarer than 3,...
and so on.

If i id not (once again :( ) misunderstand, then those bands by far do not have the impact as you think.
It's Screamin' Jay Hawkins and he's a Wild Man, so bug off!
0

#26 User is offline   zz 

  • -
  • PipPipPipPipPipPipPip
  • Group: Debugger
  • Posts: 2014
  • Joined: 30-November 02

Posted 18 January 2006 - 06:36 PM

I have a weak memory, and I may remember the below wrong, but I think it's basically like this:

How eMule picks the chunk to download

There are three bands: very rare, rare, and normal.

Basically eMule will try to pick the rarest chunk in the "very rare" band if there are chunks that are very rare, otherwise it will try to pick the rarest chunk in the "rare" band. If there are no "very rare" or "rare" chunks, it will pick a random chunk from the "normal" band (preferring to continue a chunk it has already started, unless there's already a transfer going on to that chunk).

The limits for these bands are is decided by the total number of sources found for the file. If there's less than 200 sources in total for the file, the limits are that if a chunk exists at less than 10% of the sources it's considered "very rare", 10%-20% it's "rare", and otherwise it's considered normal.

The completeness of a chunk is included in the valuation of the rareness of a chunk; in the "very rare" band, if a chunk is > 75% complete, 1 source is subtracted from the number of sources available for the chunk, that is, it's evaluated as if it's slightly rarer than it really is.

For example:

If the file has 200 sources, and you are comparing the two rarest chunks, one having just one source being 0% complete, and the other having 2 sources but being 90% complete, then it may pick the chunk with two sources for download. If it is comparing two chunks that each have two sources, but one of them is 10% complete, it will prefer to continue on the started chunk.

In the "rare band" every 25% of completeness subtracts one source, so if a chunk is > 75% complete, it may be picked instead of a chunk that has three less sources.

In the normal band (>20% of the sources have the chunk) it doesn't care about the rareness of the chunk, it will just continue the most complete chunk.

So eMule will try to pick a "very rare" chunk, otherwise a "rare" chunk, and if there's no such chunks, it will pick a random chunk.

Problems with this strategy

Basically, I see two problems with the standard eMule implementations, and they are easily fixed without changing much code:

1. The rareness bands are too low. At 10 sources in total, the "very rare" band will be 1 source, and the "rare" band is 2 sources. When you select a chunk from a remote client, if it has no chunks with only 1-2 sources, eMule will pick a random chunk from the rest of the chunks (they will all have 3-10 sources). I'd prefer it pick the rarest chunk when there's so few sources. This is a change I've done in ZZUL.

2. Standard eMule prefers to transfer from only one client to each chunk. If you have two transfers going on at the same time, they will be for two separate chunks. I think it's better to try to complete a chunk as soon as possible, which would be done by letting those two transfers go to the same chunk. That chunk could then be completed and reshared twice as fast. This is also done in ZZUL.

The code

Here's the code for the standard eMule. I have marked the important places with red, for instance where the limits for the "very rare" and the "rare" band is selected.

Quote

bool CPartFile::GetNextRequestedBlock(CUpDownClient* sender,
                                      Requested_Block_Struct** newblocks,
                                      uint16* count) /*const*/
{
    // The purpose of this function is to return a list of blocks (~180KB) to
    // download. To avoid a prematurely stop of the downloading, all blocks that
    // are requested from the same source must be located within the same
    // chunk (=> part ~9MB).
    // 
    // The selection of the chunk to download is one of the CRITICAL parts of the
    // edonkey network. The selection algorithm must insure the best spreading
    // of files.
    // 
    // The selection is based on 4 criteria:
    //  1.  Frequency of the chunk (availability), very rare chunks must be downloaded
    //      as quickly as possible to become a new available source.
    //  2.  Parts used for preview (first + last chunk), preview or check a
    //      file (e.g. movie, mp3)
    //  3.  Request state (downloading in process), try to ask each source for another
    //      chunk. Spread the requests between all sources.
    //  4.  Completion (shortest-to-complete), partially retrieved chunks should be
    //      completed before starting to download other one.
    // 
    // The frequency criterion defines three zones: very rare (<10%), rare (<50%)
    // and common (>30%). Inside each zone, the criteria have a specific ‘weight’, used
    // to calculate the priority of chunks. The chunk(s) with the highest
    // priority (highest=0, lowest=0xffff) is/are selected first.
    // 
    //          very rare  (preview)      rare                      common
    //    0% <---- +0 pt ----> 10% <----- +10000 pt -----> 50% <---- +20000 pt ----> 100%
    // 1.  <------- frequency: +25*frequency pt ----------->
    // 2.  <- preview: +1 pt --><-------------- preview: set to 10000 pt ------------->
    // 3.                      <------ request: download in progress +20000 pt ------>
    // 4a. <- completion: 0% +100, 25% +75 .. 100% +0 pt --><-- !req => completion --->
    // 4b.                                                  <--- req => !completion -->
    // 
    // Unrolled, the priority scale is:
    // 
    // 0..xxxx      unrequested and requested very rare chunks
    // 10000..1xxxx  unrequested rare chunks + unrequested preview chunks
    // 20000..2xxxx  unrequested common chunks (priority to the most complete)
    // 30000..3xxxx  requested rare chunks + requested preview chunks
    // 40000..4xxxx  requested common chunks (priority to the least complete)
    //
    // This algorithm usually selects first the rarest chunk(s). However, partially
    // complete chunk(s) that is/are close to completion may overtake the priority
    // (priority inversion).
    // For the common chuncks, the algorithm tries to spread the dowload between
    // the sources
    //

    // Check input parameters
    if(count == 0)
        return false;
    if(sender->GetPartStatus() == NULL)
        return false;

    // Define and create the list of the chunks to download
    const uint16 partCount = GetPartCount();
    CList<Chunk> chunksList(partCount);

    // Main loop
    uint16 newBlockCount = 0;
    while(newBlockCount != *count){
        // Create a request block stucture if a chunk has been previously selected
        if(sender->m_lastPartAsked != (uint16)-1){
            Requested_Block_Struct* pBlock = new Requested_Block_Struct;
            if(GetNextEmptyBlockInPart(sender->m_lastPartAsked, pBlock) == true){
                // Keep a track of all pending requested blocks
                requestedblocks_list.AddTail(pBlock);
                // Update list of blocks to return
                newblocks[newBlockCount++] = pBlock;
                // Skip end of loop (=> CPU load)
                continue;
            }
            else {
                // All blocks for this chunk have been already requested
                delete pBlock;
                // => Try to select another chunk
                sender->m_lastPartAsked = (uint16)-1;
            }
        }

        // Check if a new chunk must be selected (e.g. download starting, previous chunk complete)
        if(sender->m_lastPartAsked == (uint16)-1){

            // Quantify all chunks (create list of chunks to download)
            // This is done only one time and only if it is necessary (=> CPU load)
            if(chunksList.IsEmpty() == TRUE){
                // Indentify the locally missing part(s) that this source has
                for(uint16 i = 0; i < partCount; i++){
                    if(sender->IsPartAvailable(i) == true && GetNextEmptyBlockInPart(i, NULL) == true){
                        // Create a new entry for this chunk and add it to the list
                        Chunk newEntry;
                        newEntry.part = i;
                        newEntry.frequency = m_SrcpartFrequency[i];
                        chunksList.AddTail(newEntry);
                    }
                }

                // Check if any block(s) could be downloaded
                if(chunksList.IsEmpty() == TRUE){
                    break; // Exit main loop while()
                }

                // Define the bounds of the three zones (very rare, rare)
                // more depending on available sources
                uint8 modif=10;
                if (GetSourceCount()>800) modif=2; else if (GetSourceCount()>200) modif=5;
                uint16 limit = (uint16)(modif*GetSourceCount()/ 100);
                if (limit==0) limit=1;

                const uint16 veryRareBound = limit;
                const uint16 rareBound = 2*limit;


                // Cache Preview state (Criterion 2)
                const bool isPreviewEnable = (thePrefs.GetPreviewPrio() || thePrefs.IsExtControlsEnabled() && GetPreviewPrio()) && IsPreviewableFileType();

                // Collect and calculate criteria for all chunks
                for(POSITION pos = chunksList.GetHeadPosition(); pos != NULL; ){
                    Chunk& cur_chunk = chunksList.GetNext(pos);

                    // Offsets of chunk
                    const uint64 uStart = (uint64)cur_chunk.part * PARTSIZE;
                    const uint64 uEnd  = ((GetFileSize() - (uint64)1) < (uStart + PARTSIZE - 1)) ?
                        (GetFileSize() - (uint64)1) : (uStart + PARTSIZE - 1);
                    ASSERT( uStart <= uEnd );

                    // Criterion 2. Parts used for preview
                    // Remark: - We need to download the first part and the last part(s).
                    //        - When the last part is very small, it's necessary to
                    //          download the two last parts.
                    bool critPreview = false;
                    if(isPreviewEnable == true){
                        if(cur_chunk.part == 0){
                            critPreview = true; // First chunk
                        }
                        else if(cur_chunk.part == partCount-1){
                            critPreview = true; // Last chunk
                        }
                        else if(cur_chunk.part == partCount-2){
                            // Last chunk - 1 (only if last chunk is too small)
                            if( (GetFileSize() - uEnd) < (uint64)PARTSIZE/3){
                                critPreview = true; // Last chunk - 1
                            }
                        }
                    }

                    // Criterion 3. Request state (downloading in process from other source(s))
                    const bool critRequested = cur_chunk.frequency > veryRareBound &&  // => CPU load
                        IsAlreadyRequested(uStart, uEnd);

                    // Criterion 4. Completion
                    uint64 partSize = PARTSIZE;
                    for(POSITION pos = gaplist.GetHeadPosition(); pos != NULL; ) {
                        const Gap_Struct* cur_gap = gaplist.GetNext(pos);
                        // Check if Gap is into the limit
                        if(cur_gap->start < uStart) {
                            if(cur_gap->end > uStart && cur_gap->end < uEnd) {
                                partSize -= cur_gap->end - uStart + 1;
                            }
                            else if(cur_gap->end >= uEnd) {
                                partSize = 0;
                                break; // exit loop for()
                            }
                        }
                        else if(cur_gap->start <= uEnd) {
                            if(cur_gap->end < uEnd) {
                                partSize -= cur_gap->end - cur_gap->start + 1;
                            }
                            else {
                                partSize -= uEnd - cur_gap->start + 1;
                            }
                        }
                    }
                    const uint16 critCompletion = (uint16)(partSize/(PARTSIZE/100)); // in [%]

                    // Calculate priority with all criteria
                    if(cur_chunk.frequency <= veryRareBound){
                        // 0..xxxx unrequested + requested very rare chunks
                        cur_chunk.rank = (25 * cur_chunk.frequency) +      // Criterion 1
                            ((critPreview == true) ? 0 : 1) + // Criterion 2
                            (100 - critCompletion);          // Criterion 4
                    }
                    else if(critPreview == true){
                        // 10000..10100  unrequested preview chunks
                        // 30000..30100  requested preview chunks
                        cur_chunk.rank = ((critRequested == false) ? 10000 : 30000) + // Criterion 3
                            (100 - critCompletion);                      // Criterion 4
                    }
                    else if(cur_chunk.frequency <= rareBound){
                        // 10101..1xxxx  unrequested rare chunks
                        // 30101..3xxxx  requested rare chunks
                        cur_chunk.rank = (25 * cur_chunk.frequency) +                // Criterion 1
                            ((critRequested == false) ? 10101 : 30101) + // Criterion 3
                            (100 - critCompletion);                      // Criterion 4
                    }
                    else { // common chunk
                        if(critRequested == false){ // Criterion 3
                            // 20000..2xxxx  unrequested common chunks
                            cur_chunk.rank = 20000 +                // Criterion 3
                                (100 - critCompletion); // Criterion 4
                        }
                        else{
                            // 40000..4xxxx  requested common chunks
                            // Remark: The weight of the completion criterion is inversed
                            //        to spead the requests over the completing chunks.
                            //        Without this, the chunk closest to completion will 
                            //        received every new sources.
                            cur_chunk.rank = 40000 +                // Criterion 3
                                (critCompletion);      // Criterion 4   
                        }
                    }

                }
            }

            // Select the next chunk to download
            if(chunksList.IsEmpty() == FALSE){
                // Find and count the chunck(s) with the highest priority
                uint16 count = 0; // Number of found chunks with same priority
                uint16 rank = 0xffff; // Highest priority found
                for(POSITION pos = chunksList.GetHeadPosition(); pos != NULL; ){
                    const Chunk& cur_chunk = chunksList.GetNext(pos);
                    if(cur_chunk.rank < rank){
                        count = 1;
                        rank = cur_chunk.rank;
                    }
                    else if(cur_chunk.rank == rank){
                        count++;
                    }
                }

                // Use a random access to avoid that everybody tries to download the
                // same chunks at the same time (=> spread the selected chunk among clients)
                uint16 randomness = 1 + (uint16)((((uint32)rand()*(count-1))+(RAND_MAX/2))/RAND_MAX);
                for(POSITION pos = chunksList.GetHeadPosition(); ; ){
                    POSITION cur_pos = pos;
                    const Chunk& cur_chunk = chunksList.GetNext(pos);
                    if(cur_chunk.rank == rank){
                        randomness--;
                        if(randomness == 0){
                            // Selection process is over
                            sender->m_lastPartAsked = cur_chunk.part;
                            // Remark: this list might be reused up to ‘*count’ times
                            chunksList.RemoveAt(cur_pos);
                            break; // exit loop for()
                        } 
                    }
                }
            }
            else {
                // There is no remaining chunk to download
                break; // Exit main loop while()
            }
        }
    }
    // Return the number of the blocks
    *count = newBlockCount;

    // Return
    return (newBlockCount > 0);
}


/zz B)
ZZUL - get control of your uploads: ZZUL Forum
0

#27 User is offline   Xman1 

  • Xtreme Modder
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1955
  • Joined: 21-June 03

Posted 18 January 2006 - 06:58 PM

Quote

1. The rareness bands are too low. At 10 sources in total, the "very rare" band will be 1 source, and the "rare" band is 2 sources. When you select a chunk from a remote client, if it has no chunks with only 1-2 sources, eMule will pick a random chunk from the rest of the chunks (they will all have 3-10 sources). I'd prefer it pick the rarest chunk when there's so few sources. This is a change I've done in ZZUL


for this issue there is a well working patch avaiable since a long time... just a few values to change :)
have a look at the codesnipsets:
http://forum.emule-p...showtopic=67087

This post has been edited by Xman1: 18 January 2006 - 07:00 PM

0

#28 User is offline   CiccioBastardo 

  • Doomsday Executor
  • PipPipPipPipPipPipPip
  • Group: Italian Moderators
  • Posts: 5541
  • Joined: 22-November 03

Posted 18 January 2006 - 07:55 PM

I can remember this has already been discussed in another thread some time ago. Can't find it by search, though. I'll try again later to find it.

Problem discussed was that this particular code:
 uint8 modif=10;
if (GetSourceCount()>800) modif=2; else if (GetSourceCount()>200) modif=5;
uint16 limit = (uint16)(modif*GetSourceCount()/ 100);
if (limit==0) limit=1;

Is not particular good, as if develop it for some cases you'll see those bands overlap. So a chunk with 100 sources has the same limit of one with 201. A chunk with 200 has a higher limit to be vey rare than one with 300. And one with 800 has a much higher limit than one with 801.

In the mentioned thread there was a table showing that more in detail [edit: no, no table, I probably had it on some paper, but it is easy to do it]
I just need to remeber where it was.

/edit: here it is.

To tell you the truth I have never changed it myself hoping someone would have come out with a more clever solution :ph34r:

This post has been edited by CiccioBastardo: 18 January 2006 - 08:29 PM

The problem is not the client, it's the user
0

#29 User is offline   zz 

  • -
  • PipPipPipPipPipPipPip
  • Group: Debugger
  • Posts: 2014
  • Joined: 30-November 02

Posted 18 January 2006 - 08:36 PM

Yes that part is the next thing I will focus on.

/zz B)
ZZUL - get control of your uploads: ZZUL Forum
0

#30 User is offline   zz 

  • -
  • PipPipPipPipPipPipPip
  • Group: Debugger
  • Posts: 2014
  • Joined: 30-November 02

Posted 12 September 2006 - 10:34 AM

These improvements are included in standard eMule now (0.47b), so I'm unpinning this thread.

/zz B)

This post has been edited by zz: 12 September 2006 - 10:35 AM

ZZUL - get control of your uploads: ZZUL Forum
0

#31 User is offline   Syst3m Crash3r 480 

  • Magnificent Member
  • PipPipPipPipPipPip
  • Group: Members
  • Posts: 380
  • Joined: 30-April 06

Posted 12 September 2006 - 11:32 AM

View Postzz, on Sep 12 2006, 12:34 PM, said:

These improvements are included in standard eMule now (0.47b), so I'm unpinning this thread.

/zz B)


This mean that you did a good work, zz! :respect:


0

  • Member Options

  • (2 Pages)
  • +
  • 1
  • 2

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users