Official eMule-Board: 0.51D - Extremely Slow Start And Freezing In Shared Files (160K Shared - Official eMule-Board

Jump to content


Page 1 of 1

0.51D - Extremely Slow Start And Freezing In Shared Files (160K Shared Windows 10

Poll: 0.51D - Extremely Slow Start And Freezing In Shared Files (160K Shared (2 member(s) have cast votes)

Sharing 150K+ files and like 3TB+ ? Have you seen similar freezing on Windows?

  1. yes (2 votes [100.00%])

    Percentage of vote: 100.00%

  2. no (0 votes [0.00%])

    Percentage of vote: 0.00%

  3. i am sharing less than 150 000 files or different OS (0 votes [0.00%])

    Percentage of vote: 0.00%

Vote Guests cannot vote

#1 User is offline   zeronetDOTio 

  • Newbie
  • Pip
  • Group: Members
  • Posts: 7
  • Joined: 02-April 20

Posted 03 May 2020 - 07:33 AM

After i shared my collection of files (maybe 160K, maybe 4TB), then i start eDonkey 0.51d again on Windows 10. But it does not display for like one hour or more. I only see the process in Task manager maxing out my CPU thread (20-25% of CPU while the CPU is made out of 4 threads) for like one hour, then tray icon appear. Program not responding then for a while and then start responding.
I go to Shared files section and click some sub-section of shared files and then click "Shared Directories" (which possibly attempt to list all the shared files on one page (can be problem) and indeed, eDonkey stop responding for maybe 30 minutes. When i then click some file in the list and then do Ctrl+A (to select all, it kept frozen for like 6 hours, then i killed it - CPU thread was maxed out, disk was not). The Shared Files section shown "Shared Files (15xxxx, Hashing xxxx)"

Next restart again non responding for like 2 hours.

I may gladly share some more files like process dump during the startup or other details you request, but only if there is someone who would understand it and more important - is a programmer who may be able to fix this SW.

PS: i thought 0.51d has a Github page, but unable to find it to report this at proper place...

This post has been edited by zeronetDOTio: 03 May 2020 - 07:35 AM

0

#2 User is offline   Campo 

  • Member
  • PipPip
  • Group: Members
  • Posts: 43
  • Joined: 18-June 19

Posted 03 May 2020 - 08:41 AM

iam around 30k files and 4TB. emule morph needs up to 5 minutes until i can click connect. If there are unhashed files, the hash process is active even if the programm seems to be frozen. If i change an option an click ok, it could be need another 2 minutes.

Could one part of the long preparation be the function, that emule has to look, that the shared files are still the same as before. filename check instead of rehash?

I assume that this problem nearly never occured in the past^^
0

#3 User is offline   zeronetDOTio 

  • Newbie
  • Pip
  • Group: Members
  • Posts: 7
  • Joined: 02-April 20

Posted 03 May 2020 - 11:16 AM

View PostCampo, on 03 May 2020 - 09:41 AM, said:

Could one part of the long preparation be the function, that emule has to look, that the shared files are still the same as before. filename check instead of rehash?


If such function is there and it is well made, then i think there should be significant HDD activity, but it was not there. There was CPU activity though during start. Maybe it was hashing (i doubt, because i think i then finished it and next start should be with minimal or no hashing or it should be done so it does not interrupt app start)

I noticed that the eMule lists ALL shared files on one page by default and it displays these instantly on page when i click Shared files tab - so i assume these was preloaded during erroneously long app start. I think this should be fixed and default would be to show no files on Shared files tab and when i click to show all files, then it would display only lets say 1000 or some low number and maybe some button to terminate the file loading process from within the app instead of app freeze for 30 minutes or more.
It may be good if the file list is in form of a text file instead of slow database (my assumption) which allows sorting and such.
The process that will update such file would run the way it does not freeze the app.
Are there actualy some developers who can fix it or other site where i notify devs about this?

This post has been edited by zeronetDOTio: 03 May 2020 - 11:18 AM

0

#4 User is offline   fox88 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 4689
  • Joined: 13-May 07

Posted 04 May 2020 - 12:07 PM

Let's say Mini car has 4 seats.
The current "record" is 23 people inside.
How well it will be able to ride with 400 people load?
0

#5 User is offline   stoatwblr 

  • Member
  • PipPip
  • Group: Members
  • Posts: 15
  • Joined: 15-February 13

Posted 14 June 2020 - 12:05 AM

View Postfox88, on 04 May 2020 - 01:07 PM, said:

Let's say Mini car has 4 seats.
The current "record" is 23 people inside.
How well it will be able to ride with 400 people load?


There are a lot of us who share "lots of files" - I'm holding about 22TB and rotate them through because having 100k files shared breaks things all over the place

More prosaically on the work side I've been trying to find a distributed way of dealing with sharing literally hundreds of millions of astronomy files totalling several PB (anon FTP is going away for various reasons) and nothing scales


Do you need to individually lstat each file every time at startup rather than just reading the ctime/mtime/size from the directory and only checking ones that have actually changed?

This post has been edited by stoatwblr: 15 June 2020 - 03:44 PM

0

#6 User is offline   stoatwblr 

  • Member
  • PipPip
  • Group: Members
  • Posts: 15
  • Joined: 15-February 13

Posted 15 June 2020 - 03:43 PM

Some musing to speeding up startup and reducing memory consumption (my emule is currently ~1.5GB and that appears to be mostly due to known2_64.met. I've seen processes balloon out to $VERY_LARGE_SIZES in the past)

Emule doesn't _really_ need to load up every single file name and hash into memory at startup before doing anything else, then announce them all

That can be a lower priority background process and it can space out the announcements.

Equally it can keep the met table in memory and if it sees a request for something matching what it _thinks_ it has, it could push that file up to the top of the "check and announce" queue.

ie, verify the file is there and then proceed - keeping disk activity/memory loading down a bit - even with SSDs, disk IO is the slowest part of the show and bulk storage doesn't tend to be on SSDs

If you're sharing 100k+ files do you need to keep them all in memory? Or just the hash table, whilst walking a "window" through announcements and checking?

At risk of starting an old flamewar: perhaps this is a metfile scalablility issue and transitioning to SQL at larger sizes is sensible.
0

#7 User is offline   fox88 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 4689
  • Joined: 13-May 07

Posted 15 June 2020 - 07:47 PM

View Poststoatwblr, on 14 June 2020 - 03:05 AM, said:

There are a lot of us who share "lots of files"

Those lots just are ignoring the natural limits.
Reasonable numbers are lower by about two orders of magnitude.

View Poststoatwblr, on 14 June 2020 - 03:05 AM, said:

More prosaically on the work side I've been trying to find a distributed way of dealing with sharing literally hundreds of millions of astronomy files totalling several PB (anon FTP is going away for various reasons) and nothing scales

Continuing the analogy, it is like trying to load the whole cargo train into a Mini car.
What is wrong with FTP?

View Poststoatwblr, on 15 June 2020 - 06:43 PM, said:

Emule doesn't _really_ need to load up every single file name and hash into memory at startup before doing anything else, then announce them all

That can be a lower priority background process and it can space out the announcements.


What that spaced out eMule does when it receives several requests for 25+ GB files that required hashing?
Play nice music and say, Please wait for an hour, files already have being queued for hashing!?
A hint: there is no such message in data protocol.
0

#8 User is offline   stoatwblr 

  • Member
  • PipPip
  • Group: Members
  • Posts: 15
  • Joined: 15-February 13

Posted 16 June 2020 - 07:13 PM

View Postfox88, on 15 June 2020 - 08:47 PM, said:

View Poststoatwblr, on 14 June 2020 - 03:05 AM, said:

There are a lot of us who share "lots of files"

Those lots just are ignoring the natural limits.
Reasonable numbers are lower by about two orders of magnitude.
[/quote]

Define reasonable numbers. It isn't 1998 anymore. Back then 100GB was larger than existing hard drives (the first 100GB drives showed up in 2000)

Quote

View Poststoatwblr, on 14 June 2020 - 03:05 AM, said:

More prosaically on the work side I've been trying to find a distributed way of dealing with sharing literally hundreds of millions of astronomy files totalling several PB (anon FTP is going away for various reasons) and nothing scales

Continuing the analogy, it is like trying to load the whole cargo train into a Mini car.
What is wrong with FTP?
[/quote]

You mean apart from the massive security holes (no admin will let you do sftp to a public site), the fact that it's increasingly filtered to the point that even passive mode barely works and the fact that it means a single source for things? (same hassle for http atrchives)

NASA and ESA archives are purposefully limited to 100Mb/s connections to the outside world to limit their bandwidth.

Various collaborating groups are proposing making new P2P protocols to handle this and I don't see much point in reinventing the wheel


Quote

View Poststoatwblr, on 15 June 2020 - 06:43 PM, said:

Emule doesn't _really_ need to load up every single file name and hash into memory at startup before doing anything else, then announce them all

That can be a lower priority background process and it can space out the announcements.


What that spaced out eMule does when it receives several requests for 25+ GB files that required hashing?
Play nice music and say, Please wait for an hour, files already have being queued for hashing!?
A hint: there is no such message in data protocol.


No hint needed or required. known2/known.met load up pretty quickly if nothing's shared at startup. The problems start when the directory tree is walked.

The _real_ problem here is the IO overhead of lstat/opening/closing files (plus seek time on mechanical drives) not the actual hashing time. You can lose 8-20ms _per file_ (even more over NFS or SMB) and this doesn't scale with size - it's the same with small files as it is with large ones. Once a file's opened, reading is as fast as the disk will feed it.

The problem is that during this scanning sequence the entire computer ends up hanging on the IO subsystem file open() or lstat() calls and _this_ is what makes the program plus overall system unresponsive - so there are very real benefits in adopting some form of slow-start/slow-scan philosophy

(I spent a lot of time benchmarking IO latencies due to issues with network /home in clustered filesystems servicing several hundred clients. It was eye opening - the delays are not where you may think they are and there are several "2^^n" scaling problems with latencies as directories grow in size - a directory with 32500 files might be 5 times faster to scan than one with 33000 files, etc, depending on the filesystem (NTFS is particularly susceptable to this issue but linux filesystems are similar and FAT32 runs a very real risk of data loss at directory sizes exceeding 4096 files as well as incurring a 20x slowdown past 512 entries/directory)

Given that you already have the hashes, you can use those to determine which files to access first - and just like your startup scan you can then lstat (etc etc) them to ensure they haven't changed before announcing them (slow scan or listening for KAD request matches) or accepting a TCP connection - in the latter case if it has changed then the answer is "no", rehashing is triggered and the file announced like any new hash.

The point is that kad and ed2k accouncements and requests are periodic so refusing/ignoring a request isn't a problem. You just deal with it in your own time and announce when ready.

It's far better than crippling the host with naive scanning algorithms which simply _do not scale_ - your "natural limits" are based on assumptions which are inherently flawed because there is no need to load up everything at once. This isn't DOS, multithreading is normal these days and processes which monopolise a system (either by computation or tying up the IO channels) are frowned on.
0

#9 User is offline   fox88 

  • Golden eMule
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 4689
  • Joined: 13-May 07

Posted 17 June 2020 - 03:22 PM

View Poststoatwblr, on 16 June 2020 - 10:13 PM, said:

Define reasonable numbers.

I did. You called the numbers and might know what "order of magnitude" means when applied to these.

View Poststoatwblr, on 16 June 2020 - 10:13 PM, said:

NASA and ESA archives are purposefully limited to 100Mb/s connections to the outside world to limit their bandwidth.

Do they know 'It isn't 1998 anymore'?

View Poststoatwblr, on 16 June 2020 - 10:13 PM, said:

The _real_ problem here is

To understand where the real problem is, it might be required to get the idea of ED2K/KAD networks design and even read the source code.
0

#10 User is offline   megaT 

  • Member
  • PipPip
  • Group: Members
  • Posts: 24
  • Joined: 09-May 20

Posted 18 June 2020 - 12:40 PM

Quote

The _real_ problem here is the IO overhead of lstat/opening/closing files (plus seek time on mechanical drives) not the actual hashing time. You can lose 8-20ms _per file_ (even more over NFS or SMB) and this doesn't scale with size - it's the same with small files as it is with large ones. Once a file's opened, reading is as fast as the disk will feed it.

I'd also say that putting the hash calculation in the background task is a good idea,
especially for huge files.
They wouldnt be required to be anounced prior hashing them either.
0

#11 User is offline   stoatwblr 

  • Member
  • PipPip
  • Group: Members
  • Posts: 15
  • Joined: 15-February 13

Posted 20 June 2020 - 03:10 PM

View Postfox88, on 17 June 2020 - 04:22 PM, said:

View Poststoatwblr, on 16 June 2020 - 10:13 PM, said:

The _real_ problem here is

To understand where the real problem is, it might be required to get the idea of ED2K/KAD networks design and even read the source code.


I have. They scale - quite elegantly

It's the _local_ activity of the program on the machine itself which does not.
Trying to drink from a firehose is always a bad idea. Sooner or later you find a firehose you can't cope with.

Your error here is conflating what happens at network level with what you're doing in the storage backend

This post has been edited by stoatwblr: 20 June 2020 - 03:12 PM

0

  • Member Options

Page 1 of 1

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users