chunking takes a very long time #157

wndywllms · 2016-11-11T08:43:28Z

For the full band (4ch 4s) the initial chunking took me over 12 hrs. After turning on the compression and switching to using a ramdisk (/dev/shm) for the dir_local, instead of using the local disk on the node, I got it down to roughly 7 hrs. This seems a bit extreme. (I did have to limit the number of chunking tasks running simultaneously per freq. band). I've tweaked the chunk size to produce 8 chunks (integer multiple of the thread limit on io-heavy tasks, thread_io), so it is now producing chunks of ~1.5G (pre-compression was 1.5G). The work dir is on a large shared disk.

darafferty · 2016-11-11T09:30:41Z

That does seem slow. The chunking script could likely be improved quite a bit, as it does a lot of copying of columns. I'll take a look at it.

Another issue is that the chunking is limited to a single node, so it can't take advantage of multiple nodes of a cluster. We could get around this by making a "chunking pipeline" or perhaps by moving the whole chunking operation into the initial-subtract pipeline.

wndywllms · 2016-11-11T09:49:36Z

Using multiple nodes would be quite useful here. I've been using only one node for the init subtract (deep) but 3-4 for factor so for me it would go faster if you pipeline it in factor.

AHorneffer · 2016-11-11T14:10:23Z

Doing the chunking in the initial-subtract pipeline would be possible. I'm not too fond of this because it would make the initial-subtract pipeline even more a "Factor-pipeline", but it probably already is, so there is no real harm done.

Another question is if the chunking part of Factor will speed up if the input data is already compressed with dysco?

Wendy: Is the chunking limited by CPU speed or by IO speed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chunking takes a very long time #157

chunking takes a very long time #157

wndywllms commented Nov 11, 2016

darafferty commented Nov 11, 2016

wndywllms commented Nov 11, 2016

AHorneffer commented Nov 11, 2016

chunking takes a very long time #157

chunking takes a very long time #157

Comments

wndywllms commented Nov 11, 2016

darafferty commented Nov 11, 2016

wndywllms commented Nov 11, 2016

AHorneffer commented Nov 11, 2016