From 02b804b13648f2fa415006fb6537032be801acba Mon Sep 17 00:00:00 2001 From: Adam Caprez <acaprez2@unl.edu> Date: Fri, 2 Sep 2022 23:16:00 -0500 Subject: [PATCH] Drop header and reorder. --- content/good_hcc_practices/_index.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/content/good_hcc_practices/_index.md b/content/good_hcc_practices/_index.md index 9ec3097a..715b9d51 100644 --- a/content/good_hcc_practices/_index.md +++ b/content/good_hcc_practices/_index.md @@ -22,6 +22,14 @@ operations, such as testing and running applications, one should use an lots of threads for compiling applications, or checking the job status multiple times a minute. ## File Systems +* **No POSIX file system performs well with an excessive number of files**, as each file operation +requires opening and closing, which is relatively expensive. +* Moreover, network data transfer operations that involve frequent scanning (walking) of every +file in a set for syncing operations (backups, automated copying) can become excessively taxing for +network file systems, especially at scale. +* Large numbers of files can take an inordinate amount of time to transfer in or out of network +file systems during data migration operations. +* **Computing workflows can be negatively impacted by unnecessarily large numbers of file operations**, including file transfers. * Some I/O intensive jobs may benefit from **copying the data to the fast, temporary /scratch file system local to each worker nodes**. The */scratch* directories are unique per job, and are deleted when the job finishes. Thus, the last step of the batch script should copy the @@ -36,15 +44,6 @@ all the necessary files need to be either moved to a permanent storage, or delet disk, in your program.** This approach stresses the file system and may cause general issues. Instead, consider reading and writing large blocks of data in memory over time, or utilizing more advanced parallel I/O libraries, such as *parallel hdf5* and *parallel netcdf*. -#### Large numbers of files considerations - * **No POSIX file system performs well with an excessive number of files**, as each file operation -requires opening and closing, which is relatively expensive. - * Moreover, network data transfer operations that involve frequent scanning (walking) of every -file in a set for syncing operations (backups, automated copying) can become excessively taxing for -network file systems, especially at scale. - * Large numbers of files can take an inordinate amount of time to transfer in or out of network -file systems during data migration operations. - * **Computing workflows can be negatively impacted by unnecessarily large numbers of file operations**, including file transfers. ## Internal and External Networks -- GitLab