Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
H
HCC docs
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Monitor
Service Desk
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Holland Computing Center
HCC docs
Commits
889f19ee
Commit
889f19ee
authored
2 years ago
by
Adam Caprez
Browse files
Options
Downloads
Plain Diff
Merge branch 'tom' into 'master'
Drop header and reorder. See merge request
!331
parents
b4ce891f
02b804b1
No related branches found
No related tags found
1 merge request
!331
Drop header and reorder.
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
content/good_hcc_practices/_index.md
+8
-9
8 additions, 9 deletions
content/good_hcc_practices/_index.md
with
8 additions
and
9 deletions
content/good_hcc_practices/_index.md
+
8
−
9
View file @
889f19ee
...
...
@@ -22,6 +22,14 @@ operations, such as testing and running applications, one should use an
lots of threads for compiling applications, or checking the job status multiple times a minute.
## File Systems
*
**No POSIX file system performs well with an excessive number of files**
, as each file operation
requires opening and closing, which is relatively expensive.
*
Moreover, network data transfer operations that involve frequent scanning (walking) of every
file in a set for syncing operations (backups, automated copying) can become excessively taxing for
network file systems, especially at scale.
*
Large numbers of files can take an inordinate amount of time to transfer in or out of network
file systems during data migration operations.
*
**Computing workflows can be negatively impacted by unnecessarily large numbers of file operations**
, including file transfers.
*
Some I/O intensive jobs may benefit from
**
copying the data to the fast, temporary /scratch
file system local to each worker nodes
**
. The
*/scratch*
directories are unique per job, and
are deleted when the job finishes. Thus, the last step of the batch script should copy the
...
...
@@ -36,15 +44,6 @@ all the necessary files need to be either moved to a permanent storage, or delet
disk, in your program.
**
This approach stresses the file system and may cause general issues.
Instead, consider reading and writing large blocks of data in memory over time, or
utilizing more advanced parallel I/O libraries, such as
*parallel hdf5*
and
*parallel netcdf*
.
#### Large numbers of files considerations
*
**No POSIX file system performs well with an excessive number of files**
, as each file operation
requires opening and closing, which is relatively expensive.
*
Moreover, network data transfer operations that involve frequent scanning (walking) of every
file in a set for syncing operations (backups, automated copying) can become excessively taxing for
network file systems, especially at scale.
*
Large numbers of files can take an inordinate amount of time to transfer in or out of network
file systems during data migration operations.
*
**Computing workflows can be negatively impacted by unnecessarily large numbers of file operations**
, including file transfers.
## Internal and External Networks
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment