Add data sharing page
Compare changes
content/handling_data/swan_data_sharing.md
0 → 100644
+ 309
− 0
The user permissions map to the *UID* (user identifier number) of the account that created the file. Similarly, the group permissions map to the *GID* (group identifier number) of the account that created the file. Generally, the *GID* is the primary group of the user account in most cases. The `other (o)` permissions map to all other users not matching the prior two groupings.
To say that another way, your HCC account username maps to the *UID*, your HCC primary group maps to the *GID* (although this may depend on where the file is located/created with regards to supplementary group access), and the other is all the other users that are not part of your HCC user account or group(s).
The `(x)` permission differs depending on the file type. Directory type with `(x)` will allow search operations for the grouping involved under that directory path - lacking the `(x)` bit will result in permission denied errors for the grouping being checked for path access. File types with `(x)` are known as executable files that the system will run (load a program image file instance into RAM memory and execute it), while files without `(x)` tend to be data files of some sort used for input or output.
Directory files start with a `d` in the permission listing, while files have a hyphen `-`. Next, the `user (u)`, `group (g)` and `other (o)` permission modes follow, i.e., `tuuugggooo`, where `t` is the type of directory/file, `uuu`, `ggg` and `ooo` are permission placeholders for the prior mentioned `(u)`, `(g)` and `(o)` permission groupings.
**Please note that when sharing a file, all the directories in the path to the file need to have execute (x) bits set (in order for its contents to be accessible) and read (r) bits set (to show up in listing queries)**, e.g., the `ls -l` command. For example, if you want to share the directory `/work/group/username/shared/`, `read (r)` and `execute (x)` permissions should be given to `/work`, `/work/group`, `/work/group/username` and `/work/group/username/shared` to ensure both access to the files and the ability to list directory entries for the various path components.
However, this is possible with the POSIX Access Control Lists model ([POSIX extended ACLs](https://man7.org/linux/man-pages/man5/acl.5.html)) which extends the standard POSIX model. **This is more involved setup that is only recommended for the advanced user that has the need and is already well experienced with the standard model.** We refer such users to the tool documentation of [getfacl](https://man7.org/linux/man-pages/man1/getfacl.1.html) and [setfacl](https://man7.org/linux/man-pages/man1/setfacl.1.html).
Similar to Unix/POSIX permissions, ACL provides `read (r)`, `write (w)` and `execute (x)` permissions for the `user (u)`, `group (g)` and `other (o)`. The user is your HCC account, the group is your HCC group (or supplementary group for where the file is located), and the other is all the other users that are not part of your HCC group. An ACL can "extend" this prior mapping by allowing a per-user and/or per-group list of additional groupings that reside within the traditional/standard model's "group permission" grouping. To say that another way, the `group rwx` permissions mapping expands to multiple entries that only the prior mentioned tools can work with.
Changing the group permission on the file updated the `mask::rwx` extended ACL entry to "allow" the `execute (x)` permission that was previously missing. Note well, even though the group permissions in the `ls` listing show `rwx` for the group, actual *GID* group members would only have `r-x` access as the "allow" mask property is what is actually listed.
With the `setfacl` commands above, the listed `demo` accounts are given `read (r)` `write (w)` or `execute (x)` access to the file `file.txt` by the ACL. Standard permission modes still apply and it is assumed that these `demo` accounts have sufficient directory search `(x)` permissions to reach the `${WORK}/shared` path. These details may need to be given when HCC staff sets up the shared path if the user accounts are not members of the group involved at the path `${WORK}` expands to.
More examples on ACLs can be found [here](https://www.geeksforgeeks.org/access-control-listsacl-linux/) and the author of the Linux POSIX ACL implementation has an excellent document on the topic [here](https://www.usenix.org/legacy/publications/library/proceedings/usenix03/tech/freenix03/full_papers/gruenbacher/gruenbacher_html/main.html).
**Please note that when sharing a file, all the directories in the path to the file need to have execute (x) bits set (in order for its contents to be accessible) and read (r) bits set (to show up in listing queries)**, e.g., the `ls -l` command. For example, if you want to share the directory `/work/group/username/shared/`, `read (r)` and `execute (x)` permissions should be given to `/work`, `/work/group`, `/work/group/username` and `/work/group/username/shared` to ensure both access to the files and the ability to list directory entries for the various path components.
- if you set the permissions of the subdirectory to Read/Write, all the directories within this subdirectory will have Read/Write permissions and you can not overwrite that (e.g., if `/work/group/username/shared/shared1` is Read/Write then `/work/group/username/shared/shared1/test` will be Read/Write too);
- Data shared with Globus can not be accessed directly on the cluster and the data will need to be transferred to the cluster if it is used as part of SLURM job; unless the data being shared is from a cluster file-system, in which case the prior mentioned [Unix permissions](#standard-unix-permissions) and/or [ACLs](#using-posix-access-control-lists-acl) may be needed to grant the HCC accounts the needed permissions - thus complicating the share.