81

I have a program that needs to generate temporary files. It is written for cluster machines.

If I saved those files to a system-wide temporary directory (eg: /tmp), some users complained the program failed because they didn't have proper access to /tmp. But if I saved those files to the working directory, those users also complained they didn't want to see those mysterious files.

Which one is a better practice? Should I insist that saving to /tmp is the right approach and defend any failure as "working as intended" (ie. ask your admin for proper permission/access)?

psmears
  • 188
  • 4
ABCD
  • 1,166
  • 1
  • 8
  • 13
  • 3
    check if the program has access and if not find another temp dir – ratchet freak Apr 05 '16 at 11:44
  • 1
    @ratchetfreak This might not work as the program doesn't know which temp directory is good. – ABCD Apr 05 '16 at 11:45
  • on this clustered file system, is there a per-user temp directory? If so, you should be using that. – gbjbaanb Apr 05 '16 at 12:08
  • 24
    If your admin screwed up the access rights, he should definitely fix it. What would you do if your admin forgot to add execute rights to your program? – Doc Brown Apr 05 '16 at 13:50
  • 7
    You will not find /tmp on most windows systems, but there is a OS call that will tell you where to put temp files. – Ian Apr 05 '16 at 16:36
  • @Ian I don't care Windows, the program is designed for only and only Unix. – ABCD Apr 05 '16 at 16:37
  • 28
    If some people didn't have access to `/tmp` on a Unix-like system, it's misconfigured. The superuser should do something like `chmod 1777 /tmp`. – musiphil Apr 05 '16 at 18:14
  • 3
    It definitely sounds like the problem here is that the admins have misconfigured the server(s). Many, many programs (including cluster-aware programs) need to write temporary files, and `/tmp` is the canonical place to do so. By default all users can write there, but only can delete their own files. But you can also provide a configuration option for a temporary directory, for people who need to do their work on systems with incompetent admins. – Michael Hampton Apr 06 '16 at 01:33
  • just for comment: I see many linux program create hidden folder inside user working directory (start with '.' in the folder name). It will not have any permission issue, and those users won't complain if they don't see it – Hoàng Long Apr 06 '16 at 02:01
  • 12
    Beware that $TMPDIR might point to a different path than `/tmp/`, which you should use instead. See some of the answers ;) – marcelm Apr 06 '16 at 08:17
  • @HoàngLong why "it will not have any permission issue"? Can't the working directory of a program be write-protected? – Federico Poloni Apr 06 '16 at 21:52
  • 1
    @FedericoPoloni: sorry that I was unclear. The OP's post says that he only has problem when trying to save outside of his working directory, so I supposed it works ok. Without any other mention, I guess the working directory might be the home directory as well. – Hoàng Long Apr 07 '16 at 01:05
  • @HoàngLong Yes. That's right. Using the the protected directory for working directory is insane and can be ignored immediately. – ABCD Apr 07 '16 at 01:06

7 Answers7

145

Temporary files have to be stored into the operating system temporary directory for several reasons:

  • The operating system makes it very easy to create those files while ensuring that their names would be unique.

  • Most backup software knows what are the directories containing temporary files, and skips them. If you use the current directory, it could have an important effect on the size of incremental backups if backups are done frequently.

  • The temporary directory may be on a different disk, or in RAM, making the read-write access much, much faster.

  • Temporary files are often deleted during the reboot (if they are in a ramdisk, they are simply lost). This reduces the risk of infinite growth if your app is not always removing the temp files correctly (for instance after a crash).

    Cleaning temp files from the working directory could easily become messy if the files are stored together with application and user files. You can mitigate this problem by creating a separate directory within the current directory, but this could lead to another problem:

  • The path length could be too long on some platforms. For instance, on Windows, path limits for some APIs, frameworks and applications are terrible, which means that you can easily hit such limit if the current directory is already deep in the tree hierarchy and the names of your temporary files are too long.

  • On servers, monitoring the growth of the temporary directory is often done straight away. If you use a different directory, it may not be monitored, and monitoring the whole disk won't help to easily figure out that it's the temp files which take more and more place.

As for the access denied errors, make sure you let the operating system create a temporary file for you. The operating system may, for instance, know that for a given user, a directory other than /tmp or C:\Windows\temp should be used; thus, by accessing those directories directly, you may indeed encounter an access denied error.

If you get an access denied even when using the operating system call, well, it simply means that the machine was badly configured; this was already explained by Blrfl. It's up to the system administrator to configure the machine; you don't have to change your application.

Creating temporary files is straightforward in many languages. A few examples:

  • Bash:

    # The next line will create a temporary file and return its path.
    path="$(mktemp)"
    echo "Hello, World!" > "$path"
    
  • Python:

    import tempfile
    
    # Creates a file and returns a tuple containing both the handle and the path.
    handle, path = tempfile.mkstemp()
    with open(handle, "w") as f:
        f.write("Hello, World!");
    
  • C:

    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    ...
    char temp_file[] = "/tmp/tmp.XXXXXX";
    int fd = mkstemp(temp_file);
    dprintf(fd, "Hello World!!!\n");
    close(fd);
    
  • C#:

    // Creates a file and returns the path.
    var path = Path.GetTempFileName();
    File.WriteAllText(path, "Hello, World!");
    
  • PHP:

    # Creates a file and returns the handle.
    $temp = tmpfile();
    fwrite($temp, "Hello, World!");
    fclose($temp);
    
  • Ruby:

    require "tempfile"
    
    # Creates a file and returns the file object.
    file = Tempfile.new ""
    file << "Hello, World!"
    file.close
    

Note that in some cases, such as in PHP and Ruby, the file is removed when the handle is closed. That's an additional benefit of using the libraries bundled with the language/framework.

Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
  • 2
    What do you mean by "make sure you let the operating system create a temporary file for you". So instead of e.g. `fopen("/tmp/mytmpfile", "w");` I should make some system call to handle temporary files? – simon Apr 05 '16 at 17:01
  • 30
    @gurka: You should be calling `tmpfile(3)` to generate your temporary files, or at least calling `mktemp(3)` to create the file names. – TMN Apr 05 '16 at 17:29
  • 3
    @TMN: They are just library functions that run in the user space, and they don't have any magic to bypass the permission error given by the operating system. – musiphil Apr 05 '16 at 18:18
  • 25
    @musiphil Both tmpfile and mktemp uses external variables to determine the path for temporary files. These may have been set up to point to another directory than /tmp/, perhaps a per-user directory. Trying to create a filename manually in /tmp/ may fail, while tmpfile and mktemp would return valid paths. – pipe Apr 05 '16 at 18:28
  • 2
    @musiphil: I never said they'd fix the permission problem, I was responding to his question about using system calls to create the files. – TMN Apr 05 '16 at 22:43
  • @TMN: Sorry for the confusion. Please understand I never said that you said what you never said, either, and I just wanted to point out that readers could be misled into thinking that `tmpfile` or `mktemp` was the system call @gurka was seeking for. – musiphil Apr 06 '16 at 16:51
  • This is a good answer but it could be even better if you could edit the suggested use of library functions in and mention that *they already provide* the ability for an end-user to override the selection of the directory to create temporary files in via environment variables, should they really need it. – 5gon12eder Apr 06 '16 at 17:55
  • The operating system makes it easy ... but caveat emptor: The [Windows native API call](https://msdn.microsoft.com/de-de/library/windows/desktop/aa364991%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396) creates temp files with a prefix and a 4-digit hex number. If the software does not properly cleanup their tempfiles (e.g. due to crashes or to not having DELETE rights), after some time Windows will spend an extraordinary time trying to find a unused file name. I never tried what happened when all 0x10000 combinations are used, but things become veeery slow when the folder starts to fill up. – JensG Apr 06 '16 at 19:55
  • 1
    I would include additional quotes in the bash example: `path="$(mktemp)"`. Not that it is strictly necessary here. But it is a good habit to always put `"` around `$` expansion unless you have a very specific need not to. – kasperd Apr 06 '16 at 22:37
  • In the case of C#, you probably want to `FileAttributes.Temporary` and/or `FileOptions.DeleteOnClose`. Assuming sufficient RAM, Windows will avoid storing such files on disk (see `FILE_ATTRIBUTE_TEMPORARY ` paragraph under [CreateFile function](https://msdn.microsoft.com/en-us/library/aa363858.aspx#caching_behavior) – Brian Apr 13 '16 at 17:26
  • 1
    Does this advice apply when the temporary files may contain sensitive data? – Noumenon Apr 09 '17 at 14:29
  • While downloading a file `test.odt`, Firefox creates another (temporary) file `test.odt.part` in the current directory. And opening the file using LibreOffice Writer creates another (temporary) file `.~lock.test.odt#` in the current directory. Why they chose the current directory? It seems like a contradiction to what you're saying. – Jeyekomon Feb 03 '23 at 13:26
33

Should I insist saving to /tmp is the right approach and defend for any failure as "working as intended" (ie. ask your admin for proper permission access)?

There are standards for this, and the best thing you can do is conform to them.

POSIX, which is followed by pretty much every non-mainframe OS of any significance that you're likely to run into, has provisions for creating uniquely-named temporary files in a directory using default values that can be reconfigured by the environment:

  • The C stdio.h header may optionally include a P_tmpdir macro that names the system's temporary directory.
  • TMPDIR is the canonical environment variable for changing the location of temporary files. Prior to POSIX, there were other variables used, so I tend to go with the first of that or TMP, TEMPDIR and TEMP that has a value, punting and using the system default if none of those exist.
  • The mkstemp() and tempfile() functions will generate unique temporary files.

If your users are being denied the ability to create temporary files, the system is either misconfigured or the administrators aren't making clear what their policy is on such things. In those cases, you'd be on very firm ground in saying that your program conforms to a well-established portability standard and that its behavior can be changed using the environment variables the standard specifies.

Blrfl
  • 20,235
  • 2
  • 49
  • 75
  • `P_tmpdir` is not a part of `stdio.h` as defined by the C language specification. It might be defined by POSIX or SVID. – musiphil Apr 05 '16 at 18:13
  • 1
    @musiphil: As implied by the (now-clarified) answer, it's part of POSIX. (Technically, it's an X/Open System Extension that POSIX incorporated. See http://pubs.opengroup.org/onlinepubs/009695399/basedefs/stdio.h.html.) – Blrfl Apr 05 '16 at 18:33
  • Fully agree with all the above. A good example is Linux systems with `pam_tmpdir` - this sets `TMPDIR` and `TMP` to be different for each user, for robustness and privacy. It's also useful to be able to set `TMPDIR` for a single command - if you have your usual temporary directory in a RAM filesystem for speed, you may need to do this for commands that generate huge temporary files (such as a giant `sort`, for example). Don't ignore the standards/conventions that your users expect! – Toby Speight Apr 06 '16 at 08:34
  • Definitely check the environment for the location of temporary files and never hard-code /tmp. Because a shared tmp has security issues, one mitigation I have often seen is to create per-user /tmp directories with no read-write permission for anyone else. It removes possible race conditions and symlink attacks. – Zan Lynx Apr 06 '16 at 16:46
10

The previous answers, although correct, aren't valid for most large scale computer clusters.

Computer clusters not always follow the standard conventions for machines, usually for good reasons, and there is no point in discussing it with the sysadmins.

Your current directory is referring to the central file system, which is accessed through the network. This is not only slow, but also puts loads on the system for the rest of the users, so you shouldn't use it unless you aren't writing much and you can recover from it if the job crashes.

The computing nodes have their own hard drive, that is the fastest file system available, and what you should be using. The cluster documentation should tell you what it is, typically /scratch, /tmp/[jobid], or some non standard enviroment variable ($SNIC_TMP in one of the ones I use).

So, what I recommend is making it user-configurable. The defaults can be the first one you have write access to:

  • $TMPDIR
  • tmpfile
  • /tmp
  • .

But expect a low success rate with this approach, and make sure to emit a big fat warning.

Edit: I'll add another reason for force it to be user-set. One of my clusters has $TMPDIR set to /scratch, that is user-writable and on the local hard drive. But, the documentation says that anything you write outside of /scratch/[jobid] may be deleted at any point, even in the middle of the run. So, if you follow the standards, and trust $TMPDIR, you will encounter random crashes, very hard to debug. So, you may accept $TMPDIR, but not trust it.

Some other clusters do have this variable properly configured, so you may add an option to explicitly trust $TMPDIR, otherwise, emit a big, fat warning.

Davidmh
  • 240
  • 1
  • 7
  • 2
    Which ones exactly are the previous answers? – Tulains Córdova Apr 06 '16 at 18:35
  • 2
    So what you're saying here is that because some clusters that don't take the trivial step of adhering to a well-established standard for telling programs where to write their temporary files, that's one additional cluster-specific customization required per program. Pretty weak tea if you ask me. – Blrfl Apr 06 '16 at 20:50
  • @Blrfl you can wave the standards as much as you want, and write code that adheres perfectly fine to them, and always crashes; you can try to fight with the sysadmins of each cluster you use; or you can accept your faith and make it configurable. Plus, in HPC one usually needs to adapt the code to the specifics of the cluster anyway (available RAM, relative speed of the filesystems, MPI implementation, general availability of resources...), there is no "one size fits all". – Davidmh Apr 07 '16 at 08:40
  • @Davidmh: Understood, but not the point. The standard _makes it configurable_ in a non-astonishing way. If I take known-conforming code to a cluster where the standard isn't followed, I have to set it in _exactly one place,_ such as at the entry point. That's one less thing in the rest of the code to audit, modify and risk getting wrong. – Blrfl Apr 07 '16 at 10:12
9

The temp-file-directory is highly operating system/environment dependant. For example a web-servers-temp dir is seperate from the os-temp-dir for security reasons.

Under ms-windows every user has its own temp-dir.

you should use the createTempFile() for this if such a function is available.

k3b
  • 7,488
  • 1
  • 18
  • 31
  • 1
    Just be mindful of hidden OS limitations in Windows. We discovered the hard way that the maximum number of files in a folder was limited to 65,565. Sure, that's a lot of files, and sure, you should never *conceivably* have that many laying around. But are you *sure* that every app cleans up after itself in a timely and well-behaved manner? – Mike Hofer Apr 06 '16 at 19:31
  • Ah, I've seen your comment too late. I just wrote the same above. BTW the limit is primarily due to the mechanics of the GetTimeFileName() function, not NTFS. [That folder limit you mentioned applies only to FAT32](http://superuser.com/questions/446282/max-files-per-directory-on-ntfs-vol-vs-fat32). – JensG Apr 06 '16 at 19:56
1

For many applications, you should consider putting temporary files in $XDG_RUNTIME_DIR or $XDG_CACHE_HOME (the other XDG dirs are for nontemporary files). For instructions on calculating them if they are not explicitly passed in the environment, see the XDG basedir spec or find a library that already implements that part.

Note, however, that $XDG_RUNTIME_DIR is a new addition and there is no standard fallback for older systems due to security concerns.

If neither of those is suitable, then /tmp is the correct place. You should never assume the current directory is writable.

o11c
  • 588
  • 2
  • 6
0

they didn't have proper access to /tmp

It's not obvious what you mean by "proper access". Is it something like /tmp/xxx: Permission denied?

As others have pointed out, anything under /tmp must have a unique name, so if the program is using a file called /tmp/xxx for each user, only the first user will be able to use it. The program will fail for everyone else because they are trying to write to a file owned by that first user.

But, something not mentioned in any of the other answers so far, there is a good technical reason for using /tmp rather than /var/tmp, one's home directory, or the current directory.

Some systems have /tmp set up as a RAM-disk, which has three main advantages:

  • A RAM-disk will be much faster than anything on a permanent storage device.
  • A RAM-disk won't put wear and tear on SSD memory, which has a limit to the number of writes that it can handle in its lifetime.
  • The temporary files will automatically be deleted even if the program that created them terminates without cleaning up after itself. (Also true for /var/tmp, which cannot be a RAM-disk.)
Ray Butterworth
  • 202
  • 1
  • 2
  • 8
-2

This is more like an alternative, but you might unlink() the file immidiately after fopen() . It depends of usage pattern of cource.

Unlinking the files, if it can be done, helps for several ways:

  • file is not seen - user not see it.
  • file is not seen from other processes - there is not chance other process to modify the file by mistake.
  • easy cleanup if program crash.

Files must be created in /tmp. If user have not rights to create file there, this means system is missconfigured.

Files can not be created in users home directory. Lots of users, such "nobody", " www-data" and many others, does not have rights to write in their home directories, or they are even chroot()-ed. Note that even in chroot environment /tmp still exists.

Nick
  • 295
  • 2
  • 9
  • While this might be a good idea in general, it doesn't help the users who are lacking write permissions on the directory the file is to be created in. – 5gon12eder Apr 06 '16 at 17:58
  • 4
    It also doesn't answer the question, which is where to put temporary files. – Blrfl Apr 06 '16 at 20:53
  • I believe my answer is somehow important. I did edit, probably is more clear this way. – Nick Apr 07 '16 at 07:00