When to malloc and free?

Question

Valgrind does not report a memory leak during my actual usage, only during my scripted test that I scripted with a shell script to test my own shell. I found that I didn't have to use malloc every time I did. For example, strdup could do it for me. Now I wonder if it is possible to formally prove that I really need malloc where I use it or can I make an analysis and either prove or disprove that malloc is actually needed?

If I can rewrite the program so that it doesn't use malloc then I'd be glad. I'd be even happier if I can prove for certain cases that I don't need malloc, since there were cases were I only had to free() and malloc size was done automatically. One error message that appears only during test is the following.

. 
==29846== Memcheck, a memory error detector
==29846== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==29846== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==29846== Command: ./shell
==29846== 
'PATH' is set to /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin.
stdin is a file or a pipe
==29846== 
==29846== HEAP SUMMARY:
==29846==     in use at exit: 83,052 bytes in 171 blocks
==29846==   total heap usage: 240 allocs, 69 frees, 101,754 bytes allocated
==29846== 
==29846== 12 bytes in 1 blocks are definitely lost in loss record 5 of 93
==29846==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29846==    by 0x50FCA59: strdup (strdup.c:42)
==29846==    by 0x4E57624: readline (in /usr/lib/x86_64-linux-gnu/libedit.so.2.0.53)
==29846==    by 0x40193B: main (main.c:599)
==29846== 
==29846== LEAK SUMMARY:
==29846==    definitely lost: 12 bytes in 1 blocks
==29846==    indirectly lost: 0 bytes in 0 blocks
==29846==      possibly lost: 0 bytes in 0 blocks
==29846==    still reachable: 83,040 bytes in 170 blocks
==29846==         suppressed: 0 bytes in 0 blocks
==29846== Reachable blocks (those to which a pointer was found) are not shown.
==29846== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==29846== 
==29846== For counts of detected and suppressed errors, rerun with: -v
==29846== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The test I scripted is:

#!/bin/sh
echo "-- Testing our implementation of OpenShell --"
echo ""
echo "- If you have any problem in passing a test read the corresponding"
echo "- source file to understand what the test is checking"
echo ""
printf "********************* PRESS ENTER TO RUN TESTS  ... "
read _

# Key pressed, do something
printf "********************* TEST WILDCARDS \n***** Press any key to listing all files in current directory... "
read _
valgrind --leak-check=full --leak-kinds=all./shell << EOF
ls -al *.*
EOF
printf "********************* TEST ALGORITHMS ... "
read _
echo "top -b -n1|head -8|tail -1" | ./shell

printf "********************* TEST ALGORITHMS Part II.  ... "
read _
valgrind --leak-check=full --leak-kinds=all ./shell << EOF
who|awk '{print \$4 ; print \$3}'|sort -n|wc -l
EOF

printf "********************* TEST CHECKENV.  ... "
read _
valgrind --leak-check=full --leak-kinds=all ./shell << EOF
checkenv
EOF
printf "********************* TEST DONE. YOU SHOULD SEE OUTPUT FROM TEST ABOVE ... "
read _

My offending code is here:

 while (1) {
        buf = "> ";
        if (!isatty(fileno(stdin))) {
            printf("stdin is a file or a pipe\n");
            if (buf)
                command(readline(buf));
            exit(0);
        }
        cwd = malloc(sizeof(char *) * 100);
        if (cwd != NULL && getcwd(cwd, 99) == cwd) {
            printf("%s: ", cwd);
            char *input = readline(buf);
            add_history(input);
            command(input);
            free(input);
            free(cwd);

        } else {
            printf("%s: ", getenv("USER"));
            char *input = readline(buf);
            add_history(input);
            if (input) {
                command(input);
                free(input);
            }

        }
    }

Have you ever managed to reach the end of available memory with your program, i.e. the error return from `malloc` is `ENOMEM`? — ott--, May 02 '16 at 04:32
Possible duplicate of [Array or Malloc?](http://programmers.stackexchange.com/questions/143858/array-or-malloc) — Doc Brown, May 02 '16 at 05:58
From the manual page on `strdup`: "The strdup() function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).". So you still need malloc. — Sjoerd Job Postmus, May 02 '16 at 06:59
You shouldn't be using malloc unless you need to dynamically allocate memory at runtime. If you need to change the size of an array during run-time, that would be a good time to use malloc. — Snoop, May 02 '16 at 11:43
@StevieV I'm reading input that should be large. I'm reading input that is a shell program that could be very large. No matter how I do valgrind reports a memory leak but only during my tests, never during actual usage does valgrind report a memory leak, and the program doesn't crash. The line it is complaining about is `readline`. From `readline.h` — Niklas Rosencrantz, May 02 '16 at 11:50
@Programmer400 I just looked up what Valgrind is, and I understand more or less what you're saying now. I guess I don't know enough to answer the question about memory leaks. But I see using malloc as just a means of creating structures during run-time. — Snoop, May 02 '16 at 12:07
@StevieV Valgrind is cool! It will give you the line number of the offending code. I get a memory leak if I use `readline` but not if I use `editline`. But it is only Valgrind that is reporting the memory leak and during my automated integration test I scripted. Valgrind does not report a memory leak during actual usage, only during the test. — Niklas Rosencrantz, May 02 '16 at 12:11
That's really strange, you'd think there would only be a memory-leak if you didn't `free` right? I mean it's not like you `malloc` new memory space when you do `readline` right? — Snoop, May 02 '16 at 12:12
Maybe just have a fixed size FIFO, and dump the input from `readline` to the FIFO instead of mallocing more and more space. — Snoop, May 02 '16 at 12:15
I posted the offending code. There is a memory leak _BUT_ only during the test. I'm using Valgrind with more and more options to understand what is wrong. I'm freeing from malloc in the loop but maybe I misunderstood. I don't have much experience with this detailed level of C and RAM, but I'm sure we will find the memory leak because there is so much information and reproducable. — Niklas Rosencrantz, May 02 '16 at 12:27

5gon12eder · Accepted Answer · 2016-05-02T23:01:25.027

I'm not going to debug your code, there's not enough context to do this anyway, but I'm going to show you an idiom that you will probably find easier to use correctly. As a bonus, it will also be faster.

Have a look at your loop body. You are allocating memory during each iteration and free it under certain circumstances depending on the overall control flow. This puts very high mental load on anybody (including yourself) reading the code and trying to verify that each malloc is paired with a corresponding free on each possible path of execution.

What if you instead took the memory allocation and deallocation outside the loop?

const size_t maxcwd = PATH_MAX;
char * cwd = malloc(maxcwd);
if (cwd == NULL)
  {
    fprintf(stderr, "error: cannot start shell: out of memory\n");
    return EXIT_FAILURE;
  }
while (true)
  {
    if (getcwd(cwd, maxcwd))
      {
        // Good, use it…
      }
    else
      {
        // Bad, do something else…
      }
  }
free(cwd);

Note how there is only a single malloc and a single free and it is fairly easy to see that they pair up correctly. The code will also be faster because there are much fewer allocations and deallocations performed.

However, the code is not ideal. Why allocate a huge array upfront that might never be needed and why fail if it is still too small? It would be better to dynamically resize the array as needed.

size_t cwdsize = 0;
char * cwd = NULL;
while (true)
  {
    while ((cwd == NULL) || (getcwd(cwd, cwdsize) == NULL))
      {
        size_t newsize = (cwdsize > 0) ? (cwdsize << 1) : 128;
        char * temp = realloc(cwd, newsize);
        if (temp == NULL)
          {
            fprintf(stderr, "error: out of memory\n");
            break;
          }
        cwdsize = newsize;
        cwd = temp;
      }
    // cwd is now always good to use…
  }
free(cwd);

Of course, the code is a little verbose and you might decide to factor it out into a function. This is also the pattern used by the POSIX getline function which I find very elegant.

It is too bad that GNU readline does not support the same idiom. It always allocates new memory and you have to remember to free it after every call. Even worse, there is no way for it to report failure to allocate memory.

bool keep_going = true;
while (keep_going)
  {
    char * line = NULL;
    // Other stuff that needs to be done…
    line = readline("$ ");
    if (line == NULL)
      {
        // Probably EOF
        keep_going = false;
        goto finally;
      }
  finally:
    free(line);
  }

If you also have other error paths, you can safely say goto finally; multiple times in your loop. This is because free(NULL) is a safe no-op. If you're dealing with resources where the deallocation function must only be called with valid resources, your cleanup code would have to check itself whether it actually has a resource to free. This disciplined use of goto basically mimics deterministic destruction in C++ and is a very useful idiom in C.

The readline code you posted actually helped because when I use it, there is no memory leak reported by Valgrind and that was my goal. — Niklas Rosencrantz, May 02 '16 at 20:41
@Programmer400 - Your real goal >>should be<< to understand what the problem is and how to fix / avoid it in the future ... as well as in the case at hand. — Stephen C, May 02 '16 at 22:42
@StephenC Actually I found the leak again. Valgrind reports a memory leak and now I see that I don't do `free` after I do `string[j] = strdup(argv[j]);` because I didn't know that I have to do it. I learnt it here yesterday that `strdup` requires that I `free()` . Let's try again after I set it free. — Niklas Rosencrantz, May 02 '16 at 23:00

When to malloc and free?

1 Answers1