Testing a function that uses random number generator

Question

I have a method which is a part of the interface I am implementing. This method calls another private method which uses a random number generator to produce output and returns it to the calling method. I want to test the calling method. How can I do that? This is the method under test:

 @Override
 public String generate(int wordCount) {
    StringBuilder sentence = new StringBuilder();

    List<String> selectedStrings = selectRandomStringsFromInternalVocabulary(wordCount, new Random());
    selectedStrings.sort(Comparator.<String>naturalOrder());

    swapOddIndexedStringsWithEvenIndexedStrings(selectedStrings);

    for (String word: selectedStrings)
        sentence.append(word)
                .append(" ");


    return sentence.toString().trim();
}

This is the method that uses random number generator:

private List<String> selectRandomStringsFromInternalVocabulary(int wordCount, Random random) {
    List<String> selectedStrings = new ArrayList<>();
    int wordCountInVocabulary = internalVocabulary.size();

    while (wordCount-- != 0) {
        int stringIndex = random.nextInt(wordCountInVocabulary);
        selectedStrings.add(internalVocabulary.get(stringIndex));
    }

    return selectedStrings;
}

There are a few things that I've thought I can do: 1. Make the second method package-private and test it. But I don't want to test a private method if I can avoid it. 2. Add Random as a parameter to the calling function and pass a mock during test. However, its part of the interface and other classes implementing it does not use RNG. Furthermore, I don't want clients to know about the implementation details.

I have gone through these questions: 1. Unit testing methods with indeterminate output 2. Unit Testing a function with random behavior

But the suggestions are similar to what I mentioned above.

Possible duplicate of [How should I test randomness?](https://softwareengineering.stackexchange.com/questions/147134/how-should-i-test-randomness) — gnat, Aug 29 '17 at 09:45
Remarks about question duplication belong in the comments, not in your question. — Robert Harvey, Aug 29 '17 at 16:07
I don't think there's a need to create a new Random every time there. Have a Random as an instance member. — COME FROM, Aug 30 '17 at 06:58
You should be looking at mathematical proofs, not tests. The tests are simply a backup as they prove nothing. — Frank Hileman, Aug 30 '17 at 22:26
I'm not trying to test whether the randomness of the selected strings conform strictly to some statistical distribution. I'm trying to test 1. the selected strings are not the same every time 2. odd indexed strings are swapped with even indexed ones. — sayeed910, Aug 31 '17 at 09:13

Berin Loritsch · Accepted Answer · 2017-08-31T00:24:04.887

15

The question you should be asking yourself is "What am I trying to prove?" (NOTE: "prove" is not used in the mathematical sense, but simply to test the validity of something). Unit tests are there to test that your application is functioning within it's designed parameters. Different tests for correctness require different approaches:

You can verify that all words in the returned the string exist in the global vocabulary
You can verify that there are, or are no repeats of words in that returned string
You can verify that there is no extraneous white space at the beginning and end
You can verify that the number of words enforces positive numbers
You can verify that the a list of 0 words provides an empty string

That might be sufficient for your needs. Those types of tests are also fairly robust. They won't break if the implementation of the random number generator changes. You are still working through the external interface to get at the internal features.

I think in this case it would prove both brittle and only marginally useful to test specific sequences of strings. There aren't really that many branch paths to worry about in your example. Think about the contracts and what it is you really need to ensure, and embrace the randomness.

The important thing to note is that you are not proving randomness. You are ensuring that your method behaves as expected.

If you were testing randomness, you would need to do some statistical analysis to prove the spread of random numbers was as expected, etc. Those kinds of tests would require "proof" in the mathematical sense of the word. Unit tests aren't the right tool for that job.

edited Aug 31 '17 at 00:24

answered Aug 29 '17 at 13:49

Berin Loritsch

45,784
7
87
160

1

Unfortunately unit tests can rarely "prove" anything. This answer would be good if it encouraged actual proofs, informal or formal. The proofs are more important than the tests. – Frank Hileman Aug 30 '17 at 22:24
2

If this is your opinion, perhaps you aren't writing the right tests. Of course, this could also be a case of getting hung up on the semantics of the word "proof". In either case, the tests can easily be written to prove every one of those bullet points. Perhaps you are thinking in the mathematical concept of the word. If that's the case I'll concede that I'm no mathematical genius. I'm just a guy who learned by experience it's best to embrace the randomness and test what's conceptually important. – Berin Loritsch Aug 30 '17 at 23:42
1

Yes, I was using the word proof as commonly defined (mathematical proof). Informal proofs can work as well. But tests don't prove anything unless you can cover the whole state space, which is usually unrealistic. I think your answer would be fine if you did not use the word "proof" or "prove" in that way. – Frank Hileman Aug 30 '17 at 23:52
1

Just saying "you can test" instead of "you can prove", "testing" instead of "proving", would probably fix it. – Frank Hileman Aug 30 '17 at 23:54
I'll change the text, but my use of the word "prove" (https://www.merriam-webster.com/dictionary/prove) is well within the English dictionary definition, which is why I used it to get the OP to think about actually testing correctness. – Berin Loritsch Aug 31 '17 at 00:18
@FrankHileman, I reworded things to hopefully address both of our concerns. Too many times, the concept of testing is applied too simplistically because the examples for how to do unit testing were too simplistic. – Berin Loritsch Aug 31 '17 at 00:26
1

I find @FrankHileman's concern is warranted, since one usual misunderstanding about tests is that they "prove" some piece of code has no bugs. But tests, at best, can only show the presence of bugs, not prove their absence. This is understood by experienced programmers, but it's also a reason to not think of tests in terms of "proving" things. – Andres F. Aug 31 '17 at 02:01
2

Well, most things that you test are things that you *want* to prove, but proving is so much effort that you simply test them instead. How about using the word "ensure" instead of "prove"? – user253751 Aug 31 '17 at 05:12
@immibis, Did I not already make that change? – Berin Loritsch Aug 31 '17 at 14:54
@AndresF., Your logic is as flawed as the insinuation that I'm inexperienced. I already changed the wording to use "verify" and "ensure". What more do you want? The only use of the word "Prove" is in the opening statement with a link to the definition of the word (did you read it?). After that I talked about how to ensure that aspects of the method under test are within expected parameters. – Berin Loritsch Aug 31 '17 at 15:00
@BerinLoritsch You got me wrong (I possibly explained myself unclearly). I don't think you're inexperienced and I upvoted your answer. I was just explaining why I think FrankHileman's concern about the wording was warranted -- it's all too easy for inexperienced programmers (not you!) to fall into the wrong mindset when using misleading wording such as "proving". I agree with your edits and, again, I upvoted your answer! – Andres F. Aug 31 '17 at 15:31
Then my apologies. As to wording, I think there will always be a bit of a divide. For all the developers who have a formal Computer Science degree or math degree, they have the formal math definition in mind. I don't have a formal degree, so I think of the word in the slightly more relaxed understanding that you get from the English dictionary. There's a lot of people like me, so I'm always surprised when people get hung up on words that I use, inferring meaning that I didn't have when I wrote it. – Berin Loritsch Aug 31 '17 at 15:38

score 9 · Answer 2 · answered Aug 29 '17 at 14:22

9

Use dependency injection. Create an IRandomNumberGenerator interface, and inject it into your class (as constructor argument) or function (as parameter) that needs it. It can be as simple as the following:

interface IRandomNumberGenerator
{
     int GetRandomNumber();
}

Now create two classes that implement that interface. Your real number generator class, and a mock. Your mock implementation will return a predefined number, your real implementation will return an actual random number. Test with the mock implementation.

class MockRandomNumberGenerator : IRandomNumberGenerator
{
     private int _fakeRandomNumber;

     public MockRandomNumberGenerator(int fakeRandomNumber)
     {
          _fakeRandomNumber = fakeRandomNumber
     }

     public int GetRandomNumber()
     {
          return _fakeRandomNumber;
     }
}

Under no condition test with actual random numbers being generated, your test results will no longer be reproducible in that case. I'm using C# syntax, as I'm not familiar with Java.

answered Aug 29 '17 at 14:22

Eternal21

1,584
9
11

Unrelated to the question. – Frank Hileman Aug 30 '17 at 22:24
Seems like a lot of hassle when using the a real RNG with a specific seed, which works fine and disproves your last paragraph. – whatsisname Aug 31 '17 at 02:39
@whatsisname It works fine, until you want to ensure the test passes with a specific random number being generated. Mocking it takes 1 second (if you're not using Mocking libraries, you should), and gives you infinite flexibility. – Eternal21 Aug 31 '17 at 11:37
@FrankHileman Which part? – Eternal21 Aug 31 '17 at 11:38
I don't see how it addresses the question in any way. Why is DI often chosen for no reason? – Frank Hileman Aug 31 '17 at 16:18
3

@FrankHileman Because DI is the best way to decouple your unit under test from other responsibilities. As it is, the code was responsible both for generating random numbers, and selecting vocabulary words. That's why writing a test for it wasn't straightforward. Applying SRP (via DI) fixed that deficiency. – Eternal21 Aug 31 '17 at 16:29

score 5 · Answer 3 · edited Aug 30 '17 at 06:19

5

When testing:

Use your random number generator with a particular seed that you specify. Then you will always get the same sequence. This makes it testable.

edited Aug 30 '17 at 06:19

whatsisname

27,463
14
73
93

answered Aug 29 '17 at 09:35

Pieter B

12,867
1
40
65

Do I add the seed to production code? I don't have access to RNG in test code. – sayeed910 Aug 29 '17 at 09:40
@sayeed That would force results to always be the same. I'm guessing that's not what you're after so no. Is there a configuration file? You could load a seed value from the configuration, and only if it is provided, set it in the program. Your production configuration would not have a seed and you can test your code in test environment with a predefined seed with no need to concern yourself of changing program functionality between test and production. – Neil Aug 29 '17 at 12:19
5

Honestly, I think this leads to brittle tests. Why not prove aspects of the methods that can be proven without worrying about specific results. – Berin Loritsch Aug 29 '17 at 13:53

Nathan Cooper · Answer 4 · 2017-08-29T14:20:48.243

Make random a member of the class under test. Inject a mocked/fixed-seed value in for testing, and a real one for production code. Use Dependency injection.

class someclassname {
    private Random _rng;

    constructor(Random rng)

    public String generate(int wordCount)

    private selectRandomStringsFromInternalVocabulary(int wordCount) //words would be a better name here btw
}

You may want to make the vocabulary a dependency as well.

Also, you might have a bug that this will fix anyway. I don't know about Java, but going new Random() in a method that is called very frequently (say in a fast loop) can go badly when the default constructor uses a time based seed.

Testing a function that uses random number generator

4 Answers4