C#/VB struct – how to avoid case with zero default values, which is considered invalid for given structure?

Question

How to implement some constrained .NET struct/structure (let's say LimitedString), where its state with default values (set by CLR to technical defaults, i.e. nulls, zeros, etc.) should be prohibited due to some design constraint?

For example in case of trivial struct LimitedString, properties are String Value and int MaxLength = 10, where value of the MaxLength property must be at least 1. Value 0 is not allowed by design. But when I initalize the structure, I have 0 there. How to force value 10 into defaults?

I see two options:

Throw exception in static (vb.net shared) parameterless constructor → force using only costructor(s) with parameters. Impractical, parameterless use is expected sometimes.
Add helper private field IsInitialized and while it is false, assume default values, i.e. MaxLength = 10. Slightly higher complexity inside the struct.

Is option #2 a legitimate way or does this violate some design principles? Is there some better way than option #2?

EDIT: option #1 won't work anyway, mentioned constructor is called every time, even if other constructors are called.

Let 0 express 10 :) Then make property MaxLength { get { return 10 - maxLength} }; Also add similar transform in ctor. [whoops, I see there is already answer same like my comment] — apocalypse, May 13 '16 at 15:53

score 7 · Accepted Answer · answered May 13 '16 at 11:39

7

Unless you have a really, really good reason that this has to remain a Struct, I would recommend that you convert it to a Class.
That way, initialisation is completely under your control, through the Constructor(s) of that Class.

If you really can't convert it, then I'd suggest creating a Factory Class to "construct" instances of this Struct; that class then takes on the responsibility for properly initialising instances of the Struct.

answered May 13 '16 at 11:39

Phill W.

11,891
4
21
36

My actual case is very close to `LimitedString` I have shown. In fact, currently I have a class for it. Having a class for relatively simple value type often included in expressions does not look to me as a good idea, so I attempted to move it to struct. A factory class looks like a good idea (thanks) but programmers would need to be aware of it – maybe not very practical... also when considering its use in expressions. – miroxlav May 13 '16 at 11:53
Anyway, good answer. I'll re-evaluate supposed structure usage to decide whether to keep it as `class` or move to `struct` with that defaults trick shown in question as option #2 (code changes were minimal and it works nicely). Thank you for the insight. – miroxlav May 13 '16 at 12:21
2

Good answer. There have been many objects I have made over the years that fit most of the qualifications for a structure (small, primitive types, value semantics made sense in context), but I chose to change them to immutable classes and then overrode Equals simply because the 0's and null defaults were not a valid state that I wanted to exist. Overriding equals and keeping them immutable gets you close to the same usage as value semantics. – Mike May 13 '16 at 15:33
2

How does using factory help? It doesn't prevent the user from creating the default value. – svick May 13 '16 at 20:44
I think the biggest problem with accidently generating invalid structs is things like `FirstOrDefault() ` – Dirk Boer Nov 04 '22 at 11:07

31eee384 · Answer 2 · 2016-05-13T21:09:41.607

5

First note: I agree this should probably be a class. For a struct, though:

Is it possible to change the design? In LimitedString's case, it sounds like 0 is perfectly fine: a string with no characters. You can't enforce anything with a default constructor (like you suggested in #1) because structs (in C#) cannot contain explicit parameterless constructors.

For #2, maybe it's simpler to change the meaning of the struct's state to establish a good by-design default without adding an extra flag or a special-case 0 condition:

private int _maxLengthMinusTen; // By default 0, making MaxLength 10.
public int MaxLength => _maxLengthMinusTen + 10;

A constructor would perform the conversion from an input maxLength, or you could provide a private setter to keep the 10 localized in the code.

edited May 13 '16 at 21:09

answered May 13 '16 at 15:47

31eee384

151
4

`private int _maxLengthMinusTen;` That's evil, I like it! – svick May 13 '16 at 20:43
good... actually `MaxLength` limit disallowing `0` is in databases... `char`/`varchar`/`text`... of course, `0` can stay for `max` as in `varchar(max)` – miroxlav May 13 '16 at 21:05
@miroxlav Ah, didn't think about it being dependent on a more set in stone limit. I loosened the language a bit there. – 31eee384 May 13 '16 at 21:14

Dai · Answer 3 · 2022-03-07T17:01:48.533

How to implement some constrained .NET struct, where its state with default values (set by CLR to technical defaults, i.e. nulls, zeros, etc.) should be prohibited due to some design constraint?

You can't. As of 2022 and C# 10.0, there still is no way to prevent consuming code from having default struct values:

LimitedString[] values = new LimitedString[ 5 ];
MethodThatRequiresNonNullString( values[0].Value ); // <-- This will always fail at runtime, without _any_ compile-time warnings or errors.

Q.E.D.

_{Whereas if LimitedString were a class type, and if C# 8.0 nullable-reference-types were enabled, you'd get a compile-time warning that LimitedString[] values should be typed as LimitedString?[] and that the values[0].Value dereference is unsafe.}.

However this does not necessarily mean that you should be using a class type for this: it just means you need to understand how you can implement struct types correctly and appropriately.

For example in case of trivial struct LimitedString, properties are String Value and int MaxLength, where value of the MaxLength property must be at least 1. Value 0 is not allowed by design.

But when I initialize the structure, I have 0 there. How to force value 10 into defaults?

You appear to be thinking that it's okay to define classes and structs that can be instantiated into an invalid state and then set their properties afterwards until they're somehow "initialized". This is not how classes nor structs should be designed.

^{(I blame WinForms and WPF/XAML for so many .NET developers getting into this plainly wrong mindset, because WinForms and WPF/XAML basically require all component classes to have parameterless constructors and be post-hoc initialized).}

Constructors exist to ensure that their newly created objects are in a specific valid state, this means having to assert preconditions about their parameter values (using ArgumentException). And in .NET, struct types should always be immutable (which necessarily means that their properties are strictly get-only: no set nor init properties!) so always write readonly struct-types, not struct-types.

With that in mind, let's review your first question again:

But when I initialize the structure, I have 0 there. How to force value 10 into defaults?

You should make maxLength a constructor parameter, not as a property you set after construction - which means it can also validate the String value.

However because maxLength is not an invariant of your program it means your LimitedString becomes less useful (e.g. e.g. a method that accepts a LimitedString limStr parameter has no useful compile-time guarantees that the actual limStr.Value.Length is anything but non-zero, so it would have to check the ``limStr.Value.Lengthitself at runtime which is hardly better than just passing aStringvalue. Instead theMaxLengthvalue should be expressed as a _type-parameter_ ofLimitedString, e.g. LimitedString<MaxLength: 10>, unfortunately C# does not support inttype-parameters like C++ does - but you can hack it in other ways... <sub>(but that's another discussion...)</sub> - but I'll continue with my answer anyway, but I'll disregard the _invariance_ aspects of yourMaxLength` design.

Throw exception in static parameterless constructor → force using only costructor(s) with parameters. Impractical, parameterless use is expected sometimes.

Again, you misunderstand the purpose of constructors (and are also seemingly getting confused by the type-level static LimitedString() "static constructor", which is actually completely irrelevant to your question, as your struct LimitedString won't have any static members).

You should have a parameterized constructor (accepting String value, Int32 maxLength) and your constructor must throw new ArgumentException to make precondition assertions about those parameter's arguments values. That's the whole point of a constructor, regardless of if it's a struct or a class's constructor.

...However, because C#/.NET struct types always have a parameterless constructor that cannot be manually defined or removed (an unavoidable consequence of the low-level details of how struct types work: it's their default or "zero" value!) So in C#, whenever you're writing a method with a struct-type parameter or struct-typed property-setter you always need to be cognizant of the possibility that that input is default: and then act accordingly depending on your business/domain rules (i.e. "can a default or "uninitialized" value of this type ever be considered valid in my program?"). If not, then your program needs to reject it in appropriately: either by throwing an ArgumentException, returning false from a Try...-pattern method, etc.

Add helper private field IsInitialized and while it is false, assume default values, i.e. MaxLength = 10. Slightly higher complexity inside the struct.

You actually wouldn't need to add a whole new field to detect default struct state: assuming your constructor requires the String value argument to be non-null (in addition to checking the length) before storing it in readonly String myStringValue; then you know that if the struct is default then the myStringValue field will also be null (as null == default(String)) - so just checking if this.myStringValue is null is enough to tell you the struct is invalid. But you don't even have to do that: the as you said MaxLength must be > 0 and because default(Int32) == 0 you could always just check if this.maxLengthValue == default(Int32) to see if your struct is invalid.

Slightly higher complexity inside the struct.

Unfortunately in the case of struct types, that "slightly higher complexity" is absolutely necessary because a struct's member methods and properties can be invoked on default instances (whereas a class will never have its instance methods invoked when this == null), so all of your structs' externally visible (i.e. public and internal members) must self-validate this as a precondition.

Is option #2 a legitimate way or does this violate some design principles?

On the contrary: Option #2 is the only way (and Option #1 is either nonsensical or demonstrates a lack of understanding of OOP fundamentals and the purpose of constructors).

With that in mind, let's review the hard-and-fast rules for struct type design (especially since C# 7 made significant improvements to struct types with the addition of readonly struct, for example):

TL;DR: aka Hard and fast rules:

If your instance's data is mutable, or exceeds ~32 bytes in aggregate, then then you should use a class instead of a struct.
Use readonly struct LimitedString, not struct LimitedString.
Define your struct LimitedString's state using only fields, not auto-properties...
- ...and never read those fields directly!
- Instead all read access to those fields should be indirectly done via wrapper getter-only properties, which all ensure this != default(LimitedString) before returning.
Ensure all members and consumers of your struct LimitedString use only those self-validating wrapper properties.

But why?

Because any and all struct-types can always possibly be default, it means that if struct-type's default state is invalid (and so should never be encountered during program operation) it means all public members of structs must be self-validating in some way or another.

I find the best approach is to always use private readonly fields and require all access to their data to by via an expression-bodied property that performs the state validation. This does mean that you cannot use auto-properties to avoid having to define both the field and property for the same logical member *grumble*.

Rather than using your struct LimitedString for my first example, let's review this contrived example struct Foo instead, which features more problem-areas:

public struct Foo
{
    public Foo( Bar bar, Qux qux, Int32 neverZero, Int32 canBeZero )
    {
        if( neverZero == 0 ) throw new ArgumentOutOfRangeException( paramName: nameof(neverZero), actualValue: neverZero, message: "Cannot be zero." );

        this.Bar = bar ?? throw new ArgumentNullException(nameof(bar));
        this.qux = qux ?? throw new ArgumentNullException(nameof(qux));
        this.NeverZero = neverZero;
        this.CanBeZero = canBeZero 
    }

    public Bar Bar { get; }
    private readonly Qux qux;
    public Int32 NeverZero { get; }
    public Int32 CanBeZero { get; }

    public SomethingElse Baz()
    {
        return this.Bar.Hmmm( this.qux ).LoremIpsum( 123 );
    }

    public CompletelyDifferent MyHovercraftIsFullOfEels()
    {
        return this.qux.IWillNotBuyThistTobaccanistItIsScratched();
    }
}

...you'll need this instead:

public readonly struct Foo
{
    public Foo( Bar bar, Qux qux, Int32 neverZero, Int32 canBeZero )
    {
        if( neverZero == 0 ) throw new ArgumentOutOfRangeException( paramName: nameof(neverZero), actualValue: neverZero, message: "Cannot be zero." );
        
        this.bar_DoNotReadDirectlyExceptViaProperty = bar ?? throw new ArgumentNullException(nameof(bar));
        this.qux_DoNotReadDirectlyExceptViaProperty = qux ?? throw new ArgumentNullException(nameof(qux));
        this.NeverZero = neverZero;
        this.CanBeZero = canBeZero 
    }

    private readonly Bar    bar_DoNotReadDirectlyExceptViaProperty;
    private readonly Qux    qux_DoNotReadDirectlyExceptViaProperty;
    private readonly Int32  neverZero_DoNotReadDirectlyExceptViaProperty;
    private readonly Int32? canBeZero_DoNotReadDirectlyExceptViaProperty;

    public  Bar Bar         => this.bar_DoNotReadDirectlyExceptViaProperty ?? throw new InvalidOperationException();
    private Qux Qux         => this.qux_DoNotReadDirectlyExceptViaProperty ?? throw new InvalidOperationException();
    public  Int32 NeverZero => this.neverZero_DoNotReadDirectlyExceptViaProperty != 0 ? this.neverZero_DoNotReadDirectlyExceptViaProperty : throw new InvalidOperationException();
    public  Int32 CanBeZero => this.canBeZero_DoNotReadDirectlyExceptViaProperty ?? throw new InvalidOperationException();

    public SomethingElse Baz()
    {
        return this.Bar.Hmmm( this.Qux ).LoremIpsum( 123 );
    }

    public CompletelyDifferent MyHovercraftIsFullOfEels()
    {
        return this.Qux.IWillNotBuyThistTobaccanistItIsScratched();
    }
}

The main thing to observe here is that now all accesses of struct Foo's state (i.e. its fields) from the Baz and MyHovercraftIsFullOfEels methods now go via the Bar and Qux properties instead of accessing the fields directly (which would cause a NullReferenceException, even when using C# 8.0 nullable-reference-types).
The field name style bar_DoNotReadDirectlyExceptViaProperty is intentionally ugly: it mitigates the risk of human-factors (e.g. if the struct Foo code is ever modified in future by someone who is unfamiliar with the patterns in-use here) and ensures all access go through the self-validating wrapper properties: Bar and Qux respectively.
- Warning sign naming is a common technique, e.g. dangerouslySetInnerHTML in ReactJS. I don't know if there's a formal name for the technique though.
- Ideally C# would let us have property-scoped fields which would make this technique completely moot but it doesn't. Oh well.
  - I suppose you could write a Roslyn code-analysis rule that enforces the requirement that all readonly struct fields are always read only via their matched self-validating properties.
Observe how the self-validating property-getter throws InvalidOperationException because that is the specific exception type you should use for this situation (emphasis mine):

InvalidOperationException: The exception that is thrown when a method call is invalid for the object's current state.
- I'm sure we're all familiar with the maxim that "property-getters should never throw exceptions", I'll remind everyone that it should really be rephrased as "property-getters on correctly constructed objects should never throw exceptions".
  - This is because it's the constructor's responsibility to ensure the object is in a valid state - that's the entire point of constructors in the first place: to make guarantees about the state of an object.
  - Therefore, if a struct value is not constructed correctly (e.g. SomeStruct val = default(SomeStruct);) then its property-getters should throw an exception, because it's an exceptional circumstance and you really do what the program to fail.
The Int32 canBeZero value is stored in a Nullable<Int32> field (aka Int32? aka int?) because otherwise it is impossible to differentiate between default(Int32) and zero, while using Nullable<Int32> makes it possible to detect never-initialized fields that store Int32 values - or any other value-type T where default(T) is meaningful in your application (e.g. TimeSpan).
- However this isn't necessary for the Int32 NeverZero property because the domain-rules (which are implemented in the constructor) which specifically disallow zero means that default(Int32) is an invalid value, so there's no need to use Nullable<Int32> as the underlying field type.

A better `struct LimitedString`

So having considered that, this is how I would design your struct LimitedString:

public readonly struct LimitedString
{
    public LimitedString( String value, Int32 maxLength )
    {
        if( value is null ) throw new ArgumentNullException(nameof(value));
        if( maxLength < 1 ) throw new ArgumentOutOfRangeException( message: "Value must be positive and non-zero.", nameof(maxLength));
        if( value.Length > maxLength ) throw new ArgumentException( message: "Length {0} exceeds maxLength {1}.".Format( value.Length, maxLength ), nameof(value));

        this.value_doNotReadDirectly     = value;
        this.maxLength_doNotReadDirectly = maxLength;
    }
    
    private readonly String value_doNotReadDirectly;
    private readonly Int32  maxLength_doNotReadDirectly;

    public String Value => this.value_doNotReadDirectly ?? throw new InvalidOperationException( "This LimitedString is a default value and was not constructed correctly." );

    public Int32 MaxLength => this.maxLength_doNotReadDirectly > 1 ? this.this.maxLength_doNotReadDirectly : throw new InvalidOperationException( "This LimitedString is a default value and was not constructed correctly." );
}

While struct LimitedString now correctly handles both default and constructed states, it still needs other improvements made to it in order to make it suitable for production use, such as these listed below:

With the above public readonly struct LimitedString, as-is, Visual Studio will instantly prompt you to override Equals and GetHashCode.
- Implementing GetHashCode correctly is very important for struct types and is an absolute requirement if you want to use it as a TKey type in a Dictionary<TKey,TValue> or as T in HashSet<T>.
You'll also want to override ToString() too, so you can pass LimitedString directly into String.Format args or Console.Write, as well as to get a better experience in the VS debugger.
- This is trivial: just do public override String ToString() => this.Value.ToString();.
- Also consider adding [DebuggerDisplay] too.
Add an implicit conversion operator to String: because every valid LimitedString represents a valid String.
- But don't add an implicit conversion from String: because not every non-null String is a valid LimitedString.
  - And also because LimitedString requires a maxLength parameter, which is not present in a normal String value, of course.
Implement IEquatable<LimitedString> and maybe IEquatable<String> too.
Add public methods and forwarded get-only properties to LimitedString that match the public API surface of String so that your LimitedString type can be a drop-in replacement for a String parameter or property in your existing code.
- so add members like public Int32 Length => this.Value.Length; and public String Trim() => this.Value.Trim();.
- This can be tedious though, and it is unfortunate that C# still cannot auto-generate those members for us.

A real-world example: Refinement types.

I use readonly struct-types as refinement types in my projects, which is handy for when I want to express a static constraint over some business object.

For example, supposing I have class User with an optional String? EmailAddress { get; } property and I want to pass it around some in code that requires the User to have a valid e-mail address (not just a non-null String e-mail address value) then I'd solve this problem by defining readonly struct UserWithValidEmailAddress : IReadOnlyUser and then changing all member parameters and return-types in the code that requires e-mail addresses from User to UserWithValidEmailAddress, which means the compiler will give me a compile-time error if any code tries to pass a User (because that User object could have an invalid or absent e-mail address) without first validating it with a validating-factory method.

I use readonly struct because that means it's essentially a zero-cost abstraction, which we don't get with class-types (as class-types always involve a GC allocation and subsequent collection) whereas struct types are "free" until/unless boxed.
The use of implicit conversion and implementing the IReadOnly... interface means that struct-types lack of inheritance aren't problems either (and even if these refinement types were class types instead of struct types inheritance would be the worst way to implement them).
- In all of my projects I always define IReadOnly... interfaces for all domain-types, so class User will have IReadOnlyUser which has only read-only versions of User's instance properties in addition to all methods that don't cause any mutation.
  - Having IReadOnly... interfaces is also useful because C# still doesn't have C++-style const-correctness. *grumble*.

Unfortunately because struct-types can always exist in their default form, it's necessary for them to self-validate inside every public instance member, not just inside their parameterized constructor. And unfortunately, C# auto-properties (used for brevity) cannot self-validate, which makes them unsafe - so those have to be implemented as private fields with public get-only properties which throw InvalidOperationException if they detect they're in an invalid/default state (which is trivial: just ensure that reference-type fields are non-null, or if all fields are value-types, then use System.Nullable<T> for the first field and throw if that's null at runtime - or add a bool field that will always be true if any of the defined ctors are used).

So this is what my UserWithValidEmailAddress looks like:

public readonly struct UserWithValidEmailAddress : IReadOnlyUser
{
    public static Boolean TryCreate( User user, [NotNullWhen(true)] out UserWithValidEmailAddress? valid )
    {
        if( ValidateEmailAddress( user.EmailAddress ) )
        {
            valid = new ValidateEmailAddress( user, new MailAddress( user.EmailAddress ) );
            return true;
        } 
        valid = default;
        return false;
    }

    public static implicit operator User( UserWithValidEmailAddress self )
    {
        return self.User;
    }

    private UserWithValidEmailAddress( User user, MailAddress addr )
    {
        this.user_DoNotReadDirectlyExceptViaProperty = value ?? throw new ArgumentNullException(nameof(user));
        this.validatedMailAddress = addr ?? throw new ArgumentNullException(nameof(value));
    }

    private readonly User user_DoNotReadDirectlyExceptViaProperty;
    private readonly MailAddress validatedMailAddress;

    public User User => this.user_DoNotReadDirectlyExceptViaProperty ?? throw new InvalidOperationException("This is a default(UserWithValidEmailAddress).");

    public MailAddress ValidEmailAddress => this.validatedMailAddress ?? throw new InvalidOperationException("This is a default(UserWithValidEmailAddress).");

    public override String ToString() => this.User.ToString();

#region IReadOnlyUser
    // This code-block is generated by my own tooling, which helps with the tedium.
    // Because all of these members go through `UserWithValidEmailAddress.User` (which self-validates) instead of directly accessing the `user_DoNotReadDirectlyExceptViaProperty` field it means `default`-safety is guaranteed.

    public String   UserName    => this.User.UserName;
    public String   DisplayName => this.User.DisplayName;
    public DateTime Created     => this.User.Created;
    
    // etc

    // The IReadOnlyUser.EmailAddress property is implemented explicitly, instead of as a public member, because it's redundant: if a method is using `UserWithValidEmailAddress` then it should use the `MailAddress ValidEmailAddress` property instead.
    String IReadOnlyUser.EmailAddress => this.User.EmailAddress;

#endregion
}

While this looks tedious to write every time I want a refinement type, I have a VS code-snippet that inserts the outline for me, as well as having other tooling to automatically generate the #region IReadOnlyUser part too.

C#/VB struct – how to avoid case with zero default values, which is considered invalid for given structure?

3 Answers3

TL;DR: aka Hard and fast rules:

But why?

A better `struct LimitedString`

A real-world example: Refinement types.

Linked

C#/VB struct – how to avoid case with zero default values, which is considered invalid for given structure?

3 Answers3

TL;DR: aka Hard and fast rules:

But why?

A better struct LimitedString

A real-world example: Refinement types.

Linked

A better `struct LimitedString`