Pick your priorities
Participants express their track preferences arranging them in order and submitting them at a certain recorded time (which is useful for prioritizing earlier submissions)
You've got a conflict (or at the very least an undefined order) of priorities here. Are you primarily assigning tracks based on maximizing the attendees' cumulative preference, or the order in which they committed their preferences?
I'd argue that the preference takes precedence over the submission date, and to only use the submission date in order to resolve otherwise equivalent situations.
Your solution
But let's discuss your suggested solution first.
My first approach at a solution involves filling each track quota with participants who selected it as first choice, in the order of their submission. Once I complete that quota, I remove the track from the choice set (so some participants who could not get into their first choice now have a second chance at getting into their second) and go over those choices too, in order of submission.
Your suggestion, however, argues that submission date is more important than preference, which is IMO counterintuitive to having people fill in their preferences.
Let's use a simple example here. 3 people submit their preference for 3 tracks, let's assume a maximum capacity of 1 each. Submitted in chronological order:
PERSON | PREFERENCES
========================
1 | A, B, C
2 | A, C, B
3 | C, A, B
Following your described logic, you would:
- Assign P1 to A (A quota filled)
- Assign P2 to C (C quota filled)
- Assign P3 to B (B quota filled)
One person goes to their preferred track, another person gets their "compromise" (middle) choice, the other has to go to their least preferred track. At first sight, that seems perfectly balanced, as all things should be, right?
P3 is getting a raw deal here, they get assigned to their least favorite track. This doesn't optimize attendee preference at all. The first to submit always gets their way, and the last person to submit doesn't get a say at all and has to rely on pure luck that no one else chose their preferred track.
I understand that my example is quite black and white, but the point still stands.
There is a much fairer way to sort this:
Now, no one has to attend their least preferred track. Even better, two people get to attend their #1 preferred track.
Using your assignment logic, only one person was able to go to their #1 preferred track and another was being sent to their least preferred track, so the solution was definitely not as good.
Interesting to notice in the alternate solution: P2 was assigned to A, even though P1 submitted earlier, and both had A as a top preference. This directly contradicts your system's suggestion that submission date decides who gets their preference.
Weighting your preferences
I hope you agree that the alternate solution is better, so any algorithm that would render this solution instead of yours is objectively better.
When making users fill in those lists, you essentially force them to provide a personalized weight to each option. The top choice gets 3 points, the second choice gets 2 points, the last choice gets 1 point1 2.
The aim of the assignment logic then becomes maximizing total "points" for all attendants combined. The points represent a satisfaction rating, and you want to maximize it.
The dead simplest way to do that is to cycle through every possible assignment permutation, calculate the score, and track the best scoring result. However, that's obviously not efficient at all, especially since this problem exponentially grows with the number of attendants and tracks.
To do it more efficiently, you have to take a few steps. I'll discuss them in order.
- Assume perfection. Assign everyone to their top choice. Ignore track capacity for now.
- Now evaluate track capacity.
- If there are no capacity violations, awesome! Everyone gets their top choice!
That's of course wishful thinking. Odds are that you're going to have some tracks that have more attendees than their capacity allows.
However, if you remember our total satisfaction score, we currently have the highest possible score, i.e. attendants * topChoicePoints
.
We're going to be moving people around in order to meet the capacity criteria, and since this invariably means having to downgrade people, thus lowering the total satisfaction score, it's important to realize that we want to minimize both the amount of moves and the point loss accrued by every move.
1 The specific number values don't matter. You could, for example, give the top priority much more points. This means that the algorithm is going to assign more people to their first choice, but at the cost of downgrading others from their second to their third choice. How you assign the weighting points is up to you. A simple 3-2-1 is the easiest for my example.
2 If the preference list is shorter than the total list of tracks, all tracks not chosen should get an equal lower value (e.g. 0).
Question 1: What is the minimal amount of moves we have to do?
That's easy. It's the exact amount of people who are over the track capacity. If a track currently has 50 people assigned and a capacity of 40, that's a minimum required amount of moves of 10. Let's call this number the "excess" for simplicity's sake.
Note that we do not know which 10 of the 50 attendants will be moved, but we know that there are 10 people to move.
If you take the sum of all track excesses, then you get the minimal amount of moves we have to perform.
The logic won't be very hard to implement. Essentially, our algorithm runs until the remaining total excess is zero. Something along the lines of
while(tracks.Any(track => track.HasExcess))
{
RemoveOneAttendantFrom(
tracks.First(track => track.HasExcess), // source
tracks.Where(track => track.OpenSlots > 0) // possible destinations
);
}
The bread and butter is going to be the logic in MoveOneAttendant
; which is the next question.
Question 2: How do we decide who to move, and where to?
It is possible to now revert to your "first come first serve" approach, essentially working from the back (last to submit), removing those excess attendants and pushing them towards any open track that they somewhat prefer.
However, I still disagree that submission time is the most important factor here. We're still trying to minimize the point loss from moving someone, so we should definitely take their preference into account.
For example, if everyone at the back (last to submit) have chosen already full tracks as their top preferences, you're going to end up assigning them to an open track that is very low on their preference list, which is going to cause a massive point loss.
If at the same time there is a person whose second choice track is still open, it's much better to move them instead, since this minimizes the point loss, regardless of when they submitted their preference.
Secondly, assuming you find different people who each have a (different) second choice track still available, who do you then favor and which track do you send them to?
You could again argue "first come, first serve" and move the last one who submitted their preferences. But I would argue that it's in the convention's best interest, if the total satisfaction remains the same in either case, to try and balance the attendants over the tracks, instead of packing a few tracks and making others into ghost towns.
Our priorities are thus, in descending order of priority:
- The track capacity shall be respected
- Attendee satisfactions (= preference assignment) shall be maximized
- Attendees shall be evenly distributed among the tracks (when it doesn't violate priority 1 and 2!)
- As a last resort tie-breaker, attendees to submit their preferences earlier get the benefit of the doubt.
So I would suggest that you look for a destination track (to move someone to) based on the following criteria:
- Order the open tracks by open slots descending (or ratio of open slots over capacity, if your tracks have wildly different capacities)
- In order of the open tracks:
- From the source track (excess), is there a person whose second choice is this track?
- If not, check the next open track.
- If you find a match, move that person, and exit the
RemoveOneAttendantFrom
function as it is complete.
- If multiple people satisfy the same target track as their X-th choice, then you move the person who submitted the latest (you get to finally use the submission date!)
- If you did not find a match, iterate across the tracks looking for a third-choice match, then a fourth-choice, then a fifth-choice, ... untill you've exhausted the length of your preference lists.
If you have exhausted all the Xth-choices (for the entire preference list) of all the open tracks and found no matches, there are two possibilities:
- Your preference lists are shorter than the total track list. Since all unchosen tracks are equivalent and all chosen tracks are no-go for literally everyone who could be moved, you can essentially pick any attendant from the excess track and assign them to any unchosen track. You can (finally!) use your submission date here to favor those who submitted earlier.
- If your preference lists contain all possible tracks, then your combined track capacity is less than the number of attendants you have, which renders the algorithm moot as you've overbooked your convention.
Something along the lines of:
public void RemoveOneAttendantFrom(Track excessTrack, IEnumerable<Track> openTracks)
{
var orderedTracks = openTracks.OrderByDescending(track => track.OpenSlots);
for(int i = 2; i < preferenceListLength; i++) // start from second choice already
{
foreach(var openTrack in orderedTracks)
{
if(excessTrack.Attendants.Any(attendant =>
attendant.Preferences[0].TrackId == openTrack.Id))
{
var latestAttendant = excessTrack
.Attendants
.Where(attendant => attendant.Preferences[0].TrackId == openTrack.Id)
.OrderByDescending(attendant => attendant.CreatedOn)
.First();
latestAttendant.MoveTo(openTrack);
return;
}
}
}
// Either assign any attendant to any non-chosen open track,
// or throw a "too many attendants" exception
}
The implementation of latestAttendant.MoveTo(openTrack)
is left as an exercise for the reader.
Caveat
There is one caveat here: combined movement. For example, suppose you have to move a person, and you have to assign them to their fourth choice because their second and third choice tracks are already full. That's a -3 point loss.
There is a better solution possible. Maybe you could move that person to a second-choice full track (incurring -1), and then move another person out of that track to their second choice (assuming it's on the list of open tracks), which also incurs a -1 point loss. This is a better outcome, as the total point loss is -2, instead of -3.
However, now you've removed two people from their favorite track, instead of one person. But, on the other hand, you've minimized each person's dissatisfaction as they both only get -1 satisfaction point.
In the beginning I said we we want to minimize both the amount of moves and the point loss accrued by every move. That is correct, but now we get to another question of priority: which one do we want to minimize the most?
This is where it becomes arguable which is better. I would argue that it's better to impact each person (individually) less, even at the cost of impacting more people, as it will generally increase the amount of people who will have mostly liked your convention. If you target one person's dissatisfaction at the cost of keeping another person very happy, that's one negative review and one positive one. I'd argue that it's better to have two above average reviews than it is to have a great and a bad one, but your mileage may vary here.
If you choose for the "divided dissatisfaction approach", the problem becomes a bit more complex, as you effectively have to search for:
- The best (= least points) move by moving the target person to the destination directly
- Find the best combinated move that moves the target person to a full track, and then moves another person to the actual destination track
You compare the two values (or stop looking when you know you've exhausted all better possibilities from the second bullet point) and make your decision based on that.
This is essentially the same logic, except that it's recursive and keeps track of the total cost of this "combo" move across its recursions.