This is basically a logging/counting application that is counting the number of packets and counting the type of packet, etc. on a p2p chat network. This equates to about 4-6 million packets in a 5 minute period. And because I only take a "snapshot" of this information, I am only removing packets older than 5 minutes every five minutes. So the maximum about of items that will be in this collection is 10 to 12 million.
Because I need to make 300 connections to different superpeers, it is a possibility that each packet is trying to be inserted at least 300 times (which is probably why holding this data in memory is the only reasonable option).
Currently, I have been using a Dictionary for storing this information. But because of the large amount of items I'm trying to store, I run into issues with the large object heap and the amount of memory usage continuously grows over time.
Dictionary<ulong, Packet>
public class Packet
{
public ushort RequesterPort;
public bool IsSearch;
public string SearchText;
public bool Flagged;
public byte PacketType;
public DateTime TimeStamp;
}
I have tried using mysql, but it was not able to keep up with the amount of data that I need to insert (while checking to make sure it was not a duplicate), and that was while using transactions.
I tried mongodb, but the cpu usage for that was insane and did not keep either.
My main issue arises every 5 minutes, because I remove all packets that are older than 5 minutes, and take a "snapshot" of this data. As i'm using LINQ queries to count the number of packets containing a certain packet type. I also am calling a distinct() query on the data, where I strip 4 bytes (ip address) out of the keyvaluepair's key, and combine it with the requestingport value in the Value of the keyvalupair and use that to get a distinct number of peers from all the packets.
The application currently hovers around 1.1GB of memory usage, and when a snapshot is called it can go so far as to double the usage.
Now this wouldn't be an issue if I have an insane amount of ram, but the vm I have this running on is limited to 2GB of ram at the moment.
Is there any easy solution?