I know it is old, but I happen to have done this with a dspice last year, so I'll summarize for other people's benefit.
First, I wouldn't use the W5100, but its brother W5500, which is basically a revision, and utilizes the SPI much better. I also would consider switching to a part that has DMA, specially if you want to make it UDP only.
In both cases you probably will use the Microchip MLA TCP/IP stack, Wiznet provides patches for this.
Unfortunately, all Microchip TCP/IP stack variants seems to do blocking communication over SPI (no DMA, no enhanced buffer mode). I tried to cut it down to UDP only and clipped the entire microchip part (using the wiznet underlying driver directly and rewriting it in the process).
I also agree with MJH that the DMA enabled PIC18F97J60 is a better choice than a cheaper PIC with ENC (unless you numbers are really high), but I was somewhat disappointed that the TCP/IP does not really utilize the benefits of the J60, sticking to the lowest common denominator.
The advantages of using a IP part instead of an ethernet part is that you can limit a socket to a certain port, and you won't have to transfer any unrelated traffic over your SPI link. The W5500 has 4KB per socket, and I use a separate sockets for receiving and sending to maximize buffer utilization.
My current UDP stack reacts only on the wiznet interrupt, and does not download payload data it doesn't need. I use it UDP, packet based though (no streams), and use broadcasts on ports for send (to avoid having to cache MAC data for ARP purposes, though in retrospect that is maybe not the best opimization).
On the 60MIPS dspice a roundtrip (receive a small packet, answer with a small packet) takes about 100-120us, of which about 10-12us is CPU time in three different chunks (pre receive(3-5us), post receive and presend(5-7 us depending) and post send (2us). Once every 2kb I have to do some maintenance that is about 40us wall time and 5us CPU time.
Short commands are done using enhanced buffer. Longer are done using DMA using (on dspice, DMA needs 2 bits of time between bytes (or words in 16-bit mode), enhanced buffer does not).
The suite is not (yet) open, but if sb needs pointers please respond in the comments. I plan to port the stack to pic32(mk) in the coming year.