I've read through the answers and some of them are good and on point, but I feel that they don't give the full picture on the major differences between them and what their practical uses are for and when and where to use them in hardware design.
-Note- I'm 100% self taught and do not possess an engineering degree, nor have I physically built an electronic device or even soldered a circuit for that matter. However, I have independently studied all of the topics associated with electrical engineering as I have a high interest in the subject matter.
The main difference between the two is that a Decoder will set a particular Output to either High Voltage a Logical One or Low Voltage a Logical Zero depending on the Driving Logic that is used combined with the States of its Select Input Bits where a Demux contains a Decoder but will direct or allow the flow of any sized Bus line through it. A Decoder does not accept any input of data other than the select and enable control lines where a Demux does have data input lines...
Typically a Demux is used with a data bus line but not always. The data bus line does not pass through the Decoder portion of the Demux itself. The propagation of the decoded address line will direct which output data bus is currently active. The propagation and direction of data flow takes place after the signal propagates through the Demux's internal Decoder into a series of N-bit 2 Input And Gates. The signal from the internal Decoder's Output is usually tied together with a bit-extender to make all N-bit Lines to its respective And Gate's Input High to make it active or True while the N-bit data bus is connected to the other And Gate's Input. All other Data Buses' Lines' Output Will be OFF since their And Gates have all Zeros passing through their one Input since only 1 bus is active at a time while its internal decoder's enable line is asserted.
Here is a screen shot image of the basic design of a 3-8 Decoder and an 8-bit Way Demux both with an Enable from Basic Logic Gates. The screen shot was take from Logisim after building these circuits.

When would you want to use one in favor of the other? This depends on your particular needs for your intended use along with any constraints you may have.
- Constraints:
- Board or Circuit Size
- Timing issues or concerns with propagation delay
- Cost of parts
- Desired use
- Other unlisted factors
If you are constrained by a small area such as in a micro controller, then it would more than likely be preferable to use a Decoder as it is smaller, easier and requires less transistors, wires, and area to make. It also has a smaller propagation delay for timing constraints within some circuits compared to a Demux since Demuxes have an extra layer of Logic to perform.
So when would you want to use a Demux instead of a Decoder? As others have stated, if you need to perform additional logic, or to route Data Lines to different Buses then it might be more appropriate.
Consider a CPU or a Micro-controller that has a very small area which contains many sub sections such as its Registers, Cache, ALU, and Control Units. When a specific task or instruction has been performed by the CPU's ALU it normally sets different bits within a Flags register. The output of the Flags Register will normally be connected to the Control Unit.
The Control Unit will also have specific bits coming in from the Instruction Register as well as lines from the Instruction Counter and the Stage Counter where the Stage Counter would be the cycle stage that the current instruction is on, such as Fetch, Decode, Execute, Write, etc... This is where you would want to use a Decoder and not a Demux within your Control Unit. Other parts of the CPU and its Data Path such as the routing lines for the Registers or Register File to select which register, or the Cache-Line - Cache-Blocks, etc. where you want to designate address lines to be active, you would probably want to use a Decoder instead of a Demux.
Now, within the ALU itself for controlling which function is to be performed based on the signals that are sent from the Control Unit you then may use either depending on the design of your ALU. Here, you may want to use a Demux to direct the output of the operation of the ALU to a specific region within the CPU for example, you might want to send it to the Data-Bus that goes to the Register-File to be written, or to the Output-Bus to be written back to the Cache or Main-Memory, etc.
Typically, when you are working with Instructions to be decoded, flags to be processed, or address and control lines to become active, this is when you would prefer to use a Decoder.
When you are working with Data-Lines or Data-Buses and you want to direct the flow of Data from one place to another such as the various buses on a mother board... Then you would probably prefer to use a Demux over a Decoder as it would also allow you to perform additional Logic directly to the Data that is being passed through.
Sometimes it may not always be practical to use a Demux in some situations. For example, if you have data coming across a bus that is connected to some device, and this device has a very specific and narrow timing range that is specified in its documentation and data sheets for read and write operations... Adding a Demux before this device might cause the device to not work as you would intend it to. Here you would probably have to use a Decoder with Tri-State Buffers such as a Bus-Transceiver to make sure that the signal of the *Data is reaching the Device within the appropriate time limits.
Demuxes are still used in some remote situations, but are seldom used compare to their inverse device Multiplexer or Mux as there are better alternatives that are used in practices such as using Tri-State Buffers.
Now, depending on the type of system that is being used for its internal memory structure such as synchronous clock timings, read or write on the rising or falling edge... as opposed to asynchronous memory read and writes by asserting control bits to automatically read a write agnostic of any clock can also be a determining factor of whether to use either a Demux, a Decoder with other logic wrapped around it, a Bus-Transceiver also known as Tri-State Buffers with direction flow and enabling logic, or some other custom built device to perform such similar tasks.
Think of a Bus Controller that is built into a given computer's Mother or Main Board... Some are built in hardware, other's have programmable ROMs, while others may have flash ram that allows a given CPU to set it's logic as opposed to the Control Unit of a CPU or even a GPU...
I hope that this clears things up, not so much for the person who asked this question as they have already accepted an answer, but to illustrate the main differences between these devices for future readers and novices who may be interested in going down this path pursuing this field...