I'm pretty sure Olin has it nailed in his comment there: The idea is that you can either increase the speed by accessing more bits in parallel or decrease the pin count and access bits serially, but these things happen internally. The way you get a class 10 device is that the internal controller takes in the write command and accesses enough flash cells in parallel so that erasing and re-writing them can happen at 10MB/s. The problem here is that in general this makes stacking flash cells more expensive because you need more lines between each layer, which is why micro-SD cards are so much more expensive in higher classes.
The other way you can increase speeds is through pre-erasing cells. The problem is that you can only change individual bits one direction (I don't remember whether this is high-to-low or low-to-high), and the other direction requires you to wipe the whole cell. So in general when you try and write 512 bytes, the SD card will erase the block you're trying to write to and then write the new data. This slows down the transaction, but if you instead marked that cell for erasing later, and then wrote to a different cell which had been pre-erased, it would happen much faster. Then the controlling IC can then go through and pre-erase the marked cells when it's idle.
Aaaaand I wrote this whole blob like you were talking to SD cards, but you said you're writing to flash chips. Whoops! The advice for pre-erasing cells should still be valid if you have that level of control over the flash chips. Anyone may feel free to correct me if I'm wrong, and I hope that helps!
Edit:
Looking at the tags it looks like you might actually be asking about SD cards, in which case the only thing you could really do would be external parallelization. Essentially you'd be implementing a RAID 0 where the first byte goes to the first sd card, the second byte to the second sd card, etc. You could theoretically increase your throughput N times where N is the number of cards, so long as the data came at a rate where you could expect that the first card would be finished writing by the time you finished sending the write command to card N.
The downside to this is that you would need N functioning SD card interfaces, and would be kind of a pain to get data on and off it.