There are no legal repercussions that I'm aware of. If a web master notices you crawling pages that they told you not to crawl, they might contact you and tell you to stop, or even block your IP address from visiting, but that's a rare occurrence. It's possible that one day new laws will be created that add legal sanctions, but I don't think this will become a very big factor. So far, the internet culture used to prefer the technical way of solving things with "rough consensus and running code" rather than asking lawmakers to step in. It would also questionable whether any law could work very well given the international nature of IP connections.
(In fact, my own country is in the process of creating new legislation specifically targeted at Google for re-publishing snippets of online news! The newspapers could easily bar google from spidering them via robots.txt
, but that's not what they want - they want to be crawled, because that brings page hits and ad money, they just want Goggle to pay them royalties on top! So you see, sometimes even serious, money-grubbing businesses are more upset for not crawling them than for crawling them.)