Should we contact any site we want to scrape to build our market intelligence app?
No company would say yes if we were to contact them.
While Google obviously doesn't ask for permission while scraping (indexing) the entire web.
Any suggestions?
I think its a stretch to say that because google does it, its okay for you to do so.
Check what the target site(s) have in regard to copyright notices / data reuse notices. Perhaps they have an api. IF you are not going to compete against them, why not provide an avenue where you promote their site as a data provider / provide links back to their source? Simply "do it and apologize later" IS a strategy, but it comes with risks.
Scraping is almost always illegal when collected info is republished identically (styling doesn't save you here). Even so, it is rarely prosecuted - but the frequency is increasing.
There have recently been a number of law suits by companies who were the 'victim' of scraping. They have been able to win large sums based on the "temporary or partial thievery of datalines, servers, administrative hours and other resources". This as opposed to IP rights.
Instead of scraping, look for sites which allow you to download RSS, JSON or other feeds or public APIs. You can update your local copy of the feeds based on date to avoid taxing their servers; be sure to check their terms of use for proper attribution.
http://blog.ericgoldman.org/archives/2010/09/antiscraping_la.htm
http://petewarden.com/2010/04/05/how-i-got-sued-by-facebook/