Google Word Set
It appears Google will be releasing a data set through the Linguistic Data Consortium that is quite large. No word on pricing and checking the LDC site showed a wide range.
We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. There are 13,653,070 unique words, after discarding words that appear less than 200 times.