Hashimi wrote:Hashimi wrote:Actually, I'm surprised by the contrary. For example, common words like "headache", "boyfriend", "airport", "bathroom", "motorbike", "underground", "midnight", "classroom", "bedroom", "girlfriend", "birthday", "timetable", "weekend", "upstairs", "suitcase", "motorcycle", "homework", "businessman", "website", and even "forever" are not in the list of the most frequent 25,000 words in the British National Corpus and the Corpus of Contemporary American English!
Now I understand why these common words are not on the list of the most frequent 25K words in the BNC-COCA. They are all considered as two-word words so they removed them from the list!
I wonder if this is just an issue with the interface you're using the search these corpora? The quickest way I know to look up multiple words in COCA is using the analyze texts feature on Wordandphrase.info (kind of a front end for COCA maintained by the same linguist, Marc Davies). Copy-pasting the word list above, it appears that all of the words in the list are in the top 25000. Motorbike is the least frequent, with a rank of 23938. The rest are in the top 10000, with six (weekend, classroom, airport, bedroom, forever, bathroom) in the top 3000.
You can also search the BNC here and COCA here, but I find that interface less useful, because as far as I know you can only look up one word at a time, and (again AFAIK) you only get the word's frequency in the corpus, not its rank relative to other words. Looks like most of the words are in the BNC though, with the exception of website. Motorbike is slightly less frequent in the BNC than in COCA, so it's possible that it's not in the top 25000 in that corpus, but I think the rest probably are.