Does anyone else see potential in this? http://commondatastorage.googleapis.com/books/syntactic-ngrams/index.html
They computed 2,3,4 etc.-grams. In those strings they counted nouns, verbs and adjectives, adverbs but dismissed pronouns, predeterminers, possessives etc. This way collocations (but not colligations) are most prominent because optional/random words are discarded.
Example 4-gram (and not 8-gram!):
[He] exerted [a] strong influence [on] [the] legislators.
But those datasets are enormous! Does anyone have any idea how to utilize them?
Google syntactic n-grams as a collocations dictionary?
-
- White Belt
- Posts: 11
- Joined: Sun Jan 10, 2016 6:57 pm
- x 8
Return to “Language Programs and Resources”
Who is online
Users browsing this forum: No registered users and 2 guests