Statistics question
The majority of today’s grammar checkers are still rule-based. But as Data Science and statistical methods are gaining relevance in NLP, grammar checking can also be performed on the basis of Big Data- as opposed to grammar rules.
undefined
The task of this semester project is to develop a statistical grammar checker.
undefined
Your prototype (for English, German or Arabic) should include (at least) the following features:
undefined
-
a GUI via which input can be typed,
-
a match of the input (of n-grams thereof) to „big data“, i.e. large corpora which are available online,
-
a calculation on the basis of statistical methods (e.g. n-gram counts, Markov probabilities,…),
-
the detection & handling of errors, and
-
a suggestion of a correction (i.e. a more likely string).
SAMPLE ASSIGNMENT