Language Identification Analysis
Comparative study of language classifiers for Hindi/Marathi Devanagari text identification (CL2 course project).
- Multiple feature extraction: character frequency, word length, morphological analysis, n-grams, POS tagging, TF-IDF
- Naive Bayes classifiers comparing linguistic features vs baseline n-gram models