![]()
Vijay Kumar Konda
Independent Researcher
India
Abstract
This manuscript presents a comprehensive examination of text mining techniques for sentiment analysis of social media streams, with a strict focus on methodologies and tools available up to and including 2016. The work begins with an overview of the core concepts and motivations for text-based opinion mining within engineering practice. It then surveys representative case studies spanning Twitter and Facebook analysis, identifies persistent research gaps in feature extraction, domain adaptation, and performance evaluation, and outlines a reproducible methodology implemented in Python—leveraging open‐source libraries such as NLTK, scikit‐learn, and Gensim. Experimental results on benchmark datasets are reported, demonstrating classification accuracies in line with state‐of‐the‐art practices circa 2016. The conclusion synthesizes the technical findings, reflects on limitations, and suggests directions for future work that remained open.
Keywords
sentiment analysis, text mining, social media, Python, NLTK, scikit‐learn
References
- Pak, A., & Paroubek, P. (2010). Twitter as a Corpus for Sentiment Analysis and Opinion Mining. Proceedings of LREC 2010, 1320–1326.
- Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.
- Othman, Z. A., Al‐Zu’bi, Z. M. F., & Al‐Eroud, M. N. (2014). Sentiment analysis of Facebook comments for service quality assessment. International Journal of Computer Applications, 98(8), 1–5.
- Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press.
- Bird, S., Klein, E., & Loper, E. (2009). Natural Language Processing with Python. O’Reilly Media.
- Salton, G., & McGill, M. J. (1986). Introduction to Modern Information Retrieval. McGraw‐Hill.
- Rehurek, R., & Sojka, P. (2010). Software Framework for Topic Modelling with Large Corpora. Proceedings of the LREC Workshop on New Challenges for NLP Frameworks, 45–50.
- Joachims, T. (1998). Text categorization with Support Vector Machines: Learning with many relevant features. European Conference on Machine Learning, 137–142.
- Barbosa, L., & Feng, J. (2010). Robust Sentiment Detection on Twitter from Biased and Noisy Data. Proceedings of COLING 2010, 36–44.
- Ma, Y., & Seneff, S. (2004). Classification of IMDB movie reviews for sentiment analysis. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 493–496.