Public Perceptions of Telegram’s Association with Illicit Activities: A Sentiment Analysis Using VADER and Machine Learning

Authors

  • Rabel Catayoc Mindanao State University - Iligan Institute of Technology, Philippines

DOI:

https://doi.org/10.54536/ajsts.v4i1.4667

Keywords:

Entiment Reasoner, Logistic Regression, Machine Learning, Sentiment Analysis, Vader, Tf-Idf

Abstract

Digital communication platforms have significantly transformed social interaction and information dissemination, yet simultaneously present challenges related to illicit activities, security threats, and regulatory oversight. Telegram, a widely-used encrypted messaging service, has recently drawn global scrutiny due to allegations linking it to criminal enterprises, including identity theft, illicit drug markets, and distribution of child exploitation materials. This study systematically evaluates public sentiment surrounding Telegram’s reported facilitation of illegal activities, employing comparative sentiment analysis methodologies: VADER (Valence Aware Dictionary and Sentiment Reasoner) and a supervised machine learning approach (TF-IDF vectorization coupled with Logistic Regression). A corpus of 632 reader comments from a Wall Street Journal article discussing Telegram’s controversial associations was analysed. VADER-based labelling identified an unexpectedly predominant positive sentiment (59.5%), indicating potential public scepticism toward negative media narratives or ideological support for encrypted platforms. The logistic regression classifier demonstrated robust predictive performance, with overall accuracy of 89.56%, precision of 91%, recall of 90%, and an F1-score of 89%, yet displayed a notable positivity bias, misclassifying nuanced negative commentary. Qualitative word cloud visualizations further highlighted distinctive lexical patterns, underscoring explicit concerns around security and criminality in negative comments and humour or reflective discourse in positive remarks. Methodologically, results expose critical limitations of traditional lexical approaches in capturing subtle, implicit, or context-dependent negativity, suggesting the integration of advanced context-aware modelling techniques, such as transformer-based neural embeddings, for enhanced precision. Practically, this analysis provides critical insights for platform governance, risk management strategies, regulatory frameworks, journalistic practices, and computational linguistics research, emphasizing the necessity for balanced methodological approaches to accurately gauge and respond to nuanced public sentiment within contentious digital discourse contexts.

Downloads

Download data is not yet available.

References

Aggarwal, C. C., & Zhai, C. (2012). Mining text data. Springer.

Baumgartner, J., Zannettou, S., Keegan, B., Squire, M., & Blackburn, J. (2020). The Pushshift Telegram Dataset. Proceedings of the International AAAI Conference on Web and Social Media, 14(1), 840–847.

Cambria, E., Das, D., Bandyopadhyay, S., & Feraco, A. (2017). A Practical Guide to Sentiment Analysis. Springer.

Cambria, E., Poria, S., Gelbukh, A., & Thelwall, M. (2017). Sentiment analysis is a big suitcase. IEEE Intelligent Systems, 32(6), 74–80.

Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.

Castells, M. (2013). Communication Power. Oxford University Press.

Europol. (2022). Internet Organised Crime Threat Assessment 2022. Europol Public Information.

Forman, G., & Scholz, M. (2010). Apples-to-apples in cross-validation studies: Pitfalls in classifier performance measurement. ACM SIGKDD Explorations Newsletter, 12(1), 49–57.

Freelon, D., Marwick, A., & Kreiss, D. (2020). False equivalencies: Online activism from left to right. Science, 369(6508), 1197–1201. https://doi.org/10.1126/science.abb2428

Gillespie, T. (2018). Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions that Shape Social Media. Yale University Press.

Gillespie, T. (2018). Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions That Shape Social Media. Yale University Press.

Gorwa, R. (2019). What is platform governance? Information, Communication & Society, 22(6), 854-871.

Han, J., Kamber, M., & Pei, J. (2011). Data mining: Concepts and techniques (3rd ed.). Elsevier.

He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.

Hutto, C. J., & Gilbert, E. (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Weblogs and Social Media, 8(1), 216-225.

Kohlmann, E. (2024). Quoted in “How Telegram Became Criminals’ Favorite Marketplace,” The Wall Street Journal.

Kohlmann, E. (2024). Telegram and the evolving digital underground. Journal of Cybersecurity Research, 8(1), 12–27.

Liu, B. (2012). Sentiment analysis and opinion mining. Morgan & Claypool Publishers.

Liu, B. (2015). Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge University Press.

Liu, B. (2015). Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. Cambridge University Press.

Marwick, A., & Lewis, R. (2017). Media Manipulation and Disinformation Online. Data & Society Research Institute.https://datasociety.net/pubs/oh/DataAndSociety_MediaManipulationAndDisinformationOnline.pdf

Mostafa, M. M. (2013). More than words: Social networks’ text mining for consumer brand sentiments. Expert Systems with Applications, 40(10), 4241-4251.

Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2), 1-135.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

Pfeffer, J., Zorbach, T., & Carley, K. M. (2014). Understanding online firestorms: Negative word-of-mouth dynamics in social media networks. Journal of Marketing Communications, 20(1-2), 117-128.

Powers, D. M. W. (2011). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37–63.

Sexton, D. (2024). Quoted in “How Telegram Became Criminals’ Favorite Marketplace,” The Wall Street Journal.

Sexton, J. (2024). Encrypted messaging and criminality: Challenges and responses. Digital Crime Studies Journal, 10(2), 45-60.

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437.

Stieglitz, S., & Dang-Xuan, L. (2013). Emotions and information diffusion in social media: Sentiment of microblogs and sharing behavior. Journal of Management Information Systems, 29(4), 217-248.

Suzor, N. P. (2019). Lawless: The Secret Rules That Govern Our Digital Lives. Cambridge University Press.

Swire, B., Berinsky, A. J., Lewandowsky, S., & Ecker, U. K. H. (2021). Processing political misinformation: Comprehending the Trump phenomenon. Royal Society Open Science, 8(9), 210631. https://doi.org/10.1098/rsos.210631

Team, B. (2025, March 20). How many people use Telegram? 55 Telegram stats. Backlinko. https://backlinko.com/telegram-users

The Wall Street Journal [WSJ]. (2024). Telegram CEO Pavel Durov Arrested in France. Retrieved from https://www.wsj.com/articles/telegram-ceo-arrested-france

Van Dijck, J., Poell, T., & De Waal, M. (2018). The Platform Society: Public Values in a Connective World. Oxford University Press.

Wall Street Journal (2024). How Telegram Became Criminals’ Favorite Marketplace. Retrieved from www.wsj.com.

Weiss, G. M., & Provost, F. (2003). Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research, 19, 315–354.

Downloads

Published

2025-05-07

How to Cite

Catayoc, R. (2025). Public Perceptions of Telegram’s Association with Illicit Activities: A Sentiment Analysis Using VADER and Machine Learning. American Journal of Smart Technology and Solutions, 4(1), 68–88. https://doi.org/10.54536/ajsts.v4i1.4667