How We Tackled Twitter Platform Manipulation.

Nishān Wickramarathna
12 min readMar 7, 2021

This is a story of how our research group created a solution/framework that can be used in similar contexts to solve Platform Manipulation. According to Twitter, Platform Manipulation refers to the unauthorized use of Twitter to mislead others and/or disrupt their experience by engaging in bulk, aggressive, or deceptive activity. This prohibited activity includes, but is not limited to, spam, malicious automation, and fake accounts. (Read more)


Every corporate decision is largely based on news on digital media. Every industry and individuals are relying on accurate news for their day to day work. How will you know that what you are reading is not fake? and that it is trustworthy? What might give you a clue that an online story you are reading is bogus, fake, or unreliable? We’d appreciate examples of what appears to be a reliable news source and what doesn’t.
With the advent of the Internet, several of our mundane tasks got shifted. The Internet gives quick access to news and some of us often rely on it but how can you say for sure what you read on the Internet is hard fact? We now delve into the realm of satire, click-bait, hoaxes and scams. There are sites that lie for fun, personal gain, or just to mess with the public. While satire sites are humorous and provide a lighter take on news, people usually take their word as truth.

Even credible news sources can’t be trusted these days. A well-known news source claimed that the Facebook’s “unlike” feature would delete posts that had more than 10 unlikes. They listed their source as National Report, which is a satire page. Fake news sources are often satire, and will include that bit in its disclaimer. If you aren’t sure about a particular news source, always search it up with the word “fake” or “satire” next to it. To make things simpler, exists for the sole purpose of listing satire and potentially satire sites.
This proposed solution looks at the problem with current perspectives and combine them with new research areas. New ways to develop features are being investigated and experimented. Entire system is divided into 4 modules to cover the research problem thoroughly. Proposed solution will be consisted with a module to gather similar incidents, and filter out less relevant tweets, then they will be analyzed against machine learning models to predict their sentiment, category, automaticity, deceptiveness and network lifetime. All the modules are properly tested and evaluated upon standard criteria and includes multiple methodologies and proof of concepts to the final solution would look like.
4 modules, namely, Keyword Extraction and Collecting Relevant News, Deception and Automatic Account Detection, Impact and Credibility Analysis, Popularity Forecast will work together to provide answers to the questions, “Is this information credible? Is this information being used to manipulate me?”. The solution will be able to tell you how various social media propaganda attacks and platform manipulation attempts trying to disrupt your way of life.

Manipulating public opinion using social media has become a pressing issue in this decade. Evidence of using organized social media manipulation campaigns have been taken place in 48 countries in 2018 alone and in each country, at least one government agency or a political party has been involved [Source]. This is a massive threat to democracy and there’s nothing new in governments carrying out propaganda, but using toxic messaging on a global scale and using new tools for amplifying while leveraging psychological methodologies is new. Facebook is being used heavily for such activities and other platforms like Twitter has been marked as a good option to spread misinformation and spam. The respective researchers say that some of these scenarios were taken place in countries which are new to social media and have been experimenting with computational propaganda and information control. It is impossible to rely on users to report such content and flag them as misleading, they would help the spreading of such information. This paper proposes novel methodology to detect such scenarios using machine learning and natural language processing techniques by predicting the credibility of the user profile and the credibility of the content. It uses existing research as foundation and builds new perspectives to look at the problem.


The completed framework has 4 major modules as depicted here.

As the initial design of the system it is agreed that the final outcome of the system should take both content/languages driven and network driven approaches. The first involves analyzing the problem using natural language processing techniques and the latter involves deriving popularity forecast and make predictions considering the social network. Two supporting modules have been implemented to support in various ways including feature generation and other analysis.

Keyword Extraction and Collecting Relevant News

The dataset was collected using Twitter Streaming API and contains tweets from multiple domains such as health, sports, education. The preprocessing steps include extracting text from the dataset, removing URLs, punctuation, stop words and spelling correction with lemmatization using Wordnet corpus. When the claimed tweet is presented to the algorithm, a set of keywords need to be given as output. After the algorithm is established, text will be extracted from the tweet, follow all these steps and instead of the corpus, set of keywords will be the output. Stanford CoreNLP was used because it has the highest token accuracy, for POS tagging and Named Entity Recognition (NER) tagging. Flow of keyword extraction is narrated in the following figure.

After extracting the keywords, synonyms have to be extracted as well to widen the search. Wordnet lexical database has been used to retrieve similar words and similarity by using Gensim word2vec. Similarity threshold can be determined using a statistical experiment. Following table depicts comparison of threshold and the accuracy of the output.

Similarity threshold was set to 0.69 and a Turing test is carried out to evaluate extracted keywords with human intervention, further discussed in detail in here, yielded in 67.7% accuracy. This is heavily interrelated with the knowledge and intelligence of the Turing test participants. These keywords will be used to query Twitter API and collect tweets related to initial claimed tweet and those data is the input for next 3 modules.

Deception and Automated Accounts (Bots) Detection

Inauthentic accounts and malicious automation impacts user experience in a negative way and fake or deceptive news is a form of information that is intentionally altered to manipulate users. As discussed earlier these contents may cause public unrest and will push political propaganda, even news agencies use it to change public opinion since the journalism was invented. Furthermore, it has been observed that multiple types of bot accounts carry out coordinated attacks, usually backed by an Internet Troll (sock puppet bots), a human being who writes social media posts to push an agenda, motivated by passion or another third party. Those types include sock puppet bots, amplifier bots, approval bots. Amplifier bots retweet sock puppet bots for a wider audience, then the approval bots bolster these tweets by liking, retweeting, or replying. This stage is designed with two submodules for easier modularization and for effective feature engineering capabilities.

  • Deception detection
  • Automated account detection

The entire methodology is based upon feature engineering. One crucial feature of the study is content credibility (deception detection), as this study is focusing towards news, a dataset consisting 30,000 credible and non-credible news has been collected first. For that, two subreddits, well known for their content credibility has been selected, namely TheOnion (contains non-credible news) and NotTheOnion (contains credible news). Four experiments have been carried out to find what is the best classifier for credibility classification. It has been observed that Multinomial Naïve Bayes classifier performs best. Average accuracy was obtained by 3-fold cross validation, which was 0.887 in the selected model. Precision of 90.87%, and Recall of 90.02% has been observed in combination of Count Vectorizer and Multinomial Naïve Bayes. The fact that Multinomial Naïve Bayes is suitable for discrete features such as word counts provided by count vectorizer was another reason for selecting it, the overall study done for this module alone.

Next step is to determine whether this tweet is tweeted by an automated account. This data was obtained in an online machine learning competition []. In addition to the features suggested by existing research, there are few new features introduced. If a twitter account is missing an attribute, that feature cannot be classified as Missing Completely at Random (MCAR), because this information is given by the human behind the account, and are Missing Not at Random (MNAR). Hence there were multiple features to capture the missingness. Another feature was to detect certain words that were found on non-credible news and flag twitter accounts that contain those words as described in following figure.

Most used terms are different from scenario to scenario. Integrating this feature is a way of transforming the model from one context to another (domain linking), and the outcome becomes more domain specific. 24 features were created from user account data provided by Twitter API,

· Capture null values — a binary feature stating that whether this feature contains a value (0) or not (1) (applied to location, description, url, status, has_extended_profile),

· Most used terms — collected most used terms in non-credible news and created a binary feature stating whether this feature(string) has any of those terms in it (applied to status, description, name, screen_name),

· Numbers count in the name — sum of number in this feature. (applied to name, screen_name),

· String length — length of the string, i.e. number of characters in this feature. (applied to description, status)

Other features include, Number of links — number of hyperlinks in this feature (applied to description), Number of hashtags — number of #hashtags in this feature (applied to description, status), Number of mentions — number of @mentions in this feature (applied to description, status), Periods (.) count — number of periods in this feature (applied to description, status), Quotation mark (“”) count — number of quotation marks in this feature (applied to status, description). Dataset is provided by Keyword generation module (further discussed in section A of this chapter). Random forest classifier has been used for this classification because it provided the highest accuracy of 89.04%, where decision tree and logistic regression gave 83.21% and 78.57% respectively, also confirmed by previous research. An average accuracy of 0.87 was obtained by using traditional feature set, but after introducing new features the accuracy went up by 2.69% resulting in 0.89. This shows that these models can be improved further to get much higher results. Detecting and categorizing tweets into click-bait, humor, hate-speech etc. can improve the results. This module will provide two outcomes, first is the credibility of the tweet itself, the second is whether this account can be a bot or not.

Impact and Credibility Analysis

To get a sense of credibility, reasonable grounds have to be established. This stage of the framework consists of three sub tasks, namely, news category analysis, sentiment analysis for tweets and source credibility analysis. This module requires the Twitter user id or screen name from the collected set of tweets by module 1. Flow of the methodology is depicted in the following figure.

News category analysis module (further discussed in here) is responsible for identifying the most relevant category of the tweet belongs to. Machine learning approach has been taken and the dataset consists of tweet and the relevant category. After conducting a comparative study between Support Vector Machines (SVM), Naïve Bayes classifier and Decision Tree (Hoeffding Tree), SVM was selected as the best fit. To evaluate the responses on the followers to measure how they agree or disagree with the author of the tweet, sentiment analysis step has been carried out. For this purpose, also machine learning model SVM is used as it yielded the highest accuracy and lexicon-based sentiment analysis performs poorly compared to machine learning models. One drawback of using SVM is that smaller datasets tend to reduce the accuracy. The output of this module is the sentiment, either positive or negative, analogous to agreement or disagreement, the process referred to as agreement modeling, and the result, agreement sore will be given to Popularity Forecast module. Overall sentiment of the followers to the user profile is calculated and used together with features like, verified, number of followers, lifetime of the profile to determine credibility score.

The data does not have the credibility associated with it, so unsupervised approach K-means clustering is used so the number of clusters can be pre-defined.

Popularity Forecast

From here onwards, an event that is related to information sharing involving many users will be referred to as events. News and marketing campaigns are examples for events. So, we took human behavior as a major aspect of the model. When we measure popularity, we can’t directly get the number of users involved or affected by the event from existing methods. We can identify involvement as in (1) by users who retweets (U1), users who reply to tweets (U2), users who tweet with hashtags (U3). If the total involved users regarding an event is denoted by T.

The proposed model suggests time series analyses techniques in order to predict the popularity. This approach tries to model the total behavior of propagation through the first few minutes of propagation. LSTM model will be used as the time series analyzing technique within this particular framework, but there could be other special cases that may give better results with this model. As shown here, the information of the first few minutes is fetched to the network at each time steps. Their aggregated and summarized features will be fetched at each time step.


There were multiple challenges to overcome before designing this framework. As an example, what would be the ground truth for deceptive news, how to evaluate the final outcome. It has been clear from the beginning that even the mainstream media also can be biased (also confirmed by the public survey) hence there’s no way of setting a ground truth to a given scenario. Instead this framework used supervised learning to classify the deceptive news, disregarding language cues like satire.

The implementation was carried out using microservices architecture for optimal maintainability and cross functionality. Four modules and their respective databases are interdependent but redundant. It was kept this way for improved productivity and parallel task execution. The Twitter API free version is limited to 180 calls per every 15 minutes and only capable of searching cc from the last 30 days. So, third party libraries were used in some situations to query at high speeds and gather historical data. Using all 4 modules, and by looking at the collective outcome, the users will get a clear insight into the situation, and they can make an educated decision whether to trust the news or not.

The aim of this framework is to identify and detect social media platform manipulation by identifying credibility of the origin (user profile) and credibility of the content using sentiment analysis and machine learning. Method to extract keywords is generic that does not relate to a specific domain. Accuracy and the performance of the method has to be improved in the future.

Our research team published 4 research papers with respect to this research. We proposed a new methodology to detect Twitter platform manipulation — “using Twitter services in a manner intended to artificially amplify or suppress information or engage in behavior that manipulates or disrupts people’s experience”. We used machine learning and natural language processing techniques by predicting the credibility of the user profile and the credibility of the content overcome the research problem.

We, the authors, would like to express our utmost gratitude to the Supervisor of our level 4 research Dr. G. Upeksha Ganegoda, University of Moratuwa and all others who helped us in numerous ways.

Keyword extraction from Tweets using NLP tools for collecting relevant by Thiruni Jayasiriwardena (Thiruni_Jay)

Source credibility analysis on Twitter By Malith Wijesekara (Malith__95)

Detecting Automatically Generated Tweets Using Lexical Analysis and Profile Credibility — [] by Nishan Chathuranga Wickramarathna (NishanWrites)

A Framework to Detect Twitter Platform Manipulation and Computational by the entire team.