Who did SAM follow?

During the federal election, SAM tracked the Twitter mentions of 300 candidates – the major political parties’ leaders and the incumbent Members of Parliament who ran for re-election in 2021. You can find the accounts that SAM monitored on this spreadsheet.

When did SAM track?

SAM tracked candidates’ mentions 24/7 during the election campaign from August 15 to September 20, 2021 and provided us with findings on a weekly basis.

Did SAM track data in both official languages?

Yes. SAM monitored tweets in both English and French.

How did SAM detect toxicity?

SAM is a machine learning bot — a software application robot that runs automated tasks over the Internet. SAM tracked all English and French tweets sent to political party leaders and incumbent candidates. Each message that SAM tracked — whether that’s a reply, quote tweet or mention — was analyzed and scored on seven toxicity attributes.

Did you share SAM’s findings?

Yes. See our reports page to review the reports published during the campaign.

The Samara Centre for Democracy and Areto Labs are now reviewing the large amount of data that SAM collected during the election period to conduct further analysis. This is being done with the intention of developing informed conclusions about online toxicity, civic engagement and democracy in Canada.

Can you explain your methodology?

SAM used a natural language processing machine learning model to make predictions about whether someone would consider text toxic or not. Each tweet that SAM analyzed was assessed for how likely it is that a person receiving the text would view it as toxic.

If a tweet was assessed as being greater than or equal to 51% likely to be toxic, that means the text was likely to be considered uncivil, insulting, hostile and may even be threatening or profane. If a tweet was assessed as being equal to 80% likely to be toxic that means the model would have predicted the text had a greater likelihood to be considered toxic. A different filter was used to make predictions about “severely toxic” text. This was done to help cull false positives (e.g. when people use swear words in a positive way).

SAM was trained and tested on millions of language data points to understand colloquialisms, natural language syntax, and more. The algorithm looked at things like the specific words used in a tweet and the order they were used in to make a prediction.

Language is highly nuanced and SAM can never replace a human, which is why the word “likely” is used in reports. What the technology can do is make those predictions at scale, and that’s why the technology looked at trends and data anomalies rather than details about specific senders on Twitter. Read more about natural language processing and sentiment analysis.

SAM does not offer absolute or definitive conclusions but rather offers important insights about the state of online political conversations.

I have feedback for SAM!

We’d love to hear it. Share your thoughts with the Samara Centre.