Preliminary Findings from the Challenges of Capturing Engagement on Facebook for Altmetrics Study

With more than 2.2 billion active users—six times as many as Twitter—Facebook is by far the largest social media platform on the web today. Yet despite its popularity, studies investigating Facebook sharing have reported surprisingly low levels of user engagement with scholarly research on the platform. Are Facebook users really sharing fewer academic articles than Twitter users? Or is there something else going on?

In a recent study, ScholCommLabers Asura Enkhbayar and Juan Pablo Alperin set out to investigate this discrepancy by using the Facebook Graph API to search for articles that might have been overlooked by current Altmetrics engagement measures. However, before they could do so, they encountered several fundamental challenges of collecting engagement data from Facebook, some of which have never been reported before. Their findings will be presented at STI 2018, the 23rd International Conference on Science and Technology Indicators. But in the meantime, we’re sharing a sneak peek of some of the highlights, right here on the ScholCommLab blog:

1. Working with DOIs, URLs, and Facebook’s Graph API is messy.

In an ideal world, collecting engagement about scholarly articles would follow a simple pathway: 1) A document would be identified by a Digital Object Identifier (DOI); 2) Crossref would provide the most recent URL associated with that DOI; 3) the Graph API would be queried with the URL; 4a) Facebook would map this URL to their internal identifier system; and 4b) it would simultaneously return the number of its engagements.

But the reality is much more complicated. To begin with, most scholarly articles have multiple URLs associated with them, each of which needs to be identified in order to collect the full engagement data. But there’s also challenges associated with how to aggregate the metrics that are collected for each URL. In some cases, different URLs point to the same final location—making it important not to double-count those metrics. In others, however, they point to the same article at different locations, meaning the engagement data should be added together to gain a complete picture of engagement for the article. In the case of Facebook, this aggregation is done by mapping each URL to a Facebook Open Graph Object with a unique ID—a process that is not straightforward, even when care is taken to follow best practices. Together, these challenges mean that any attempt to aggregate metrics using URL-based APIs will include some errors and limitations.

A simplified visualization of the challenges of collecting Facebook engagement data
A simplified visualization of the challenges of collecting Facebook engagement data

2. The difficulties of working with Facebook’s Graph API have never been documented before.

The first challenge identified in the study—that is, the messiness of working with DOIs and URLs—has been reported by several other researchers working in this area (for example, by Chamberlain, 2013; Wass, 2016; Liu & Adie, 2013). But the problems Enkhbayar and Alperin encountered when using Facebook’s Graph API have not yet been quantified in other studies. In their paper, they present an in-depth overview of these previously undocumented challenges, along with a first approximation of how pervasive the problems they generate are.

3. In 12% of cases, reliable engagement numbers could not be identified because of these challenges.

Taken together, the challenges associated with DOIs, URLs, and the Graph API make it difficult to gain a complete picture of engagement with scholarly research on Facebook. Of more than 100,000 attempts to map engagement data to a given article, Enkhbayar and Alperin identified issues in more than 12% of cases. Given that they only tested a small number of problem cases and URL variants, these results point to large challenges facing those wishing to collect Facebook metrics through the available API.

What’s next?

The present study identified some exciting opportunities to better understand the limitations of data sources and how to overcome them. But clearly, more research is needed if we ever hope to fully understand the discrepancies between actual Facebook user engagement and the numbers that are captured by the current Altmetrics system. Stay tuned: Enkhbayar and Alperin plan to investigate this fascinating question further in their next study, coming soon.