Catalogue Search | MBRL

Improving Online Community Governance at Web Scale

by Weld, Galen Cassebeer in Computer Engineering , Computer science , Information Technology

2025

Nearly two out of every three people on the planet are members of an an online community, and this number is forecast to keep growing. These communities have an incredible diversity of topic, size, and structure, and they offer unique ways to connect their users and bring people together. Unfortunately, online communities have also been associated with significant offline harms, including the mental health crisis, abuse and harassment, interference with free and democratic elections, and radicalization and political polarization.Almost all online communities rely on some form of governance to set and enforce rules, role model good behavior, and generally lead the community. The forms that this governance takes varies widely from community to community. On some platforms, moderators' work is conducted in the background, while in many others, community leaders are volunteers who take a more visible role. Many communities' governance also relies on a range of complex technical tools. Some communities operate on a pseudodemocratic basis, with nominations and regular elections, while others operate on a consensus model, and still others are effectively autocracies. It is very difficult to know how best to govern an online community, given different community needs, the enormous range of available governance strategies, and the challenge of empirically measuring governance and outcomes.In this dissertation, I conduct research that makes online communities better through data-driven analyses of community values, moderation practices, and experiments with new tools. My work focuses on three important research activities: (1) I characterize communities' values in community members' own words to build a foundational understanding of communities' needs and what `better' actually means. (2) I assess existing moderation practices and community affordances such as voting at a massive scale across hundreds of thousands of communities in order to identify which practices are most promising. (3) I deploy interventions and best practices in partnership with community leaders to maximize real world impact . Much of my research is conducted on Reddit, one of the largest platforms for online communities, and a platform where I am a longtime moderator of several subreddits, and a member of the Reddit Moderator Council.My dissertation makes several key contributions: My theoretical contributions include the first ever taxonomy of community values, based on the largest-to-date surveys of community members. My methodological contributions include a new method for scalably measuring community outcomes by quantifying how community members talk about their moderators, and a new method for classifying the rules enforced by communities. Finally, I make artifact contributions by publishing classifiers for discussions of moderators and rules, and datasets of anonymized survey results, community rules, and news sharing behavior.

Dissertation

Share this book

Add to My Shelf

Making Online Communities 'Better': A Taxonomy of Community Values on Reddit

by Zhang, Amy X , Althoff, Tim , Weld, Galen in Community , Taxonomy , Virtual communities

2023

Many researchers studying online communities seek to make them better. However, beyond a small set of widely-held values, such as combating misinformation and abuse, determining what 'better' means can be challenging, as community members may disagree, values may be in conflict, and different communities may have differing preferences as a whole. In this work, we present the first study that elicits values directly from members across a diverse set of communities. We survey 212 members of 627 unique subreddits and ask them to describe their values for their communities in their own words. Through iterative categorization of 1,481 responses, we develop and validate a comprehensive taxonomy of community values, consisting of 29 subcategories within nine top-level categories, enabling principled, quantitative study of community values by researchers. Using our taxonomy, we reframe existing research problems, such as managing influxes of new members, as tensions between different values, and we identify understudied values, such as those regarding content quality and community size. We call for greater attention to vulnerable community members' values, and we make our codebook public for use in future research.

Paper

Share this book

Add to My Shelf

Reddit Rules and Rulers: Quantifying the Link Between Rules and Perceptions of Governance across Thousands of Communities

by Zhang, Amy X , Leibmann, Leon , Althoff, Tim in Critical components , Datasets , Taxonomy

2025

Rules are a critical component of the functioning of nearly every online community, yet it is challenging for community moderators to make data-driven decisions about what rules to set for their communities. The connection between a community's rules and how its membership feels about its governance is not well understood. In this work, we conduct the largest-to-date analysis of rules on Reddit, collecting a set of 67,545 unique rules across 5,225 communities which collectively account for more than 67% of all content on Reddit. More than just a point-in-time study, our work measures how communities change their rules over a 5+ year period. We develop a method to classify these rules using a taxonomy of 17 key attributes extended from previous work. We assess what types of rules are most prevalent, how rules are phrased, and how they vary across communities of different types. Using a dataset of communities' discussions about their governance, we are the first to identify the rules most strongly associated with positive community perceptions of governance: rules addressing who participates, how content is formatted and tagged, and rules about commercial activities. We conduct a longitudinal study to quantify the impact of adding new rules to communities, finding that after a rule is added, community perceptions of governance immediately improve, yet this effect diminishes after six months. Our results have important implications for platforms, moderators, and researchers. We make our classification model and rules datasets public to support future research on this topic.

Paper

Share this book

Add to My Shelf

Perceptions of Moderators as a Large-Scale Measure of Online Community Governance

by Zhang, Amy X , Leibmann, Leon , Althoff, Tim in Community , Virtual communities

2025

Millions of online communities are governed by volunteer moderators, who shape their communities by setting and enforcing rules, recruiting additional moderators, and participating in the community themselves. These moderators must regularly make decisions about how to govern, yet measuring the 'success' of governance is complex and nuanced, making it challenging to determine what governance strategies are most successful. Furthermore, prior work has shown that communities have differing values, suggesting that 'one-size-fits-all' approaches to governance are unlikely to serve all communities well. In this work, we assess governance practices on reddit by classifying the sentiment of community members' public discussion of their own moderators. We label 1.89 million posts and comments made on reddit over an 18 month period. We relate these perceptions to characteristics of community governance, and to different actions that community moderators take. We identify types of communities where moderators are perceived particularly positively and negatively, and highlight promising strategies for moderator teams. Amongst other findings, we show that strict rule enforcement is linked to more favorable perceptions of moderators of communities dedicated to certain topics, such as news communities, than others. We investigate what kinds of moderators are associated with improved community perceptions upon their addition to a mod team, and find that moderators who are active community members before and during their mod tenures are seen more favorably. We make our models, anonymized datasets, and code public.

Paper

Share this book

Add to My Shelf

Perceptions of Moderators as a Large-Scale Measure of Online Community Governance

by Zhang, Amy X , Leibmann, Leon , Althoff, Tim in Community , Community participation , Virtual communities

2024

Millions of online communities are governed by volunteer moderators, who shape their communities by setting and enforcing rules, recruiting additional moderators, and participating in the community themselves. These moderators must regularly make decisions about how to govern, yet measuring the 'success' of governance is complex and nuanced, making it challenging to determine what governance strategies are most successful. Furthermore, prior work has shown that communities have differing values, suggesting that 'one-size-fits-all' approaches to governance are unlikely to serve all communities well. In this work, we assess governance practices on reddit by classifying the sentiment of community members' public discussion of their own moderators. We label 1.89 million posts and comments made on reddit over an 18 month period. We relate these perceptions to characteristics of community governance, and to different actions that community moderators take. We identify types of communities where moderators are perceived particularly positively and negatively, and highlight promising strategies for moderator teams. Amongst other findings, we show that strict rule enforcement is linked to more favorable perceptions of moderators of communities dedicated to certain topics, such as news communities, than others. We investigate what kinds of moderators are associated with improved community perceptions upon their addition to a mod team, and find that moderators who are active community members before and during their mod tenures are seen more favorably. We make all our models, datasets, and code public.

Paper

Share this book

Add to My Shelf

What Makes Online Communities 'Better'? Measuring Values, Consensus, and Conflict across Thousands of Subreddits

by Zhang, Amy X , Althoff, Tim , Weld, Galen in Community , Modelling , Virtual communities

2022

Making online social communities 'better' is a challenging undertaking, as online communities are extraordinarily varied in their size, topical focus, and governance. As such, what is valued by one community may not be valued by another. However, community values are challenging to measure as they are rarely explicitly stated. In this work, we measure community values through the first large-scale survey of community values, including 2,769 reddit users in 2,151 unique subreddits. Through a combination of survey responses and a quantitative analysis of public reddit data, we characterize how these values vary within and across communities. Amongst other findings, we show that community members disagree about how safe their communities are, that longstanding communities place 30.1% more importance on trustworthiness than newer communities, and that community moderators want their communities to be 56.7% less democratic than non-moderator community members. These findings have important implications, including suggesting that care must be taken to protect vulnerable community members, and that participatory governance strategies may be difficult to implement. Accurate and scalable modeling of community values enables research and governance which is tuned to each community's different values. To this end, we demonstrate that a small number of automatically quantifiable features capture a significant yet limited amount of the variation in values between communities with a ROC AUC of 0.667 on a binary classification task. However, substantial variation remains, and modeling community values remains an important topic for future work. We make our models and data public to inform community design and governance.

Paper

Share this book

Add to My Shelf

Political Bias and Factualness in News Sharing across more than 100,000 Online Communities

by Althoff, Tim , Weld, Galen , Glenski, Maria in Bias , False information , Links

2022

As civil discourse increasingly takes place online, misinformation and the polarization of news shared in online communities have become ever more relevant concerns with real world harms across our society. Studying online news sharing at scale is challenging due to the massive volume of content which is shared by millions of users across thousands of communities. Therefore, existing research has largely focused on specific communities or specific interventions, such as bans. However, understanding the prevalence and spread of misinformation and polarization more broadly, across thousands of online communities, is critical for the development of governance strategies, interventions, and community design. Here, we conduct the largest study of news sharing on reddit to date, analyzing more than 550 million links spanning 4 years. We use non-partisan news source ratings from Media Bias/Fact Check to annotate links to news sources with their political bias and factualness. We find that, compared to left-leaning communities, right-leaning communities have 105% more variance in the political bias of their news sources, and more links to relatively-more biased sources, on average. We observe that reddit users' voting and re-sharing behaviors generally decrease the visibility of extremely biased and low factual content, which receives 20% fewer upvotes and 30% fewer exposures from crossposts than more neutral or more factual content. This suggests that reddit is more resilient to low factual content than Twitter. We show that extremely biased and low factual content is very concentrated, with 99% of such content being shared in only 0.5% of communities, giving credence to the recent strategy of community-wide bans and quarantines.

Paper

Share this book

Add to My Shelf

How Conversational Structure and Style Shape Online Community Experiences

by Zhang, Amy X , Pearson, Carl , Kairam, Sanjay in Virtual communities

2025

Sense of Community (SOC) is vital to individual and collective well-being. Although social interactions have moved increasingly online, still little is known about the specific relationships between the nature of these interactions and Sense of Virtual Community (SOVC). This study addresses this gap by exploring how conversational structure and linguistic style predict SOVC in online communities, using a large-scale survey of 2,826 Reddit users across 281 varied subreddits. We develop a hierarchical model to predict self-reported SOVC based on automatically quantifiable and highly generalizable features that are agnostic to community topic and that describe both individual users and entire communities. We identify specific interaction patterns (e.g., reciprocal reply chains, use of prosocial language) associated with stronger communities and identify three primary dimensions of SOVC within Reddit -- Membership & Belonging, Cooperation & Shared Values, and Connection & Influence. This study provides the first quantitative evidence linking patterns of social interaction to SOVC and highlights actionable strategies for fostering stronger community attachment, using an approach that can generalize readily across community topics, languages, and platforms. These insights offer theoretical implications for the study of online communities and practical suggestions for the design of features to help more individuals experience the positive benefits of online community participation.

Paper

Share this book

Add to My Shelf

Leveraging Community and Author Context to Explain the Performance and Bias of Text-Based Deception Detection Models

by Althoff, Tim , Weld, Galen , Ayton, Ellyn in Context , Deception , Model accuracy

2021

Deceptive news posts shared in online communities can be detected with NLP models, and much recent research has focused on the development of such models. In this work, we use characteristics of online communities and authors -- the context of how and where content is posted -- to explain the performance of a neural network deception detection model and identify sub-populations who are disproportionately affected by model accuracy or failure. We examine who is posting the content, and where the content is posted to. We find that while author characteristics are better predictors of deceptive content than community characteristics, both characteristics are strongly correlated with model performance. Traditional performance metrics such as F1 score may fail to capture poor model performance on isolated sub-populations such as specific authors, and as such, more nuanced evaluation of deception detection models is critical.

Paper

Share this book

Add to My Shelf

Adjusting for Confounders with Text: Challenges and an Empirical Evaluation Framework for Causal Inference

by West, Peter , Althoff, Tim , Weld, Galen in Digital media , Ground truth , Inference

2022

Causal inference studies using textual social media data can provide actionable insights on human behavior. Making accurate causal inferences with text requires controlling for confounding which could otherwise impart bias. Recently, many different methods for adjusting for confounders have been proposed, and we show that these existing methods disagree with one another on two datasets inspired by previous social media studies. Evaluating causal methods is challenging, as ground truth counterfactuals are almost never available. Presently, no empirical evaluation framework for causal methods using text exists, and as such, practitioners must select their methods without guidance. We contribute the first such framework, which consists of five tasks drawn from real world studies. Our framework enables the evaluation of any casual inference method using text. Across 648 experiments and two datasets, we evaluate every commonly used causal inference method and identify their strengths and weaknesses to inform social media researchers seeking to use such methods, and guide future improvements. We make all tasks, data, and models public to inform applications and encourage additional research.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter