By the D/ARC Leadership Team:
Rachel Berryman, Daniel G. Heslep, Celeste Oon, Kira Bohunicky, and PB Berge
⭐ Introduction
Discord is a platform quite unlike those that have recently dominated internet research, and its emergence has raised crucial questions about the ethics and practice of researching the platform. Studying—and, particularly, scraping—data on Discord is a lot blurrier than on other platforms, in no small part because the line between “private” and “public” is much less certain. How does one determine what information is “private” or “public” on a popular Discord server? By the state of its invitation link? By whether it’s listed elsewhere? Is scraping data from Discord allowed?
These challenges are multiplied by the fact that Discord research is an increasingly multidisciplinary venture. In the D/ARC, we have members from a vast range of disciplines: not only media and technology scholars, but also librarians, web developers, and cybersecurity specialists. The platform also has significant value to fields outside of Internet research including health, agriculture, biology, and many other fields that do not conventionally fit under the umbrella of “internet research.” Moreover, Discord is increasingly being used to conduct research via online interviews and survey responses from participants. This makes it increasingly urgent for us to have conversations around the ethics of Discord research. More than ever, researchers need a framework that enables them to approach their projects reflexively and with transparency, doing their scholarly due diligence to ensure their research is conducted safely, respectfully, and ethically.
We recognize that every project is different, which is why we’ve put together this guide. While this guide seeks to address some of the big questions around the ethics of Discord research, we follow Annette Markham’s provocation to view ethics as an ongoing process that must be constantly negotiated and applied throughout research, not just at the beginning of study design or to receive clearance from an Ethics IRB. Thus, these resources are not exhaustive, nor definitive, but we hope the links below will encourage you to pause and reflect (and reflect again) on how your project can best avoid harm, while still illuminating Discord’s rich cultures, vernaculars, structures, affordances, and networks.
How to Use This Guide:
While Discord remains understudied, and there are great opportunities to gather data presented by research communities, the unique dynamics of the platform pose pressing conundrums for internet researchers. Rather than put forward specific solutions, this guide is inquiry driven: what are the questions that should guide your research? Answering these questions will allow you to attend to the considerations most specific to this project. While the standard-fare of many review boards is “are these communities private?”—Discord research is necessarily more complicated and therefore requires more direct questions.
To these ends, this guide is divided into three parts:
- We list crucial questions that researchers should ask as they develop their ethics frameworks.
- We list foundational resources for internet research for scholars coming from other disciplines.
- We list resources for drafting, iterating, and revising ethical frameworks for internet studies.
Finally, this is a living document, and we encourage those with additional resources or experience to message us, so that we can include them here. And, if you are working through a particularly difficult, complex, or sensitive topic, we encourage you to consider asking other members of the D/ARC–there are many of us thinking through these same ideas!
💬 Scraping Discord Servers
One of the questions we’re most frequently asked in the D/ARC is if, and how, researchers are allowed or able to scrape information from Discord. Notably, scraping channels or servers on Discord does technically violate Discord’s TOS. It’s not illegal, per se, nor is it necessarily unethical within a research purview—but do note that some research governing bodies will want that to be addressed, and researchers should be wary of scraping information from personal accounts. With this said, there are multiple ways that one can scrape information from Discord: one can scrape message text from a channel or DM, profile information (such as profile description, roles, and profile pictures), server information (such as server name, description, and active/total users).
When scraping information in this way, it’s insufficient to consider how “public” the server is (after all, every Discord server technically requires an invitation link to join). Instead, researchers should attend the following considerations:
- Does this server advertise itself on third-party boards, websites, or social media?
- Does the server require specialized invitations to join (i.e., one-to-one invites) or does it have a permanent invitation link (such as a vanity url)?
- Is the server a designated Community Server or Verified Server? If so, is the server “discoverable” (appearing in Discord’s “Discover” browser) or is it unlisted, only available by direct link?
- Is this server set up specifically to suit the interests of a marginalized group or young people?
- What is the stated purpose of the server? Does the server make any declarations of its audience?
- Does this server have an active community management or moderation team that manages messages?
- What is the size and activity of the server (and the scraped channels)? Are there five people talking here? Tens of thousands? How often are the same users contributing to the conversation?
- What is the level of intimacy expected between users on this channel? How familiar are they with each other? Are they persistently online with one another (moving between text, voice, and video)? Do they expect that what they say here will only be seen/heard by a select few users?
- What are the posting restrictions on this server and in this channel? How are server and channel-specific roles shaping who is and isn’t able to post?
- What is bot activity like on this server? How much are users interacting with or relying on bots?
- Is this server performing a specific function on behalf of a content creator, company, or other entity? (i.e. a game server hosted by the publisher’s official community management team is very different than a playgroup started by friends).
Conducting ethnographic research on Discord servers is especially complicated, as there are even more considerations required regarding one’s approach and the limitations of the study. In addition to the questions listed above, ethnographic researchers should consider:
- Have users provided their informed consent to the study? Likewise, do users have the option to “opt-out” or protect certain messages from collection? (Message exporting bots and scrapers, for example, can be configured to avoid exporting messages with a certain emoji or phrase).
- Are you conducting your study server-wide, or is it limited to particular channels, roles, or users?
- What type(s) of data do you intend to use in your study? (E.g. user messages, images, video, voice data, profile data, server data, etc.)
- Are users aware their data may be utilized in research?
- What level of confidentiality will be employed in the study? Will users be identified by name or username, or pseudonymized/anonymized? Will users have input on how they appear in the study?
- What is your relationship to the users in the server? What level of trust do you have with them? Consider how your positionality affects the dynamic between researcher and subject.
- Will you provide the results of the study to the users whose data you collected? Will users benefit from your study?
- Are you providing compensation to the users in your study?
While there are many variables to consider when understanding publicity and privacy on Discord, keeping the above questions in mind can help you develop an ethical framework specific to your project. One might think scraping a “Verified Server” is the same as scraping something public, yet many Verified servers (especially related to gaming) are among the most popular for teenagers. Likewise, servers might have much of their activity limited to role-restricted channels. As you develop an ethics framework for your project, make sure you know and understand the dynamics of the community you’re investigating.
📏 The Rules as Written
Discord Terms & Conditions:
The legal terms and conditions that all Discord users agree to can be found here.
Discord Community Guidelines:
Discord’s official policies for user behavior, content rules, and many of the official guidelines are available here.
⚙️ Bots, Tools, and Technical Resources
- Discord Update Archival Repository: https://github.com/Discord-Datamining/Discord-Datamining
Documents and archives all build changes to the Discord client. - Discord Chat Exporter: https://github.com/Tyrrrz/DiscordChatExporter
A tool for exporting the chat message history from a Discord channel.
- The Disboard Scraper & Analysis Notebook: https://darcmode.org/scraper
A toolkit developed by a D/ARC moderator for scraping Discord server information from third-party site Disboard. (Note: Scraping high volumes of data can result in the process getting blocked by the website servers).
🌐 Internet Research 101
- franzke, aline shakti, Bechmann, A., Zimmer, M., Ess, C. M., & the Association of Internet Researchers. (2020). Internet Research: Ethical Guidelines 3.0. The Association of Internet Researchers (AoIR). https://aoir.org/reports/ethics3.pdf
The Association of Internet Researchers actively maintains an ethics guidelines document that, while not specific to Discord, raises important questions around public and private data, informed consent, anonymity, etc. - D’Ignazio, C., & Klein, L. F. (2020). Data Feminism. MIT Press.
D’Ignazio and Klein utilize an intersectional feminist lens to examine data science and collection. They encourage researchers to consider how data contains biases and reflects the power structures of the societal contexts in which it is produced. Hence, they provide frameworks to engage with and collect data critically, and empower marginalized communities in the process. This book is useful for anyone conducting research, especially those who frequently work with quantitative methods. - Costanza-Chock, S. (2020). Design Justice: Community-Led Practices to Build the Worlds We Need. MIT Press.
This book is an excellent resource for research design that prompts researchers to center the voices of those they work with, to create a project that speaks with communities rather than for or to them. To this end, Costanza-Chock proposes a framework for “design justice” that creates an inclusive environment and minimizes harm towards marginalized and intersectional groups. This is a must-read for anyone designing a human-centered project, but is especially pertinent to those working directly with marginalized individuals through interviews and ethnography.
🧰 Resources for Developing Ethical Frameworks
🛠️ Resources for building ethics frameworks in social media studies:
- boyd, danah, & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878
- Markham, A. N., Tiidenberg, K., & Herman, A. (2018). Ethics as Methods: Doing Ethics in the Era of Big Data Research — Introduction. Social Media + Society, 4(3). https://doi.org/10.1177/2056305118784502
- Hargittai, E. (2020). Potential biases in big data: omitted voices on social media. Social Science Computer Review, 38(1): 10–24.
- Fiesler, C., & Proferes, N. (2018). “Participant” Perceptions of Twitter Research Ethics. Social Media + Society, 4(1). https://doi.org/10.1177/2056305118763366
🗨️ Research on Adjacent Chat Applications (Telegram, WhatsApp, etc.):
- Barbosa, S., & Milan, S. (2019). Do Not Harm in Private Chat Apps: Ethical Issues for Research on and with WhatsApp. Westminster Papers in Communication and Culture, 14(1), 49–65. https://doi.org/10.16997/wpcc.313
- Hoseini, M., Melo, P., Júnior, M., Benevenuto, F., Chandrasekaran, B., Feldmann, A., & Zannettou, S. (2020). Demystifying the Messaging Platforms’ Ecosystem Through the Lens of Twitter. IMC ’20: Proceedings of the ACM Internet Measurement Conference. 345–359. https://doi.org/10.1145/3419394.3423651
🎤 Research on Interviewing Discord Users:
- Jiang, J. “Aaron,” Kiene, C., Middler, S., Brubaker, J. R., & Fiesler, C. (2019). Moderation Challenges in Voice-based Online Communities on Discord. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1–23. https://doi.org/10.1145/3359157
- Tuck, H., Guhl, J., Smirnova, J., Gerster, L., & Marsh, O. (2023). Researching the evolving online ecosystem: Telegram, Discord and Odysee.
📚 Further Examples of Discord Research
Note that the D/ARC Zotero Library actively collects and curates recently-published work on / relevant to Discord. You can find plenty of existing studies, tagged and grouped by genre, at https://darcmode.org/zotero.
Comments are closed