If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

Scoping Project Home Page

Page history last edited by Alan Liu 6 years, 9 months ago

This page serves as the home page for the WE1S scoping project, which researches strategies and resources for expanding the WE1S corpus of materials. (Last revised 7/12/17)

I. The Scoping Problem

A. Statement of Corpus Expansion Plans

[From grant proposal]: WE1S plans to devote research at the beginning of its timeline to determine which specific sources to target in these areas that will be most representative and useful for the project's goals. While the criteria for representativeness and usefulness will evolve iteratively as the project team begins its research on potential sources.... [Go to full statement on corpus expansion plans]

B. Statement of Plan for "Scoping Statement" at End of Project

[From grant proposal]: Finish collection work for the WE1S main corpus and sub-corpora, and create a "scoping statement" for the collection. Activities related to collecting and ingesting materials as datasets will be completed near the beginning of year 3 so that WE1S can concentrate on analysis and dissemination work. PI Liu and co-PI Thomas, with the assistance of RAs at their campuses and also in consultation with other co-PIs and postdocs, will take the lead in writing a scoping statement describing the nature, selection criteria, and organization of the project's gathered materials (with their associated manifests providing metadata on provenance and workflow) so that WE1S's public, humanities scholar and administrator, and digital humanities audiences will be able to understand what was gathered for study.

II. Seed Resources for Beginning to Think About the Scoping Problem

Katherine Bode, "The Equivalence of “Close” and “Distant" Reading; or, Toward a New Object for Data-Rich Literary History," Modern Language Quarterly 78:1 (2017): 77-106. (paywalled article) (open-access preprint). The beginning of Bode's article is a severe critique of "distant reading" in the digital humanities for naive or non-transparent understandings of the corpora they make the bases of studies. The part to focus on for our purposes is pp. 95 to the end. Here, Bode articulates the idea of a corpus gathered for DH analysis that would be a "scholarly edition of a literary system" (i.e., a scholarly edition not just of a work but of a whole corpus of works).
Essentially, WE1S wants to scope its selection of resources as a "scholarly edition of a media [not Bode's literary] system." WE1S will want to explain the rationale for its collected materials in some manner like that articulated by Bode, though with attention to contemporary media "impact."
Anya Schiffrin and Ethan Zuckerman, "Can We Measure Media Impact? Surveying the Field," Stanford Social Innovation Review, Fall 2015. This is an overview of current approaches to assessing the "impact" of media. This piece sets out categories of assessing media impact. However, consensus and tools/data for implementing assessment qua the categories have clearly not arrived. Nevertheless, just knowing what categories to consider might be useful for WE1S.

III. Paradigms to Investigate

"The Edition" (textual editing theory) - Jamal

Jerome McGann’s on “philology in a new key” (in his A New Republic of Letters)

See Katherine Bode's review of McGann's book

Katherine Bode's notion of a "scholarly edition of a literary system"

"The Archive" (archival theory) -- Alanna

Kate Theimer, "Archives in Context and as Context" (2012)
Jefferson Bailey, "Disrespect des Fonds: Rethinking Arrangement and Description in Born-Digital Archives," Archive Journal 3 (Summer 2013).
Mark Algee-Hewitt, Sarah Allison, Marissa Gemma, Ryan Heuser, Franco Moretti, and Hannah Walser. "Canon/Archive. Large-scale Dynamics in the Literary Field." Stanford Literary Lab Pamphlet 11. January 2016. https://litlab.stanford.edu/pamphlets/.

"The Canon" - Giorgina (update of July 5, 2017)

John Guillory, "Canonical and Non-canonical: A Critique of the Current Debate," ELH 54 (1987): 483-527
John Guillory, Cultural Capital: The Problem of Literary Canon Formation (Chicago: University of Chicago Press, 1993)
Mark Algee-Hewitt and Mark McGurl. "Between Canon and Corpus: Six Perspectives on 20th-Century Novels." Stanford Literary Lab Pamphlet 8. January 2015.Accessed June 14, 2017. https://litlab.stanford.edu/pamphlets/.

See also Mark Algee-Hewitt, et al. "Canon/Archive. Large-scale Dynamics in the Literary Field." Stanford Literary Lab Pamphlet 11. January 2016. https://litlab.stanford.edu/pamphlets/.

Alan Liu and Laura Mandell, "Description" of 1996 MLA session on "The Canon and the Web"

Corpus Linguistics -Teddy (update of July 6, 2017)

Nadja Nesselhauf, Corpus Linguistics - A Practical Introduction (2005/2001)
Mark Davies's Brigham Young Univerisity corpora collection: corpus.byu.edu

Media Impact - Ryan, Sage (Ryan's draft report)

MediaCloud ("An open-source platform for studying media ecosystems")

"Humanities" topic created by Alan

New vs. Old Media

And impact of New Media on reading of old media

Newspaper Studies - Ryan, Sage (Sage's update of July 7, 2017)

Newspaper chains and companies (how do we factor current ownership patterns and corporate media conglomerates into "representatitiveness"?)
Wikipedia

"Newspaper Circulation" (includes sources for data)
"List of newspapers in the United States"

Pew Research Center

"How People Learn About Their Local Community" (2011 report), includes "Part 3: The Role of Newspapers (Perceptions of the importance of local newspapers)"
"Newspapers: Fact Sheet" (2016)

Newspaper Research Journal
Forbes Magazine, "The Most Influential News Orgs, According to Google" (2011)
Related topic: Newspapers' Use of Analytics

Federica Cherubini and Rasmus Kleis Nielsen, "Editorial Analytics: How News Media are Developing and Using Audience Data and Metrics" (Reuters Institute for the Study of Journalism, 2016)

IV. Format for Reports on Paradigms (reports to be assigned to RAs or teams of RAs)

(Cf., the "Research Reports" for the Transliteracies Project)

Template for Scoping Research Reports

V. Criteria and Principles to Consider in Scoping (will evolve as scoping research continues)

"Representativeness"

Regional, national, local geographical representativeness
Political representativeness
High/middle/low-brow representativeness
Social respresentativeness

Circulation

"Impact"

Most cited
Most referred to from social media
Most influential

From Lindsay's email to Alan, 6/14/17:
"The Schiffrin and Zuckerman piece also seems helpful in terms of how it breaks down the conceptual category of “impact”: this includes not only reach, or circulation/traffic, but also the more-difficult-to-measure concepts of “influence” and “impact.” Impact seems very difficult to qualify or quantify, and I wonder how important it is for our research goals. If, for example, there was a highly influential article about the humanities — one cited or referred to many times by other articles — published in an outlet that isn’t eventually included in our corpus, is this important to us? Would not including this particular article in our corpus affect our corpus’s “representativeness” in a substantial way? My initial instinct is that it wouldn’t, not necessarily, since the decisions we are making about “representativeness” are being made at the level of the publication or outlet, not the individual article (i.e., "does this particular publication represent 'US public discourse' in some substantial way?" not “is this individual article representative of ‘US public discourse about the humanities’ in some substantial way?”). But perhaps this is the wrong scale to be thinking at? I’m not sure."

Syndication & Reprinting

See Ryan Cordell, "Reprinting, Circulation, and the Network Author in Antebellum Newspapers," American Literary History 27.3 (2015): 417-445.
Katherine Bode, "Fictional Systems: Network Analysis and Syndication Networks." Chapter 5 in her A World of Fiction: Mass-digitization, Nineteenth-century Australian Newspapers, and the Future of Literary History. Manuscript, 2017.

Chronology

Diachronic criteria for "representativeness"? (i.e., what is representative now, as opposed to 20 years ago?)
How to prioritize materials chronologically? e.g., first collect the last five years, and then stage work so that we collect materials in batches going back in time?)

IV.a Scoping Research Tracking Sheet

Scoping Research Google Spreadsheet

IV.b Scoping Research Graph?

Lindsay's idea (in email to Alan of 6/14/17):
"One thing that we might want to think more about are figures 3.1-3.3 from the “Canon/Archive” Lit Lab pamphlet you link to under Part IV of the Scoping Project Home Page (pg 4 from the pamphlet, https://litlab.stanford.edu/LiteraryLabPamphlet11.pdf). These figures describe a map of the “literary field,” which is based on Bourdieu’s famous diagram of the 19th-century French literary field. The x-axis represents popularity, and the y-axis represents prestige. I wonder if we could come up with a similar model for newspapers. For instance, the popularity axis could represent some combination of circulation figures and web traffic figures, and the prestige axis might include some metric of how “influential” certain publications are. For example, we could adopt/adapt Nate Silver’s methodology for calculating the “influence” of news outlets based on a representative (? this is arguable, I suppose) sample of their citation metrics (this is from the Forbes article you link to in Part IV of the Scoping Project Home Page: https://www.forbes.com/sites/jeffbercovici/2011/03/25/the-most-influential-news-orgs-according-to-google/#5d05b62541ae; Silver’s explanation of the method is here: https://fivethirtyeight.blogs.nytimes.com/2011/03/24/a-note-to-our-readers-on-the-times-pay-model-and-the-economics-of-reporting/?scp=2&sq=nate%20silver&st=cse&_r=0). We could then choose publications from this field that maximize both popularity and prestige, and that are technically feasible to scrape."

"One major problem with this approach, of course, is that it’s likely to miss some sources that we’ve already identified as high value for our particular research questions, like publications found in the Ethnic Newswatch database. We would need to come up with additional criteria describing the inclusion of these sources, and I’m not sure what such criteria would be. One idea would be to repeat a similar experiment to the one detailed above, but only including sources listed in databases like Ethnic Newswatch."

Graphs from Mark Algee-Hewitt, Sarah Allison, Marissa Gemma, Ryan Heuser, Franco Moretti, and Hannah Walser. "Canon/Archive. Large-scale Dynamics in the Literary Field." Stanford Literary Lab Pamphlet 11. January 2016. https://litlab.stanford.edu/pamphlets/:

Stanford Lit Lab Pamphlet 11, Figure 3.1

Stanford Lit Lab Pamphlet 11, Figure 3.2

Stanford Lit Lab Pamphlet 11, Figure 3.3

Comments (0)

You don't have permission to comment on this page.

Scoping Project Home Page

This page serves as the home page for the WE1S scoping project, which researches strategies and resources for expanding the WE1S corpus of materials. (Last revised 7/12/17)

I. The Scoping Problem

A. Statement of Corpus Expansion Plans

B. Statement of Plan for "Scoping Statement" at End of Project

II. Seed Resources for Beginning to Think About the Scoping Problem

III. Paradigms to Investigate

IV. Format for Reports on Paradigms (reports to be assigned to RAs or teams of RAs)

V. Criteria and Principles to Consider in Scoping (will evolve as scoping research continues)

IV.a Scoping Research Tracking Sheet

IV.b Scoping Research Graph?

Scoping Project Home Page

Page Tools

Insert links

Comments (0)

Join this workspace

Navigator

SideBar

Recent Activity