| 
View
 

Meeting 15 (2015-07-29)

Page history last edited by Alan Liu 9 years, 10 months ago

Progress to Date (and Future Scheduling)

  • Status Reports (Developer Task Assignments page)
  • Next week (Project Development Calendar)
    • Workshop meeting (Wednesday?)
      • Discussion of other publications we will have begun to explore
      • Discussion of scrubbing issues
      • Discussion of our perception of the data gathered so far after some human reading
    • Planning meeting with Scott Kleinman participating (Friday?) 
      • Discussion of scrubbing and other preprocessing steps
      • Discussion of topic modeling strategy
      • Discussion of ideas for eventual public-facing interface

 


Planning for Collection From Other Publications

  • What are our priorities? (Document Sources) (original WE1S spreadsheet of hand-picked articles on the humanities)
    • Other U.S. cities (e.g., Washington Post)?
    • Other nations: Canada, Australia New Zealand, India?
    • Online media news/popular media
    • Social media
    • Middlebrow (e.g., USA Today) vs. highbrow (e.g., New Republic, LA Review of Books)?
    • Economic press (e.g., Forbes, Business Insider, The Economist)?
    • Higher-education press (e.g., Chronicle of Higher Education, Higher Ed)?
    • Campus papers (e.g., Harvard Crimson, Yale Daily News, UCLA Bruin)?
    • Commencement speeches.
    • Articles on "sciences"?
    • In future:
      • Political, legislative, and policy publications (Austin Yack to work on this in fall for independent study class with Alan)
      • Foundation studies and white papers (e.g., "The Heart of the Matter")
      • Scholarly publications
  • Researching availability and collection workflows for other publications:
    • Sign up on Document Sources page to be "lead researcher" of a publication
    • Check availability (years in full digital form without a paywall)
    • Check method of systematic searching (library database? API? Search interface for archives
    • Develop a scraper

 


Planning for Processing of Scraped Results (& Eventual Topic Modeling)

  • Human "reading" of NYT, WSJ, and Guardian results to date to get a sense of what we have

 

 

 

 

Comments (0)

You don't have permission to comment on this page.