Meeting (2016-08-26)


 Meeting Outcomes:

(jump to notebook added after meeting at bottom of page)

 

 

Next full-team meeting date(s)? -- some options:

Purpose of the next meeting:

 


 

1. MongoDB/Manifest Development & Total Workflow Development on File Server / MongoDB / Virtual Machine

 

 

2. Corpus Finalization: Metadata -- current projected completion date: Sept. 12

 

 

3. Corpus Finalization: Creating the Corpus of Articles -- can we do so by end of Sept.?

 

4. Generate Initial Topic Models-- by early October?

 

5. Begin Interpreting Topic Models-- by mid October?

 


 

 

 

 

 

 

 

Meeting Outcomes (to-do's in red)

  • Next full-team meeting: Thursday, Sept. 15, 1pm (Pacific): We will brainstorm topic model interpretation process in that meeting.
  • Corpus Metadata Finalization:
    • RAs to finish current corpus by c. Sept. 12th
      • Nathalie Popa to collect Globe & Mail "liberal arts" (However, exporting the article bodies manually from the sheets is no longer necessary.)
    • Variations among individual RAs' CSV files to be fixed/standardized
    • Jamal to provide Tyler with sample CSVs 
  • Producing the Corpus:
    • Tyler to make a script that does the following: [P.S. Scott indicated at the meeting he had quickly started on this]
      1. Export the article bodies as plain-text files named by the values of the ID column (e.g., "nyt-2012-h-14")
      2. Store in appropriate folders in tree on filestation (or storage MongoDB/Manifest system if it is ready)
    • Lindsay to work on producing a quick-and-dirty "random" corpus on a relatively small scale (pending future decision on whether we need to improve the random corpus).
    • We will use script to export plain-text files for whole corpus.
  • Back-end Development:
    • Tyler to continue working on MongoDB/Manifest (and file uploading) in collaboration with Scott. Tyler to document his work/code as appropriate.
    • Tyler to consult with Jeremy (in a meeting?) on ideas for possibly simplifying the WE1S workflow.
    • Jeremy to work on the total WE1S workflow system on the filestation, MongoDB, and virtual machine (as indicated in his diagram).
    • Jeremy to work on the de-duping part of the workflow (as in his diagram)