  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.


Corpus Design and Research Group Meeting (2018-02-09)

Page history last edited by Alan Liu 6 years, 7 months ago

 Meeting Outcomes:

(jump to notebook added after meeting at bottom of page)


Meeting time: Tuesday, February 9, 2018, 12-1pm Pacific Meeting Zoom URLhttps://ucsb.zoom.us/j/483937448
  • PIs: Alan Liu; co-PIs: Jeremy Douglass, Scott Kleinman, Lindsay Thomas
  • Project Manager: Samina Ali




Corpus Design & Research Group

  •  UCSB Graduate Student RAs (currently focusing on corpus design & collection strategy)
    • Rebecca Baker (English)
    • Nazanin Keynejad (Comp. Lit.)
    • Giorgina Paiella (English), WE1S Analyses & Reports Editor
    • Aili Peeker (English)
    • Jamal Russell (English)
    • Tyler Shoemaker (English)


  •  U. Miami RAs:
    • Samina Ali (English, U. Miami), WE1S Project Manager
    • Tarika Sankar (English, U. Miami)
    • Annie Schmalstig (English, U. Miami)
  •  CSUN RA: Sandra Fernandez 

Next meeting (?):  



Text-Analysis Hacker Group

  • UCSB Faculty
    • Fermín Moscoso del Prado Martín, Assistant Professor of Linguistics, UC Santa Barbara
  • UCSB Graduate Student RAs
    • Sandra Auderset(Linguistics, UCSB)
    • Devin Cornell (Sociology, UCSB)
    • Nicholas Lester (Linguistics, UCSB)
    • Fabian Offert (Media Arts & Technology, UCSB)
    • Teddy Roland (English, UCSB)
    • Chloe Willis (Linguistics, UCSB) 
  • Other Participants
    • Ryan Heuser (WE1S Advisory Board member; Ph.D. student at Stanford U.)

Next meeting (?):  



Preliminary Business

  • Name change for our RA group
  • UCSB Kronos due datesKRONOS Timecard Due dates for APPROVAL for both Student Employee.pdf  -- Next deadlines for RA Approval of their hours (prior to Faculty Manager approval):
    • Biweekly RAs: Feb. 10 
    • Monthly RAs: Feb. 15
    • Set up Kronos due dates on a shared Google calendar? 
  • Applying what we learned in the Markdown/GitHub workshop 
  • Work of the T Hacker Group
    • Word embedding group
    • Linguistics group
    • Machine gorup 


1. Template for Area Focus Reports


  • Discussion of report template created by Giorgina
    • template_for_collection_area_of_focus_report (Word and Markdown versions on WE1S Github repo under "Resources")
  • For future: identification of specific areas of focus reports that merit working up into full-scale research reports (like those created last summer)



2. Search/Download Workflows




3. Corpus Design


  • Updates on RA individual tasks (see tasks defined in "outcomes" from our last meeting) 
  • New developments and Issues:
    • New "regions" column in the spreadsheet
    • Press Freedom Ranking
    • political bias assessment 
  • Need for a meeting to evaluate "reprsentativeness"? 


 WE1S Corpus Collection Form

WE1S Corpus Collection List Form  

WE1S Corpus Collection List (current) WE1S  Corpus Collection List   
  Deprecated version of corpus collection list  
Trello Board for current tasks  
Areas of focus Areas of Focu




4. Publishing the "Reports" from Summer 2017 RA work in Markdown > GitHub > WordPress


  • Giorgina to be our pathfinder for this process? 




5. Next Steps

  • Individual tasks (review and roundup of tasks that surfaced during this meeting)
  • Next Meetings?
    • "Representativeness" critique meeting 
    • Search/download workflow practicum 
    • Intro to the WE1S Virtual Workflow Manager/Manifest system 




Meeting Outcomes


* All RAs

--continue individual collection research

--search/download practice and keep notes

--begin using the Areas of Focus report form, and make suggestions to Giorgina via Ryver forum

--upload individual Areas of Focus report forms to a Github repo for that (repo to be set up by Samina)


* Samina:

--set up Github repo for research > reports > 

--set up Pbworks page for subcorpora wishlist

--document workflow for Lexis Nexis (with Tyler and Rebecca)

--document workflow for Newsbank

--input Spanish-language Caribbean NPs into Google Form


* Georgina:

--revise Areas of Focus report template form: start a Ryver forum to discuss edits and tweaks with all RAs

--follow up on Italian newspapers

--"editor" status: work on reports from the summer



--compile a list of Youtube tutorials on databases we use for the project (ProQuest, Nexis Uni, etc.)



--will now add "Asia" to her collection list, and will discuss regional distinctions with Tyler and Giorgina re: Middle East



--will add bilingual sources into the Google form twice; once for a Spanish publication, and again for the English-language equivalent



--will start including French news sources in her collection; for now they will be added to Trello (until we can determine the best way to collect and narrow down these sources)



--will include "news portals" in her collection



--will look into obtaining transcripts from Right-wing radio shows


* Alan

--contact Francis Steen


* Lindsay and UM team:

--design representatitiveness critique meeting

--contact Ryan Cordell

--experiment with using Pandoc to convert NewsBank PDFs


* All PIs

--think ahead to scheduling next meetings for:

--representativeness critique

--search/download workshop

--duplication issues in collection

--thinking about subcorpora

--intro to WE1S Virtual Workspace Manager/Manifest platform




Comments (0)

You don't have permission to comment on this page.