| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Planning Document for Topic Modeling

Page history last edited by Alan Liu 9 years, 5 months ago
Last updated:
Oct. 22, 2014

 

Research Method

Topic Modeling

 

    Introductions to the Idea of Topic Modeling

 

    Topic Modeling Tools (see DH Toychest for fuller list)
  • MALLET (command-line version; download and install in a folder called "mallet" directly in root directory of your computer)
  • Java GUI version of Mallet (aka "GUI Topic Modeling Tool"; download the .jar file and run from your computer)
  • LDAvis ("R package for interactive topic model visualization) (example of use)
  • MALLET-to-Gephi Data Stacker (online tool that takes "the '--output-doc-topics' output from MALLET and reorganize it into a format that Gephi understands")
  • [for other tools related to text-preparation and workflow for topic modeling, see below]

 

   Tutorials for MALLET Topic Modeling

 

    Initial Proof of Concept for small sample of documents from WhatEvery1Says corpus

 

Topic Model Runs

 

Participants in the project should all perform experimental topic modeling runs on samples from the corpus as they work on particular components of the text-preparation processes outlined above.  Ideally, there will be an iterative relation between tweaks to those processes and tweaks to the topic modeling.

 

  • Log of Topic Model Runs (and MALLET commands): collective log of our topic model runs with a record by date of each run and the MALLET commands used, plus commentary on or links to the results (in progress)
    • Alan's topic model run of 1 May 2014 (using MALLET): results

Facilitate Interpretation of Topic Model Results

 

  1. Experiment with methods for facilitating the interpretation of topic models. Examples:
    1. Andrew Goldstone's interface
      1. Shawn Graham, "Using Goldstone’s Topic Modeling R package with JSTOR’s Data for Research"
    2. Jeffrey M. Binder and Collin Jennings's interface
    3. Lexos word clouds for topic models (in progress: Scott Kleinman)
    4. LDAvis ("R package for interactive topic model visualization) (example of use
    5. MALLET-to-Gephi Data Stacker (online tool that takes "the '--output-doc-topics' output from MALLET and reorganize it into a format that Gephi understands")
  2. Create a web site with a good front-end interface for exploring the WhatEvery1Says topic model (to do)

 

Other Tools and Research on the state of the art in interpreting topic model results:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Comments (0)

You don't have permission to comment on this page.