5 Dec 2018, 9-12:30 | Oriental Room S204, University of Sydney
Topic models allow you to find topics and relationship among documents in large collections of texts. There is a (growing) number of approaches to explore and make statistical inferences about texts. This session will provide a hands-on introduction to text analysis and topic modelling using stm: An R package for Structural Topic Models. stm
is a comprehensive and highly regarded package to prepare, model and visualise textual data. The session will be mainly practical but I will also provide a short theoretical introduction to topic models and present some applications of stm
we can currently find in the literature. In this session, we will walk the length of a standard pipeline for textual analysis. We will explore different techniques and methods to import texts and associated metadata into R from a variety of sources such as PDFs, webpages, spreadsheets and APIs. We will prepare the data and clean the text using Hadley Wickham’s “tidy” approach. We will compute document term matrices and discuss the benefits of different weighting techniques. Finally, we will estimate, evaluate and visualise topic models to facilitate their interpretation and to communicate the result of the analysis.
ws-201812-master.zip
and open ws-201812.Rproj
with RStudio to load the project.See here.
Oriental Room S204, The Quadrangle, University of Sydney
francesco.bailo@sydney.edu.au | +61 2 8627 6895 |