05.Text Mining(W6)
October 9, 2023Less than 1 minute
What's text mining
Text mining refers to the process of extracting interesting and non-trivial information and knowledge from unstructured text
Text mining applications
- Spam filtering
- Creating suggestion and recommendations (like amazon)
- Monitoring public opinions (Obama election campaign)
Topic modelling
Aim of Topic Models:
- Large unstructured collection of document
- Discover set of topics that generated the documents
- Annotate documents with topics
- Each document is a distribution of topics (different colors)
- Each topic is a distribution of words
In reality, we only observe the documents
Our goal is to infer the underlying topic structure
Latent Dirichlet Allocation (LDA)
References
- Week 6 Slides from Thuc (SP52023)
- Topic Models