Reading Tea Leaves: How Humans Interpret Topic Models Jonathan Chang Jordan Boyd-Graber
by user
Comments
Transcript
Reading Tea Leaves: How Humans Interpret Topic Models Jonathan Chang Jordan Boyd-Graber
Reading Tea Leaves: How Humans Interpret Topic Models Jonathan Chang Jordan Boyd-Graber Sean Gerrish Chong Wang David M. Blei Princeton University Identity Guidelines The signature The Princeton signature should be included on all official Princeton publications. Signature or wordmark SIGNATURE It need not appear in a large or prominent position, but it should be included to signal a publication’s core relationship to Princeton. 6 NIPS 2009 Dec 9th, 2009 The signature should be used in all situations that call for the Princeton “logo,” for instance on promotional materials for public events for which Princeton is the primary sponsor. The wordmark The wordmark may be used alone in some situations. When the wordmark is used alone, the shield should also appear somewhere on the publication or item. It may be particularly useful to incorporate the wordmark separately from the shield in display settings or in less formal situations. WORDMARK Chang, Boyd-Graber, Wang, Gerrish, Blei Digital art for the Princeton signature Reading Tea Leaves Topic Models in a Nutshell From an input corpus → words to topics Corpus Forget the Bootleg, Just Download the Heralded Movie Legally Multiplex As Linchpin To of Growth The Shape Cinema, Transformed At Crew the Click A Peaceful Putsof aWhere MouseIts Mouth Is Muppets Stock Trades: A Better Deal For The Investors Simple three Isn't big Internet portals begin to distinguish Red Light, Green Light: A among themselves as 2-Tone L.E.D. to shopping malls Simplify Screens Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Topic Models in a Nutshell From an input corpus → words to topics TOPIC 1 TOPIC 2 TOPIC 3 computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Evaluation Corpus Forget the Bootleg, Just Download the Heralded Movie Legally Multiplex As Linchpin To of Growth The Shape Cinema, Transformed At Crew the Click A Peaceful Putsof aWhere MouseIts Mouth Is Muppets Stock Trades: A Better Deal For The Investors Isn't Simple three big Internet portals begin Green to distinguish Red Light, Light: A among themselves as 2-Tone L.E.D. to shopping malls Simplify Screens Model A -4.8 Model B -15.16 Model C -23.42 Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Held-out Data Sony Ericsson's Infinite Hope for a Turnaround For Search, Murdoch Looks to a Deal Microsoft Price WarWith Brews Between Amazon and Wal-Mart Evaluation Held-out Log Likelihood Corpus Forget the Bootleg, Just Download the Heralded Movie Legally Multiplex As Linchpin To of Growth The Shape Cinema, Transformed At the Click A Peaceful Crew Putsof aWhere MouseIts Mouth Is Muppets Stock Trades: A Better Deal For The Investors Simple three Isn't big Internet portals begin to distinguish Red Light, Green Light: A among themselves as 2-Tone L.E.D. to shopping malls Simplify Screens Model A -4.8 Model B -15.16 Model C -23.42 Held-out Data Sony Ericsson's Infinite Hope for a Turnaround For Search, Murdoch Looks to a Deal Microsoft Price WarWith Brews Between Amazon and Wal-Mart Measures predictive power, not latent structure Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Qualitative Evaluation of the Latent Space [Hofmann, 1999] Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Qualitative Evaluation of the Latent Space [Blei et will al.,give2003] The William Randolph Hearst Foundation $1.25 million to Lincoln Center, Metropolitan Opera Co., New York Philharmonic and Juilliard School. “Our board felt that we had a real opportunity to make a mark on the future of the performing arts with these grants an act every bit as important as our traditional areas of support in health, medical research, education and the social services,” Hearst Foundation President Randolph A. Hearst said Monday in announcing the grants. Lincoln Center’s share will be $200,000 for its new building, which Chang,young Boyd-Graber, Gerrish,new Blei publicReading Tea Leaves will house artists Wang, and provide facilities. The Metropolitan Opera Co. and Qualitative Evaluation of the Latent Space sequentially nal posterior: L , , αm) (4) c assignments le (Nt )\l,n is t in the tuple, mpled. on document direct transla- DA DE EL EN ES FI FR IT NL PT SV centralbank europæiske ecb s lån centralbanks zentralbank ezb bank europäischen investitionsbank darlehen !"#$%&' !"#$%&'( )%*!"+), %)! )%*!"+),( !"#$%&%( bank central ecb banks european monetary banco central europeo bce bancos centrales keskuspankin ekp n euroopan keskuspankki eip banque centrale bce européenne banques monétaire banca centrale bce europea banche prestiti bank centrale ecb europese banken leningen banco central europeu bce bancos empréstimos centralbanken europeiska ecb centralbankens s lån DA DE EL EN ES FI FR IT NL PT SV børn familie udnyttelse børns børnene seksuel kinder kindern familie ausbeutung familien eltern $'+-+# $'+-+.* /+)/01*%+' /+)/01*%+'( 0/*%2( $'+-+),( children family child sexual families exploitation niños familia hijos sexual infantil menores lasten lapsia lapset perheen lapsen lapsiin enfants famille enfant parents exploitation familles bambini famiglia figli minori sessuale sfruttamento kinderen kind gezin seksuele ouders familie crianças família filhos sexual criança infantil barn barnen familjen sexuellt familj utnyttjande [Mimno et al., 2009] Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Qualitative Evaluation of the Latent Space (a) Topic labeled as SSL .io.Serializable { mes are split to nce factor calcuthat is extracted h occurs in comtifiers such as in me. These differur set of location system, classes are more likely t for that class. Keyword ssl expr init engine var ctx ptemp mctx lookup modssl ca Probability 0.373722 0.042501 0.033207 0.026447 0.022222 0.023067 0.017153 0.013773 0.012083 0.011238 0.009548 (b) Topic labeled as Logging Keyword log request mod config name headers autoindex format cmd header add Probability 0.141733 .036017 0.0311 0.029871 0.023725 0.021266 0.020037 0.017578 0.01512 0.013891 0.012661 Table 2: Sample Topics extracted from Apache source code [Maskeri et al.,Petstore 2008] 5.2 Topic Extraction For In order to investigate the effect of naming on topic extraction results we considered Petstore, a J2EE blueprint implementation by Sun Microsystems. Being a reference Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Info. Extraction Information Retrieval Lexical Semantics MUC Terrorism Metaphor Morphology Named Entities* Paraphrase/RTE Parsing Plan-Based Dialogue Probabilistic Models Prosody Semantic Roles* Yale School Semantics Sentiment Speech Recognition Spell Correction Statistical MT Statistical Parsing Summarization Syntactic Structure TAG Grammars* Unification WSD* Word Segmentation WordNet* system text information muc extraction template names patterns pattern domain document documents query retrieval question information answer term text web semantic relations domain noun corpus relation nouns lexical ontology patterns slot incident tgt target id hum phys type fills perp metaphor literal metonymy metaphors metaphorical essay metonymic essays qualia analogy word morphological lexicon form dictionary analysis morphology lexical stem arabic entity named entities ne names ner recognition ace nes mentions mention paraphrases paraphrase entailment paraphrasing textual para rte pascal entailed dagan parsing grammar parser parse rule sentence input left grammars np plan discourse speaker action model goal act utterance user information model word probability set data number algorithm language corpus method prosodic speech pitch boundary prosody phrase boundaries accent repairs intonation semantic verb frame argument verbs role roles predicate arguments knowledge system semantic language concept representation information network concepts base subjective opinion sentiment negative polarity positive wiebe reviews sentence opinions speech recognition word system language data speaker error test spoken errors error correction spelling ocr correct corrections checker basque corrected detection english word alignment language source target sentence machine bilingual mt dependency parsing treebank parser tree parse head model al np sentence text evaluation document topic summary summarization human summaries score verb noun syntactic sentence phrase np subject structure case clause tree node trees nodes derivation tag root figure adjoining grammar feature structure grammar lexical constraints unification constraint type structures rule word senses wordnet disambiguation lexical semantic context similarity dictionary chinese word character segmentation corpus dictionary korean language table system synset wordnet synsets hypernym ili wordnets hypernyms eurowordnet hyponym ewn wn Qualitative Evaluation of the Latent Space Table 2: Top 10 words for 43 of the topics. Starred topics are hand-seeded. [Hall et al., 2008] Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Topics are shown to users during web search. Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Users can refine queries through topics. Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Key Points 1 2 3 4 “Reading Tea Leaves” alternative: measuring interpretability Direct, quantitative human evaluation of latent space Testing interpretability on different models and corpora Disconnect with likelihood Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Key Points 3 4 New York Times W ! ! ! 0.80 What we care about 2 “Reading Tea Leaves” alternative: measuring interpretability Direct, quantitative human evaluation of latent space Testing interpretability on different models and corpora Disconnect with likelihood Better 1 ! ! 0.75 0.70 ! 0.65 !1.0 What we're measuring ! ! !1.5 Better ! Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves ! ! Evaluating Topic Interpretability Interpretability is a human judgement We will ask people directly Experiment Goals Quick Fun Consistent Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Evaluating Topic Interpretability Interpretability is a human judgement We will ask people directly Experiment Goals Quick Fun Consistent We turn to Amazon Mechanical Turk Two tasks: Word Intrusion and Topic Intrusion Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task One: Word Intrusion TOPIC 1 TOPIC 2 TOPIC 3 computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task One: Word Intrusion 1 Take the highest probability words from a topic Original Topic dog, cat, horse, pig, cow Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task One: Word Intrusion 1 Take the highest probability words from a topic Original Topic dog, cat, horse, pig, cow 2 Take a high-probability word from another topic and add it Topic with Intruder dog, cat, apple, horse, pig, cow Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task One: Word Intrusion 1 Take the highest probability words from a topic Original Topic dog, cat, horse, pig, cow 2 Take a high-probability word from another topic and add it Topic with Intruder dog, cat, apple, horse, pig, cow 3 We ask Turkers to find the word that doesn’t belong Hypothesis If the topics are interpretable, users will consistently choose true intruder Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task One: Word Intrusion Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task One: Word Intrusion Order of words was shuffled Which intruder was selected varied Model precision: percentage of users who clicked on intruder Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion Red Light, Green Light: A 2-Tone L.E.D. to Simplify Screens TOPIC 1 "TECHNOLOGY" Internet portals begin to distinguish among themselves as shopping malls Stock Trades: A Better Deal For Investors Isn't Simple Forget the Bootleg, Just Download the Movie Legally The Shape of Cinema, Transformed At the Click of a Mouse Multiplex Heralded As Linchpin To Growth TOPIC 3 "ENTERTAINMENT" Chang, Boyd-Graber, Wang, Gerrish, Blei TOPIC 2 "BUSINESS" A Peaceful Crew Puts Muppets Where Its Mouth Is Reading Tea Leaves Task Two: Topic Intrusion 1 Display document title and first 500 characters to Turkers 2 Show the three topics with highest probability and one topic chosen randomly 3 Have the user click on the the set of words that is out of place Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion 1 Display document title and first 500 characters to Turkers 2 Show the three topics with highest probability and one topic chosen randomly 3 Have the user click on the the set of words that is out of place Hypothesis If the association of topics to a document is interpretable, users will consistently choose true intruding topic Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion 1.0 per-document topic probability Topics (sorted by probability) Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion 1.0 per-document topic probability Topics (sorted by probability) Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion 1.0 per-document topic probability Topics (sorted by probability) Intruder Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion 1.0 per-document topic probability Click Topic Log Odds: log(0.05 / 0.05) = 0.0 Topics (sorted by probability) Intruder Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion 1.0 per-document topic probability Click Topic Log Odds: log(0.05 / 0.15) = -1.1 Topics (sorted by probability) Intruder Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Task Two: Topic Intrusion 1.0 per-document topic probability Click Topic Log Odds: log(0.05 / 0.5) = -2.3 Topics (sorted by probability) Intruder Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Three Topic Models Different assumptions lead to different topic models Free parameter fit with smoothed EM (pLSI variant) [Hofmann, 1999] Dirichlet: latent Dirichlet allocation (LDA) [Blei et al., 2003] Normal with covariance: correlated topic model (CTM) [Blei and Lafferty, 2005] Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Corpora 8477 articles Sample of 10000 articles 8269 types 15273 types 1M tokens 3M tokens Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Corpora 8477 articles Sample of 10000 articles 8269 types 15273 types 1M tokens Corpora properties 3M tokens Well structured (should begin with summary paragraph) Real-world Many different themes Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Experiments 1 Fit pLSI, LDA, and CTM to both corpora 2 Each model had 50, 100, or 150 topics 3 50 topics from each condition presented to 8 workers 4 100 documents form each condition presented to 8 workers Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Word Intrusion: Which Topics are Interpretable? 15 10 5 committee legislation proposal republican taxis fireplace garage house kitchen list americans japanese jewish states terrorist artist exhibition gallery museum painting 0 Number of Topics New York Times, 50 LDA Topics 0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000 Model Precision Model Precision: percentage of correct intruders found Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Word intrusion: Models with Interpretable Topics 50 topics 100 topics 150 topics New York Times 1.0 0.8 0.4 0.2 0.0 1.0 ● 0.8 Wikipedia Model Precision 0.6 0.6 0.4 ● 0.2 ● ● ● ● ● ● 0.0 CTM LDA pLSI CTM Chang, Boyd-Graber, Wang, Gerrish, Blei LDA pLSI CTM Reading Tea Leaves ● ● ● LDA pLSI Which documents have clear topic associations? 25 Microsoft Word Lindy Hop 15 20 John Quincy Adams 5 10 Book 0 Number of Documents Wikipedia, 50 LDA Topics !3.5 !3.0 !2.5 !2.0 !1.5 !1.0 Topic Log Odds Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves !0.5 0.0 Which Models Produce Interpretable Topics ● 150 topics ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● LDA pLSI ● ● ● ● ● CTM LDA pLSI CTM Chang, Boyd-Graber, Wang, Gerrish, Blei LDA pLSI CTM Reading Tea Leaves Wikipedia 0 −1 −2 −3 −4 −5 −6 −7 100 topics New York Times Topic Log Odds 50 topics 0 −1 −2 −3 −4 −5 Held-out Likelihood Corpus New York Times Wikipedia Topics 50 100 150 50 100 150 Chang, Boyd-Graber, Wang, Gerrish, Blei pLSI -7.3384 -7.2834 -7.2382 -7.5378 -7.4748 -7.4355 LDA -7.3214 -7.2761 -7.2477 -7.5257 -7.4629 -7.4266 Reading Tea Leaves CTM -7.3335 -7.2647 -7.2467 -7.5332 -7.4385 -7.3872 0.80 ! ! ! ! Interpretability and Likelihood 0.75 w York Times ! Wikipedia ! ! 0.70 Better !1.5 0.75 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 0.70 !2.0 0.65 !2.5 !1.0 ! !7.32 ! ! ! ! ! !1.5 !7.28 !7.30 !7.28 ! ! Held-out Likelihood !7.26 !7.24 ! 50 150 !7.52 ! ! ! Predictive Log Likelihood ! ! ! ! ! !7.26 !7.24 !7.52 !7.50 !7.48 !7.46 !7.44 !7.42 !7.40 ! ! within a model, higher likelihood 6= higher interpretability Predictive Log Likelihood !2.0 ! ! Reading Tea Leaves ! 100 Better Chang,!2.5 Boyd-Graber, Wang, Gerrish, Blei ! Number of topics ! ! W ! ! ! ! Model ! ! CTM ! ! ! ! LDA ! pLSI Topic Log Odds Model Precision ! New York Times ! Model Precision !1.0 0.80 ! Model Precision on New York !! ! Times 0.65 ! ! !7.50 ! !7.48 0.80 ! ! !! ! !! ! Interpretability and Likelihood 0.75 ! w York Times Wikipedia ! ! ! 0.70 ! ! ! 0.65 !1.0 ! ! Better ! !1.5 ! !2.0 ! ! ! ! ! !! !! ! ! !7.52 !7.32 !7.50 !7.52 !7.50 Predictive Predictive Log Log Likelihood Likelihood ! !7.30 !7.48 !7.48 !7.46 !7.28 !7.44 !7.46 !7.44 ! !7.26 !7.42 !7.42 Held-out Likelihood !7.40 !7.24 !7.40 LDA 100 ! !100 150 pLSI ! !150 Number of topics ! ! !7.26 !7.24 !7.52 !7.50 !7.48 !7.46 !7.44 150 !7.52 Predictive Log Likelihood !7.42 !7.40 across models, higher likelihood 6= higher interpretability Predictive Log Likelihood Chang, Boyd-Graber, Wang, Gerrish, Blei 50 100 Better !7.28 Reading Tea Leaves ! ! Topic Log Odds !7.24 !7.24 ! ! ! ! ! ! !2.5 !7.26 !7.26 ! LDA LDA pLSI pLSI Number Modelof Number of topics topics 50 ! ! ! CTM ! 50 ! ! Topic TopicLog LogOdds Odds Topic Log Odds ! ! Model Model CTM ! ! CTM ! ! ! ! ! ! ! ! ! Model Precision Topic Log Odds on Wikipedia ! ! ! ! !7.28 !7.28 ! ! ! Model ModelPrecision Precision ! ! !7.50 !7.48 Conclusion Disconnect between evaluation and use Means of evaluating an unsupervised method For topic models, direct measurement of interpretability Surprising relationship between interpretability and likelihood Measure what you care about Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Future Work Influence of inference techniques and hyperparmeters Investigate shape of likelihood / interpretability curve Model human intuition Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Workshop Applications for Topic Models: Text and Beyond 7:30am - 6:30pm Friday Westin: Callaghan Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves Blei, D., Ng, A., and Jordan, M. (2003). Latent Dirichlet allocation. JMLR, 3:993–1022. Blei, D. M. and Lafferty, J. D. (2005). Correlated topic models. In NIPS. Hall, D., Jurafsky, D., and Manning, C. D. (2008). Studying the history of ideas using topic models. In EMNLP. Hofmann, T. (1999). Probabilistic latent semantic analysis. In UAI. Maskeri, G., Sarkar, S., and Heafield, K. (2008). Mining business topics in source code using latent dirichlet allocation. In ISEC ’08: Proceedings of the 1st conference on India software engineering conference, pages 113–120, New York, NY, USA. ACM. Mimno, D., Wallach, H., Yao, L., Naradowsky, J., and McCallum, A. (2009). Polylingual topic models. In Snowbird Learning Workshop. Clearwater, FL. Chang, Boyd-Graber, Wang, Gerrish, Blei Reading Tea Leaves