Because social media is heavily used by teenagers, the online conversation is sometimes foreign to the likes of our staff of adult Ph.D. computational linguists. A particularly thorny problem is analyzing emoticons for sentiment. Some are obvious: a smiley face :-) signifies positive and a sad face :-( signifies negative. But there are thousands of other emoticons that express varying degrees of sentiment. How can a computer analyze the sentiment expressed by emoticons so that companies can make business decisions based on accurate, reliable data?
Tanya Lee solved this key problem for NetBase. Working with our engineering team, Tanya developed a sentiment classification system based on emoticons. Unlike mainstream sentiment analysis based on statistical models using machine learning, their solution is a rule-based system because our users demand the highest precision and depth of analysis possible. Her work helps NetBase go even deeper into the online conversation to surface meaning from text.
By the way, Tanya is 17 years old.
The published study she co-authored, “Role of Emoticons in Sentence-Level Sentiment Classification,” shows that the rule-based system she helped develop is consistently accurate: For the sample studied, it assigns the same sentiment to an emoticon that a human would 75.2 percent of the time.
Tanya’s work on emoticon parsing also made her a semifinalist in the Intel Science Talent Search 2014, the nation’s most prestigious pre-college science competition.
P.S. To read more on the subject of using social analytics to accurately analyze consumer sentiment, download our paper, Can Social Media Measure Customer Satisfaction?