Social media is a platform that lets common persons to generate or create and publish contents. Twitter and Facebook the two most popular social media websites exhibit its explosive evolution and huge influence. Both Twitter and Facebook are in the top 10 most-visited websites in the world according to Alexa ranking. Sometimes social media are also referred to as consumer-generated media (CGM). Traditional media, such as newspaper, books, and television are different from social media due to the reason that almost anyone can publish and access information inexpensively using social media. Blogs, social networking sites, virtual social worlds, collaborative projects, content communities and virtual game worlds are all different forms of social media. Social media has some or all of these seven function blocks: identity, conversations, sharing, presence, relationships, reputation, and groups.

 Mining the features and contents of social media gives us an opportunity to discover social structure characteristics, analyze action patterns qualitatively and quantitatively, and sometimes the ability to predict future human related events. Even though currently most predictions using social media can be done better by human agents, specifically experts, there are still good reasons for us to try to predict automatically.

Firstly, compared with human labor, automatic prediction with tools/softwares has a much lesser cost. Secondly, people tend to overvalue small probabilities and undervalue high probabilities. So events with small and high probabilities are poorly predicted by people. Thirdly, intentionally or unintentionally, a person may make decision influenced by their desire, interests and benefit, not purely based upon objective probability. Lastly, automatic prediction methods could process greater amounts of data and provide response quickly.

Some of the popular Prediction areas using social media data are Marketing, Sales,
Movie Box-office, Elections, Stock market, E-Commerce etc.

For example, box-office prediction of a movie- it is unsurprising that predicting whether a movie will be a great commercial success or not, generates considerable interest in both commercial and research communities. Data from Twitter, which includes tweets of upcoming / unreleased movies and the source domain, comprises of the movie reviews of the released movies. Tweets related to the movie are collected and analyzed by applying data mining and machine learning algorithms. The process involves Pre-processing the tweets after collecting it, which includes removal of irrelevant content, handle missing data, eliminating stop words(commonly used words viz the, at, of etc). Later feature selection is carried out (to get the most relevant factors for prediction). Then appropriate Data mining algorithm is applied by using software or tool like R, Matlab etc and finally the results are interpreted. So, whether a movie is going to be a hit or flop at box-office can be predicted through the data collected from social media.

Social media can express collective wisdom which, when properly tapped, can yield an extremely powerful and accurate indicator of future outcomes. It has created a new way for us to collect, extract and utilize the wisdom of crowds in an objective manner with low cost and high efficiency.

“Social media should improve your life, not become your life” – Rita Ghatourey.

Extracted from: Yu, Sheng, and Subhash Kak. “A survey of prediction using social media.” arXiv preprint arXiv:1203.1647 (2012).


Blog by:

Ms. Sarita Byagar

Assistant Professor, Indira College of Commerce and Science

Create A Post