EECS 498/598 Winter 2021 Project Showcase

Welcome to the project showcase for EECS 498/598, Advanced Machine Learning for Affective Computing.

Below are videos for each project.

Topic modeling in podcasts with Neural Networks

Issac Moothart and Charles Reinertson

Podcast episodes hold a wealth of information; however, little metadata accompany these audio files, making it difficult for listeners to browse within an episode. The growing popularity of the podcast industry and the expansion of the content within it necessitates a reevaluation of the metadata which power the listening and browsing experience of podcasts. Current research in tagging content in video and text corpuses provide methods that could easily be applied to the podcasting space to produce enhanced metadata which would improve the listening experience. To address this opportunity, we explore various methods to classify the topic of short episode segments. We present a framework that can transcribe the segment, encode the transcription and thus infer the topic category of the podcast. The most effective method for encoding the transcript we explored concatenated the Bag of Words representation to the BERT encoding of the classifier token for the utterance. Trained on a dataset of 27,000 episodes with 11 topic categories, this encoding achieved an accuracy of 47%. We also call out the limitations of our system and the ethical issues that these limitations raise.

Tweet Topic and Sentiment Analysis

Xianglong Li, Mufei Chen

The COVID-19 pandemic has greatly impacted people’s life since its outbreak in December 2019. Meanwhile, people’s sentiments have been constantly changing as the pandemic exacerbated or mitigated. At the community level, public sentiments towards topics like wearing masks and vaccination have changed over time; at the individual level, people have undergone mood swings due to sudden lifestyle changes. We proposed that these trends can be captured in posts on social media platforms such as Twitter. In this project, we have built a latent Dirichlet allocation (LDA) topic model based on COVID-related tweets and performed transfer learning on pre-trained RoBERTa models using general tweets data. We analyzed the public sentiments in popular topics and developed an emotion visualizer that could monitor user’s sentiments across tweets.

Understanding Communication Differences on Mental Health and Daily Life Related Subreddits

Dinakar Talluri and Ross Kempner

Online communities for peer-to-peer mental health support have become popular and offer an unprecedented level of accessibility and anonymity. With the tools of machine learning and statistical modeling, the present work will investigate whether machine learning models can understand the different forms of communication in posts across reddit online communities in three categories: online communities for anxiety related support, online communities for depression related support, and online communities for talking about daily life such as hobbies and interests. In experiment 1, we found that multi-class lasso logistic regression in conjunction with NRCLex and LIWC affect dictionaries can learn the different communication patterns in these three classes and use the knowledge to accurately classify which of the three community types that a post comes from. Then, in experiment 2, we manipulated the training and testing data split so that the testing data came strictly after the training data in time, and the strong classification accuracy remained. In experiment 3, the testing data was strictly after the training data in time, but the testing data posts were created during the first year of the COVID pandemic, and we found that the model did not generalize as well to the testing data in experiment 3. These results suggest that the multi-class logistic regression model is generally able to understand the communication patterns across the three reddit categories; however, the patterns of communication changed in COVID era. Our work builds upon previous work, demonstrating a novel result that communication patterns in posts from anxiety and depression related subreddits can be discerned in a multi-class logistic regression setting, and we demonstrate that COVID era data poses problems for predictive models. We outline the next steps for subsequent studies to use our systems to better understand the communication patterns in these online communities.

LIWC, I am your VADER: An Affective Analysis of Mental Health Trends on Reddit

Eric Chen and Joseph Berman

In this paper, we conduct an in-depth analysis of mental health as evidenced by activity on Reddit, one of the world’s largest internet communities. We collect data from different time points in the pandemic and across different subgroups, called “subreddits”, to investigate how the pandemic may have contributed to increasing mental stress. We apply machine learning methods commonly used in emotion processing and affective computing to see how trends in activity may be indicative of mental health patterns. We also compare trends between different subreddits to disentangle some of the potential causes behind them. We hope that our findings and the pipeline we have established can help inform a data-driven approach to mental health analysis for use in the future.

Reddit Bot Development Using Python and Neural Nets

Kyle Schulz and Kevin Rodriguez Siu

This project documents the development of two reddit text bots. The first bot mines data from daily discussion threads in the popular financial subreddit /r/wallstreetbets, tracking users’ stock mentions. When called, it produces a report of the user’s virtual portfolio, including performance, index comparisons, and new stock suggestions. The second bot monitors /r/politics and uses a neural net to assess the potential of high comment activity. The bot reports the activity risk back to the user, allowing for the ability to increase moderation in identified threads.

Machine Learning for Ad Detection in Instagram

Austin Ye, Wyler Zahm, and Alex Erf

Misinformation in the modern era is of great interest to the field of data science. Improvements in communication, surveillance, and prediction technologies reduce the understanding of these technologies for individuals and hoist power into the hands of large government agencies and, possibly more dangerously, large corporations via the control of information (i.e. Clearview A.I., Google). We are concerned with a potential future where people are subtly and unknowingly influenced through marketing by social media firms who generate user feeds and by the organizations purchasing such advertisements. We constructed a labeled (sponsored / not sponsored) Instagram post dataset and an associated classifier that can classify such Instagram posts, to further research and possibly one day better inform individuals of the sponsored nature of content with which they interact. Our promising results give hope that a reliable and robust solution to this deceptive problem may one day exist.

Using a Convolutional Neural Network to Identify Littered Waste with Environmental Context

Makarand Parigi, William Chown, and Dhanuj Gandikota

Inadequately disposed waste, often littered, composes a large percentage of the waste production of the United States, with over 300,000 tons of plastic estimated to be littered each year. In fact, 60% of water pollution (8 million tons of plastic waste annually) alone is attributed to litter. Given the abundance of litter across different landscapes, littered waste identification holds challenges from both an object and environmental context. Previously, given that an image contains trash the TACO dataset used in this project provides a baseline R-CNN model for instance segmentation – it detects the boundaries of trash, but only when it knows trash already exists [1]. We build on this work by demonstrating a CNN-based architecture to identify the presence of litter in an image, as well as the prediction of the environmental context of the litter. Our project implementation could see use in a range of applications from litter waste measurement software to litter detection and cleanup in autonomous robotics. We achieve approximately 59% and 55% accuracy on trash classification and environmental prediction, respectively. Our model is novel as it detects the presence of trash in the image as well as the environmental context, while TACO’s baseline segments given the existence of trash. We first present work related to this project, followed by an explanation of the dataset and our methods of processing the image data. We then describe the model architecture and training setup. Finally, we describe the results, ethical implications, and a discussion of the results.

Speech Emotion Recognition with Spectrograms and Convolutional Neural Networks

Nathan Wong, Michael Alvin, and Han Wang

In this paper, we show that convolutional neural networks can be applied to low-level acoustic features to identify emotions in speech samples. We show how a convolutional neural networks can be applied to Mel-frequency cepstral coefficients (MFCCs) to obtain competitive results on the RAVDESS, IEMOCAP, and a Kaggle datasets. Furthermore, we performed hyperparameter tuning and model tuning to achieve the best results. We also attempted other methods in an attempt to boost our model performance, including data augmentation and transfer learning. Our results suggest that convolutional neural networks with MFCCs can be a robust model, and that data augmentation can be used to improve performance in a system.

English Accent Classification with CNN-BiLSTM

Siwei Wang

In this paper, I test the hypothesis that a CNN can improve the ability of a BiLSTM to automatically classify accents in spoken English. The accent classes I used were Arabic, Dutch, English, French, Korean, Mandarin, Portuguese, Russian, and Spanish. The data was taken from the Speech Accent Archive. I discuss the preprocessing steps, and then describe the architecture of both the CNN and BiLSTM components of my model. Extra attention is given to how the two components are connected to one another. Next, I go over the results, in which I find that the CNN failed to improve BiLSTM performance. Finally, I bring up some ethical considerations and conclude with key takeaways and possible directions for future investigation.

The secrets of engaging audiences: A case study with TED Talks

Yu-Chian Tsai and Yuan-Ping Ju

Communication is the backbone of our society; we want our ideas to be heard and spread. However, what separates a mediocre speaker from a talented one? To understand the factors behind this, we research TED, a video hosting website known for informative videos across topics. We perform descriptive analysis and train neural network models on every aspect a normal audience browse through the TED website. Our result is only the titles are weakly related to the number of views of a video. We then end this research by reasoning why there shouldn’t be a magic formula for a popular speech.

Subject Adaptation for Stress Prediction on WESAD Dataset

Yifan Li, Naihao Deng, and Shijie Qu

Negative emotions and excessive stress tend to have a negative impact on practical task performance for individuals and teams. This is magnified in highly dynamic settings such as surgery operating rooms where emotions may propagate and lead to poor patient outcomes (Liberman et al., 2020)(Chrouser et al., 2018). Prior qualitative work has demonstrated that negative emotions, frustration, in particular, are common during surgery and can decrease psychological safety and degrade team dynamics (Chrouser and Partin, 2019). Our goal is to utilize the WESAD dataset from UCI to create a model to identify emotional states using kinematic and physiologic data. Understanding emotions can be very useful in surgical training settings where accidents and disruptions don’t have fatal outcomes. Trainees and mentors can map out emotions in relation to disruptions and can better prepare mentees for emotion management during real tasks. We have implemented a random classifier and a linear classifier as our baseline models, and a neural networks. We applied those models on the processed data from a single person. We then experimented with two domain adaptation models and applied those models to other individuals with a domain generalization to train and test our model on different groups of people. By using domain adaptation, we are able to create a generalizable model for quantifying emotional states for individuals using non-invasive instrumentation.

GeneBERT: BERT for predicting differential gene expression from histone modifications

Aravind Mantravadi, Alex Ruan, and Christian Georg

Finding a Relationship Between the Stock Market and Social Media

Matthew Schneider

This document reviews my semester long project for EECS 498: Applied Machine Learning for Affective Computing. Over the course of the semester, I analyzed reddit comments to find their sentiment and correlation with specific stocks on the stock market. In this document I discuss my entire process, final results, and some ethical implications that I came upon.

BEsT WordS: A Novel Bert Based LSTM for Sarcasm Detection on Reddit Comments

Zubin Aysola and Suraj Harjani

We propose the model named: “I hAvE tHe BEsT WordS” (Trump 2016) or “BEsT WordS” in short, for the task of Sarcasm detection on online text communications, achieving near-state-of-the-art performance. BEsT WordS consists of a headless BERT based transformer, novel tokenization methods, and a uni-directional LSTM in order to classify comments into sarcastic or neutral based on parental context. Our methods combine architectures and pretrained models from several sources into a novel method, including a special tokenization step, that results in significant performance improvements over alternatives.