REU student: Juan F Arizpe-Vega
Mentors: Indrajeet Ghosh, Dr. Kasthuri Jayarajah, Dr. Nirmalya Roy
Abstract—Non-invasive sensor technologies offer an unprecedented opportunity for longitudinal and continuous monitoring of physiological and cognitive functions, enabling researchers to examine temporal changes and patterns in real-world settings. However, despite the non-invasive nature of commercially available wearable sensors, the presence of artifacts, including noise, motion artifacts, ambient interference, and physiological artifacts, can significantly impact the accuracy and interpretability of the collected data, especially in real-world settings. Therefore, effective artifact detection and correction methods are crucial for enhancing the robustness and validity of the sensed data. We focus on the challenges and advancements in multimodal artifact detection and improving the performance for downstream tasks (emotion recognition, cognitive assessment, etc.). In this work, we propose an end-to-end unsupervised variational autoencoder (VAE) framework to detect artifacts in Electroencephalogram (EEG) data by learning the underlying data distribution of the data.
REU student: Gloria Atolagbe
Mentors: Rachael Kang, Dr. Roberto Yus, Dr. Tera Reynolds
Abstract—A dense text generator learns the semantic knowledge and visual features of each frame of a video and maps them to describe the video’s most relevant subjects and events. Although dense text generation has been widely explored for untrimmed videos to generate associated texts across various domains, generating dense captions in the sports domain to supplement journalistic works without relying on commentators and experts still needs much investigation. This paper proposes an end-to-end automated text-generator that learns the semantic features from untrimmed videos of sports games and generates associated descriptive texts. The proposed approach considers the video as a sequence of frames and uses a sequential generation of words to develop detailed textual descriptions. After splitting videos into frames, we use a pre-trained VGG-16 model for feature extraction and encoding the video frames. With these encoded frames, we posit an LSTM based attention-decoder architecture that leverages soft-attention mechanisms to map the semantic features with relevant textual descriptions to generate the explanation of the game. Because developing a comprehensive description of the game warrants training on a set of dense time- stamped captions, we leverage the ActivityNet Captions dataset. In addition, we evaluate the proposed framework on both the ActivityNet Captions dataset and the Microsoft Video Description Dataset (MSVD), a dataset of shorter generalized video-caption pairs, to showcase the generalizability and scalability and also utilized beam search and greedy search for the evaluation of the SpecTextor. Empirical results indicate that SpecTextor achieves BLEU score of 0.64330 and METEOR score of 0.29768.
REU student: Serena Lin
Mentors: Emon Dey, Dr. Anuradha Ravi, Dr. Nirmalya Roy
Abstract—The proliferation of smart computing devices paves the way of applying artificial intelligence in almost all sectors of digital civilization but raises issues like personal data breaches. Preserving privacy of the data has become a crucial aspect especially in the medical domain, where sensitive information about the patients and even the institutions should be kept confidential. Federated learning is being investigated as a promising solution to this challenge but has some drawbacks itself. One of them is the issue of higher communication and computation cost when the model becomes large and complex because the model must be sent back and forth between server and clients while training. The combination of model compression techniques with federated learning algorithms can be a viable approach to subdue this problem. Considering this research scope, in this work, we present a resource-efficient federated learning scheme implemented considering medical domain application. We have chosen the 'Brain Tumor Segmentation (BraTS)' dataset published in 2020 for our experiments because of its comprehensiveness, and its well-suited for our motivation. We utilize U-net as our base deep model and develop a ternary quantized version of it to reduce computation complexity. We also present a benchmark study while running our proposed model at the client-side based on required power, memory, and inference time as a measurement of computation efficiency.
REU student: Matthew Makila
Mentors: Zahid Hasan, Dr. Nirmalya Roy
Abstract—Respiratory rate (RR), measured by breaths per minute, is one of the four human vital signs. It is recommended to check the RR pattern regularly as it provides early signs of various common cardio-vascular diseases across different age groups. To facilitate a ubiquitous contactless RR monitoring system, we propose to use off-the-shelf video cameras to monitor RR instead of using special pressure-based wearables. We aim to capture the breathing-induced body parts (shoulder, abdomen, chest) movement via a regular video camera and design a robust video processing mechanism to track the spatiotemporal movements to infer breathing rate from video. In developing such a system, we plan to collect a large RR dataset to train and validate contactless RR methods and develop a lightweight robust approach to extract RR from the input video for ubiquitous contactless RR applications. Firstly, we aim to collect quality RR datasets from diverse subjects by considering realistic situations like low RR, high RR, exercise, force RR, natural RR, variance in clothes, lighting conditions, backgrounds, different RR-induced body components, and body posture. In this study, we will make our data open source to validate RR methods and attract wider research. Secondly, we will create a spatiotemporal model to localize the RR-induced body parts and track their subtle temporal movement due to RR to infer the underlying breathing rate. Currently, we are exploring edge detection and edge movement tracking by calculating their volumetric changes and edge-energy shifts, which are more computationally efficient than their data-hungry deep-learning-based video processing counterparts. In the future, we also plan to develop data-efficient deep-learning approaches to learn breathing rates from video automatically.
REU student: Hersch Nathan
Mentors: Md. Saeid Anwar, Dr. Anuradha Ravi, Dr. Nirmalya Roy
Abstract—During disaster recovery, it is imperative to take the assistance of robots to navigate hostile terrains. Robots can autonomously make application-oriented decisions and send data (such as images) to human personnel for decision-making. Communication in a disaster-struck environment can be challenging with the destruction of communication infrastructure or lack thereof. Establishing satellite-based communication can be a costly affair. The requirement for wireless networks in far-reach areas led to the inception of LoRa (Long-Range) networks, which leverage Chirp Spread Spectrum (CSS) technology for long-range communication over low bandwidth. Thus, devices equipped with LoRa can communicate small chirps of data over a long-range, making them power efficient to sustain their battery life for a longer duration. Per regulations, in the United States LoRa exists on the 902-928Mhz band with power restrictions. LoRaWAN is a WAN protocol built on top of LoRa, which has typically been used to transmit small amounts of data from low-power sensor networks. In this project, we first set up a LoRaWAN network to interface and interact with UAVs and UGVs. We then analyze the performance of LoRaWAN network on varying workloads and monitor the computation and communication power consumption of a bot while employing the LoRa network. We further explore the possibility of transmitting image data over the LoRaWAN network. We leverage the low bandwidth of LoRaWAN to send feature representatives of the images (rather than sending raw image data) that can be processed at an edge node for object classification applications. To lay down a path for decision-making (selecting the best possible network) in a heterogeneous network environment, we compare sending images and feature representatives of the raw images over WiFi via MQTT (as proposed by previous works) and LoRaWAN. We analyze the performance (delay and power consumption) of WiFi and LoRaWAN given varying workloads.
REU student: Sarah Okome
Mentors: Indrajeet Ghosh, Dr. Nirmalya Roy
Abstract—Wearable-based technology for human activity recognition is prevalent in today’s society. Recently, deep learning algorithms have enabled us to develop scalable data-driven algorithms. Such algorithms can help advocate for transferring the knowledge of complex human motions to downstream tasks via activity recognition, skill assessment, etc. However, these models lack labeled data- one of the key challenges for downstream tasks. This work tackles the challenge of scarce labeled samples encountered in the activity recognition task. This work proposes SSAR, a self-trained, semi-supervised learning framework (SSL) that can effectively discern and classify human activities through leveraging pseudo labels. The motivation for utilizing a proxy label method like pseudo labels is to propagate labels from labeled data to predict labels for unlabeled data with minimal expert supervision. We utilize a CNN-based learning framework to learn the robust representation from labeled data and leverage the learned knowledge to the unlabeled data for detecting micro-complex activities. Lastly, we evaluate the proposed framework using two publicly available datasets: Badminton Activity Recognition (BAR) and WISDM, by computing four evaluation metrics: F1-score, recall, precision and accuracy.
REU student: Sophia Woodson
Mentors: Maloy Kumar Devnath, Dr. Anuradha Ravi, Dr. Nirmalya Roy
Abstract—Millimeter wave radar (mmWave radar) operates on the principle that transmits short wavelength electromagnetic waves and estimates the range, velocity, and angle of the objects that obstruct the waves. This project investigates the use of mmWave radar to identify objects that can obstruct UGVs during navigation. The hypothesis is that a system using mmWave Radar can identify and classify the objects (both visible on the surface and hidden beneath the ground) that can potentially obstruct the navigation of UGVs. As mmWave radar is less sensitive to changes in environmental lighting conditions and contains a degree of permeability, it can assist in UGV navigation in situations where a camera would be incapacitated– for example, in low visibility weather or environments, such as on foggy days or in tall grass (where objects can be obscured). Additionally, mmWave radar does not collect personally identifying information, which is a cause of concern while using RGB cameras. In this work, we perform a literature study on the existing systems that use mmWave radar for object identification and classification. We further explore the various parameters and features of mmWave Radar that can effectively help identify and classify objects on both indoor surfaces, such as tiles, carpets, tables, and outdoor surfaces, such as mulch and grass. In a preliminary study, we collect range doppler and 3D point cloud data of objects pertaining to different materials (plastic, metal, cardboard, and paper). The idea is to find the suitable features of mmWave Radar to identify static objects that are visible/hidden beneath the surfaces. At the end of the study, we present a taxonomy of mmWave Radar features that are suitable for different applications.
REU student: Vicki Young
Mentors: Jumman Hossain, Dr. Nirmalya Roy
Abstract—The COVID-19 pandemic has brought a devastating effect on human health across the globe. People are still observing face masking in public places to contain the spread of COVID-19 as coughing is one of the primary transmission mediums. Early cough detection with identifying the fine-grained contextual information plays a significant role in preventing COVID-19 spread among the close-by cohorts. Many approaches proposed for developing systems to detect cough in literature, but earable devices are inadequately studied for respiratory symptom detection. In this work, we leveraged an acoustic research prototype-eSense that embeds acoustic and IMU sensors into user-convenient earbuds to find out answers for the two following questions – 1. How feasible is it for the earables to detect respiratory symptoms? and 2. How scaleable are the developed models when exposed to unseen dataset samples? In our study, we have experimented with traditional machine learning models and a deep learning model. We found that thedeep learning model outperforms the conventional algorithms on a large scale, achieving 97.51% accuracy. We further investigated the model scaleability for the unseen datasets. We found that the model performance drops sharply when a model is trained on a particular dataset and tested on an unseen dataset. To mitigate such issues, we explored using an adversarial domain adaptation technique.
REU student: Adams Ubini
Mentors: Dr. Zhiyuan Chen
Abstract—Intelligent Transportation Systems (ITS) apply various technologies such as sensing, analysis, control, and communications to ground transportation in order to improve safety, mobility and efficiency. Examples of ITS include Connected Autonomous Vehicles, Traffic Management Systems, Smart Parking Systems etc. These systems are vulnerable to cyber-attacks and privacy violations (e.g., data about an individual’s location, trajectory, or other sensitive information may be leaked, especially when such data contains identity information). Sensitive data, such as the license plate number of a car, is frequently exchanged in ITS. Thus, it is necessary to restrict who have access to such sensitive information in ITS. The research proposes a situation-aware access control framework for ITS. The most commonly used access control solution in ITS is role-based access control. However, situation-aware access control is more appropriate for ITS because the access control decisions often depend on dynamically changing situations. For instance, a traffic enforcement officer will only be given access to a vehicle's license plate if it is traveling faster than the speed limit. We implemented a semantic-web based solution to support situation-aware access control in a distributed ITS environment. We first created an ontology for ITS, including major classes such as users, vehicles, sensors, data, and data broker. Our ontology also models some of the popular ITS protocols such as MQTT which uses a publisher/subscriber model. We also proposed a query rewriting method that can modify a query over ITS data to enforce access control rules. Finally, we conducted experiments on a few ITS use cases to show that the overhead of enforcing access control rules is acceptable.
REU student: Temitope Peters
Mentors: Avijoy Chakma, Dr. Abu Zaher Md Faridee
Abstract—The development of the network of systems and applications brings many advantages to the world but also carries risk. One of the biggest risks is the presence of vulnerabilities in software and hardware components due to an application weakness, which could be a design defect or an implementation error. This opens a window to third parties that could take advantage of this weakness and attack the system’s core functionalities and often cause irreversible damage. Due to the presence of many vulnerabilities, it is often hard to process them manually and also identify potential relationships between vulnerabilities. To address this problem, we propose an approach that has the ability to process thousands of vulnerabilities from different reporting platforms through an artificial intelligence pipeline to associate and connect them.