Overview of Research Areas and Projects

Guohong Cao, Major Projects

Deep Learning on Mobile Devices

The rapid progress of deep-learning techniques has enabled many emerging artificial intelligence applications, and there is a tremendous demand for running these applications on mobile devices. However, deep-learning models are by nature computationally intensive, making them challenging to deploy on battery-powered mobile devices. This research investigates the fundamental and challenging issues for running deep-learning applications on mobile devices by leveraging Neural Processing Units (NPUs). An NPU is a microprocessor that specializes in the acceleration of deep-learning algorithms; however, it incurs accuracy loss, and it is a challenge to address this problem. The goal of this research is to support deep learning on mobile devices with NPU by addressing the following issues: i) investigating model-partitioning techniques to decompose the deep-learning model into different layers running on heterogeneous processors to minimize processing time or maximize accuracy based on the application requirements; ii) designing energy- and thermal-aware techniques to address the performance limitations of the current mobile system; iii) exploring the collaborative intelligence between edge server and NPU-based mobile device to optimize performance by investigating how and where to run the computation.

Energy-Aware and QoE-Aware Video Streaming on Mobile Devices

Video streaming on mobile devices has seen tremendous growth recently, and is a major marketing point of many wireless carriers. Since videos have very large data sizes, the mobile devices must spend a great deal of their battery power on downloading and processing the video over the wireless network. Existing research on video streaming over the Internet does not consider the new research challenges related to energy efficiency and the special characteristics of mobile devices and wireless networks. This project addresses these challenges by developing novel techniques to save energy while maintaining users’ Quality of Experience (QoE) during video streaming. To support energy-aware and QoE-aware video streaming on mobile devices, this research addresses the challenges of how, when and how much data to download under different contexts to save energy and improve users’ QoE through the following thrusts: (i) understanding how the processor frequency affects network throughput and energy consumption, and how to adjust the frequency to minimize energy or maximize QoE considering user context; (ii) development of protocols for buffer management considering that users skip scenes and abort watching under some circumstances; and (iii) reducing the energy consumption of video streaming through network quality aware downloading and context-aware downloading.

Mahanth Gowda , Major Projects

mmSpy: Spying Phone Calls using mmWave Radars

This project presents a system mmSpy that shows the feasibility of eavesdropping phone calls remotely. Towards this end, mmSpy performs sensing of earpiece vibrations using an off-the-shelf radar device that operates in the mmWave spectrum (77 GHz, and 60 GHz). Given that mmWave radars are becoming popular in a number of autonomous driving, remote sensing, and other IoT applications, we believe this is a critical privacy concern. In contrast to prior works that show the feasibility of detecting loudspeaker vibrations with larger amplitudes, mmSpy exploits smaller wavelengths of mmWave radar signals to detect subtle vibrations in the earpiece devices used in phonecalls. Towards designing this attack, mmSpy solves a number of challenges related to non-availability of large scale radar datasets, systematic correction of various sources of noises, as well as domain adaptation problems in harvesting training data. Extensive measurement-based validation achieves an end- to-end accuracy of 83 − 44% in classifying digits and keywords over a range of 1-6 ft, thereby compromising the privacy in applications such as exchange of credit card information. In addition, mmSpy shows the feasibility of reconstruction of the audio signals from the radar data, using which more sensitive information can be potentially leaked.

I Spy You: Eavesdropping Continuous Speech on Smartphones via Motion Sensors

This project presents iSpyU , a system that shows the feasibility of recognition of natural speech content played on a phone during conference calls (Skype, Zoom, etc) using a fusion of motion sensors such as accelerometer and gyroscope. While microphones require permissions from the user to be accessible by an app developer, the motion sensors are zero-permission sensors, thus accessible by a developer without alerting the user. This allows a malicious app to potentially eavesdrop on sensitive speech content played by the user’s phone. In designing the attack, iSpyU tackles a number of technical challenges including: (i) Low sampling rate of motion sensors (500 Hz in comparison to 44 kHz for a microphone). (ii) Lack of availability of large-scale training datasets to train models for Automatic Speech Recognition (ASR) with motion sensors. iSpyU systematically addresses these challenges by a combination of techniques in synthetic training data generation, ASR modeling, and domain adaptation. Extensive measurement studies on modern smartphones show a word level accuracy of 53.3 − 59.9% over a dictionary of 2000-10000 words, and a character level accuracy of 70.0 − 74.8%. We believe such levels of accuracy poses a significant threat when viewed from a privacy perspective.

Thomas La Porta, Major Projects

Collaborative Technology Alliance

As part of the Network Sciences CTA program, we are performing research on Quality of Information (QoI) Aware networking. We have defined several contextual and intrinsic attributes of information quality, such as accuracy, precision, timeliness and freshness. The desired attributes of these values may be specified to information sources by either a vector or a function. These information attributes are then mapped to data quality attributes so that an information source may be selected and proper controls instantiated in the network. Our results show that the network resources required varies non-linearly with the requested QoI. That is, by slightly reducing QoI requirements, far more pieces of information may be retrieved over a network. We have developed models that allow us to accurately characterize network scalability vs. QoI requirements. We have also developed a distributed processing algorithms to perform QoI-sensitive video analytics. We are also currently working on mapping queries made by humans into quality sensitive information requests given the imprecision of the requests.

Defense Threat Reduction Agency

As part of our DTRA project we are modeling the spread of failures across multiple interconnected networks, and developing recovery algorithms for massive failures with partial information. For the spread of failures, we developed a generic model of phenomena spreading for several different network structures and interconnect architectures. We have characterized which architectures and nodes speed or slow the spread of phenomena. For the recovery work, we have developed algorithms to provide minimum cost repairs to provide a baseline level of service to mission critical tasks. This work is being extended to the case where the full extent of the failure scenario is not known

Bin Li, Major Projects

Wireless Collaborative Mixed Reality Networking

Wireless collaborative mixed reality (WCMR) provides an interactive and immersive experience for a group of people that can move freely in an open space and will potentially revolutionize existing collaborative mission-critical training, such as firefighter drills and disaster response training. In order to provide the best immersive experience, WCMR differs drastically from traditional wireless applications in that they demand not only coordinated information to be transferred to collaborating agents in real-time but also fast computations in mobile mixed reality devices. Hence, WCMR requires fundamentally different designs than existing approaches that mainly focus on communication demands and predominantly assume that they are independently generated at different network agents. This project aims to develop joint communication, computation, and learning algorithms that explicitly exploit unique characteristics of WCMR and support emerging WCMR applications.

Mobile Edge/Cloud Computing

Today’s mobile devices are not merely smart, they are becoming intelligent as artificial intelligence applications are being pushed into mobile devices and as mobile devices are being integrated into the cloud-fog-mobile architecture. This calls for efficient and adaptive computing/communication co-design of wireless networks to optimize application-level latency (including both communication latency and computing times) and to achieve energy efficiency (considering energy consumed by both communications and computing). This project develops fundamental theories and novel architectures of low-latency, energy-efficient, and computing-centric wireless networks to support emerging mobile intelligence applications.

Ting He, Major Projects

Inference and Control in Overlay Networks

An overlay network is a layered network, where the overlay layer makes use of the underlay layer in order to deliver data to its destinations. Overlay networks arise in many application scenarios such as content distribution networks that are used to distribute content over the Internet (e.g., movies, music, etc.), and are often used to deploy new technology that is incompatible with legacy devices and protocols. A key challenge in such systems is that the overlay cannot observe the innerworkings of the underlay, making it difficult to use the underlay efficiently. This project develops mechanisms to learn the network topology and congestion of the underlay, and network algorithms for routing messages efficiently across overlay networks. Existing overlay systems mostly rely on simple models of the underlay network, and may fail to achieve good performance when the underlay nodes cannot be fully observed and controlled. This project aims at developing fundamental limits and practical algorithms for monitoring and controlling partially observable/controllable overlay-underlay networks. This is accomplished through two interdependent thrusts: Thrust 1 develops techniques that utilize measurements and side information observable to the overlay in order to infer the underlay network structure and state, and Thrust 2 develops algorithms that utilize the inferred information to control the operation of overlay nodes so as to optimize the performance for overlay services.

Adversarial Network Reconnaissance in Software Defined Networking

As a new networking paradigm, SDN introduces both the opportunities of easier network management and more flexible policy deployment, and the challenges of new attack surfaces. This project investigates such new attack surfaces from the perspective of adversarial reconnaissance, which is a family of techniques that allow insider and outsider attackers to use the network behavior and control-plane messaging to infer the structure, configuration, and vulnerabilities of the target SDN. To secure future networks against such attackers, this project proposes to develop a systematic understanding of the techniques, capabilities, fundamental limits, and countermeasures of adversarial reconnaissance in SDNs. The project investigates two correlated questions: (1) What information can be learned by an adversary? (2) What attacks can be launched based on this information? Two parallel thrusts are carried out to address these questions, one focusing on an internal adversary (compromised switch), and the other focusing on an external adversary (compromised host). Examples studied include flow table reconnaissance by host-based adversary and load balancer reconnaissance by switch-based adversary, with many more open questions to be explored.

Hong Hu, Major Projects

Automatic Identification of Privilege-guard Variables for Data-only Attacks and Defenses

As cyber attackers are always exploring novel, low-cost hacking vectors to bypass current defenses, security researchers should examine the remaining threats comprehensively in order to develop effective defenses in advance. Within program memory, attackers are shifting their attentions from control hijacking to more stealthy, pure data manipulation: they aim to modify security-critical variables to bypass security checks, like authentication and authorization. Researchers must understand which variables determine application security before developing efficient defenses to prevent so-called data-only attacks.

This project proposes three thrusts to comprehensively understand the practicality of automatically constructing data-only attacks. First, Thrust 1 includes a set of novel techniques aiming to automatically identify security-critical, non-control data from general-purpose programs. Thrust 1 will focus on conditional branches that prevent untrusted users from accessing high-privilege resources. The result will help defenders understand whether security-critical variables can be identified automatically. Second, Thrust 2 will develop solutions to measure the challenges of constructing concrete data-only attacks. The goal is to estimate the upper-bound cost of building attacks. The results of this thrust will help understand the practicality of this new threat. Third, Thrust 3 will build a benchmark of data-only attacks to offer a unified platform for testing future data-only attacks and defenses. This project will produce a set o

Enhancing Practical Defense Mechanisms against Memory Errors and Attacks

Given the constant threat of software vulnerabilities and malicious attacks, the computer security community is always working on improving defense mechanisms for real-world use. At the same time, they also need to make sure these defenses can stand up to determined attackers who are ready to exploit any weaknesses. This project aims to evaluate these practical defense mechanisms, spotting and fixing potential issues before attackers can cause major problems. The results will push for more automated defense improvement, affecting many research areas. The project addresses high-profile memory errors. The educational part of the project focuses on teaching students how to design, evaluate, and improve memory protection techniques, starting from K-12 and beyond, to spark interest in computer science and security.

To achieve these goals, the project will investigate a set of systematic approaches to evaluate practical defense mechanisms to understand their strengths and help enhance their robustness. The project will identify configuration variables that determine the strength of defenses, and measure the feasibility for attackers to manipulate these bytes to undermine protections. Second, defense-debloating techniques will identify radical optimizations that remove essential security checks and bring old threats back into hardened programs. Third, the project will focus on detecting and preventing resource-exhaustion attacks introduced by out-of-band scrutiny that requires extra software or hardware resources. Finally, a new technique, squeezing analysis, will be used to achieve strong and fast protection. These analyses and enhancement will be applied to diverse defenses such as control-flow integrity (CFI) and reassembly

Arslan Khan, Major Projects

Pieces

Pieces: Pieces is a highly programmable language-agnostic automatic program compartmentalization framework. Pieces can be programmed to partition programs based on various criteria, including methods of isolation between compartments.

Kiwan Maeng, Major Projects

Compiler for MPC-based secure training/inference

Multi-party computing (MPC) for machine learning (ML) allows offloading ML workloads (training and inference) to untrusted parties without sharing one’s secrets. However, existing MPC-based ML systems often introduce MPC-specific accuracy bugs that are hard to debug and significantly degrade the latency/throughput. This project takes a compiler-centric approach to tackle both problems. We build a multi-stage compiler for MPC-based ML and co-optimize performance and accuracy at different stages through MPC-specific compiler optimizations and approximations.

System support for differentially-private training

Machine learning models memorize potentially sensitive training data. Differentially-private training (e.g., DP-SGD or DP-FTRL) is becoming the new standard in industrial use cases to prevent such memorization. However, differentially-private training algorithms have system characteristics different from non-private training and usually incur more compute/memory overheads. We co-design differentially-private training algorithms and the system hardware/software to make differentially-private training more efficient and scalable.

Rômulo Meira-Góes, Major Projects

Cyber Security in Controlled Cyber-Physical Systems

Cyber-Physical Systems (CPS) provide the foundation for our critical infrastructure systems, such as energy, transportation, and manufacturing, to name a few. Although CPS are already ubiquitous in our society, their security aspects were only recently incorporated into their design process, mainly in response to catastrophic incidents caused by cyber-attacks. One common class of attacks on CPS is called deception attacks, which involve an attacker hijacking the CPS sensors and actuators. We focus on (1) How can we model and analyze CPS under deception attacks? (2) Can we automatically find and address vulnerabilities of a given CPS to deception attacks? This project aims to provide a general control framework to help engineers detect and address deception attack vulnerabilities in CPS.

Robustness of Dynamical Systems

A safety verification task involves verifying a system against a desired safety property under certain assumptions about the environment. However, these environmental assumptions may occasionally be violated due to modeling errors or faults. Ideally, the system guarantees its critical properties even under some of these violations, i.e., the system is robust against environmental deviations. In this project, we aim to define a notion of robustness as an explicit, first-class property of a transition system that captures how robust it is against possible deviations in the environment. Being able to explicitly reason about robustness enables new types of system analysis and design tasks beyond the common verification problem stated above. We aim to apply our framework to case studies involving cyber-physical systems, e.g., medical devices and mobile robots, and network protocol systems, e.g., electronic voting machines and fare collection protocols

Syed Rafiul Hussain, Major Projects

Formal Analysis of System Design

Securing a complex system/network is a non-trivial task and requires analyzing the design-specifications/standards because vulnerabilities in the design are likely to trickle down to implementations/deployments. One key observation is that a systematic analysis of a system’s specification, albeit the basic building block, is currently missing and not streamlined from the ground up. In this project, we design and implement frameworks for systematically investigating the design specifications of different systems/protocols in the context of security (e.g., secrecy and authenticity) and user privacy (observational equivalence, side-channel analysis). For this project, we develop new formal security analysis techniques based on model checking, cryptographic verifier and natural language processing.

Security Analysis of System/Software Implementations

Implementations of complex systems/software often deviate from the design because of specification ambiguities, missing security and privacy requirements, unsafe practices, and oversights stemming from inadequate input sanitization and simplification/optimization of complex protocol interactions. It is, therefore, pivotal to verify and monitor if protocol/system implementations faithfully adhere to the design specifications and the security and privacy requirements. As part of this project, we develop novel techniques and tools based on program analysis, automata learning, reverse engineering, and fuzzing.

Shagufta Mehnaz, Major Projects

Privacy Auditing of Machine Learning Models

While a variety of companies are streamlining their business processes by adopting machine learning technologies and leveraging commercial ML-as-a-service APIs, it is equally important to understand if these technologies are introducing attack vectors against the privacy of the data on which the models were trained. This is because, in many cases, the datasets used for training such models are proprietary and confidential. This project addresses and evaluates the vulnerabilities of model inversion attacks that turn the one-way journey from training data to model into a two-way one. This project also aims to develop defenses against such attacks by identifying and mitigating the root causes.

Security and Privacy Analysis of Federated Learning

Federated learning (FL) is revolutionizing how we learn from data. With its growing popularity, it is now being used in many safety-critical domains such as autonomous vehicles and healthcare. Since thousands of participants can contribute in this collaborative setting, it is, however, challenging to ensure the security and reliability of such systems. This highlights the need to design FL systems that are secure and robust against malicious participants’ actions while also ensuring high utility, privacy of local data, and efficiency. This project aims to investigate these security and privacy vulnerabilities and eventually develop a secure, privacy-preserving, and robust FL framework.

G. Gary Tan, Major Projects

Binary-Level Reverse Engineering

In this project, we focus on binary-level reverse engineering. Before we are able to perform analysis and transformation on a piece of binary, we must reverse engineer it to get its basic information, including its instructions, its control-flow graph, and basic dataflow information. Previous reverseengineering techniques are often ad hoc and do not have a formal basis. There is also no evaluation about what would be the best reverse-engineering algorithms in terms of precision and performance. We plan to construct a reverse-engineering tool that makes it easy for principled exploration of the design space of reverse-engineering algorithms.

Privilege separation via program analysis

This project aims to provide program-analysis tools that help developers partition an application, flexibly configure an application’s security architecture, and reason about its system security.

Parser assurance

The security of many software systems critically depends on the correctness of their parsers that parse potentially malicious input files. However, the software community lacks methodologies for constructing high-assurance parsers. This project studies general methodologies of increasing parser insurance, by adopting formal verification and fuzzing.

Jing Yang, Major Projects

Reinforcement Learning (RL) for the Optimization of Wireless Networks

Optimally configuring a wireless network to fit its specific deployment environment has long been a difficult task. On one hand, wireless technology advances have provided more flexibility to the operators, as more “knobs” are enabled in the operation and maintenance (OAM) interface so that the operators have deeper control of their networks. On the other hand, the increased flexibility has not translated to improved network optimization, as the complexity of cellular networks has grown significantly over the years thanks to new bands, radios, deployment, and use cases. Optimal configurations are generally impossible to determine by conventional methodologies, due to the large parameter space, coupling behaviors, and non-convexity of the problems.

In recent years, there is a growing interest in applying RL methods to adaptively configure the wireless network to match the deployment environment on the fly. Despite the promising early results and the philosophical match, majority of the existing approaches do not develop and tailor the RL methods to fit the unique characteristics of wireless networks. In particular, one prominent feature of wireless research is that past success has proved the value of physical-law based modeling. This cannot be easily captured by the current model-free (which has no modeling) or model-based (which is in the representation space) RL design. The goal of this project is to develop a novel domain knowledge enriched RL framework for wireless network optimization that seamlessly integrates a physical-law based wireless network modeling into the RL process.

Minghui Zhu, Major Projects

Privacy-preserving cyber-physical systems

Advanced information and communications technologies (ICT) are increasingly permeating through our world. The technological advances are stimulating the rapid emergence of new-generation large-scale cyber-physical systems (CPS), including the smart grid, smart buildings, intelligent transportation systems, medical device networks and mobile robotic networks. CPS consists of a large number of geographically dispersed entities and thus distributed data sharing is necessary to achieve network-wide goals. However, distributed data sharing also raises the significant concern that the private or confidential information of legitimate entities could be leaked to unauthorized entities. Privacy has become an issue of high priority to address before certain CPS can be widely deployed. Existing techniques to protect the data privacy of ICT systems are not sufficient to ensure CPS privacy. This project aims to develop new control-theoretic schemes to assure the successful completion of control tasks for large-scale CPS and simultaneously preserve the privacy of legitimate entities. The outcomes of this project will provide engineering guidelines to build trustworthy CPS in adversarial operating environments.

Distributed machine learning

Mobile robotic networks; (e.g., fleets of unmanned aerial vehicles) offer expanded capabilities for recognized military uses as well as a wide variety of civilian uses. There are several factors that contribute to their increasing potential and importance. In particular, technological advances have enabled smaller platforms with increased sensing, communication, and processing capabilities. In addition, autonomous operations offer several competitive advantages such as persistent surveillance that exceeds human fatigue limitations or remote operation capabilities without the logistical transport costs for assets and personnel. Distributed control becomes key to fully realize the potentials of mobile robotic networks. Current distributed control paradigms are mainly model-based and inadequate to handle significant uncertainties, including (1) environmental uncertainties; i.e., unforeseeable elements in unstructured environments where mobile robots operate; (2) dynamic uncertainties; i.e., inaccuracies of the physical dynamics of mobile robots. To bridge the gaps, this project aims to leverage reinforcement learning, an area of machine learning, and game theory, initially developed in economics, to develop a new data-driven (more specifically, model-free) distributed control framework. The developed framework is model-free, fully distributed, autonomous, and its performance is rigorously provable. The framework will significantly improve the autonomy of mobile robots when they face significant environmental uncertainties and dynamic uncertainties especially in long-term missions.

Sencun Zhu, Major Projects

Federated Learning Security and Privacy

Federated Learning (FL) is designed to protect the data privacy of each client during the training process by transmitting only models instead of the original data. However, the trained model may memorize certain information about the training data. With the recent legislation on the right to be forgotten, it is crucially essential for the FL model to possess the ability to forget what it has learned from each client. On the other hand, the privacy-preserving nature of FL opens doors for backdoor attacks by malicious clients who train local neural network models secretly with certain patterns in the training data that will later cause the global model to misclassify any data carrying such patterns. In our research, we aim at pruning the backdoored models or removing the injected backdoors from trained models, as well as designing privacy-preserving machine unlearning algorithms to support the right to be forgotten.

Software plagiarism detection

For both binary code and android apps, leveraging program logic and user interface.

Fully automated mechanisms

Designed to analyze information flow-level privacy leakage based on app functionality, and to generate and enforce privacy policies.

Analyze app review data

Used for detecting fake reviews and hired reviewers.