Electrical Engineering and Computer Science

Permanent URI for this collection

https://hdl.handle.net/10315/36010

Browse

Now showing 1 - 20 of 37

Open Access
A Cloud-Based Extensible Avatar For Human Robot Interaction
(2019-07-02) AlTarawneh, Enas Khaled Ahm; Jenkin, Michael
Adding an interactive avatar to a human-robot interface requires the development of tools that animate the avatar so as to simulate an intelligent conversation partner. Here we describe a toolkit that supports interactive avatar modeling for human-computer interaction. The toolkit utilizes cloud-based speech-to-text software that provides active listening, a cloud-based AI to generate appropriate textual responses to user queries, and a cloud-based text-to-speech generation engine to generate utterances for this text. This output is combined with a cloud-based 3D avatar animation synchronized to the spoken response. Generated text responses are embedded within an XML structure that allows for tuning the nature of the avatar animation to simulate different emotional states. An expression package controls the avatar's facial expressions. The introduced rendering latency is obscured through parallel processing and an idle loop process that animates the avatar between utterances. The efficiency of the approach is validated through a formal user study.
Open Access
A Multi-Mode Stacked-Switch Inverter/Rectifier Leg for Bidirectional Power Converters
(2022-08-08) Emamalipour Shalkouhi, Reza; Lam, John Chi Wo
The development of renewable energy systems (e.g. wind and solar) is significant to cope with an energy crisis yet, at the same time, it presents challenges to the grid for their MW-scale integration due to their volatile characteristics. Battery energy storage systems are essential in providing sustainable power and improving the overall system reliability effectively with the large deployments of renewable energy conversion systems. Bidirectional power converters are responsible for transferring power between the battery energy storage system and the grid. Selecting an efficient and cost-effective power topology along with a reliable control system is critical to ensure that the energy storage system operates safely with prolonged service life and minimized maintenance cost. In this dissertation, a multi-mode stacked-switch leg with soft-switching capability for use in bidirectional DC/DC converters is proposed for battery energy storage applications. This dissertation consists of three parts. The first part focuses on the development of a bidirectional soft-switched converter utilizing a CLLC resonant circuit and the proposed multi-mode switching legs. The presented leg is able to facilitate multiple operating modes to enable high voltage gain under different operating conditions and allow the converter to operate with a much lower output voltage ripple (50%) compared with the conventional stacked-switches-based converter topology. In the second part of this thesis, a fault-tolerant control scheme is proposed which enables seamless post fault operation of the presented multi-mode DC/DC converter if any switches in the presented leg experience an open-circuit fault. In the third part of this thesis, a comprehensive hybrid control system is proposed so that the overall voltage gain range of the converter is widely extended with a narrow switching frequency range (less than 10% of the base frequency), while at the same time, the efficiency of the converter is improved over the whole gain range (more than 1%). The operating principles and characteristics of the proposed converter and the proposed control schemes are explained in detail in this thesis. The performance of each of the presented circuit and control concepts is verified through simulation as well as experimental results on silicon-carbide (SiC)-based proof-of-concept hardware prototypes.
Open Access
A Unified Multiscale Encoder-Decoder Transformer for Video Segmentation
(2024-07-18) Karim, Rezaul; Wildes, Richard P.
This dissertation presents an end-to-end trainable and unified multiscale encoder-decoder transformer for dense video estimation, with a focus on segmentation. We investigate this direction by exploring unified multiscale processing throughout the processing pipeline of feature encoding, context encoding and object decoding in an encoder-decoder model. Correspondingly, we present a Multiscale Encoder-Decoder Video Transformer (MED-VT) that uses multiscale representation throughout and employs an optional input beyond video (e.g., audio), when available, for multimodal processing (MED-VT++). Multiscale representation at both encoder and decoder yields three key benefits: (i) implicit extraction of spatiotemporal features at different levels of abstraction for capturing dynamics without reliance on additional preprocessing, such as computing object proposals or optical flow, (ii) temporal consistency at encoding and (iii) coarse-to-fine detection for high-level (e.g., object) semantics to guide precise localization at decoding. Moreover, we explore temporal consistency through a transductive learning scheme that exploits many-to-label propagation across time. To demonstrate the applicability of the approach, we provide empirical evaluation of MED-VT/MEDVT++ on three unimodal video segmentation tasks: (Automatic Video Object Segmentation (AVOS), actor-action segmentation, Video Semantic Segmentation (VSS)) and a multimodal task (Audio Visual Segmentation (AVS)). Results show that the proposed architecture outperforms alternative state-of-the-art approaches on multiple benchmarks using only video (and optional audio) as input, without reliance on additional preprocessing, such as object proposals or optical flow. We also document details of the model’s internal learned representations by presenting a detailed interpretability study, encompassing both quantitative and qualitative analyses.
Open Access
Advancements in Inkjet Printing Techniques for Improving Field-Effect Transistors in Printed Electronics
(2025-04-10) Naderi, Paria; Grau, Gerd
This thesis centers on the enhancement of printed field-effect transistors (FETs) with a primary focus on addressing challenges through innovative inkjet printing methodologies. The study concentrates on the utilization of hydrophobic fluoropolymers, particularly Teflon amorphous fluoropolymer (Teflon-AF), as gate dielectrics in organic thin-film transistors (OTFTs). Teflon-AF, while promising for its low charge trap density, thermal stability, and low dielectric constant, presents impediments due to its low surface energy, resulting in issues like dewetting and bulging of printed patterns. This obstructs the creation of uniform lines using low-viscosity ink. To resolve these challenges, this thesis presents novel strategies for inkjet printing micro-patterns on hydrophobic surfaces. Two approaches are outlined to successfully inkjet print micro-patterns on hydrophobic surfaces. The first involves a sequential inkjet printing and drying process, which maintains ink adherence to the surface. Meanwhile, an energy minimization technique predicts the equilibrium shape and volume of patterns, which is influenced by surface tension forces. The simulation accurately predicts the required ink volume for achieving dry patterns with smooth edges, advancing inkjet printing techniques for electronics. The second approach demonstrates stacked-coin methodology to form smooth lines on hydrophobic surfaces. Variations in drop spacing, stage speed, and stage temperature map out various regimes: isolated droplets, isolated groupings, broken lines, true stacked-coin, and delamination. The study further investigates the fabrication of OTFTs employing Teflon-AF gate dielectric. Plasma treatment is applied to render the hydrophobic layer amenable to inkjet printing of silver electrodes. Morphological and surface chemical properties of Teflon-AF films change after plasma treatment and gradually reverse after annealing of the electrodes. This affects the organic semiconductor/dielectric interface and, consequently, the OTFT performance. Moreover, the research explores inkjet printing capabilities by incorporating carbon nanotubes (CNTs) into the TFT channel. Strategic utilization and engineering of the coffee ring effect in inkjet-printed CNT channel obviate the need for surface treatment. Inkjet printing leads to the aligning and bundling of CNTs on the line edges, which improves device performance.
Open Access
Biologically-inspired Neural Networks for Shape and Color Representation
(2022-03-03) Mehrani, Paria; Tsotsos, John K.
The goal of human-level performance in artificial vision systems is yet to be achieved. With this goal, a reasonable choice is to simulate this biological system with computational models that mimic its visual processing. A complication with this approach is that the human brain, and similarly its visual system, are not fully understood. On the bright side, with remarkable findings in the field of visual neuroscience, many questions about visual processing in the primate brain have been answered in the past few decades. Nonetheless, a lag in incorporating these new discoveries into biologically-inspired systems is evident. The present work introduces novel biologically-inspired models that employ new findings of shape and color processing into analytically-defined neural networks. In contrast to most current methods that attempt to learn all aspects of behavior from data, here we propose to bootstrap such learning by building upon existing knowledge rather than learning from scratch. Put simply, the processing networks are defined analytically using current neural understanding and learned where such knowledge is not available. This is thus a hybrid strategy that hopefully combines the best of both worlds. Experiments on the artificial neurons in the proposed networks demonstrate that these neurons mimic the studied behavior of biological cells, suggesting a path forward for incorporating analytically-defined artificial neural networks into computer vision systems.
Open Access
Bridging Data Management and Machine Learning: Case Studies on Index, Query Optimization, and Data Acquisition
(2022-12-14) Li, Yifan; Yu, Xiaohui
Data management tasks and techniques can be observed in a variety of real world scenarios, including web search, business analysis, traffic scheduling, and advertising, to name a few. While data management as a research area has been studied for decades, recent breakthroughs in Machine Learning (ML) provide new perspectives to define and tackle problems in the area, and at the same time, the wisdom integrated in data management techniques also greatly helps to accelerate the advancement of Machine Learning. In this work, we focus on the intersection area of data management and Machine Learning, and study several important, interesting, and challenging problems. More specifically, our work mainly concentrates on the following three topics: (1) leveraging the ability of ML models in capturing data distribution to design lightweight and data-adaptive indexes and search algorithms to accelerate similarity search over large-scale data; (2) designing robust and trustworthy approaches to improve the reliability of both conventional query optimizer and learned query optimizer, and boost the performance of DBMS; (3) developing data management techniques with statistical guarantees to acquire the most useful training data for ML models with a budget limitation, striving to maximize the accuracy of the model. We conduct detailed theoretical and empirical study for each topic, establishing these fundamental problems as well as developing efficient and effective approaches for the tasks.
Open Access
Closed-Loop Highly-Scalable Retinal Implant with Fully-Analog ED-Based Adaptive-Threshold Spike Detection and Poisson-Coded Temporally-Distributed Optogenetic Stimulation
(2024-07-18) Yousefi, Tayebeh; Kassiri, Hossein
Intraocular stimulators show promise for treating retinal degeneration by restoring visual input to the damaged retina. This is achieved by capturing images with a wearable camera and accordingly stimulating remaining retinal cells, effectively bypassing dysfunctional photoreceptors. State-of-the-art retinal stimulators face a major challenge due to the lack of cell-type specificity of electrical stimulation (activating both ON and OFF pathways in the retina) leading to limited visual perception due to sending contradictory messages to the brain. This fundamental limit motivated us to investigate the development of an optogenetic-based retinal prosthesis, that uses promoter opsins for selective activation of ON bipolar cells, offering a more natural vision restoration. In developing such a device, the first challenge we faced was optimizing stimulation strategy for optimal therapeutic efficacy. Responding to this challenge, we first present a retina-inspired computational framework to evaluate and optimize an optogenetic epi-retinal neurostimulator. This framework reveals that optical stimulation, compared to electrical stimulation, provides superior visual perception, which improves with increased μLED array resolution. The framework also explores optical stimulation factors and μLED specifications like light intensity and wavelength spatial resolution and light divergence. A critical issue in optogenetics is controlling opsin distribution, as uneven distribution affects light sensitivity across the retina. Variations in tissue properties and fluid dynamics introduce unpredictability in stimulation effectiveness. Our solution to this issue includes a scalable optogenetic stimulator IC, which features channel-specific closed-loop calibration for defining the optimal stimulation intensity using a temporally adaptive-threshold spike detection circuit. The second challenge we addressed was scalability, and by association, energy efficiency of the device. Scaling implantable stimulators is limited by instantaneous power demands during multi-channel stimulation. We address this by exploiting opsins’ sensitivity to integrated optical energy, using Poisson coding, temporally-distributed stimulation to evenly distribute the stimulation power consumption, enabled by our raster scanning technique for efficient μLED addressing. This reduces wireless data communication requirements, and significantly reduces IC-to-optrode interconnections, making large-scale implementation feasible. Our wireless and battery-less stimulator implant comprises blocks for optical stimulation, fully-adaptive spike detection, and closed-loop calibration. It calibrates light intensity for each μLED row based on recorded spiking activity.
Open Access
CMOS Capacitive Sensor for Cellular and Molecular Monitoring
(2024-07-22) Tabrizi, Hamed Osouli; Magierowski, Sebastian; Ghafar-Zadeh, Ebrahim
This thesis focuses on the design and implementation of complementary metal-oxide-semiconductor (CMOS) based capacitive sensors for life science applications. The use of CMOS capacitive sensors has shown to be effective in a variety of applications, including chemical solvent monitoring, cellular monitoring, and DNA analysis. Despite significant advances, major challenges still exist with the current CMOS capacitive biosensing technologies including extending the dynamic range of detection, diminishing the sensitivity to remnants, and rapid high-throughput monitoring. In the second chapter, a fully integrated capacitive sensor with a wide input dynamic range (IDR) and a digital output is proposed. The design concepts and constraints, functionality, characterization, and experimental results with chemical solvents are also demonstrated in this chapter. With this novel topology, a significant increase in the IDR has been achieved which is discussed in the same chapter. Furthermore, in this chapter, we have proposed a novel calibration-free capacitive sensing technique. The proposed technique allows for uncovering the sudden changes due to the remnants as well as gradual changes due to target molecules and cells. The input dynamic range of the system is 400fF based on the post-layout simulation results. the measured resolution of the sensor is equal to 416 aF with up to 1.27 pF input offset adjustment range using the programmable bank of capacitors with a resolution of 10 fF. In the third chapter, we present a fully integrated capacitive sensor array for life science applications. This sensing device consists of an array of 16 × 16 interdigitated electrodes (IDEs) integrated with a charge-based readout and multiplexing circuitries on the same chip. This sensing device has a wide IDR of about 100 fF and a resolution of 150 aF, and the capability of temporal, spatial, and dielectric sensing. It makes it possible to develop a low-cost, calibration-free sensing platform for life science applications. In this chapter, the functionality and applicability of the proposed sensing device have been demonstrated and discussed by introducing various chemical solvents including ethanol, methanol, and pure water. The simulation and experimental results achieved in this work have taken us one step closer to a fully automated calibration-free capacitive sensing platform for high-throughput monitoring in life science applications. In the fourth chapter, the applicability of the proposed CMOS capacitive sensor for monitoring dried DNA mass has been demonstrated with experimental results. These experiments enabled us to measure the linear effect of five different concentrations with a resolution of 45 ng/μl DNA mass in ultra-pure water. With this novel application of the CMOS capacitive sensor, we can monitor the dried DNA for DNA storage monitoring purposes. Based on the results, the detection range of sub-pico mol has been achieved which is compatible with the concentrations of DNA used in DNA memory technologies. In the fifth chapter, a rapid and accurate assessment of oral cells using our CMOS capacitive sensor chips has been demonstrated. This kind of diagnostics allows for the early detection and control of periodontal and gum diseases. the experimental and simulation results demonstrate the functionality and applicability of the proposed sensor for monitoring oral cells in a small volume of 1 µl saliva samples. These results reveal that the hydrophilic adhesion of oral cells on the chip alters the capacitance of IDEs. The presented results in this chapter set a new stage for the emergence of sensing platforms for testing oral samples.
Open Access
Data-driven Methods for Optimal Power Flow in Smart Grids
(2024-11-07) Wang, Junfei; Srikantha, Pirathayini
The modern power grid is undergoing significant changes, driven by increasing complexity, the integration of renewable energy sources, and the urgent need to reduce greenhouse gas emissions. These challenges necessitate more advanced methods for power grid operations. System operators continuously solve the Optimal Power Flow (OPF) problem at regular intervals to determine the most economical dispatch of power while balancing electricity supply and demand. However, traditional OPF and convex relaxation methods often face issues related to feasibility and computational speed. Recently, machine learning methods have gained considerable attention as potential solutions to these challenges. As discussed in the literature, these methods include supervised learning, hybrid approaches that combine physical solvers or equations with machine learning, and unsupervised learning. Despite these advancements, there remain research gaps that need to be addressed. In this thesis, three mechanisms for addressing the OPF problem from different perspectives are proposed. Firstly, a supervised learning algorithm with a subsequent feasibility calibration method is introduced. Secondly, a generative adversarial network (GAN) with a representation learning module is studied and employed as an optimizer for the OPF problem. Lastly, the nearly convex nature of power flow data is investigated, motivating the development of a data-driven convex relaxation approach to solve the OPF problem. This thesis makes significant contributions to the literature by ensuring the feasibility of OPF solvers through post-process algorithms with theoretical support, relaxing the assumption of having optimal solutions for training, and demonstrating high performance of data-driven OPF methods on large systems, such as the PGLIB 2000-bus system.
Open Access
Distributed Communication and Control Frameworks for Smart Grids using the Internet of Things and Blockchain Technology
(2021-11-15) Saxena, Shivam Kumar; Farag, Hany E. Z.
Smart distribution grids (SDGs) are power systems that harness distributed energy resources (DERs) to increase their operational efficiency and sustainability. However, the uncontrolled operation of DERs lead to operational challenges, resulting in transformer overload and voltage violations. Distribution system operators (DSOs) are responsible for preventing such issues, however, DERs are typically owned by agents such as homeowners and private enterprises, whose motivations revolve around financial incentives and maximizing operational convenience, which do not always align with the DSO's objectives. Thus, new communication and control frameworks are required to coordinate the actions of agents and DSOs to deliver mutually beneficial results. The architectures of these frameworks should be distributed to avoid unilateral authority, and auditable to alleviate any trust issues between participants. Thus, this thesis develops distributed communication and control frameworks for SDGs that are built upon modern communication technologies such as the Internet of Things (IoT), and blockchains, both of which provide architectures that are distributed. The proposed control strategies of this thesis are inspired from principles related to transactive energy systems (TES), where distributed control techniques are combined with economically oriented decision making to improve overall energy efficiency. Accordingly, this thesis proposes three new frameworks, and validates their efficacy using both simulated and real-world experiments at a microgrid in Vaughan, Ontario. First, a fully distributed communication framework (DCF) is proposed for agent messaging, which is built upon the IoT-based framework known as Data Distribution Service (DDS). The DCF provides 1000 messages/second at 36 millisecond latency, and also enhances the efficacy of agents in resolving voltage violations in real-time at the microgrid. Second, a blockchain-based TES is proposed to enable agents to bid for voltage regulation services, where smart contracts enable multiple violations to be resolved in parallel, leading to less bidding cycles. Third, a blockchain-based residential energy trading system (RETS) is proposed , which enables residential communities and DSOs to participate in peer to peer energy trading and demand response. The RETS reduces the peak demand of the community by 48 kW (62%), which leads to an average savings of $1.02 M for the DSO by avoiding transformer upgrades.
Open Access
Exploiting Novel Deep Learning Architecture in Character Animation Pipelines
(2022-12-14) Ghorbani, Saeed; Troje, Nikolaus
This doctoral dissertation aims to show a body of work proposed for improving different blocks in the character animation pipelines resulting in less manual work and more realistic character animation. To that purpose, we describe a variety of cutting-edge deep learning approaches that have been applied to the field of human motion modelling and character animation. The recent advances in motion capture systems and processing hardware have shifted from physics-based approaches to data-driven approaches that are heavily used in the current game production frameworks. However, despite these significant successes, there are still shortcomings to address. For example, the existing production pipelines contain processing steps such as marker labelling in the motion capture pipeline or annotating motion primitives, which should be done manually. In addition, most of the current approaches for character animation used in game production are limited by the amount of stored animation data resulting in many duplicates and repeated patterns. We present our work in four main chapters. We first present a large dataset of human motion called MoVi. Secondly, we show how machine learning approaches can be used to automate proprocessing data blocks of optical motion capture pipelines. Thirdly, we show how generative models can be used to generate batches of synthetic motion sequences given only weak control signals. Finally, we show how novel generative models can be applied to real-time character control in the game production.
Open Access
Federated Learning for Heterogeneous Networks: Algorithmic and System Design
(2024-03-16) Wu, Hongda; Wang, Ping
Building reliable machine learning models depends on access to data samples. With the increasingly advanced sensing and computing capabilities on edge devices, the ever-stringent data privacy legislation, and growing user privacy concerns, it is crucial to build learning models from separate, heterogeneous data sources without violating user privacy. Federated Learning (FL) can facilitate collaborative machine learning without accessing user-sensitive data and has emerged as an attractive paradigm for mobile edge networks. However, federated optimization builds on a heterogeneous environment, which brings challenges beyond traditional distributed learning. Though FL is viewed as a promising technique for enabling intelligent applications, the current FL system suffers from high communication costs, restricting it from being applied in mobile edge networks. To fully release the potential, the FL design must be communication-efficient, adaptive, and robust to the heterogeneous training environment. In this thesis, we aim to address the practical challenges of FL in a conscientious manner. Particularly, we try to understand and address some of those challenges in federated networks and build FL systems that fulfill the accuracy, efficiency, and robustness requirements. Starting with the primary challenge, i.e., data heterogeneity, we study how it impacts the model accuracy and communication cost in the collaborative training system. To address this concern, we develop new and scalable algorithms that can quantify the contribution from participating devices, thus alleviating the negative impact of data heterogeneity and reducing the overall communication burden. To handle another major challenge, i.e., the heterogeneity of computation capabilities among different types of edge devices, we devise a new sub-model training method to enable devices with heterogeneous computation capabilities to participate in and contribute to the FL system, making it robust to the straggler effect. The proposed solutions are rigorously compared with popularly adopted benchmarks from theoretical and empirical perspectives. Finally, we provide a preliminary discussion on personalized FL and point out the potentially interesting research directions in the related fields. Although the proposed methods and designs originate from the practical application of FL, the theoretical insights gained from this thesis can be extended to a broader context of trustworthy machine learning.
Open Access
Implementation of a Neural Network-Based ASIC Chip for Mobile DNA Devices
(2025-04-10) Wu, Zhongpan; Magierowski, Sebastian
Portable DNA sequencing, particularly using nanopore technology, has the potential to revolutionize genomics by making it accessible in a wide range of environments. However, current state-of-the-art devices face significant challenges due to the lack of integrated bioinformatics processing capabilities. This research addresses these challenges by developing specialized System-on-Chip (SoC) architectures designed for real-time bioinformatics analysis, integrating both a machine learning (ML)-based basecalling accelerator and an Edit Distance (ED) accelerator for sequence comparison. The proposed SoC architecture, based on an open-source RISC-V core, features hardware accelerators tailored for the computational demands of nanopore DNA sequencing. Performance evaluation was conducted in two stages: first through FPGA prototyping, followed by integration into a fabricated SoC. The FPGA prototyping demonstrated nearly 2,000x speedup for ML-based basecalling compared to a standalone RISC-V core, while maintaining an accuracy rate of 83.7%. It also showed an 11.5x and 1.2x energy efficiency improvement over x86 CPUs and high-end GPUs, respectively. The ED accelerator for sequence comparison achieved a 538x boost in energy efficiency compared to commercial CPUs. The fabricated SoC, implemented in a 22-nm CMOS process, successfully demonstrated the feasibility of integrating advanced bioinformatics tasks into a single, power-efficient chip. Evaluation of the fabricated SoC confirmed its capability for real-time, mobile DNA sequencing with high accuracy, reduced power consumption, and significantly improved processing speed, all while reducing dependency on external computational devices. This research represents a significant step towards realizing a fully integrated, stand-alone DNA sequencing solution, capable of performing comprehensive bioinformatics analyses in real time.
Open Access
Improving the Logging Practices in DevOps
(2020-11-13) Chen, Boyuan; Jiang, ZhenMing
DevOps refers to a set of practices dedicated to accelerating modern software engineering process. It breaks the barriers between software development and IT operations and aims to produce and maintain high quality software systems. Software logging is widely used in DevOps. However, there are few guidelines and tool support for composing high quality logging code and current application context of log analysis is very limited with respect to feedback for developers and correlations among other telemetry data. In this thesis, we first conduct a systematic survey on the instrumentation techniques used in software logging. Then we propose automated approaches to improving software logging practices in DevOps by leveraging various types of software repositories (e.g., historical, communication, bug, and runtime repositories). We aim to support the software development side by providing guidelines and tools on developing and maintaining high quality logging code. In particular, we study historical issues in logging code and their fixes from six popular Java-based open source projects. We found that existing state-of-the-art techniques on detecting logging code issues cannot detect a majority of the issues in logging code. We also study the use of Java logging utilities in the wild. We find the complexity of the use of logging utilities increases as the project size increases. We aim to support the IT operation side by enriching the log analysis context. In particular, we propose a technique, LogCoCo, to systematically estimate code coverage via executing logs. The results of LogCoCo are highly accurate under a variety of testing activities. Case studies show that our techniques and findings can provide useful software logging suggestions to both developers and operators in open source and commercial systems.
Open Access
Improving the Reliability of AI Infrastructure Software with Data-Driven Software Analytics
(2025-04-10) Shiri Harzevili, Nima; Wang, Song
Today, AI systems are increasingly used in safety-critical fields like transportation, finance, and robotics. While AI offers many benefits that simplify daily life, its widespread adoption has also increased threats, highlighting the urgent need for secure AI. Failing to protect AI systems against security threats could have disastrous consequences. Like traditional software, AI applications are built upon multiple layers: application and service, model, framework, library and compiler, and hardware. In this thesis, we first conduct an empirical study to characterize and understand security weaknesses in AI frameworks. We identified Memory Leak (CWE-401) and Integer Overflow (CWE-190) as the two most prevalent bug types, with common root causes being improper validation of tensor properties and poor memory management. Next, we assess the effectiveness of five popular static analysis tools for identifying bugs in AI frameworks. Our study shows that these tools detect only a small fraction of bugs. Key limitations include lacking support for AI-specific macros/APIs, tensor data types, and computation graphs. We then evaluate dynamic analysis techniques, specifically DL fuzz testing tools, on real-world bugs in AI frameworks. Our findings show that DL fuzzers detect only 6.5% (34 out of 517) of unique bugs in our benchmark dataset. We also identify two main factors limiting the effectiveness of these tools. Based on these findings, we developed a novel API-level DL fuzzer called Orion to address the limitations of existing fuzzers and identify new bugs in AI backend implementations. Our study confirms that most bugs stem from inadequate checks on tensor properties. In the final chapter, we characterize DL checker bugs and propose TensorGuard, an innovative tool designed to detect and repair such bugs. TensorGuard achieves an accuracy of 11.1%, surpassing the state-of-the-art bug repair baseline by 2%. We also tested TensorGuard on six months of checker-related updates (493 changes) in Google’s JAX library, successfully detecting 64 checker bugs. Taken together, the findings from the five studies provide robust evidence that using data-driven software analytics to mine publicly available historical repositories of AI frameworks—such as code repositories and bug databases—holds immense potential for advancing the reliability of AI infrastructure software.
Open Access
Integrated Analog Readout Array and Digital Backend for Mobile DNA Sequencing
(2025-04-10) Dawji, Yunus Ibrahim; Magierowski, Sebastian
DNA, a fundamental biomolecule, contains the genetic code that governs the development, functioning, and reproduction of all living organisms. It is composed of smaller molecular units called nucleotides. The process of determining the specific sequence of these nucleotides is known as DNA sequencing. An innovative approach to this is nanopore-based DNA sequencing. Unlike many other methods, nanopore sequencing detects DNA molecules directly, rather than relying on secondary phenomena, and can do so in real-time as the molecules pass through the device. This technology holds the potential to significantly democratize DNA sequencing, which could revolutionize medical diagnostics and personalized medicine, ultimately improving the lives of billions. This report focuses on optimizing the performance and cost-efficiency of nanopore-based sequencing, particularly by exploring the opportunities for implementing low-cost, integrated analog front-end arrays and application-specific accelerated digital back-end systems. This thesis presents three iterative versions of a digital readout integrated circuit (DROIC), each enhancing throughput density through architectural and circuit-level advancements. The first version (DROICv1) employs a discrete-time (DT) amplifier with in-pixel successive approximation ADCs. The second version (DROICv2) increases throughput density using column-based ADCs. The third version incorporates an asynchronous reset amplifier, further enhancing throughput density by reducing amplifier noise. To validate the system's functionality, the thesis demonstrates biological ion-channel and solid-state nanopore measurements. It also introduces methods for post-processing the chip to enable on-chip sensors. Finally, a RISC-V-based digital basecaller is presented, optimizing the speed and energy efficiency of the digital backend.
Open Access
Integrated Circuits and Systems for Adaptive Optimization of Energy Storage Efficiency in Resonant Inductive Wireless Power Receivers
(2023-12-08) Taghadosi, Mansour; Kassiri, Hossein
Recent developments in highly-miniaturized implantable neuro-stimulators has led to a rapid rise in their required power and data transmission throughput resulting in an increase of instantaneous-to-average ratio in their power consumption. Motivated by crucial role of efficient energy storage in such systems, we introduce energy management strategies in wireless powering links, in which the key performance measure is the energy stored during a limited time interval rather than the average energy delivered to the load. First, the development of an algorithmic scheme for maximizing energy storage in current-mode (CM) resonant inductive power receivers is presented. The efficacy and precision of the presented analytical model is confirmed with CAD-based simulation results and validated using experimental measurements. Furthermore, a 0.45mm2 integrated circuit (IC) fabricated in 0.18µm CMOS is presented that performs the above-mentioned optimization. By continuous monitoring of incident waveform dynamics, the IC automatically adapts its optimal solution on-the-fly to any change in the inductive link's physical or electrical parameters. The computations are implemented using analog circuits which minimize IC's power consumption while making it needless of high-speed ADC/DAC. Our measurement results show that by using the IC, the energy storage efficiency is improved by 53% and 67% for the two tested links, compared to the conservative schemes, while consuming two orders of magnitude smaller energy than it saves through optimization. To the best of our knowledge, this is the first reported link-adaptive calibration-free IC for optimizing energy storage efficiency in CM receivers. Second, a 2×2mm2 IC is fabricated in 0.18µm CMOS that maximizes the energy storage efficiency in resonant inductive links with voltage-mode receivers. The IC automatically stores the maximum possible energy while simultaneously providing the required continuous load's power. In the proposed scheme, the optimal operation is maintained by detecting and operating the receiver at a specific optimal voltage, eliminating the need for direct power measurements and adaptive matching circuits. The power reception and delivery phases are isolated which ensures maximum power reception independent of the actual loading at the receiver. The measurement results demonstrate up to 48.44% and 93.97% improvements for the charging time and the stored power, respectively.
Open Access
Investigating and Modeling the Effects of Task and Context on Drivers' Attention
(2024-07-18) Kotseruba, Iuliia; Tsotsos, John K.
Driving, despite its widespread nature, is a demanding and inherently risky activity. Any lapse in focus, such as failing to look at the traffic signals or not noticing the actions of other road users, can lead to severe consequences. Technology for driver monitoring and assistance aims to mitigate these issues, but requires a deeper understanding of how drivers observe their surroundings to make decisions. In this dissertation, we investigate the link between where drivers look, tasks they perform, and the surrounding context. To do so, we first conduct a meta-study of the behavioral literature that documents an overwhelming importance of the top-down (task-driven) effects on gaze. Next, we survey applied research to show that most models do not necessarily make this connection and instead establish correlations between where the drivers looked and images of the scene, without explicitly considering drivers' actions and environment. Next, we annotate and analyze the four largest publicly available datasets that contain driving footage and eye-tracking data. The new annotations for task and context show that data is dominated by trivial scenarios (e.g. driving straight, standing) and help uncover problems with the typical data recording and processing pipelines that result in noisy, missing, or inaccurate data, particularly during safety-critical scenarios (e.g. intersections). For the only dataset with the raw data available, we create a new ground truth which alleviates some of the discovered issues. We also provide recommendations for future data collection. Using the new annotations and ground truth, we benchmark a representative set of bottom-up models for gaze prediction (i.e. those that do not represent the task explicitly). We conclude that while corrected ground truth boosts performance, the implicit representation is not sufficient to capture the effects of task and context on where drivers look. Lastly, motivated by these findings, we propose a task- and context-aware model for drivers' gaze prediction with explicit representation of the drivers' actions and context. The first version of the model, SCOUT, improves state-of-the-art performance by over 80% overall and 30% on the most challenging scenarios. We then propose SCOUT+, which relies on the more readily available route and map information similar to what the driver might see on the in-car navigation screen. SCOUT+ achieves comparable results as the version that uses more precise numeric and text labels.
Open Access
Learned Exposure Selection for High Dynamic Range Image Synthesis
(2021-03-08) Segal, Shane Maxwell; Brown, Michael; Brubaker, Marcus
High dynamic range (HDR) imaging is a photographic technique that captures a greater range of luminance than standard imaging techniques. Traditionally accomplished by specialized sensors, HDR images are regularly created through the fusion of multiple low dynamic range (LDR) images that can now be captured by smartphones or other consumer grade hardware. Three or more images are traditionally required to generate a well-exposed HDR image. This thesis presents a novel system for the fast synthesis of HDR images by means of exposure fusion with only two images required. Experiments show that a sufficiently trained neural network can predict a suitable exposure value for the next image to be captured, when given an initial image as input. With these images fed into the exposure fusion algorithm, a high-quality HDR image can be quickly generated.
Open Access
Leveraging Dual-Pixel Sensors for Camera Depth of Field Manipulation
(2022-03-03) Abuolaim, Abdullah Ahmad Taleb; Brown, Michael S.
Capturing a photo with clear scene details is important in photography and for computer vision applications. The range of distance in the real world that makes the scene's objects appear with clear details is known to be the camera's depth of field (DoF). The DoF is controlled by either adjusting lens distance to sensor (i.e., focus distance), aperture size, and/or focal length of the cameras. At capture time, especially for video recording, DoF adjustment is often restricted to lens movements as adjusting other parameters introduces artifacts that can be visible in the recorded video. Nevertheless, the desired DoF is not always achievable at capture time due to many reasons like the physical constraints of the camera optics. This leads to another direction of adjusting DoF after effect as a post-processing step. Although pre- or post-capture DoF manipulation is essential, there are few datasets and simulation platforms that enable investigating DoF at capture time. Another limitation is the lack of real datasets for DoF extension (i.e., defocus deblurring), where the prior work relies on synthesizing defocus blur and ignores the physical formation of defocus blur in real cameras (e.g., lens aberration and radial distortion). To address this research gap, this thesis revisits DoF manipulation from two point of views: (1) adjusting DoF at capture time, a.k.a. camera autofocus (AF), within the context of dynamic scenes (i.e., video AF); (2) computationally manipulating the DoF as a post-capturing process. To this aim, we leverage a new imaging sensor technology known as the dual-pixel (DP) sensor. DP sensors are used to optimize camera AF and can provide good cues to estimate the amount of defocus blur present at each pixel location. In particular, this thesis provides the first 4D temporal focal stack dataset along with AF platform to examine video AF. It also presents insights about user preference that lead to propose two novel video AF algorithms. As for post-capture DoF manipulation, we examine the problem of reducing defocus blur (i.e., extending DoF) by introducing a new camera aperture adjustment to collect the first dataset that has images with real defocus blur and their corresponding all-in-focus ground truth. We also propose the first end-to-end learning-based defocus deblurring method. We extend image defocus deblurring to a new domain application (i.e., video defocus deblurring) by designing a data synthesis framework to generate realistic DP video data through modeling physical camera constraints, such as lens aberration and redial distortion. Finally, we build on top of a data synthesis framework to synthesize shallow DoF with other aesthetic effects, such as multi-view synthesis and image motion.