Electrical Engineering and Computer Science
Permanent URI for this collection
Browse
Browsing Electrical Engineering and Computer Science by Title
Now showing 1 - 20 of 33
Results Per Page
Sort Options
Item Open Access A Cloud-Based Extensible Avatar For Human Robot Interaction(2019-07-02) AlTarawneh, Enas Khaled Ahm; Jenkin, MichaelAdding an interactive avatar to a human-robot interface requires the development of tools that animate the avatar so as to simulate an intelligent conversation partner. Here we describe a toolkit that supports interactive avatar modeling for human-computer interaction. The toolkit utilizes cloud-based speech-to-text software that provides active listening, a cloud-based AI to generate appropriate textual responses to user queries, and a cloud-based text-to-speech generation engine to generate utterances for this text. This output is combined with a cloud-based 3D avatar animation synchronized to the spoken response. Generated text responses are embedded within an XML structure that allows for tuning the nature of the avatar animation to simulate different emotional states. An expression package controls the avatar's facial expressions. The introduced rendering latency is obscured through parallel processing and an idle loop process that animates the avatar between utterances. The efficiency of the approach is validated through a formal user study.Item Open Access A Multi-Mode Stacked-Switch Inverter/Rectifier Leg for Bidirectional Power Converters(2022-08-08) Emamalipour Shalkouhi, Reza; Lam, John Chi WoThe development of renewable energy systems (e.g. wind and solar) is significant to cope with an energy crisis yet, at the same time, it presents challenges to the grid for their MW-scale integration due to their volatile characteristics. Battery energy storage systems are essential in providing sustainable power and improving the overall system reliability effectively with the large deployments of renewable energy conversion systems. Bidirectional power converters are responsible for transferring power between the battery energy storage system and the grid. Selecting an efficient and cost-effective power topology along with a reliable control system is critical to ensure that the energy storage system operates safely with prolonged service life and minimized maintenance cost. In this dissertation, a multi-mode stacked-switch leg with soft-switching capability for use in bidirectional DC/DC converters is proposed for battery energy storage applications. This dissertation consists of three parts. The first part focuses on the development of a bidirectional soft-switched converter utilizing a CLLC resonant circuit and the proposed multi-mode switching legs. The presented leg is able to facilitate multiple operating modes to enable high voltage gain under different operating conditions and allow the converter to operate with a much lower output voltage ripple (50%) compared with the conventional stacked-switches-based converter topology. In the second part of this thesis, a fault-tolerant control scheme is proposed which enables seamless post fault operation of the presented multi-mode DC/DC converter if any switches in the presented leg experience an open-circuit fault. In the third part of this thesis, a comprehensive hybrid control system is proposed so that the overall voltage gain range of the converter is widely extended with a narrow switching frequency range (less than 10% of the base frequency), while at the same time, the efficiency of the converter is improved over the whole gain range (more than 1%). The operating principles and characteristics of the proposed converter and the proposed control schemes are explained in detail in this thesis. The performance of each of the presented circuit and control concepts is verified through simulation as well as experimental results on silicon-carbide (SiC)-based proof-of-concept hardware prototypes.Item Open Access A Unified Multiscale Encoder-Decoder Transformer for Video Segmentation(2024-07-18) Karim, Rezaul; Wildes, Richard P.This dissertation presents an end-to-end trainable and unified multiscale encoder-decoder transformer for dense video estimation, with a focus on segmentation. We investigate this direction by exploring unified multiscale processing throughout the processing pipeline of feature encoding, context encoding and object decoding in an encoder-decoder model. Correspondingly, we present a Multiscale Encoder-Decoder Video Transformer (MED-VT) that uses multiscale representation throughout and employs an optional input beyond video (e.g., audio), when available, for multimodal processing (MED-VT++). Multiscale representation at both encoder and decoder yields three key benefits: (i) implicit extraction of spatiotemporal features at different levels of abstraction for capturing dynamics without reliance on additional preprocessing, such as computing object proposals or optical flow, (ii) temporal consistency at encoding and (iii) coarse-to-fine detection for high-level (e.g., object) semantics to guide precise localization at decoding. Moreover, we explore temporal consistency through a transductive learning scheme that exploits many-to-label propagation across time. To demonstrate the applicability of the approach, we provide empirical evaluation of MED-VT/MEDVT++ on three unimodal video segmentation tasks: (Automatic Video Object Segmentation (AVOS), actor-action segmentation, Video Semantic Segmentation (VSS)) and a multimodal task (Audio Visual Segmentation (AVS)). Results show that the proposed architecture outperforms alternative state-of-the-art approaches on multiple benchmarks using only video (and optional audio) as input, without reliance on additional preprocessing, such as object proposals or optical flow. We also document details of the model’s internal learned representations by presenting a detailed interpretability study, encompassing both quantitative and qualitative analyses.Item Open Access Biologically-inspired Neural Networks for Shape and Color Representation(2022-03-03) Mehrani, Paria; Tsotsos, John K.The goal of human-level performance in artificial vision systems is yet to be achieved. With this goal, a reasonable choice is to simulate this biological system with computational models that mimic its visual processing. A complication with this approach is that the human brain, and similarly its visual system, are not fully understood. On the bright side, with remarkable findings in the field of visual neuroscience, many questions about visual processing in the primate brain have been answered in the past few decades. Nonetheless, a lag in incorporating these new discoveries into biologically-inspired systems is evident. The present work introduces novel biologically-inspired models that employ new findings of shape and color processing into analytically-defined neural networks. In contrast to most current methods that attempt to learn all aspects of behavior from data, here we propose to bootstrap such learning by building upon existing knowledge rather than learning from scratch. Put simply, the processing networks are defined analytically using current neural understanding and learned where such knowledge is not available. This is thus a hybrid strategy that hopefully combines the best of both worlds. Experiments on the artificial neurons in the proposed networks demonstrate that these neurons mimic the studied behavior of biological cells, suggesting a path forward for incorporating analytically-defined artificial neural networks into computer vision systems.Item Open Access Bridging Data Management and Machine Learning: Case Studies on Index, Query Optimization, and Data Acquisition(2022-12-14) Li, Yifan; Yu, XiaohuiData management tasks and techniques can be observed in a variety of real world scenarios, including web search, business analysis, traffic scheduling, and advertising, to name a few. While data management as a research area has been studied for decades, recent breakthroughs in Machine Learning (ML) provide new perspectives to define and tackle problems in the area, and at the same time, the wisdom integrated in data management techniques also greatly helps to accelerate the advancement of Machine Learning. In this work, we focus on the intersection area of data management and Machine Learning, and study several important, interesting, and challenging problems. More specifically, our work mainly concentrates on the following three topics: (1) leveraging the ability of ML models in capturing data distribution to design lightweight and data-adaptive indexes and search algorithms to accelerate similarity search over large-scale data; (2) designing robust and trustworthy approaches to improve the reliability of both conventional query optimizer and learned query optimizer, and boost the performance of DBMS; (3) developing data management techniques with statistical guarantees to acquire the most useful training data for ML models with a budget limitation, striving to maximize the accuracy of the model. We conduct detailed theoretical and empirical study for each topic, establishing these fundamental problems as well as developing efficient and effective approaches for the tasks.Item Open Access Closed-Loop Highly-Scalable Retinal Implant with Fully-Analog ED-Based Adaptive-Threshold Spike Detection and Poisson-Coded Temporally-Distributed Optogenetic Stimulation(2024-07-18) Yousefi, Tayebeh; Kassiri, HosseinIntraocular stimulators show promise for treating retinal degeneration by restoring visual input to the damaged retina. This is achieved by capturing images with a wearable camera and accordingly stimulating remaining retinal cells, effectively bypassing dysfunctional photoreceptors. State-of-the-art retinal stimulators face a major challenge due to the lack of cell-type specificity of electrical stimulation (activating both ON and OFF pathways in the retina) leading to limited visual perception due to sending contradictory messages to the brain. This fundamental limit motivated us to investigate the development of an optogenetic-based retinal prosthesis, that uses promoter opsins for selective activation of ON bipolar cells, offering a more natural vision restoration. In developing such a device, the first challenge we faced was optimizing stimulation strategy for optimal therapeutic efficacy. Responding to this challenge, we first present a retina-inspired computational framework to evaluate and optimize an optogenetic epi-retinal neurostimulator. This framework reveals that optical stimulation, compared to electrical stimulation, provides superior visual perception, which improves with increased μLED array resolution. The framework also explores optical stimulation factors and μLED specifications like light intensity and wavelength spatial resolution and light divergence. A critical issue in optogenetics is controlling opsin distribution, as uneven distribution affects light sensitivity across the retina. Variations in tissue properties and fluid dynamics introduce unpredictability in stimulation effectiveness. Our solution to this issue includes a scalable optogenetic stimulator IC, which features channel-specific closed-loop calibration for defining the optimal stimulation intensity using a temporally adaptive-threshold spike detection circuit. The second challenge we addressed was scalability, and by association, energy efficiency of the device. Scaling implantable stimulators is limited by instantaneous power demands during multi-channel stimulation. We address this by exploiting opsins’ sensitivity to integrated optical energy, using Poisson coding, temporally-distributed stimulation to evenly distribute the stimulation power consumption, enabled by our raster scanning technique for efficient μLED addressing. This reduces wireless data communication requirements, and significantly reduces IC-to-optrode interconnections, making large-scale implementation feasible. Our wireless and battery-less stimulator implant comprises blocks for optical stimulation, fully-adaptive spike detection, and closed-loop calibration. It calibrates light intensity for each μLED row based on recorded spiking activity.Item Open Access CMOS Capacitive Sensor for Cellular and Molecular Monitoring(2024-07-22) Tabrizi, Hamed Osouli; Magierowski, Sebastian; Ghafar-Zadeh, EbrahimThis thesis focuses on the design and implementation of complementary metal-oxide-semiconductor (CMOS) based capacitive sensors for life science applications. The use of CMOS capacitive sensors has shown to be effective in a variety of applications, including chemical solvent monitoring, cellular monitoring, and DNA analysis. Despite significant advances, major challenges still exist with the current CMOS capacitive biosensing technologies including extending the dynamic range of detection, diminishing the sensitivity to remnants, and rapid high-throughput monitoring. In the second chapter, a fully integrated capacitive sensor with a wide input dynamic range (IDR) and a digital output is proposed. The design concepts and constraints, functionality, characterization, and experimental results with chemical solvents are also demonstrated in this chapter. With this novel topology, a significant increase in the IDR has been achieved which is discussed in the same chapter. Furthermore, in this chapter, we have proposed a novel calibration-free capacitive sensing technique. The proposed technique allows for uncovering the sudden changes due to the remnants as well as gradual changes due to target molecules and cells. The input dynamic range of the system is 400fF based on the post-layout simulation results. the measured resolution of the sensor is equal to 416 aF with up to 1.27 pF input offset adjustment range using the programmable bank of capacitors with a resolution of 10 fF. In the third chapter, we present a fully integrated capacitive sensor array for life science applications. This sensing device consists of an array of 16 × 16 interdigitated electrodes (IDEs) integrated with a charge-based readout and multiplexing circuitries on the same chip. This sensing device has a wide IDR of about 100 fF and a resolution of 150 aF, and the capability of temporal, spatial, and dielectric sensing. It makes it possible to develop a low-cost, calibration-free sensing platform for life science applications. In this chapter, the functionality and applicability of the proposed sensing device have been demonstrated and discussed by introducing various chemical solvents including ethanol, methanol, and pure water. The simulation and experimental results achieved in this work have taken us one step closer to a fully automated calibration-free capacitive sensing platform for high-throughput monitoring in life science applications. In the fourth chapter, the applicability of the proposed CMOS capacitive sensor for monitoring dried DNA mass has been demonstrated with experimental results. These experiments enabled us to measure the linear effect of five different concentrations with a resolution of 45 ng/μl DNA mass in ultra-pure water. With this novel application of the CMOS capacitive sensor, we can monitor the dried DNA for DNA storage monitoring purposes. Based on the results, the detection range of sub-pico mol has been achieved which is compatible with the concentrations of DNA used in DNA memory technologies. In the fifth chapter, a rapid and accurate assessment of oral cells using our CMOS capacitive sensor chips has been demonstrated. This kind of diagnostics allows for the early detection and control of periodontal and gum diseases. the experimental and simulation results demonstrate the functionality and applicability of the proposed sensor for monitoring oral cells in a small volume of 1 µl saliva samples. These results reveal that the hydrophilic adhesion of oral cells on the chip alters the capacitance of IDEs. The presented results in this chapter set a new stage for the emergence of sensing platforms for testing oral samples.Item Open Access Data-driven Methods for Optimal Power Flow in Smart Grids(2024-11-07) Wang, Junfei; Srikantha, PirathayiniThe modern power grid is undergoing significant changes, driven by increasing complexity, the integration of renewable energy sources, and the urgent need to reduce greenhouse gas emissions. These challenges necessitate more advanced methods for power grid operations. System operators continuously solve the Optimal Power Flow (OPF) problem at regular intervals to determine the most economical dispatch of power while balancing electricity supply and demand. However, traditional OPF and convex relaxation methods often face issues related to feasibility and computational speed. Recently, machine learning methods have gained considerable attention as potential solutions to these challenges. As discussed in the literature, these methods include supervised learning, hybrid approaches that combine physical solvers or equations with machine learning, and unsupervised learning. Despite these advancements, there remain research gaps that need to be addressed. In this thesis, three mechanisms for addressing the OPF problem from different perspectives are proposed. Firstly, a supervised learning algorithm with a subsequent feasibility calibration method is introduced. Secondly, a generative adversarial network (GAN) with a representation learning module is studied and employed as an optimizer for the OPF problem. Lastly, the nearly convex nature of power flow data is investigated, motivating the development of a data-driven convex relaxation approach to solve the OPF problem. This thesis makes significant contributions to the literature by ensuring the feasibility of OPF solvers through post-process algorithms with theoretical support, relaxing the assumption of having optimal solutions for training, and demonstrating high performance of data-driven OPF methods on large systems, such as the PGLIB 2000-bus system.Item Open Access Distributed Communication and Control Frameworks for Smart Grids using the Internet of Things and Blockchain Technology(2021-11-15) Saxena, Shivam Kumar; Farag, Hany E. Z.Smart distribution grids (SDGs) are power systems that harness distributed energy resources (DERs) to increase their operational efficiency and sustainability. However, the uncontrolled operation of DERs lead to operational challenges, resulting in transformer overload and voltage violations. Distribution system operators (DSOs) are responsible for preventing such issues, however, DERs are typically owned by agents such as homeowners and private enterprises, whose motivations revolve around financial incentives and maximizing operational convenience, which do not always align with the DSO's objectives. Thus, new communication and control frameworks are required to coordinate the actions of agents and DSOs to deliver mutually beneficial results. The architectures of these frameworks should be distributed to avoid unilateral authority, and auditable to alleviate any trust issues between participants. Thus, this thesis develops distributed communication and control frameworks for SDGs that are built upon modern communication technologies such as the Internet of Things (IoT), and blockchains, both of which provide architectures that are distributed. The proposed control strategies of this thesis are inspired from principles related to transactive energy systems (TES), where distributed control techniques are combined with economically oriented decision making to improve overall energy efficiency. Accordingly, this thesis proposes three new frameworks, and validates their efficacy using both simulated and real-world experiments at a microgrid in Vaughan, Ontario. First, a fully distributed communication framework (DCF) is proposed for agent messaging, which is built upon the IoT-based framework known as Data Distribution Service (DDS). The DCF provides 1000 messages/second at 36 millisecond latency, and also enhances the efficacy of agents in resolving voltage violations in real-time at the microgrid. Second, a blockchain-based TES is proposed to enable agents to bid for voltage regulation services, where smart contracts enable multiple violations to be resolved in parallel, leading to less bidding cycles. Third, a blockchain-based residential energy trading system (RETS) is proposed , which enables residential communities and DSOs to participate in peer to peer energy trading and demand response. The RETS reduces the peak demand of the community by 48 kW (62%), which leads to an average savings of $1.02 M for the DSO by avoiding transformer upgrades.Item Open Access Exploiting Novel Deep Learning Architecture in Character Animation Pipelines(2022-12-14) Ghorbani, Saeed; Troje, NikolausThis doctoral dissertation aims to show a body of work proposed for improving different blocks in the character animation pipelines resulting in less manual work and more realistic character animation. To that purpose, we describe a variety of cutting-edge deep learning approaches that have been applied to the field of human motion modelling and character animation. The recent advances in motion capture systems and processing hardware have shifted from physics-based approaches to data-driven approaches that are heavily used in the current game production frameworks. However, despite these significant successes, there are still shortcomings to address. For example, the existing production pipelines contain processing steps such as marker labelling in the motion capture pipeline or annotating motion primitives, which should be done manually. In addition, most of the current approaches for character animation used in game production are limited by the amount of stored animation data resulting in many duplicates and repeated patterns. We present our work in four main chapters. We first present a large dataset of human motion called MoVi. Secondly, we show how machine learning approaches can be used to automate proprocessing data blocks of optical motion capture pipelines. Thirdly, we show how generative models can be used to generate batches of synthetic motion sequences given only weak control signals. Finally, we show how novel generative models can be applied to real-time character control in the game production.Item Open Access Federated Learning for Heterogeneous Networks: Algorithmic and System Design(2024-03-16) Wu, Hongda; Wang, PingBuilding reliable machine learning models depends on access to data samples. With the increasingly advanced sensing and computing capabilities on edge devices, the ever-stringent data privacy legislation, and growing user privacy concerns, it is crucial to build learning models from separate, heterogeneous data sources without violating user privacy. Federated Learning (FL) can facilitate collaborative machine learning without accessing user-sensitive data and has emerged as an attractive paradigm for mobile edge networks. However, federated optimization builds on a heterogeneous environment, which brings challenges beyond traditional distributed learning. Though FL is viewed as a promising technique for enabling intelligent applications, the current FL system suffers from high communication costs, restricting it from being applied in mobile edge networks. To fully release the potential, the FL design must be communication-efficient, adaptive, and robust to the heterogeneous training environment. In this thesis, we aim to address the practical challenges of FL in a conscientious manner. Particularly, we try to understand and address some of those challenges in federated networks and build FL systems that fulfill the accuracy, efficiency, and robustness requirements. Starting with the primary challenge, i.e., data heterogeneity, we study how it impacts the model accuracy and communication cost in the collaborative training system. To address this concern, we develop new and scalable algorithms that can quantify the contribution from participating devices, thus alleviating the negative impact of data heterogeneity and reducing the overall communication burden. To handle another major challenge, i.e., the heterogeneity of computation capabilities among different types of edge devices, we devise a new sub-model training method to enable devices with heterogeneous computation capabilities to participate in and contribute to the FL system, making it robust to the straggler effect. The proposed solutions are rigorously compared with popularly adopted benchmarks from theoretical and empirical perspectives. Finally, we provide a preliminary discussion on personalized FL and point out the potentially interesting research directions in the related fields. Although the proposed methods and designs originate from the practical application of FL, the theoretical insights gained from this thesis can be extended to a broader context of trustworthy machine learning.Item Open Access Improving the Logging Practices in DevOps(2020-11-13) Chen, Boyuan; Jiang, ZhenMingDevOps refers to a set of practices dedicated to accelerating modern software engineering process. It breaks the barriers between software development and IT operations and aims to produce and maintain high quality software systems. Software logging is widely used in DevOps. However, there are few guidelines and tool support for composing high quality logging code and current application context of log analysis is very limited with respect to feedback for developers and correlations among other telemetry data. In this thesis, we first conduct a systematic survey on the instrumentation techniques used in software logging. Then we propose automated approaches to improving software logging practices in DevOps by leveraging various types of software repositories (e.g., historical, communication, bug, and runtime repositories). We aim to support the software development side by providing guidelines and tools on developing and maintaining high quality logging code. In particular, we study historical issues in logging code and their fixes from six popular Java-based open source projects. We found that existing state-of-the-art techniques on detecting logging code issues cannot detect a majority of the issues in logging code. We also study the use of Java logging utilities in the wild. We find the complexity of the use of logging utilities increases as the project size increases. We aim to support the IT operation side by enriching the log analysis context. In particular, we propose a technique, LogCoCo, to systematically estimate code coverage via executing logs. The results of LogCoCo are highly accurate under a variety of testing activities. Case studies show that our techniques and findings can provide useful software logging suggestions to both developers and operators in open source and commercial systems.Item Open Access Integrated Circuits and Systems for Adaptive Optimization of Energy Storage Efficiency in Resonant Inductive Wireless Power Receivers(2023-12-08) Taghadosi, Mansour; Kassiri, HosseinRecent developments in highly-miniaturized implantable neuro-stimulators has led to a rapid rise in their required power and data transmission throughput resulting in an increase of instantaneous-to-average ratio in their power consumption. Motivated by crucial role of efficient energy storage in such systems, we introduce energy management strategies in wireless powering links, in which the key performance measure is the energy stored during a limited time interval rather than the average energy delivered to the load. First, the development of an algorithmic scheme for maximizing energy storage in current-mode (CM) resonant inductive power receivers is presented. The efficacy and precision of the presented analytical model is confirmed with CAD-based simulation results and validated using experimental measurements. Furthermore, a 0.45mm2 integrated circuit (IC) fabricated in 0.18µm CMOS is presented that performs the above-mentioned optimization. By continuous monitoring of incident waveform dynamics, the IC automatically adapts its optimal solution on-the-fly to any change in the inductive link's physical or electrical parameters. The computations are implemented using analog circuits which minimize IC's power consumption while making it needless of high-speed ADC/DAC. Our measurement results show that by using the IC, the energy storage efficiency is improved by 53% and 67% for the two tested links, compared to the conservative schemes, while consuming two orders of magnitude smaller energy than it saves through optimization. To the best of our knowledge, this is the first reported link-adaptive calibration-free IC for optimizing energy storage efficiency in CM receivers. Second, a 2×2mm2 IC is fabricated in 0.18µm CMOS that maximizes the energy storage efficiency in resonant inductive links with voltage-mode receivers. The IC automatically stores the maximum possible energy while simultaneously providing the required continuous load's power. In the proposed scheme, the optimal operation is maintained by detecting and operating the receiver at a specific optimal voltage, eliminating the need for direct power measurements and adaptive matching circuits. The power reception and delivery phases are isolated which ensures maximum power reception independent of the actual loading at the receiver. The measurement results demonstrate up to 48.44% and 93.97% improvements for the charging time and the stored power, respectively.Item Open Access Investigating and Modeling the Effects of Task and Context on Drivers' Attention(2024-07-18) Kotseruba, Iuliia; Tsotsos, John K.Driving, despite its widespread nature, is a demanding and inherently risky activity. Any lapse in focus, such as failing to look at the traffic signals or not noticing the actions of other road users, can lead to severe consequences. Technology for driver monitoring and assistance aims to mitigate these issues, but requires a deeper understanding of how drivers observe their surroundings to make decisions. In this dissertation, we investigate the link between where drivers look, tasks they perform, and the surrounding context. To do so, we first conduct a meta-study of the behavioral literature that documents an overwhelming importance of the top-down (task-driven) effects on gaze. Next, we survey applied research to show that most models do not necessarily make this connection and instead establish correlations between where the drivers looked and images of the scene, without explicitly considering drivers' actions and environment. Next, we annotate and analyze the four largest publicly available datasets that contain driving footage and eye-tracking data. The new annotations for task and context show that data is dominated by trivial scenarios (e.g. driving straight, standing) and help uncover problems with the typical data recording and processing pipelines that result in noisy, missing, or inaccurate data, particularly during safety-critical scenarios (e.g. intersections). For the only dataset with the raw data available, we create a new ground truth which alleviates some of the discovered issues. We also provide recommendations for future data collection. Using the new annotations and ground truth, we benchmark a representative set of bottom-up models for gaze prediction (i.e. those that do not represent the task explicitly). We conclude that while corrected ground truth boosts performance, the implicit representation is not sufficient to capture the effects of task and context on where drivers look. Lastly, motivated by these findings, we propose a task- and context-aware model for drivers' gaze prediction with explicit representation of the drivers' actions and context. The first version of the model, SCOUT, improves state-of-the-art performance by over 80% overall and 30% on the most challenging scenarios. We then propose SCOUT+, which relies on the more readily available route and map information similar to what the driver might see on the in-car navigation screen. SCOUT+ achieves comparable results as the version that uses more precise numeric and text labels.Item Open Access Learned Exposure Selection for High Dynamic Range Image Synthesis(2021-03-08) Segal, Shane Maxwell; Brown, Michael; Brubaker, MarcusHigh dynamic range (HDR) imaging is a photographic technique that captures a greater range of luminance than standard imaging techniques. Traditionally accomplished by specialized sensors, HDR images are regularly created through the fusion of multiple low dynamic range (LDR) images that can now be captured by smartphones or other consumer grade hardware. Three or more images are traditionally required to generate a well-exposed HDR image. This thesis presents a novel system for the fast synthesis of HDR images by means of exposure fusion with only two images required. Experiments show that a sufficiently trained neural network can predict a suitable exposure value for the next image to be captured, when given an initial image as input. With these images fed into the exposure fusion algorithm, a high-quality HDR image can be quickly generated.Item Open Access Leveraging Dual-Pixel Sensors for Camera Depth of Field Manipulation(2022-03-03) Abuolaim, Abdullah Ahmad Taleb; Brown, Michael S.Capturing a photo with clear scene details is important in photography and for computer vision applications. The range of distance in the real world that makes the scene's objects appear with clear details is known to be the camera's depth of field (DoF). The DoF is controlled by either adjusting lens distance to sensor (i.e., focus distance), aperture size, and/or focal length of the cameras. At capture time, especially for video recording, DoF adjustment is often restricted to lens movements as adjusting other parameters introduces artifacts that can be visible in the recorded video. Nevertheless, the desired DoF is not always achievable at capture time due to many reasons like the physical constraints of the camera optics. This leads to another direction of adjusting DoF after effect as a post-processing step. Although pre- or post-capture DoF manipulation is essential, there are few datasets and simulation platforms that enable investigating DoF at capture time. Another limitation is the lack of real datasets for DoF extension (i.e., defocus deblurring), where the prior work relies on synthesizing defocus blur and ignores the physical formation of defocus blur in real cameras (e.g., lens aberration and radial distortion). To address this research gap, this thesis revisits DoF manipulation from two point of views: (1) adjusting DoF at capture time, a.k.a. camera autofocus (AF), within the context of dynamic scenes (i.e., video AF); (2) computationally manipulating the DoF as a post-capturing process. To this aim, we leverage a new imaging sensor technology known as the dual-pixel (DP) sensor. DP sensors are used to optimize camera AF and can provide good cues to estimate the amount of defocus blur present at each pixel location. In particular, this thesis provides the first 4D temporal focal stack dataset along with AF platform to examine video AF. It also presents insights about user preference that lead to propose two novel video AF algorithms. As for post-capture DoF manipulation, we examine the problem of reducing defocus blur (i.e., extending DoF) by introducing a new camera aperture adjustment to collect the first dataset that has images with real defocus blur and their corresponding all-in-focus ground truth. We also propose the first end-to-end learning-based defocus deblurring method. We extend image defocus deblurring to a new domain application (i.e., video defocus deblurring) by designing a data synthesis framework to generate realistic DP video data through modeling physical camera constraints, such as lens aberration and redial distortion. Finally, we build on top of a data synthesis framework to synthesize shallow DoF with other aesthetic effects, such as multi-view synthesis and image motion.Item Open Access Machine Learning and Digital Histopathology Analysis for Tissue Characterization and Treatment Response Prediction in Breast Cancer(2023-12-08) Saednia, Khadijeh Shirin; Sadeghi-Naini, AliBreast cancer is the most common type of diagnosed cancer and the leading cause of cancer-related death in women. Early diagnosis and prognosis in breast cancer patients can permit more therapeutic options and possibly improve their survival and quality of life. The gold standard approach for breast cancer diagnosis and characterization is histopathology assessment on biopsy specimens, which is time and resource-demanding. In this dissertation project, state-of-the-art machine learning (ML) methods have been developed and investigated for breast tissue characterization, nuclei segmentation, and chemotherapy response prediction in breast cancer patients using pre-treatment digitized histopathology images. First, a novel multi-scale attention-guided deep learning model is introduced to characterize breast tissue on digital pathology images according to four histological types. Evaluation results on the test set show the effectiveness of the proposed approach in accurate histopathology image classification with an accuracy of 97.5%. In the next step, a cascaded deep-learning-based model is proposed to delineate tumor nuclei in digital pathology images accurately, which is an essential step for extracting hand-crafted quantitative features for analysis with conventional ML models. The proposed model could achieve an F1 score of 0.83 on an independent test set. At the end, two novel ML frameworks are introduced and investigated for chemotherapy response prediction. In the first approach, a digital histopathology image analysis framework has been developed to extract various subsets of quantitative features from the segmented digitized slides for conventional ML model development. Several ML experiments have been conducted with different feature sets to develop prediction models of therapy response using a gradient boosting machine with decision trees. The proposed model with the optimal feature set could achieve an accuracy of 84%, sensitivity of 85% and specificity of 82% on an independent test set. The second approach introduces a hierarchical self-attention-guided deep learning framework to predict breast cancer response to chemotherapy using digital histopathology images of pre‑treatment tumor biopsies. The whole slide images (WSIs) are processed automatically through the proposed hierarchical framework consisting of patch-level and tumor-level processing modules followed by a patient-level response prediction component. A combination of convolutional and transformer modules is utilized at each processing level. The proposed framework could outperform the conventional ML models with a test accuracy, sensitivity, and specificity of 86%, 87%, and 83%, respectively. The proposed methods and the reported results in this dissertation are steps toward streamlining the histopathology workflow and implementing response-guided precision oncology for breast cancer patients.Item Open Access Machine Learning and Quantitative Imaging for the Management of Brain Metastasis(2023-03-28) Jalalifar, Seyed Ali; Sadeghi-Naini, AliSignificantly affecting patients’ clinical course and quality of life, a growing number of cancer cases are diagnosed with brain metastasis annually. Although a considerable percentage of cancer patients survive for several years if the disease is discovered at an early stage while it is still localized, when the tumour is metastasized to the brain, the median survival decreases considerably. Early detection followed by precise and effective treatment of brain metastasis may lead to improved patient survival and quality of life. A main challenge to prescribe an effective treatment regimen is the variability of tumour response to treatments, e.g., radiotherapy as a main treatment option for brain metastasis, despite similar cancer therapy, due to many patient-related factors. Stratifying patients based on their predicted response and consequently assessing their response to therapy are challenging yet crucial tasks. While risk assessment models with standard clinical attributes have been proposed for patient stratification, the imaging data acquired for these patients as a part of the standard-of-care are not computationally analyzed or directly incorporated in these models. Further, therapy response monitoring and assessment is a cumbersome task for patients with brain metastasis that requires longitudinal tumour delineation on MRI volumes before and at multiple follow-up sessions after treatment. This is aggravated by the time-sensitive nature of the disease. In an effort to address these challenges, a number of machine learning frameworks and computational techniques in areas of automatic tumour segmentation, radiotherapy outcome assessment, and therapy outcome prediction have been introduced and investigated in this dissertation. Powered by advanced machine learning algorithms, a complex attention-guided segmentation framework is introduced and investigated for segmenting brain tumours on serial MRI. The experimental results demonstrate that the proposed framework can achieve a dice score of 91.5% and 84.1% to 87.4% on the baseline and follow-up scans, respectively. This framework is then applied in a proposed system that follows standard clinical criteria based on changes in tumour size at post-treatment to assess tumour response to radiotherapy automatically. The system demonstrates a very good agreement with expert clinicians in detecting local response, with an accuracy of over 90%. Next, innovative machine-learning-based solutions are proposed and investigated for radiotherapy outcome prediction before or early after therapy, using MRI radiomic models and novel deep learning architectures that analyze treatment-planning MRI with and without standard clinical attributes. The developed models demonstrate an accuracy of up to 82.5% in predicting radiotherapy outcome before the treatment initiation. The ground-breaking machine learning platforms presented in this dissertation along with the promising results obtained in the conducted experiments are steps forward towards realizing important decision support tools for oncologists and radiologists and, can eventually, pave the way towards the personalized therapeutics paradigm for cancer patientsItem Open Access Machine Learning-Based Defences Against Advanced 'Session-Replay' Web Bots(2024-03-16) Sadeghpour, Shadi; Vlajic, NatalijaThe widespread adoption of the Internet has brought about significant benefits for modern society, but has also led to an increase in malicious activities, particularly through the use of web bots. While some bots serve useful purposes, the proliferation of malicious web bots poses a significant threat to Internet security, impacting individuals, businesses, governments, and society as a whole. The emergence of AI-powered web bots capable of mimicking human behavior and evading detection has further exacerbated this problem. This dissertation aims to deepen our understanding of advanced web bots and the web bot attacks that often signal fraudulent online activities. In particular, we focus on session-replay web bots, the latest and most advanced type of web bots, which present an especially difficult challenge in online domains where multiple genuine human users frequently exhibit similar behavioral patterns, such as news, banking, or gaming sites. To achieve our research objectives, we have meticulously curated an extensive dataset encompassing both human and bot-generated data. Additionally, we have developed our own prototype of advanced session-replay bot (the so-called ReBot), which has enabled us to accurately simulate the attacks conducted by this particular category of web bots. Moreover, by infusing randomness into the design of ReBot, we have been able to achieve varying degrees of bot and attack evasiveness. From the defenders perspective, and by leveraging state-of-the-art deep learning algorithms, we have proposed several effective strategies for detection of advanced session-replay bot attacks. One of our proposed techniques deploys the concept of moving-target defence in the form of webpage randomization which is particularly challenging for the attacker to overcome. This thesis also explores the utilization of generative machine learning models for the purpose of generating synthetic bots sessions. The ability to synthesize advance session-replay bots - as opposed to looking for real-world instances of these bots or evidence of their activity in real-world logs - is of critical importance if we are to make timely and effective advances in the field of web bot detection and defence.Item Open Access Machine Stereo Vision for Medical Image Registration(2021-03-08) Speers, Andrew David; Wildes, RichardImage guided liver surgery aims to enhance the precision of resection and ablation by providing fast localization of tumours and adjacent complex vasculature to improve oncologic outcome. This dissertation presents a novel end-to-end system for fast and accurate 3D surface reconstruction and motion estimation of the liver for alignment of intraoperative imagery with a preoperative volumetric scan. The system is designed and evaluated for application to liver surgery in an open setting, where open surgery is the dominant setting. The system is comprised of three key components: initialization, 3D surface recovery, and 3D motion estimation. Initialization is performed semi-automatically using a Branch-and-Bound (BnB) strategy to generate a set of globally optimal shape-based registration candidates from which the user can select a suitable initialization. 3D surface recovery is performed using a computationally efficient adaptive Coarse-to-Fine (CTF) stereo algorithm providing data-driven dense reconstructions in a computationally-efficient manner. A robust, 3D motion estimation technique based on interframe feature matching is then used to register a time series of reconstructions back to the initial frame of the sequence. The system has been evaluated empirically with reference to novel laboratory and intraoperative datasets, with results showing that performance is within tolerances expected for integration into Surgical Navigation (SN) systems.