Computer Science

Permanent URI for this collection

https://hdl.handle.net/10315/38508

Browse

Now showing 1 - 20 of 94

Open Access
A 360-degree Omnidirectional Photometer Using a Ricoh Theta Z1
(2023-12-08) MacPherson, Ian Michael; Brown, Michael S.
Spot photometers measure the luminance emitted or reflected from a small surface area in a physical environment. Because the measurement is limited to a "spot," capturing dense luminance readings for an entire environment is impractical. This thesis demonstrates the potential of using an off-the-shelf commercial camera to operate as a 360-degree luminance meter. The method uses the Ricoh Theta Z1 camera, which provides a full 360-degree omnidirectional field of view and an API to access the camera's minimally processed RAW images. Working from the RAW images, this thesis describes a calibration method to map the RAW images under different exposures and ISO settings to luminance values. By combining the calibrated sensor with multi-exposure high-dynamic-range imaging, a cost-effective mechanism for capturing dense luminance maps of environments is provided. The results show that the Ricoh Theta calibrated as a luminance meter performs well when validated against a significantly more expensive spot photometer.
Open Access
A Graph-Based Deep Learning Model for Anti-Money Laundering
(2023-08-04) Bakhshinejad, Nazanin; Nguyen, Uyen T.
Anti-money laundering (AML) refers to a set of laws, regulations, and procedures that financial institutions and other regulated entities are required to implement to identify and prevent the use of their services for illicit financial activities. Current AML solutions rely on rule-based algorithms, which are not scalable and ineffective against new, evolving or complex money laundering patterns. On the other hand, the rapid advancement of technology and new sophisticated financial instruments have increased the complexity of money laundering methods. Machine learning has the capability to learn and identify new or complex money laundering patterns. Within this context, the thesis offers two major contributions. First, we conducted a survey that provides a comprehensive review of existing machine learning-based AML solutions from a data-oriented perspective. We studied existing machine learning models proposed for AML in terms of datasets used, input and output data, approaches to the class imbalance problem, and classification metrics. To the best of our knowledge, this survey is the first that focuses on different aspects of data, classification metrics and related issues (e.g., the class imbalance problem). Second, we propose an AML detection system and a graph-based machine learning model to identify suspicious transactions. The detection system first transforms a dataset of accounts and transactions into a graph structure and applies the node2vec (N2V) algorithm to convert the graphs into feature vectors. The feature vectors are then input into a graph convolution network (GCN), which will then classify the transactions as normal or suspicious. (Each suspicious transaction, which is known as an alarm, will be investigated manually by a financial analyst to confirm if it is a normal transaction or a money laundering transaction.) To overcome the inherent class imbalance of AML data (i.e., the number of money laundering transactions in a dataset is much smaller than the number of normal transactions), we use a combination of techniques, including over-sampling and classifier threshold moving. Our experimental results show that the proposed N2V-GCN system can achieve very low false negative rates (money laundering transactions misclassified as normal transactions), reaching zero in one experiment. At the same time, the proposed system lowers the false alarm rates (normal transactions classified as suspicious transactions) to under 50%, much lower than the current industry standard of 90% or more.
Open Access
A Novel Vulnerable Smart Contracts Profiling Method Based on Advanced Genetic Algorithm Using Penalty Fitness Function
(2024-11-07) HajiHosseinKhani, Sepideh; Habibi Lashkari, Arash
With the advent of blockchain networks, there has been a transition from traditional contracts to Smart Contracts (SCs), which are crucial for maintaining trust within these networks. Previous methods for analyzing SC vulnerabilities typically lack accuracy and effectiveness, struggling to detect complex vulnerabilities due to limited data availability. This study introduces a novel approach to detecting and profiling SC vulnerabilities, featuring two components: a new analyzer named BCCC-SCsVulLyzer and an advanced Genetic Algorithm (GA) profiling method. The BCCC-SCsVulLyzer extracts 240 features, while the enhanced GA employs techniques such as Penalty Fitness Function and Adaptive Mutation Rate to profile vulnerabilities. Additionally, this work introduces a new dataset, BCCC-SCsVul-2024, with 111,897 Solidity source code samples for practical validation. Three taxonomies are established to enhance the efficiency of profiling techniques. Our approach demonstrated superior precision and accuracy, proving efficient in time and space complexity. The profiling technique also makes the model highly transparent and explainable, highlighting the potential of GA-based profiling to improve SC vulnerability detection and enhance blockchain security.
Open Access
A Solution for Scale Ambiguity in Generative Novel View Synthesis
(2025-04-10) Forghani, Fereshteh; Brubaker, Marcus
Generative Novel View Synthesis (GNVS) involves generating plausible unseen views of a scene given an initial view and the relative camera motion between the input and target views using generative models. A key limitation of current generative methods lies in their susceptibility to scale ambiguity, an inherent challenge in multi-view datasets caused by the use of monocular techniques to estimate camera positions from uncalibrated video frames. In this work, we present a novel approach to tackle this scale ambiguity in multi-view GNVS by optimizing the scales as parameters in an end-to-end fashion. We also introduce Sample Flow Consistency (SFC), a novel metric designed to assess scale consistency across samples with the same camera motion. Through various experiments, we demonstrate our approach yields improvements in terms of SFC, providing more consistent and reliable novel view synthesis.
Open Access
A System for Plan Recognition in Discrete and Continuous Domains
(2022-08-08) Scheuhammer, Alistair Sterling Benger; Lesperance, Yves
For my thesis I seek to implement a programming framework which can be used to model and solve plan recognition problems. My primary goal for this system is for it to be able to easily handle continuous probability spaces as well as discrete ones. My framework is based primarily on the probabilistic situation calculus developed by Belle and Levesque, and is an extension of a programming language developed by Levesque called Ergo. The system I have built allows one to specify complex domains and dynamic models at a high-level and is written in a language which is user-friendly and easy to understand. It has strong formal foundations, can be used to compare multiple different plan recognition methods, and makes it easier to perform plan recognition in tandem with other forms of reasoning, such as threat assessment, reasoning about action, and planning to respond to the actions performed by the observed agent.
Open Access
A Wait-free Queue with Poly-logarithmic Worst-case Step Complexity
(2023-03-28) Naderibeni, Hossein; Ruppert, Eric
In this work, we introduce a novel linearizable wait-free queue implementation. Linearizability and lock-freedom are standard requirements for designing shared data structures. To the best of our knowledge, all of the existing linearizable lock-free queues in the literature have a common problem in their worst case, called the CAS Retry Problem. We show that our algorithm avoids this problem with the helping mechanism which we use and has a worst-case running time better than prior lock-free queues. The amortized number of steps for an Enqueue or Dequeue in our algorithm is O(log^2 p + log q), where p is the number of processes and q is the size of the queue when the operation is linearized.
Open Access
Active Observers in a 3D World: Human Visual Behaviours for Active Vision
(2022-12-14) Solbach, Markus Dieter; Tsotsos, John K.
Human-like performance in computational vision systems is yet to be achieved. In fact, human-like visuospatial behaviours are not well understood – a crucial capability for any robotic system whose role is to be a real assistant. This dissertation examines human visual behaviours involved in solving a well-known visual task; The Same-Different Task. It is used as a probe to explore the space of active human observation during visual problem-solving. It asks a simple question: “are two objects the same?”. To study this question, we created a set of novel objects with known complexity to push the boundaries of the human visual system. We wanted to examine these behaviours as opposed to the static, 2D, display-driven experiments done to date. We thus needed to develop a complete infrastructure for an experimental investigation using 3D objects and active, free, human observers. We have built a novel, psychophysical experimental setup that allows for precise and synchronized gaze and head-pose tracking to analyze subjects performing the task. To the best of our knowledge, no other system provides the same characteristics. We have collected detailed, first-of-its-kind data of humans performing a visuospatial task in hundreds of experiments. We present an in-depth analysis of different metrics of humans solving this task, who demonstrated up to 100% accuracy for specific settings and that no trial used less than six fixations. We provide a complexity analysis that reveals human performance in solving this task is about O(n), where n is the size of the object. Furthermore, we discovered that our subjects used many different visuospatial strategies and showed that they are deployed dynamically. Strikingly, no learning effect was observed that affected the accuracy. With this extensive and unique data set, we addressed its computational counterpart. We used reinforcement learning to learn the three-dimensional same-different task and discovered crucial limitations which only were overcome if the task was simplified to the point of trivialization. Lastly, we formalized a set of suggestions to inform the enhancement of existing machine learning methods based on our findings from the human experiments and multiple tests we performed with modern machine learning methods.
Open Access
Active Visual Search: Investigating human strategies and how they compare to computational models
(2024-03-16) Wu, Tiffany; Tsotsos, John K.
Real world visual search by fully active observers has not been sufficiently investigated. Whilst the visual search paradigm has been widely used, most studies use a 2D, passive observation task, where immobile subjects search through stimuli on a screen. Computational models have similarly been compared to human performance only to the degree of 2D image search. I conduct an active search experiment in a 3D environment, measuring eye and head movements of untethered subjects during search. Results show patterns forming strategies for search, such as repeated search paths within and across subjects. Learning trends were found, but only in target present trials. Foraging models encapsulate subject location-leaving actions, whilst robotics models captured viewpoint selection behaviours. Eye movement models were less applicable to 3D search. The richness of data collected from this experiment opens many avenues of exploration, and the possibility of modelling active visual search in a more human-informed manner.
Open Access
Advancing Blind Face Restoration: Robustness and Identity Preservation with Integrated GAN and Codebook Prior Architectures
(2024-03-16) Tayarani Bathaie, Seyed Nima ; An, Aijun
Blind Face Restoration (BFR) is a challenging task in computer vision, which aims to reconstruct High-Quality (HQ) facial images from Low-Quality (LQ) inputs. BFR presents as a challenging ill-posed problem, necessitating auxiliary information to constrain the solution space. While geometric and generative facial priors provide some support in BFR, their effectiveness wanes under intense degradation. Discrete codebook priors, though promising, grapple with the difficulty of associating intensely degraded images with their corresponding codes. To effectively address these limitations, this research introduces a two-stage restoration approach, termed Identity-embedded GAN and Codebook Priors (IGCP), which synergistically combines the strengths of both generative and codebook priors. In the first stage, our approach employs a Generative Prior Restorer (GPR) network for initial image restoration. Distinct from existing methods that apply identity-based losses to the final restored image, our work innovates by embedding identity information directly into the style vectors of the StyleGAN2 network during the generation process. This is achieved through the introduction of an \emph{identity-in-style} loss, ensuring superior fidelity and identity preservation even in severely degraded images Proceeding to the second stage, the approach utilizes a two-component framework known as the Codebook Prior Restorer (CPR) network. This framework comprises a Vector Quantized AutoEncoder (VQAE) for artifact mitigation and to add a final touch of quality, complemented by introducing a Feature Transfer Module (FTM) that is demonstrated to be necessary to ensure fidelity and identity preservation. Extensive experimental evaluations were conducted across five datasets, including our newly introduced CelebA-IntenseTest dataset. The results from these experiments demonstrate the remarkable efficacy of the IGCP approach. Notably, IGCP has shown exceptional performance in handling various degradation levels, setting new benchmarks in the domain of BFR.
Open Access
An Axiomatic Perspective on Anomaly Detection
(2024-11-07) Wyke, Chester Samuel; Urner, Ruth
A major challenge for both theoretical treatment and practical application of unsupervised learning tasks, such as clustering, anomaly detection or generative modeling, is the inherent lack of quantifiable objectives. Choosing methods and evaluating outcomes is then often a matter of ad-hoc heuristics or personal taste. Anomaly detection is often employed as a preprocessing step to other learning tasks, and unsound decisions for this task may thus have far-reaching consequences. In this work, we propose an axiomatic framework for analyzing behaviours of anomaly detection methods. We propose a basic set of desirable properties (or axioms) for distance-based anomaly detection methods and identify dependencies and (in-)consistencies between subsets of these. In addition, we include empirical results, which demonstrate the benefits of this axiomatic perspective on behaviours of anomaly detection methods. Our experiments illustrate how some commonly employed algorithms violate, perhaps unexpectedly, a basic desirable property. Namely, we highlight a material problem with a commonly used method called Isolation Forest, related to infinite bands of space likely to be labelled as inliers that extend infinitely far away from the training data. Additionally, we experimentally demonstrate that another common method, Local Outlier Factor, is vulnerable to adversarial data poisoning. To conduct these experimental evaluations, a tool for dataset generation, experimentation and visualization was built, which is an additional contribution of this work.
Open Access
Analyizing Color Imaging Failure on Consumer Cameras
(2022-12-14) Tedla, SaiKiran Kumar; Brown, Michael S.
There are currently many efforts to use consumer-grade cameras for home-based health and wellness monitoring. Such applications rely on users to use their personal cameras to capture images for analysis in a home environment. When color is a primary feature for diagnostic algorithms, the camera requires color calibration to ensure accurate color measurements. Given the importance of such diagnostic tests for the users' health and well-being, it is important to understand the conditions in which color calibration may fail. To this end, we analyzed a wide range of camera sensors and environmental lighting to determine (1): how often color calibration failure is likely to occur; and (2) the underlying reasons for failure. Our analysis shows that in well-lit environments, it is rare to encounter a camera sensor and lighting condition combination that results in color imaging failure. Moreover, when color imaging does fail, the cause is almost always attributed to spectral poor environmental lighting and not the camera sensor. We believe this finding is useful for scientists and engineers developing color-based applications with consumer-grade cameras.
Open Access
Anonymity in Developer Communities: Insights from Developer Perceptions and Stack Overflow Profiles
(2024-11-07) Lemango, Elim Yoseph; Nayebi, Maleknaz
This thesis consists of two studies: an interview study with 34 early-career developers and a mining study analyzing 130,000 developer profiles. The interview study examines developers' definitions of anonymity, their preferences for anonymity, and their engagement with privacy policies. It also explores whether presenting privacy policies using contextual integrity principles improves understanding of the privacy policies. The developers from the interview study defined anonymity as withholding identifiable information like name, location, and professional background. The mining study investigates how much information developers share across platforms and the ease of retrieving their professional profiles. Our findings show that using Stack Overflow location and screen name in LinkedIn searches narrows down profiles, but cross-linking Stack Overflow profile data information with GitHub or Twitter adds noise. This research provides valuable insights into how developers define anonymity, and how that affects their behaviour when using social coding platforms.
Open Access
Augmented Reality Water-Level Task
(2023-08-04) Abadi, Romina; Allison, Robert
The``Water Level Task'' asks participants to draw the water level in a tilted container. Studies showed that many adults have difficulty with the task. Our study aimed to determine if the misconception about water orientation happens in a more natural environment. We implemented an AR water-in-container effect to create an augmented reality (AR) version of the Water-Level task (AR-WLT). In the AR-WLT, participants interacted with two containers half full of water in a Hololens2 AR display and were asked to determine which looked more natural. In at least one of the two simulations, the water surface did not remain horizontal. A traditional online WLT was created to recruit low and high-scoring participants. Our results showed that low-scoring individuals were likelier to make errors in the AR version. However, participants did not choose simulations close to their 2D drawings, suggesting different cognitive and perceptual factors were involved in different environments.
Open Access
Automatic Instantiation of Assurance Cases from Patterns using Large Language Models
(2025-04-10) Odu, Oluwafemi John; Belle, Alvine Boaye
Justifying the correct implementation of mission-critical systems' non-functional requirements (e.g., safety, and security) is crucial to prevent system failure. The latter could have severe consequences such as the death of people and financial losses. Assurance cases can be used to prevent system failure. They are structured sets of arguments supported by evidence, demonstrating that a system’s non-functional requirements have been correctly implemented. Assurance case patterns serve as templates derived from previous successful assurance cases, aimed at facilitating the creation of new assurance cases. Despite the use of these patterns to generate assurance cases, their instantiation remains a largely manual and error-prone process that heavily relies on domain expertise. Thus, exploring techniques to support their automatic instantiation becomes crucial. To address this, our thesis explores the literature on assurance case patterns to understand recent advancements and trends characterizing that literature. Then we investigated the potential of Large Language Models (LLMs) in automating the generation of assurance cases that comply with specific assurance case patterns. Our findings suggest that LLMs can generate assurance cases that comply with the given patterns. However, this study also highlights that LLMs may struggle with understanding some nuances related to pattern-specific relationships. While LLMs exhibit potential in the automatic generation of assurance cases, their capabilities still fall short compared to human experts. Therefore, a semi-automatic approach to instantiating assurance cases may be more advisable at this time.
Open Access
Automation in Open Source Software: A GitHub Marketplace Analysis
(2023-12-08) Saroar, Sk Golam; Nayebi, Maleknaz
This thesis comprises two papers that examine automation tools in the Open Source Software (OSS) ecosystem on GitHub, focusing on GitHub Actions as well as the GitHub Marketplace, which is a platform for sharing these Actions for collaboration and reuse. Our research aims to understand and explore the state of automation in OSS, as existing studies have mainly focused on statistical analysis of a sample of GitHub repositories, neither considering developers’ perspectives nor leveraging the GitHub Marketplace. The first paper conducted a survey analysis to investigate the motivations, decision criteria, and challenges associated with creating, publishing, and using Actions. The second paper explores the GitHub Market- place and presents a mapping study by analyzing 7,878 Actions and 515 research papers mapped into 32 different categories. We found a substantial industry-academia gap, with researchers focusing on experimentation and practitioners relying more on exploration tools. The limited number of OSS automation tools published in academia contrasted with the convenient access practitioners had to the marketplace offerings. This thesis contributes to the understanding of automation in the OSS ecosystem, highlights the industry-academia gap, offers insights for researchers to build on existing work, and aids practitioners in navigating technology and finding synergies.
Open Access
Batch Query Memory Prediction Using Deep Query Template Representations
(2023-08-04) Jaramillo, Nicolas Andres; Papagelis, Manos; Litoiu, Marin
This thesis introduces a novel approach called LearnedWMP for predicting the memory cost demand of a batch of queries in a database workload. Existing techniques focus on estimating the resource demand of individual queries, failing to capture the net resource demand of a workload. LearnedWMP leverages the query plan and groups queries with similar characteristics into pre-built templates. A histogram representation of these templates is generated for the workload, and a regressor predicts the resource demand, specifically memory cost, based on this histogram. Experimental results using three database benchmarks demonstrate a 47.6% improvement in memory estimation compared to the state-of-the-art. Additionally, the approach outperforms various machine and deep learning techniques for individual query prediction, offering a 3x to 10x faster and at least 50% smaller model size.
Open Access
Chart Question Answering with an Universal Vision-Language Pretraining Approach
(2023-12-08) Parsa Kavehzadeh; Enamul Hoque Prince
Charts are widely used for data analysis, providing visual representations and insights into complex data. To facilitate chart-based data analysis using natural language, several downstream tasks have been introduced recently including chart question answering. However, existing methods for these tasks often rely on pretraining on language or vision-language tasks, neglecting the explicit modeling of chart structures. To address this, we first build a large corpus of charts covering diverse topics and visual styles. We then present UniChart, a pretrained model for chart comprehension and reasoning. We propose several chart-specific pretraining tasks that include: (i) low-level tasks to extract the visual elements (e.g., bars, lines) and data from charts, and (ii) high-level tasks to acquire chart understanding and reasoning skills. Our experiments demonstrate that pretraining UniChart on a large corpus with chart-specific objectives, followed by fine-tuning, yields state-of-the-art performance on four downstream tasks. Moreover, our model exhibits superior generalizability to unseen chart corpus, surpassing previous approaches that lack chart-specific objectives and utilize limited chart resources.
Open Access
Chart Question Answering with Visual and Logical Reasoning
(2022-12-14) Masry, Ahmed; Prince, Enamul Hoque
Charts are very popular for analyzing data. When exploring charts, people often ask complex reasoning questions that involve several logical and arithmetic operations. They also commonly refer to visual features of a chart in their questions. However, most existing datasets do not focus on such complex reasoning questions as their questions are template-based and answers come from a fixed-vocabulary. In this thesis work, we present a large-scale benchmark covering 9.6K human-written questions and 23.1K questions generated from human-written chart summaries. To address the unique challenges in our benchmark involving visual and logical reasoning, we present transformer-based models that combine visual features and the data table of the chart. Moreover, we propose chart-specific pretraining tasks that improve the visual and logical reasoning skills of our models. While our models achieve the state-of-the-art results on the previous datasets and our benchmark, the evaluation also reveals several challenges in answering complex reasoning questions.
Open Access
Comparative Studies of Gesture-Based and Sensor-Based Input Methods for Mobile User Interfaces
(2021-11-15) Garg, Saurabh; MacKenzie, I. Scott
Three user studies were conducted to compare gesture-based and sensor-based interaction methods. The first study compared the efficiency and speed of three scroll navigation methods for touch-screen mobile devices: Tap Scroll (touch-based), Kinetic Scroll (gesture-based), and Fingerprint Scroll (our newly introduced sensor-based method). The second study compared the accuracy and speed of three zoom methods. One method was GyroZoom which uses the mobile phone's gyroscope sensor. The second one is Pinch-to-Zoom (Gesture-based) method. VolumeZoom, the third method, uses volume buttons that were reprogrammed to perform zoom operations. The third study on text entry compared a QWERTY-based onscreen keyboard with a novel 3D gesture-based Write-in-Air method. This method utilizes webcam sensors. Our key findings from the three experiments are that sensor-based interaction methods are intuitive and provide a better user experience than gesture-based interaction methods. The sensor-based methods were on par with the speed and accuracy of the gesture-based methods.
Open Access
Computer Vision for Hockey Video Curation
(2022-12-14) Pidaparthy, Hemanth; Elder, James
Computer vision-based models are being actively investigated for tasks such as ball and player tracking. These insights are useful for both coaches and players to improve performance. Applying computer vision-based solutions for hockey video analysis is challenging because of the small size of the puck, fast and non-smooth movement of the players, and frequent occlusions. In this thesis, I present my research work on computer vision for hockey video curation. I discuss three problems: 1) automatic sports videography, 2) play segmentation of hockey videos and 3) automatic homography estimation. When recording broadcast hockey videos, professional cameramen move a PTZ camera to follow the play. Professional videography is expensive for amateur games and this motivates the development of a low-cost solution for automatic hockey videography. We used a novel method for accurate ground truth of the puck location from wide-field video. We trained a novel deep network regressor to estimate the puck location on each frame. Centered around the predicted puck location, we dynamically cropped the wide-field video to generate a zoomed-in video following the play. The automatic videography system delivers continuous video over the entire game. Typical hockey games that feature 40-60 minutes of active play are played over 60-110 minutes with breaks in play due to warm-ups and fouls. We propose a novel solution for automatically identifying periods of play and no-play, and generate a temporally compressed video that is easier to watch. We combine visual cues from the output of a deep network classifier with auditory cues from the referee's whistle using a hidden Markov model (HMM). Since the PTZ parameters of the camera are constantly varying when recording broadcast hockey videos, the homography changes every frame. Knowing this homography allows for the projection of graphics onto the ice surface. We estimate the homography by exploiting the consistency of colours used for markings in ice hockey. We model the colours as a multi-variate Gaussian and then use a two-step approach to search for the homography that aligns the colours of the template image to that of a test image.