Information Systems and Technology

Permanent URI for this collection

https://hdl.handle.net/10315/27588

Browse

Now showing 1 - 3 of 3

Open Access
A Hierarchical Rule-Based Security Management System for Date-Intensive Applications
(2018-11-21) Rouf, Yar Akhter; Litoiu, Marin
Applications in today's software development environment evolve at a rapid rate, constantly providing their users with new functionalities. As a result, it becomes increasingly complex to understand the entire application. The security team and the developers may not completely understand each others approaches, resulting in a less secure system with vulnerabilities. In addition, there is large amount of security data to be analyzed. To mitigate these issues, we propose a platform to support the SecDevOps framework, a hierarchical distributed architecture for security control that uses a Business Rules Engine (BRE). The BRE simplifies security rules by allowing the teams to write them at an operational level rather than at the network level, which requires specialized knowledge. Business rules are universally understood by the different teams, resulting in effective inter-team communication. Additionally, the platform can expand and scale with new security rules and data sources at runtime in a systematic manner.
Open Access
An Approach to Designing Clusters for Large Data Processing
(2015-08-28) Sandel, Roni; Litoiu, Marin
Cloud computing is increasingly being adopted due to its cost savings and abilities to scale. As data continues to grow rapidly, an increasing amount of institutions are adopting non standard SQL clusters to address the storage and processing demands of large data. However, evaluating and modelling non SQL clusters presents many challenges. In order to address some of these challenges, this thesis proposes a methodology for designing and modelling large scale processing configurations that respond to the end user requirements. Firstly, goals are established for the big data cluster. In this thesis, we use performance and cost as our goals. Secondly, the data is transformed from relational data schema to an appropriate HBase schema. In the third step, we iteratively deploy different clusters. We then model the clusters and evaluate different topologies (size of instances, number of instances, number of clusters, etc.). We use HBase as the large data processing cluster and we evaluate our methodology on traffic data from a large city and on a distributed community cloud infrastructure.
Open Access
Exploring and Evaluating the Scalability and Efficiency of Apache Spark using Educational Datasets
(2018-08-27) Zhang, Jian; Yang, Zijiang Cynthia
Research into the combination of data mining and machine learning technology with web-based education systems (known as education data mining, or EDM) is becoming imperative in order to enhance the quality of education by moving beyond traditional methods. With the worldwide growth of the Information Communication Technology (ICT), data are becoming available at a significantly large volume, with high velocity and extensive variety. In this thesis, four popular data mining methods are applied to Apache Spark, using large volumes of datasets from Online Cognitive Learning Systems to explore the scalability and efficiency of Spark. Various volumes of datasets are tested on Spark MLlib with different running configurations and parameter tunings. The thesis convincingly presents useful strategies for allocating computing resources and tuning to take full advantage of the in-memory system of Apache Spark to conduct the tasks of data mining and machine learning. Moreover, it offers insights that education experts and data scientists can use to manage and improve the quality of education, as well as to analyze and discover hidden knowledge in the era of big data.

Browse

Browsing Information Systems and Technology by Subject "Big data"

Results Per Page

Sort Options