Developing Advanced Representation Learning Techniques for mRNA Sequence and Structure Modeling
| dc.contributor.advisor | Huang, Jimmy | |
| dc.contributor.author | Nahali, Sepideh | |
| dc.date.accessioned | 2025-11-11T19:58:31Z | |
| dc.date.available | 2025-11-11T19:58:31Z | |
| dc.date.copyright | 2025-06-26 | |
| dc.date.issued | 2025-11-11 | |
| dc.date.updated | 2025-11-11T19:58:30Z | |
| dc.degree.discipline | Information Systems and Technology | |
| dc.degree.level | Master's | |
| dc.degree.name | MA - Master of Arts | |
| dc.description.abstract | Recent studies in bioinformatics and genomics focus on analyzing RNA sequences, which are complex due to diverse nucleotide compositions, varying lengths, and multiple isoforms. Accurately modeling these sequences is essential for predicting mRNA degradation, a key factor in designing effective RNA-based therapies. However, many existing models struggle to capture the intricate relationships between sequence and structure, limiting their predictive power. We introduce StructmRNA, a BERT-based model using dual-level and conditional masking to embed RNA sequences and structures. This enables accurate prediction of mRNA sequences and structures without explicit structural data, effectively capturing sequence-structure dependencies. Evaluations show StructmRNA outperforms existing models in predicting mRNA degradation and secondary structure. Experiments with GAN-generated RNA sequences showed no performance improvement. Nonetheless, StructmRNA’s consistent convergence over 30 epochs highlights its robustness and accuracy. This work advances RNA representation learning and demonstrates deep learning’s potential in RNA-based therapeutic design and bioinformatics. | |
| dc.identifier.uri | https://hdl.handle.net/10315/43254 | |
| dc.language | en | |
| dc.rights | Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests. | |
| dc.subject | Bioinformatics | |
| dc.subject | Artificial intelligence | |
| dc.subject | Genetics | |
| dc.subject.keywords | Bioinformatics | |
| dc.subject.keywords | mRNA degradation prediction | |
| dc.subject.keywords | mRNA sequences | |
| dc.subject.keywords | Secondary structures | |
| dc.subject.keywords | StructmRNA model | |
| dc.subject.keywords | Machine learning | |
| dc.subject.keywords | Two-level masking | |
| dc.subject.keywords | Conditional masking | |
| dc.subject.keywords | Synthetic RNA data | |
| dc.subject.keywords | BERT model | |
| dc.subject.keywords | Sequence-structure relationship | |
| dc.title | Developing Advanced Representation Learning Techniques for mRNA Sequence and Structure Modeling | |
| dc.type | Electronic Thesis or Dissertation |
Files
Original bundle
1 - 1 of 1