ADS Capstone Chronicles Revised

3

3.1 IdentificationandProcessingofPII Data, Applying Deep Learning Models with Improved Accuracy and Efficiency One study explored the use of deep learning models in maintaining data privacy for large enterprises. A natural language processing (NLP) based large language model was developed to automatically detect PII data and mask such information. Additionally, support vector machines, random forest (RF), logistic regression (LR), long short-term memory, and multilayer perceptron models were trained to detect and anonymize data. These models used text that was converted into vectors to detect PII data. The final results from the model performance suggested that neural network-based models were the most proficient in identifying PII data precisely when crafting NLP-based extensive language models (Mitra & Roy, 2018). 3.2 AnonymizationofSensitive Information in Medical Health Records Protected health information (PHI) is any information that contains sensitive medical records that identify an individual, such as health care services, diagnosis, treatment, and billing information. This information cannot be directly shared outside of the hospital. Thus, the exhaustive deidentification of all PII and PHI is required. A study explored using NLP to remove PHI within Spanish clinical records. Given the dataset this study worked with, their neural network model performed the best given the token-level features and static dictionaries of Spanish names and locations (Saluja et al., 2019).

Online learning environments require robust privacy measures to protect against data breaches. Experts like Jim Greer emphasize the importance of privacy, trust, and personalization, with three significant privacy theories–limitation theory, control theory, and contextual integrity theory–essential to address concerns. Greer's team has developed privacy preferences and identity management features to maintain privacy and trust. However, as data technologies advance, privacy becomes increasingly difficult to maintain, and service providers must uphold strict privacy standards to ensure the integrity of online learning environments (Anwar, 2021). 3.4 PersonallyIdentifiableInformation (PII) Detection in the Unstructured Large Text Corpus Using Natural Language Processing and Unsupervised Learning Technique Recognizing the significance of safeguarding PII data privacy, numerous research endeavors have produced a plethora of valuable approaches for implementing robust privacy measures for PII data. Although contrasting facts highlight divergent perspectives on the effectiveness of rule-based approaches versus machine learning models, using some modeling like the clustering-based PII detection model to validate facts underscores the potential of hybrid deep learning techniques, which can help enhance accuracy (Kulkarni & Cauvery, 2021). 3.5 A Systematic Review of Cybersecurity Risks in Higher Education Higher education faces unique cybersecurity challenges due to academic freedom and collaborative research environments. There is a lack of empirical research on security practices within academia. Higher educational institutions

3.3 SupportingPrivacy,Trust,and Personalization in Online Learning

7

Made with FlippingBook - Online Brochure Maker