AAI_2025_Capstone_Chronicles_Combined

MENTAL HEALTH RISK DETECTION USING ML​

15

Each Tabnet Block begins with a feature transformer, which includes a Dense Layer followed by Batch Normalization to stabilize and accelerate the training. The result then goes to a Gated Linear Unit (GLU), which functions as an activation mechanism that controls the flow of information. GLUs allow the model to selectively amplify or suppress feature signals, improving both generalization and interpretability. This gated activation helps the network focus on the most relevant features while reducing noise. This is important as it continues to improve generalization and interpretability. Afterward is the attentive transformer, which generates an attention mask that decides which features to focus on. This is done with another dense layer. The mask determines the features and determines which to emphasize in the step. It is multiplied by a running prior, a memory vector that tracks which features have already been used in earlier steps. By updating the prior by the lambda layer, this encourages the model to focus on the new features. The model also feeds into Multiply and Lambda layers to simulate attention flow and feature routing. Multiply layers learn the attention mask and dynamically select which features to forward. Lambda layers are used to compute the updated prior by subtracting 1 from the mask and help to give a new perspective at each decision step. This continues to support the view of different features and prevents reliance on specific features. Finally, once all three decision steps in the Tabular Neural Network are completed, their outputs are combined using an Add layer. This aggregation fuses the learned insights from each stage into a unified feature representation. The combined result is then passed through a final Dense layer, which produces the probability scores for each mental health risk class. By allowing each step to contribute independently to the final decision, the model benefits from multiple perspectives, enhancing overall robustness and generalization.

215

Made with FlippingBook - Share PDF online