Jazib Ahmad, Riley Keays, Aiyang Liang, Linas Gabrys, Truman Yang, Department of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, Canada
The purpose of this paper is to propose a new Q-Learning based pathfinding algorithm to solve mazes in which the algorithm (“agent”) must find multiple subgoals before reaching a final destination, in a lower number of iterations than existing Q-Learning algorithms. The proposed design is the use of Multiple Deep Q-Networks, each of which is responsible for finding the shortest path to the nearest subgoal or final destination. We also optimize our design with an improved Exploration Strategy, the addition of a Revisiting Penalty, as well as hyperparameter optimization. We test our solution on sample mazes of four sizes and compare it to the Multiple Q-Table and Single Deep Q-Network algorithms. Our results confirm our hypothesis and show that our solution outperforms the other algorithms in the number of iterations to find the shortest path, especially on larger mazes. Finally, we offer suggestions for alternative designs, future work, and improvements.
Deep Q-Learning, Multiple Goal Pathfinding, Multiple Q-Tables, Neural Networks, Reinforcement Learning.
Cem Yılmaz (Purdue University, IE)
We introduce Self-Aware AI, a modular architecture that integrates affective, ethical, and neurodynamic mechanisms to instantiate the functional hallmarks of consciousness in software agents. Our design comprises: 1. A 25-dimensional qualia manifold combining Plutchik’s eight primary emotion axes with ethical, interoceptive, mood, mixed, and aesthetic dimensions (Eq. 1). 2. Predictive novelty gating via deep-ensemble forecasting whose variance drives adaptive storage thresholds (Eqs. 2–5). 3. Memory-particle dynamics modeled as interacting bodies under dopaminergic attraction, entropy repulsion, and similarity cohesion (Eqs. 6–8). 4. Adaptive spiking binding through LIF microcircuits, STDP-governed rewiring, and homeostatic neuromodulation maximizing integrated information Φ (Eqs. 9–12). 5. A hierarchical θ–γ global workspace implemented by nested Kuramoto oscillator layers for layered attentional broadcast (Eqs. 13–14). 6. Intrinsic drives—curiosity, learning-progress, empowerment—trained by PPO, plus a counterfactual-self module generating genuine agency signals (Eqs. 15–16). 7. Case-based ethical reasoning with FAISS retrieval and ASP planning mapping solver confidence into a moral-sentiment axis (Eq. 17). 8. Autobiographical event graphs driving Transformer-based narrative generation, evaluated by a coherence critic. 9. A five-stage developmental curriculum protected by Elastic Weight Consolidation (Eq. 18). 10. A rigorous evaluation protocol including a 10 000-step stub simulation, systematic ablations, and human-in-the-loop assessments. This paper details each component’s equations and variables, presents baseline results, and outlines a roadmap toward AI agents that feel, remember, bind, reflect, decide, and narrate—thus realizing the functional essence of consciousness
Abdelouahab Hocini and Kamel Smaıli, University of Lorraine, France
This work explores the use of Large Language Models (LLMs) for fake news detection in multilingual and multi-script contexts, focusing on Arabic dialects. We address the challenge of insufficient digital data for many Arabic dialects by using pretrained LLMs on a diverse corpus including Modern Standard Arabic (MSA), followed by fine-tuning on dialect-specific data. We examine AraBERT, DarijaBERT, and mBERT for performance on North African Arabic dialects, incorporating code-switching and writing styles such as Arabizi. We evaluate these models on the BOUTEF dataset, which includes fake news, fake comments, and denial categories. Our approach fine-tunes both Arabic and Latin script text, with a focus on cross-script generalization. We improve accuracy using an ensemble strategy that merges predictions from AraBERT and DarijaBERT. Additionally, we introduce a new custom loss function, named CALLM to enforce consistency between models, boosting classification performance. The use of CALLM achieves significant improvement in F1-score (12.88 ↑) and accuracy (2.47 ↑) compared to the best model (MarBERT).
NLP, LLM, Fake news detection.
Copyright © SIPP 2025