Optimization Strategies and Neural Architectures in Neural Networks: Dataset Pruning, Architecture Search, and Diffusion Models

Prajwal Khadka Siti; Sujata Adhikari

Vol. 14 No. 8 (2024)
Machine-Learning-Computational-AUGUST-2024

Articles

Optimization Strategies and Neural Architectures in Neural Networks: Dataset Pruning, Architecture Search, and Diffusion Models

Published 2024-08-13

Prajwal Khadka Siti
Sujata Adhikari

Prajwal Khadka Siti
Department of Computer Science, Sagarmatha Institute of Technology, 89 Kalimati Road, Kathmandu, 44601, Nepal.

Sujata Adhikari
Department of Computer Science, Lumbini University of Applied Sciences, Buddha Path, Butwal, Rupandehi, 32907, Nepal.

How to Cite

Siti, P. K., & Adhikari, S. (2024). Optimization Strategies and Neural Architectures in Neural Networks: Dataset Pruning, Architecture Search, and Diffusion Models. International Journal of Applied Machine Learning and Computational Intelligence, 14(8), 41–55. Retrieved from https://neuralslate.com/index.php/Machine-Learning-Computational-I/article/view/145

Download Citation

Abstract

The increasing complexity of neural network applications, particularly in fields such as image super-resolution (SR) and optical character recognition (OCR), has spurred the need for more efficient optimization strategies and innovative neural architectures. This paper explores the latest advancements in dataset pruning, neural architecture search (NAS), and latent dataset distillation using diffusion models. We discuss how these techniques enhance the training efficiency of deep learning models while maintaining or improving performance across tasks. Dataset pruning, which involves reducing the size of training datasets without sacrificing accuracy, is shown to be an effective method for lowering computational costs. Proxy datasets and NAS further contribute by automating the discovery of optimal neural architectures, reducing the resources needed to search the vast space of possible models. Additionally, the paper delves into latent dataset distillation, where diffusion models are employed to create condensed representations of datasets, significantly speeding up the training process. The implications of these techniques on the performance of recurrent neural network (RNN) architectures, such as U-Net and U-ReNet, are evaluated, showcasing their impact on both OCR and SR tasks. This paper synthesizes research in these areas and outlines future directions for advancing neural network optimization and architecture development.

PDF