![]()
Kishore Ande1 & Dr. Shakeb Khan2
1CVS Health, 1 CVS Drive, Woonsocket, RI, 02895, United States.
2Maharaja Agrasen Himalayan Garhwal University
Uttarakhand, India
Abstract
The use of Generative Artificial Intelligence (AI) in Extract, Transform, and Load (ETL) operations and Business Intelligence (BI) activities has the potential to revolutionize data warehousing by enhancing automation, scalability, and data consistency. Traditional ETL and BI systems rely mostly on human effort, which in most instances results in inefficiency, inaccuracy, and latency, especially with increasing volumes and complexity of data. While past studies have concentrated on the use of AI in data warehousing in different aspects, there remains a huge gap in extensive studies exploring how generative AI can automate ETL processes entirely and provide more advanced, personalized BI insights. The current research attempts to bridge this gap by examining the use of generative AI techniques, such as machine learning algorithms, deep learning techniques, and natural language processing, in the development of ETL pipelines and business intelligence automation. Specifically, the research examines the use of generative AI in automating data extraction, transformation, cleaning, and anomaly detection while, at the same time, improving the overall quality and uniformity of data loaded into the data warehouse. The research further examines the use of generative AI in generating dynamic, personalized business intelligence reports based on real-time data and user experience. The current research seeks to demonstrate the capabilities of generative artificial intelligence in streamlining data warehousing operations, reducing operational costs, and facilitating decision-making through timely and actionable insights. Through the resolution of pressing concerns like data privacy, governance, and scalability, the research presents a new model for the incorporation of AI-based methodologies in modern data warehousing frameworks. The research is expected to contribute immensely to the development of more intelligent, automated, and responsive data environments in big data processing sectors.
Keywords
Generative AI, ETL processes, Business Intelligence, data warehousing, automation, data quality, machine learning, anomaly detection, personalized insights, data governance, scalability, data extraction, data transformation, natural language processing, real-time analytics.
eferences
- Dinesh, L., & Devi, K. G. (2024). An efficient hybrid optimization of ETL process in data warehouse of cloud architecture. Journal of Cloud Computing, 13, Article 12
- Guessoum, M. A., Djiroun, R., Boukhalfa, K., & Benkhelifa, E. (2022). Natural language why-question in Business Intelligence applications: model and recommendation approach. Cluster Computing, 25(6), 3875–3898
- Kumar, G. S. S., & Kumar, M. R. (2024). AutoETL: A nonlinear deep learning framework for ETL automation. Communications on Applied Nonlinear Analysis, 32(3S), (pp. 1–13)
- Mondal, K. C., Biswas, N., & Saha, S. (2020). Role of machine learning in ETL automation. In Proceedings of the 21st International Conference on Distributed Computing and Networking (ICDCN 2020) (pp. 1–6). ACM
- Brath, R., & Hagerman, C. (2021). Automated insights on visualizations with natural language generation. In 2021 25th International Conference on Information Visualization (IV) (pp. 278–284). IEEE
- Uddin, M. K. S., & Hossan, K. M. R. (2024). A review of implementing AI-powered data warehouse solutions to optimize big data management and utilization. Academic Journal on Business Administration, Innovation & Sustainability, 4(3), 1–13
- Dinesh and K. G. Devi, “An efficient hybrid optimization of ETL process in data warehouse of cloud architecture,” J. Cloud Comput., vol. 13, art. 12, 2024
- A. Guessoum, R. Djiroun, K. Boukhalfa, and E. Benkhelifa, “Natural language why-question in Business Intelligence applications: model and recommendation approach,” Cluster Comput., vol. 25, no. 6, pp. 3875–3898, 2022
- S. S. Kumar and M. R. Kumar, “AutoETL: A nonlinear deep learning framework for ETL automation,” Commun. Appl. Nonlinear Anal., vol. 32, no. 3S, pp. 1–13, 2024
- C. Mondal, N. Biswas, and S. Saha, “Role of machine learning in ETL automation,” in Proc. 21st Int’l Conf. Distributed Computing and Networking (ICDCN), Kolkata, India: ACM, 2020, pp. 1–6
- Brath and C. Hagerman, “Automated insights on visualizations with natural language generation,” in Proc. 25th Int’l Conf. Information Visualization (IV 2021), Sydney, NSW, Australia: IEEE, 2021, pp. 278–284
- K. S. Uddin and K. M. R. Hossan, “A review of implementing AI-powered data warehouse solutions to optimize big data management and utilization,” Acad. J. Business Admin. Innov. & Sustainability, vol. 4, no. 3, pp. 1–13, 2024