Aashish Panta

Aashish Panta

Ph.D. Candidate | AI/ML & Large Data Specialist | Petascale Data Visualization
Salt Lake City, US.

About

Ph.D. Candidate in Computer Science with extensive experience in developing advanced AI/ML frameworks and high-performance data visualization solutions for petascale scientific datasets. Specializes in leveraging cloud platforms and cutting-edge technologies like RAG and Large Language Models to drive impactful research in climate science and geoscience, contributing significantly to national labs and federal agencies.

Work

VISOAR LLC
|

Software Engineer

Salt Lake City, Utah, US

Summary

Enhanced database performance and scalability by migrating backend systems and integrating advanced visualization tools for large-scale data analysis.

Highlights

Migrated backend database from SQLite to MongoDB, significantly improving database performance and scalability for critical operations.

Implemented and integrated MongoDB Atlas cloud service for robust database hosting and management, refactoring backend API and ORM logic for seamless system compatibility.

Integrated Jupyter notebooks and visualization dashboards using Panel and Bokeh into the web application, enabling large-scale data analysis and insights.

Scientific Computing and Imaging Institute
|

Graduate Research Assistant

The University of Utah, UT, US

Summary

Conducts research with Professor Valerio Pascucci on developing cyberinfrastructure for efficient visualization and analysis of petabyte-scale datasets.

Highlights

Developed cyberinfrastructure for efficient visualization of large-scale datasets (terabytes to petabytes) in collaboration with Professor Valerio Pascucci's team at SCI.

Designed a novel data model incorporating efficient progressive streaming, compression, and automated data reduction frameworks, in collaboration with national labs and federal agencies.

NSF NCAR
|

Visiting Scholar

Boulder, CO, US

Summary

Spearheaded the development of advanced web-based dashboards and containerized workflows for petabyte-scale climate data analysis, enhancing accessibility and reproducibility for domain scientists.

Highlights

Developed interactive web-based dashboards for multi-terabyte to petabyte-scale climate datasets, eliminating lossy resampling and enabling real-time analysis in NSF NCAR's Research Data Archive.

Engineered scalable containerized dashboards and Python notebook workflows on the CISL CIRRUS system, facilitating exploration of CESM2-LENS and ERA5 data in NetCDF and Zarr formats.

Optimized ingestion pipelines for large geoscience datasets in collaboration with NCAR researchers, ensuring efficient deployment, reproducibility, and usability for domain scientists.

NASA Jet Propulsion Lab
|

Machine Learning Intern

Pasadena, CA, US

Summary

Designed and deployed a unified AI framework for Earth Science Intelligence, integrating RAG and LLMs to enable interactive, high-accuracy analysis of massive climate datasets.

Highlights

Developed a unified AI framework for Earth Science Intelligence, integrating Retrieval-Augmented Generation (RAG) with Large Language Models for interactive, high-accuracy analysis of massive climate datasets.

Built modular tools including a climate data assistant, multi-model AI comparison interface, and multilingual voice-enabled dashboard, to support diverse research workflows.

Deployed the AI framework on Microsoft Azure, utilizing Blob Storage, AI Search, and Foundry services, with OpenAI's Whisper for multilingual speech-to-text, ensuring scalability and reproducibility.

NASA Jet Propulsion Lab
|

Software Engineer Intern

Pasadena, CA, US

Summary

Optimized Mars rover data processing workflows and enhanced system functionalities for in-depth statistical analysis of surface parameters.

Highlights

Collaborated with the M2020 Perseverance Robotic Arm team, identifying data trending gaps and resolving unnoticed bugs, enhancing the system with functionalities for in-depth statistics of Mars rover surface parameters.

Analyzed and graphically represented Mars 2020 rover's trending data over time to detect proximity to fault limits, facilitating preventative measures against system failures.

Optimized the Mars rover backprocessing workflow, achieving a substantial reduction in processing time and improving operational efficiency.

Education

The University of Utah
Salt Lake City, UT, United States of America

Ph.D.

Computer Science

The University of Mississippi
University, MS, United States of America

BS

Computer Science

Grade: 3.84 GPA

Awards

CCGRID International Scalable Computing Challenge (SCALE) Finalist

Awarded By

CCGRID

Recognized as a finalist in the prestigious international challenge for scalable computing.

Best Paper Award, IEEE Large Scale Data Analysis and Visualization (LDAV) symposium

Awarded By

IEEE LDAV

Received a best paper award at a leading symposium for contributions to large-scale data analysis and visualization.

Publications

Expanding Access to Science Participation: A FAIR Framework for Petascale Data Visualization and Analytics

Published by

IEEE Transactions on Visualization and Computer Graphics

Summary

A. Panta, A. Sahistan, X. Huang, A. A. Gooch, G. Scorzelli, H. Torres, P. Klein, G. A. Ovando-Montejo, P. Lindstrom, and V. Pascucci, 'Expanding access to science participation: A fair framework for petascale data visualization and analytics,' IEEE Transactions on Visualization and Computer Graphics, pp. 1–16, 2025.

A Voice-Enabled AI Agent for Interactive Visualization and Analysis of NASA's Downscaled Dataset

Published by

American Geophysical Union (AGU)

Summary

Presented at American Geophysical Union (AGU) conference.

Large Data Acquisition and Analytics at Synchrotron Radiation Facilities

Published by

IEEE International Conference on Big Data (IEEE Big Data)

Summary

A. Panta, G. Scorzelli, A. Gooch, W. Sun, K. Shanks, S. Sarker, D. Bougie, K. Soloway, R. Verberg, T. Berman, G. Tarcea, J. Allison, M. Taufer, and V. Pascucci, 'Large data acquisition and analytics at synchrotron radiation facilities,' in Proceedings of the IEEE International Conference on Big Data (IEEE Big Data), Macau, China, 2025.

From Validation to Societal Value: A User-Centric Framework for Evaluating Regional Climate Models

Published by

American Geophysical Union (AGU)

Summary

Presented at American Geophysical Union (AGU) conference.

Climate Data for Power Systems Applications: Lessons in Reusing Wildfire Smoke Data for Solar PV Studies

Published by

Annual Hawaii International Conference on System Sciences

Summary

A. Salinas, I. Sohail, V. Pascucci, P. Stefanakis, S. Amjad, A. Panta, R. Schigas, T.C.Y. Chui, N. Duboc, M. Farrokhabadi, and R. Stull, 'Climate Data for Power Systems Applications: Lessons in Reusing Wildfire Smoke Data for Solar PV Studies,' in 2026 Proceedings of the Annual Hawaii International Conference on System Sciences, 2026.

Scalable Web-Based Exploration and RAG-enhanced Insights for NASA's Downscaled Climate Data

Published by

Cloud-Native Geospatial (CNG) Conference

Summary

Presented at Cloud-Native Geospatial (CNG) Conference.

Scalable Climate Data Analysis: Balancing Petascale Fidelity and Computational Cost

Published by

IEEE 25th International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)

Summary

A. Panta, A. Gooch, G. Scorzelli, M. Taufer, and V. Pascucci, 'Scalable climate data analysis: Balancing petascale fidelity and computational cost,' in 2025 IEEE 25th International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW), 2025, pp. 245–248.

Leveraging National Science Data Fabric Services to Train Data Scientists

Published by

SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis

Summary

M. Taufer, H. Martinez, A. Panta, P. Olaya, J. Marquez, A. Gooch, G. Scorzelli, and V. Pascucci, 'Leveraging national science data fabric services to train data scientists,' in SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2024, pp. 355-362.

Web-based Visualization and Analytics of Petascale Data: Equity as a Tide that Lifts All Boats

Published by

IEEE 14th Symposium on Large Data Analysis and Visualization (LDAV)

Summary

A. Panta, X. Huang, N. McCurdy, D. Ellsworth, A. A. Gooch, G. Scorzelli, H. Torres, P. Klein, G. A. Ovando-Montejo, and V. Pascucci, 'Web-based visualization and analytics of petascale data: Equity as a tide that lifts all boats,' in 2024 IEEE 14th Symposium on Large Data Analysis and Visualization (LDAV), 2024, pp. 1-11.

Enhancing Scientific Research with FAIR Digital Objects in the National Science Data Fabric

Published by

Computing in Science & Engineering

Summary

M. Taufer, H. Martinez, J. Luettgau, L. Whitnah, G. Scorzelli, P. Newell, A. Panta, P.-T. Bremer, D. Fils, C. R. Kirkpatrick, and V. Pascucci, 'Enhancing scientific research with fair digital objects in the national science data fabric,' Computing in Science & Engineering, vol. 25, no. 5, pp. 39–47, 2023.

Skills

Web Development

Express, NodeJS, MongoDB, MySQL, React, HTML5, CSS, JavaScript.

Data Visualization

Paraview, 3D Slicer, Panel, Bokeh, Jupyter Notebooks.

Programming Languages

Python, PHP, SQL.

Scientific Computing

Petascale Data, Geoscience Datasets, Climate Modeling, Cyberinfrastructure, Data Reduction Frameworks, High-Performance Computing.

Machine Learning & AI

Retrieval-Augmented Generation (RAG), Large Language Models (LLMs), Deep Neural Networks, AI Agent Development, Climate Data Assistant, Multi-model AI Comparison.

Cloud Platforms & Services

Azure Services, MongoDB Atlas, Blob Storage, AI Search, Foundry services, OpenAI Whisper.

Methodologies & Tools

Agile, GIT, Containerization, ORMs.

Projects

OpenVisus: Large-scale Scientific Visualization Tool

Summary

Developed and optimized Python APIs and data access plugins for OpenVisus to enable high-performance streaming, multiresolution visualization, and analysis of large-scale datasets across research domains including synchrotron facilities, materials science, and data commons. Collaborated with scientists at NASA Jet Propulsion Laboratory (JPL) to develop scalable cyberinfrastructure supporting data streaming, subsetting, and real-time visualization of climate datasets within the AIST OCW framework.

Dynamic Super-Resolution for Large Multi-variate Climate Dataset

Summary

Designed and implemented a deep neural network architecture integrating convolutional layers, upsampling, and multi-quality inputs to achieve dynamic super-resolution and efficient analysis of large multivariate datasets, as part of research at the University of Utah.

NASA ARSET Program: Assessing Extreme Weather Statistics using NEX-GDDP-CMIP6

Summary

Co-led a NASA ARSET program tutorial on assessing extreme weather statistics using NASA Earth eXchange Global Daily Downscaled Projections (NEX-GDDP-CMIP6), engaging 700 participants from 93 countries.

A Unified and Interactive Framework for Data Intelligence

Summary

Developed an interactive data intelligence framework integrating Azure AI Search, Azure OpenAI, and cloud-hosted datasets, enabling natural language querying, dynamic subsetting, and on-demand analytics directly from the deployed interface, in collaboration with NASA JPL and Microsoft.

Tutorial: Strategies for Large-Scale Data Analysis with the National Science Data Fabric (NSDF)

Summary

Co-led a tutorial on strategies for large-scale data analysis with the National Science Data Fabric (NSDF) at IEEE IPDPS, 2025.

Invited Presentation: OpenVisus for Petascale Scientific Visualization (NCAR Earth System Data Science)

Summary

Presented 'OpenVisus for Petascale Scientific Visualization' at the NCAR Earth System Data Science (ESDS) Initiative, remotely.

Invited Presentation: OpenVisus for Petascale Scientific Data (NASA Earth Exchange)

Summary

Presented 'OpenVisus for Visualization of Petascale Scientific Data' at the NASA Earth Exchange (NEX) Biweekly Meeting, remotely.

Tutorial: Enabling Scientific Discovery with National Science Data Fabric

Summary

Co-led a tutorial at IEEE VIS, 2024, focusing on harnessing the power of the National Science Data Fabric for large-scale data analysis to enable scientific discovery.

Tutorial: Using NSDF Services for End-to-End Analysis and Visualization of Large Scientific Data

Summary

Co-led a tutorial as part of the NSDF Webinar Series, 2024, on utilizing NSDF services for end-to-end analysis and visualization of large scientific data.