CN
Information Center
NEWS & EVENTS
Generative AI and data analytics: From questions to queries
Oct 31,2025

A global online technology leader turned to ALTEN’s CIeNET for the creation of a reliable framework for testing and enhancing LLM capabilities and ensuring precise and efficient natural language-to-SQL query generation for complex datasets. The result: significant improvements in the ability to generate accurate queries based on natural language inputs.

Part of the ALTEN Group, CIeNET is a premier software services provider. CIeNET was approached by one of the world’s leading online technology companies to help in enhancing the capabilities of large language models (LLMs) to more precisely process natural language queries into structured query language (SQL). To overcome the limitations in existing LLMs, CIeNET benchmarked their performance, identified errors, and refined the output using custom datasets. 

01 Challenge

To improve the ability of LLMs to generate SQL queries that correctly answer natural language queries for a specified dataset

02 Solution

LLMs that accurately translate natural language queries into SQL (natural language to SQL or NL2SQL)

03 Benefits

Custom datasets for training and refining LLMs

Accurate SQL to answer natural language queries

Enhanced business reputation

Increased efficiency

The problems with data

The generation of inaccurate SQL queries from natural language inputs can lead to serious issues, including delivering the wrong information to clients or stakeholders. Erroneous data can also have a negative effect on critical decision-making processes and can even lead to financial losses. Inconsistent or incorrect data can potentially compromise the reliability of a database or lead to the unintentional disclosure of sensitive or confidential information. Furthermore, malformed SQL can even cause crashes in database systems, or result in non-compliance with regulatory requirements and, depending on the nature of the data involved, to legal complications.

Natural language to SQL

CIeNET set out to analyze the performance and accuracy of LLMs and third-party services in generating SQL queries from natural language questions for a given dataset. The process began with designing and implementing an automated benchmarking system to assess the accuracy, performance, and quality of the LLMs and third-party services in generating SQL in response to the questions. Test cases were created to evaluate the efficacy and quality of the generated SQL, comparing the results over time to other LLMs and authoring datasets. Model training and refinement were then carried out, involving the creation and review of natural language-to-SQL pairs to identify and correct erroneous datasets. Finally, custom database schemas were created and populated with data for use in the creation of new natural language-to-SQL pairs.

The tools  

The LLMs reviewed included Google Gemini, OpenAI ChatGPT and Anthropic Claude3. The database and data warehouse tools were Google BigQuery, Amazon Redshift, Databricks, Snowflake, MySQL, and PostgreSQL. CIeNET developed a custom system for benchmark execution that they named Generative AI beNchmark System (GAINS). They directed the prompt engineering, with a special focus on improved performance, and analyzed the benchmark results to identify issues, referencing various public datasets. Finally, they authored, reviewed and corrected the NL-to-SQL datasets to train and/or refine the LLMs, ensuringthat the training was accurate and effective.