Day-to-day analytics: These are questions of low-question and low-schema complexity. Answering these questions requires querying three columns and one table of data.
Example: "Return all the claims we have by claim number, open date, and close date."
Operational analytics: These are questions of high-question and low-schema complexity. Answering these questions requires aggregation and querying four tables of data.
Example: "What is the average time to settle a claim policy?"
Metrics & KPIs: These are questions of low-question and high-schema complexity. Answering these questions requires querying three columns and six tables of data.
Example: "What are the loss payment, loss reserve, expense reserve amount by claim number?"
Strategic planning: These are questions of high-question and high-schema complexity. Answering these questions means using aggregation, math, and querying nine tables of data.
Example: "What is the total loss of each policy where loss is the sum of loss payment, loss revenue, expense payment, and expense reserve amount?”
LLMs struggle to answer business questions
When scored on 43 different business questions across the four categories of complexity, the LLM struggled to produce accurate answers. The average accuracy of the answer greece whatsapp number data across all questions came in at 16.7%.
Overall accuracy was poor, but the LLM struggled most significantly with high-schema complexity questions. For those categories with high-schema complexity – questions related to metrics, KPIs and strategic planning, the LLM failed to return an accurate answer, scoring 0% in both categories.
Questions with high-schema complexity represent the upper end of questions that analysts and executives might ask of their data. Typically, these questions require an expert analyst to answer – someone proficient in SQL with a deep understanding of the organization’s data.
Day-to-day analytics: 25.5% accuracy
Operational analytics: 37.4% accuracy
Metrics & KPIs: 0% accuracy
Strategic planning: 0% accuracy
The Knowledge Graph difference
Knowledge Graphs map data to meaning, capturing both semantics and context. Rigid relational data moves into a flexible graph structure, enabling a richer understanding of the connections between data, people, processes, and decisions. The flexible format provides the context that LLMs need to more accurately answer complex questions across both aforementioned vectors of question and schema complexity.
The benchmark showed an average 3x improvement in response accuracy and marked improvement in each category – even the high-schema complexity questions that stumped the LLM alone.
The improvement added by the Knowledge Graph can be seen here:
Day-to-day analytics: From 25.5% accuracy → 71% accuracy with Knowledge Graph
Operational analytics: From 37.4% accuracy → 66.9% accuracy with Knowledge Graph an improvement of 2X
Metrics & KPIs: From 0% accuracy → 35.7% accuracy with Knowledge Graph
Strategic planning: From 0% accuracy → 38.7% accuracy with Knowledge Graph
The future of AI-ready data
Knowledge Graphs remove a critical barrier standing in the way of enterprises unlocking new capabilities with AI. This benchmark underscores the significant impact a Knowledge Graph can have on LLM accuracy in enterprise settings.
The implications are enormous for businesses: Making LLMs a viable means for making data-driven decision-making accessible to more people (regardless of technical know-how), enabling faster time-to-value with data and analytics, and surfacing new ways to use data to drive ROI, just to name a few.
This is the first benchmark investigating how Knowledge Graph-based approaches can strengthen LLM accuracy and impact in the enterprise. But these are still early days. LLMs will continue to become more accurate and Knowledge Graph-techniques will continue to refine LLM response accuracy. The authors plan to publish additional benchmarks documenting the effects of these improvements in the future.
Even as LLMs and Knowledge Graph techniques improve, no system is correct 100% of the time. The ability to audit responses and trace the path of LLM response generation will be critical to accountability and trust. By leveraging a data catalog built on a Knowledge Graph, like the data.world Data Catalog Platform, enterprises can bring explainability to LLMs, effectively opening up the AI “black box” and enabling the LLMs to “show their work.”
Additionally, a data catalog built on a Knowledge Graph will enable enterprises to govern the data, metadata, queries, and responses being used and generated by the LLM. Proper governance is critical for ensuring that sensitive or proprietary data is protected and that the responses are valid in the context of the organization and its internal rules.
With a well-managed data catalog built on Knowledge Graph, enterprises can create AI-ready data and build the foundation for success with generative AI – improving accuracy, enabling explainability, and applying governance.
The average score for each category of complexity can be found below
-
- Posts: 860
- Joined: Thu Jan 02, 2025 7:05 am