Wisdom in Data - Multi-Agent LLM Data Factory

Platform Demonstration

Comprehensive demonstrations showcasing our multi-agent data factory capabilities

Data Import Process

Methodology Overview

This demonstration shows how the Database Information Processing Agent imports CSV and Excel files into the database. The agent intelligently analyzes the number and structure of title/header rows, automatically detects and processes merged headers, and extracts each column's data for cleaning. It then provides detailed analysis for each column, including recommended field names, types, primary key status, and the reasoning behind each suggestion. The user only needs to intervene at key steps; the rest of the process is fully automated, making data import seamless and efficient.

Intelligent Header Recognition: Automatic detection of title/header rows and handling of complex merged headers
Field Analysis: Extraction and cleaning of column data, with smart suggestions for field names, types, and primary keys
Reasoning Transparency: Detailed explanation for each field analysis and decision
Minimal User Intervention: Users only participate in critical steps, with the rest handled automatically

DDL Generation Process

Automated Schema Design

This video demonstrates how the Database Information Processing Agent generates DDL by analyzing table headers and sample values. For each column, the agent automatically infers its content, provides explanations, and incorporates user-specified sample values into the DDL. This enriched DDL information helps the Database Information Retrieval Agent better understand the table structure, enabling more accurate SQL generation for user queries.

Field Content Analysis: Automatic inference and explanation of each column's meaning based on headers and sample values
Sample Value Integration: User-specified examples are included in the DDL for richer context
Enhanced Retrieval: DDL is optimized to support downstream agents in query generation

Data Question Answering

Context-Enhanced SQL Generation

"Which sales representative has the highest customer satisfaction scores?"

In this demonstration, the user asks, "Which sales representative has the highest customer satisfaction scores?" The Database Retrieval Agent leverages historical QA data, DDL information, and domain-specific knowledge as retrieval-augmented generation (RAG) context to generate the appropriate SQL query and retrieve the data. The Database Analysis Agent then combines the query results and relevant domain knowledge to provide a detailed analysis and answer to the user's question.

RAG-based SQL Generation: Utilizes historical QA, DDL, and domain knowledge for context
Automated Data Retrieval: Generates and executes SQL to answer user queries
Expert Analysis: Analysis agent synthesizes results and knowledge for comprehensive answers

Data Visualization Generation

Automated Chart Generation

"Create a chart showing monthly revenue trends across all regions"

This video demonstrates how, when a user requests "Create a chart showing monthly revenue trends across all regions," the Database Visualization Agent first determines whether a chart is needed and recommends the most suitable chart type if not specified. It then uses data provided by the Retrieval Agent to generate the chart, while the Analysis Agent offers an interpretation of the visualization. The process is fully automated, ensuring users receive both visual and analytical insights with minimal effort.

Intent Recognition: Detects user needs for visualization and recommends chart types automatically
Data-Driven Plotting: Uses retrieved data to generate appropriate charts
Insightful Analysis: Analysis agent provides explanations and insights for each visualization

Knowledge Graph Construction

Data-to-Knowledge Graph Transformation

This demonstration presents how the Knowledge Graph Information Processing Agent transforms structured database tables into relational knowledge graphs. The agent reads table DDL, header information, and sample data, then applies algorithm-defined transformation rules to determine which columns become entities (including splitting multi-valued cells), and how entities are related. Logical and semantic relationship patterns are used to associate entities, and the resulting graph is stored in a knowledge graph database. This process enables downstream relational analysis that goes beyond traditional structured queries.

Entity Construction: Identifies entity columns, handles multi-valued cells, and generates unique entity identifiers
Relationship Discovery: Defines and applies logical/semantic rules to associate entities
Graph Storage: Persists the resulting knowledge graph for advanced relational retrieval and analysis

Knowledge Graph Visualization

Interactive Graph Exploration

This video demonstrates the knowledge graph visualization interface, which allows users to intuitively explore the internal relationships within the transformed knowledge graph. The interface displays all node types, relationship types, node properties, and their associations. Users can drag, browse, and filter the graph to gain a clear understanding of the data's relational structure.

Comprehensive Visualization: Shows all node types, relationships, and properties
Interactive Exploration: Supports drag-and-drop navigation and filtering
Relationship Clarity: Helps users intuitively understand data associations

Knowledge Graph Question Answering

Complex Relational Reasoning

"Map the technology stack relationships in our active projects and identify skill dependencies."

In this demonstration, the user asks, "Map the technology stack relationships in our active projects and identify skill dependencies." The Knowledge Graph Retrieval Agent uses the graph schema, historical QA, and domain-specific knowledge as RAG context to generate Cypher queries and retrieve relevant subgraphs. The Knowledge Graph Analysis Agent then analyzes the subgraph and, together with domain knowledge, provides insights into the relationships of interest. The Knowledge Graph Visualization Agent displays the subgraph below the answer, supporting drag-and-drop interaction and showing node properties for enhanced interpretability and user experience.

RAG-based Cypher Generation: Utilizes schema, historical QA, and domain knowledge for context
Subgraph Retrieval: Generates and executes Cypher to answer relational queries
Visual Explanation: Presents subgraphs interactively with node property details

Multi-Agent Collaboration

Coordinated Team Orchestration

"Analyze our project delivery capacity by correlating team skills, project complexity, and historical performance - what are our optimization opportunities?"

This final demonstration showcases the Data Leader orchestrating complete multi-agent collaboration using the ReAct paradigm. The user asks, "Analyze our project delivery capacity by correlating team skills, project complexity, and historical performance - what are our optimization opportunities?" The Data Leader Agent dynamically coordinates the Database and Knowledge Graph teams, integrating both structural and relational information through a three-stage iterative process. The interface displays the entire interaction between the leader and both teams, making the full workflow transparent and traceable, which enhances user trust and understanding.

Three-Stage Principle: Implementation of "explore-verify-analyze" methodology for complex query decomposition
Dynamic Team Dispatch: Intelligent delegation between Database Team and Knowledge Graph Team based on task requirements
Iterative Reasoning: ReAct cycle implementation with thought-action-observation phases
Comprehensive Synthesis: Integration of multi-source information for final answer generation
Full Process Transparency: All agent interactions are visible and traceable for the user

Research Results

Comprehensive evaluation across multiple benchmarks and model providers

Published Information Processing & Management 2026

DataFactory: Collaborative multi-agent framework for advanced table question answering

Tong Wang, Chi Jin, Yongkang Chen, Huan Deng, Xiaohui Kuang, Gang Zhao

Journal: Information Processing & Management, Volume 63, Issue 6, Article 104723

DOI: 10.1016/j.ipm.2026.104723

Article: ScienceDirect Page

Read Paper Publisher Page

Project Repository

Explore the official open-source implementation of DataFactory.

DataFactory (Main Project)

Includes framework logic, multi-agent orchestration, and reproducible workflows.

Performance Comparison

TabFact Accuracy Improvement +15.9%

WikiTQ Performance Gain +28.6%

FeTaQA Rouge-2 F Score 0.3885

Model Compatibility

Tested across 8 LLMs from 5 providers including:

Claude 4.0 Sonnet
Gemini 2.5 Flash
GPT-4o Mini
Qwen3 Series
DeepSeek-V3

BibTeX

Citation-ready entry

@article{WANG2026104723,
title = {DataFactory: Collaborative multi-agent framework for advanced table question answering},
journal = {Information Processing & Management},
volume = {63},
number = {6},
pages = {104723},
year = {2026},
issn = {0306-4573},
doi = {https://doi.org/10.1016/j.ipm.2026.104723},
url = {https://www.sciencedirect.com/science/article/pii/S0306457326001147},
author = {Tong Wang and Chi Jin and Yongkang Chen and Huan Deng and Xiaohui Kuang and Gang Zhao},
keywords = {Table question answering, Multi-agent systems, Large language models, Knowledge graph, Data factory, ReAct paradigm}
}

DataFactory: Multi-Agent Framework for TableQA

15.9%

28.6%

3

Platform Interface Preview

Key Features

Automated Data Ingestion

Knowledge Graph Construction

Multi-Agent Collaboration

Hallucination Reduction

Advanced Reasoning

Autonomous Pipeline

Architecture

Multi-Agent Data Factory

Data Leader

Database Team

Knowledge Graph Team

Information Storage

Knowledge Extraction

Insight Generation

Platform Demonstration

Data Import Process

Methodology Overview

DDL Generation Process

Automated Schema Design

Data Question Answering

Context-Enhanced SQL Generation

Data Visualization Generation

Automated Chart Generation

Knowledge Graph Construction

Data-to-Knowledge Graph Transformation

Knowledge Graph Visualization

Interactive Graph Exploration

Knowledge Graph Question Answering

Complex Relational Reasoning

Multi-Agent Collaboration

Coordinated Team Orchestration

Research Results

DataFactory: Collaborative multi-agent framework for advanced table question answering

Project Repository

Performance Comparison

Model Compatibility

BibTeX