When requesting AI to process a table, which format is most appropriate: CSV, Markdown, JSON, or natural language?



Chat AI and agent AI can handle input of various data formats, including text and images. Improving Agents, which researches effective uses of AI models, analyzed the most suitable format for inputting a huge table with 1,000 rows into

GPT-4.1 mini and published the results.

Which Table Format Do LLMs Understand Best? (Results for 11 Formats)
https://www.improvingagents.com/blog/best-input-data-format-for-llms

Improving Agents created a table summarizing the 'ID,' 'name,' 'age,' 'work location,' 'department,' 'salary,' 'work history,' and 'number of project participation' of 1,000 employees in 11 different formats, and input it into GPT-4.1 mini to measure the accuracy rate for 1,000 questions. The formats used in the experiment were 11 types: 'JSON,' 'CSV,' 'XML,' 'YAML,' 'HTML,' 'Markdown Table,' 'Markdown KV,' 'INI,' 'pipe-separated string,' 'JSONL,' and 'natural language.'

Of the above formats, 'Markdown KV' is a key-value database representation in Markdown as shown below.
[code]# Employee Database

## Record 1

```
id: 1
Name: Charlie A0
age: 56
City: New York
Department: Operations
salary: 67896
years_experience: 7
project_count: 1
```

## Record 2

```
id: 2
Name: Grace B1
age: 59
city: Mumbai
Department: Marketing
salary: 47248
years_experience: 0
project_count: 43
```[/code]



In addition, for natural language input, we provided a sentence describing each employee's information, as follows:
[code]Employee Records Summary:

Diana A0 (ID: 1) is a 46-year-old employee working in the Engineering department in London. They earn $141,015 with 7 years of experience and have completed 17 projects.
Grace B1 (ID: 2) is a 59-year-old employee working in the Engineering department in Berlin. They earn $100,066 with 11 years of experience and have completed 32 projects.
Grace C2 (ID: 3) is a 64-year-old employee working in the Engineering department in Dubai. They earn $91,727 with 9 years of experience and have completed 49 projects.[/code]



The results of the experiment are as follows: CSV, a widely used format for describing tables, had a 44.3% accuracy rate, lower than natural language (49.6%). Markdown KV had the highest accuracy rate (60.7%), followed by XML (56.0%), INI (55.7%), and YAML (54.7%).



Based on these results, Improving Agents recommends using Markdown KV in environments where accuracy is required, and Markdown table format in environments where human readability is important. However, since this test was only conducted under the condition of inputting 'data of 1,000 employees' into 'GPT-4.1 nano,' the appropriate format may change depending on the type of table or AI model.

in AI,   Software, Posted by log1o_hf