This paper proposes a structured data prediction method based on Large Language Models with In-Context Learning (LLM-ICL). The method designs sample selection strategies to choose samples closely related to the prediction task and converts structured data into text sequences, which are then provided as input to large language models for prediction through in-context learning. To validate the effectiveness of the method, experiments were conducted using the IPUMS dataset. In the few-shot setting with only 10 demonstration samples, Results demonstrate that with extremely limited samples (only 10), the best-performing model Qwen-plus achieves a prediction accuracy of 79.4%, significantly outperforming traditional supervised machine learning algorithms trained on the same sample size (XGBoost at 73.5% and KNN at 71.1%). Further analysis reveals that KNN and XGBoost require approximately 500 and 16,000 samples respectively to achieve comparable accuracy levels to LLM-ICL using just 10 samples. Additionally, sample selection strategy significantly impacts performance—employing nearest neighbor sampling further enhances accuracy compared to random selection. This research demonstrates the substantial potential and application value of LLM-ICL in few-shot structured data prediction tasks.
Cite this paper
Dou, X. (2025). In-Document Table Data Inference Based on LLM-ICL Model. Open Access Library Journal, 12, e14603. doi: http://dx.doi.org/10.4236/oalib.1114603.
Joshi, M., Choi, E., Weld, D. and Zettlemoyer, L. (2017) TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Pa-pers), Vancouver, 30 July-4 August 2017, 1601-1611. https://doi.org/10.18653/v1/p17-1147
Arulraj, A., Chandra, A., Chen, W., Fan, R., Guo, S., He, X., et al. (2024) MMLU-Pro: A More Robust and Chal-lenging Multi-Task Language Understanding Benchmark. Advances in Neural Information Processing Systems 37, Vancouver, 10-15 December 2024, 95266-95290. https://doi.org/10.52202/079017-3018
Cafarella, M.J., Halevy, A. and Madhavan, J. (2011) Structured Data on the Web. Communica-tions of the ACM, 54, 72-79. https://doi.org/10.1145/1897816.1897839
Markatos, K., Kaseta, M.K., Lallos, S.N., Korres, D.S. and Efstathopoulos, N. (2012) The Anatomy of the ACL and Its Importance in ACL Reconstruction. European Journal of Orthopaedic Surgery & Traumatology, 23, 747-752. https://doi.org/10.1007/s00590-012-1079-8
Park, C.F., Lee, A., Lubana, E.S., Yang, Y.Y., Okawa, M., Nishi, K., Wattenberg, M. and Tanaka, H. (2024) ICLR: In-Context Learning of Representations. arXiv: 2501.00070. https://arxiv.org/abs/2501.00070
Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D. and Steinhardt, J. (2020) Measuring Massive Multitask Language Understanding. arXiv: 2009.03300. https://arxiv.org/abs/2009.03300
Vacareanu, R., Negru, V.A., Suciu, V. and Surdeanu, M. (2024) From Words to Numbers: Your Large Language Mod-el Is Secretly a Capable Regressor When Given In-Context Examples. arXiv: 2404.07544. https://arxiv.org/abs/2404.07544
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al. (2018) Improving Language Understanding by Generative Pretraining. https://www.mikecaptain.com/resources/pdf/GPT-1.pdf
Radford, A., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019) Language Models Are Unsupervised Multitask Learners. https://storage.prod.researchhub.com/uploads/papers/2020/06/01/language-models.pdf
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020) Language Models Are Few-Shot Learners. Advances in Neural Information Pro-cessing Systems, 33, 1877-1901.
Friedman, J., Tibshirani, R. and Hastie, T. (2000) Additive Logistic Regression: A Statistical View of Boosting (with Dis-cussion and a Rejoinder by the Authors). The Annals of Statistics, 28, 337-407. https://doi.org/10.1214/aos/1016120463
Wang, W.H., Wang, S.Y., Huang, J.Y., Liu, X.D., Yang, J., Liao, M., et al. (2023) An Investigation Study on the Interpretation of Ultrasonic Medical Reports Using OpenAI’s GPT-3.5-Turbo Model. Journal of Clinical Ultrasound, 52, 105-111. https://doi.org/10.1002/jcu.23590
Shang, J., Yu, W. and Chen, J. (2024) Crowdsourcing Canada Goldenrod Identification from Multimodal Weibo Data. Proceedings of the 24th ACM/IEEE Joint Conference on Digital Libraries, Hong Kong, 16-20 December 2024, 1-5. https://doi.org/10.1145/3677389.3702581
Pan, S., Liu, K., Chen, W. and He, B. (2024) Performance Analysis of Chinese Large Language Models in Solving Math Word Problems. 2024 International Conference on Intelligent Education and Intelligent Research (IEIR), Macau, 6-8 November 2024, 1-8. https://doi.org/10.1109/ieir62538.2024.10960109
Taşyürek, M., Adıgüzel, Ö., Gündoğar, M., Goncharuk-Khomyn, M. and Ortaç, H. (2025) Com-parative Evaluation of the Responses from ChatGPT-5, Gemini 2.5 Flash, and Deepseek-V3.1 Chatbots to Patient Inquiries about Endodontic Treatment in Terms of Accuracy, Understandability, and Readability. International Dental Research, 15, 91-95. https://doi.org/10.5577/intdentres.662