Recent Advances in Food, Nutrition & Agriculture

Author(s): Himanshu Shekhar and Abhilasha Sharma*

DOI: 10.2174/2772574X14666230126095121

Global Food Production and Distribution Analysis using Data Mining and Unsupervised Learning

Page: [57 - 70] Pages: 14

  • * (Excluding Mailing and Handling)

Abstract

Background: Today’s food industry is extensive and complicated, encompassing anything from subsistence agriculture to multinational food corporations. The mobility of food and food elements in food systems has a major impact on biodiversity preservation and the overall sustainability of our fragile global ecosystem. Identifying the human and livestock consumption patterns across regions and territories will optimize the dietary standards of the habitually undernourished and the expanding population without substantially increasing the amount of land under cultivation. Food preservation is the basis for economic advancement and social sustainability, so the food industry, both local and global, is fundamental to everyone. As a primary mechanism for ensuring global food preservation, there is currently a strong emphasis on accelerating food supply and decreasing waste. Thus, analyzing the production and distribution of food supply will boost economic sustainability.

Methods: In this paper, we present a quantitative analysis of global and regional food supply to reveal the flow of food and feed products in various parts of the world. Using data mining and machine learning-based approaches, we seek to quantify the production and distribution of food elements. The study aims to employ artificial intelligence-based methods to comprehend the shift and change in supply and consumption patterns with timely distribution to meet the global food instability. The method involves using statistical-based approaches to identify the hidden factors and variables. Feature engineering is used to uncover the interesting features in the dataset, and various clustering-based algorithms, like K-Means, have been utilized to group and identify the similar and most notable features.

Results: The concept of data mining and machine learning-based algorithms has helped us in identifying the global food production and distribution subsystem. The identified elements and their relationship can help stakeholders in regulating various external and internal factors, including urbanization, urban food needs, the economic, political and social framework, food demand, and supply flows. The exploratory analysis helps in establishing the efficiency and dynamism of food supply and distribution systems.

Conclusion: The outcome demonstrates a pattern indicating the flow of currently grown crops into various endpoints. Few countries with massive populations have shown tremendous growth in their production capacity. Despite the fact that only a few countries produce a large portion of food and feed crops, still it is insufficient to feed the estimated global population. Significant changes in many people's socioeconomic conditions, as well as radical dietary changes, will also be required to boost agricultural credit and economic foundations.

Graphical Abstract

[1]
United Nations Population Funds 2021. Available from: https://www.unfpa.org/data/world-population-dashboard
[2]
United Nations, global issues food. 2021. Available from: https://www.un.org/en/global-issues/food
[3]
The State of Food Insecurity in the World. Food and agriculture organization of the United nations; , 2015. Available from: http://www.fao.org/publications/sofi/2015/en/
[4]
Worldwide Governance Indicators. World Bank , 2014. Available from: http://info.worldbank.org/governance/wgi/
[5]
Alexandratos, N.; Bruinsma, J. World agriculture towards 2030/2050: the 2012 revision. ESA Working paper No. 12-03. Rome, FAO, 2012.
[6]
Food and Agriculture Organization of the United Nations FAOSTAT-Data., 2021. Available from: http://www.fao.org/faostat/en/#data
[7]
Food and Agriculture Organization of the United Nations. FAOSTAT-New Food Balances; , 2021. Available from: http://www.fao.org/faostat/en/#data/FBS
[8]
Chu, X.; Ilyas, I.F. Qualitative data cleaning. Proceedings of the VLDB Endowment 9.13; , 2016, pp. 1605-1608.
[http://dx.doi.org/10.14778/3007263.3007320]
[9]
Kumar, D.S.; Kumar, S. A comparative study of various data transformation techniques in data-mining. Int. J. Sci. Eng. Technol., 2015, 4(3), 146-148.
[http://dx.doi.org/10.17950/ijset/v4s3/305]
[10]
Khalid, S.; Khalil, T.; Nasreen, S. A survey of feature selection and feature extraction techniques in machine learning. In: 2014 Science and Information Conference; London, UK, 2014; pp. 372-378.
[http://dx.doi.org/10.1109/SAI.2014.6918213]
[11]
Wang, X.; Wang, C. Time series data cleaning: A survey. IEEE Access, 2020, 8, 1866-1881.
[http://dx.doi.org/10.1109/ACCESS.2019.2962152]
[12]
Britannica, The editors of encyclopedia. "Collapse of the soviet union". encyclopedia britannica, 11 Aug. 2020. Available from: https://www.britannica.com/event/the-collapse-of-the-Soviet-Union
[13]
Murray, G.R.; Scime, A. Data mining. In: Emerging Trends in the Social and Behavioral Sciences: An Interdisciplinary; Searchable, and Linkable Resource, 2015; pp. 1-15.
[14]
Osisanwo, F.Y. Supervised machine learning algorithms: Classification and comparison. Int. J. Comput. Trends Tech., 2017, 48(3), 128-138.
[http://dx.doi.org/10.14445/22312803/IJCTT-V48P126]
[15]
Na, S.; Liu, X.; Yong, G. Research on k-means clustering algorithm: An improved k-means clustering algorithm. In: 2010 Third International Symposium on intelligent information technology and security informatics 2010 IEEE, 2010, pp. 63-67.
[http://dx.doi.org/10.1109/IITSI.2010.74]
[16]
Gkoulalas-Divanis, A.; Verykios, V.S. Association rule hiding for data mining; Springer Science & Business Media. , 2010, vol. 41, .
[http://dx.doi.org/10.1007/978-1-4419-6569-1]
[17]
Gaikwad, S.V.; Chaugule, A.; Patil, P. Text mining methods and techniques. Int. J. Comput. Appl., 2014, 85, 17.
[18]
Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P. The KDD process for extracting useful knowledge from volumes of data. Commun. ACM, 1996, 39(11), 27-34.
[http://dx.doi.org/10.1145/240455.240464]
[19]
Yongjian Fu Data mining. IEEE Potentials, 1997, 16(4), 18-20.
[http://dx.doi.org/10.1109/45.624335]
[20]
Maimon, O.; Rokach, L., Eds. Data mining and knowledge discovery handbook; , 2005.
[http://dx.doi.org/10.1007/b107408]
[21]
Behrens, John T Principles and procedures of exploratory data analysis. Psychological Methods, 1997, 2(2), 131.
[http://dx.doi.org/10.1037/1082-989X.2.2.131]
[22]
Behrens, J.T. Exploratory data analysis. In: Handbook of Psychology, Second Edition 2; , 2012.
[23]
Velleman, P.F.; Hoaglin, D.C. Applications, basics, and computing of exploratory data analysis; Duxbury Press, 1981.
[24]
Kandel, S.; Heer, J.; Plaisant, C.; Kennedy, J.; van Ham, F.; Riche, N.H.; Weaver, C.; Lee, B.; Brodbeck, D.; Buono, P. Research directions in data wrangling: VISUALIZATIONS and transformations for usable and credible data. Inf. Vis., 2011, 10(4), 271-288.
[http://dx.doi.org/10.1177/1473871611415994]
[25]
Rattenbury, T. Principles of data wrangling: practical techniques for data preparation; O'Reilly Media, Inc: Newton, 2017.
[26]
Heer, J.; Hellerstein, J.M.; Kandel, S. Predictive Interaction for Data Transformation; CIDR, 2015.
[27]
Zheng, A.; Casari, A. Feature engineering for machine learning: Principles and techniques for data scientists; O'Reilly Media Inc: Newton, 2018.
[28]
Ott, R.L.; Longnecker, M.T. An introduction to statistical methods and data analysis; Cengage Learning: Boston, 2015.
[29]
Hall, M.A. Correlation-based feature selection for machine learning; , 1999.
[30]
Gopika, N.; Meena, A.; Kowshalaya, M.E. Correlation based feature selection algorithm for machine learning 2018.
[http://dx.doi.org/10.1109/CESYS.2018.8723980]
[31]
Benesty, J. Pearson correlation coefficient. In: Noise reduction in speech processing; Springer: Berlin, Heidelberg, 2009; pp. 1-4.
[32]
Zhou, H.; Deng, Z.; Xia, Y.; Fu, M. A new sampling method in particle filter based on Pearson correlation coefficient. Neurocomputing, 2016, 216, 208-215.
[http://dx.doi.org/10.1016/j.neucom.2016.07.036]
[33]
Verma, M. A comparative study of various clustering algorithms in data mining. Int. J. Eng. Res. Appl., 2012, 2(3), 1379-1384.
[34]
Rodriguez, M.Z. Clustering algorithms: A comparative approach. PLoS One, 2019, 14(1), e0210236.
[http://dx.doi.org/10.1371/journal.pone.0210236]
[35]
Likas, A.; Vlassis, N.; Verbeek, J.J. The global k-means clustering algorithm. Pattern Recognit., 2003, 36(2), 451-461.
[http://dx.doi.org/10.1016/S0031-3203(02)00060-2]
[36]
Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell., 2002, 24(7), 881-892.
[http://dx.doi.org/10.1109/TPAMI.2002.1017616]
[37]
Pham, D.T.; Dimov, S.S.; Nguyen, C.D. Selection of K in K -means clustering. Proc. Inst. Mech. Eng., C J. Mech. Eng. Sci., 2005, 219(1), 103-119.
[http://dx.doi.org/10.1243/095440605X8298]
[38]
Wilkin, G.A.; Huang, X. K-means clustering algorithms: implementation and comparison. In: Second International Multi-Symposiums on Computer and Computational Sciences (IMSCCS 2007); IEEE, 2007.
[39]
Kodinariya, T.M.; Makwana, P.R. Review on determining number of Cluster in K-Means Clustering. Int. J., 2013, 1(6), 90-95.
[40]
Murtagh, F.; Contreras, P. Algorithms for hierarchical clustering: An overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 2012, 2(1), 86-97.
[http://dx.doi.org/10.1002/widm.53]
[41]
Bouguettaya, A.; Yu, Q.; Liu, X.; Zhou, X.; Song, A. Efficient agglomerative hierarchical clustering. Expert Syst. Appl., 2015, 42(5), 2785-2797.
[http://dx.doi.org/10.1016/j.eswa.2014.09.054]
[42]
FAO. IFAD, UNICEF, WFP and WHO. 2020. The State of Food Security and Nutrition in the World 2020. In: Transforming food systems for affordable healthy diets; Rome, FAO, 2020.
[43]
FAO. The State of Food and Agriculture 2020. Overcoming water challenges in agriculture. Rome 2020.