نوع مقاله : مقاله پژوهشی لاتین
نویسندگان
1 گروه مدیریت فناوری اطلاعات، دانشکده مدیریت و اقتصاد، دانشگاه تربیت مدرس، تهران، ایران
2 گروه مهندسی مکانیک بیوسیستم ، دانشکده کشاورزی، دانشگاه شهرکرد، شهرکرد، ایران
چکیده
بینی الکترونیکی یک دستگاه الکترونیکی برای تشخیص بو است. دادههای بهدستآمده از این دستگاه بهصورت عددی و در ستونهای مختلف ذخیره میشوند که مربوط به دادههای دو نوع پنیر بدون گلوتن و پنیر حاوی گلوتن هستند. این دادهها بهتنهایی برای تصمیمگیری و قضاوت کافی نیستند و لازم است روابط و الگوهای میان آنها کشف شود تا مشخص شود دادههای جدید ثبتشده توسط دستگاه به کدام دسته از پنیرهای دارای گلوتن یا بدون گلوتن تعلق دارند. به همین منظور، در این تحقیق از روشهای دادهکاوی و یادگیری ماشین استفاده شده است. دادهکاوی شامل الگوریتمهای متنوعی مانند طبقهبندی، خوشهبندی و استخراج قوانین وابستگی است. برای دستیابی به نتایج بهتر، فرآیند دادهکاوی بر روی 105 ترکیب مختلف از مدلها انجام شد و 13 مدلی که بالاترین دقت را در درک روابط میان دادهها داشتند، در تحقیق ذکر شدهاند. در این پژوهش، با استفاده از روشهای دادهکاوی، دادههای مربوط به پنیرهای دارای گلوتن و بدون گلوتن در دستههای جداگانه طبقهبندی شدند و مدلی جهت پیشبینی نوع دادههای جدید از نظر ماهیت پنیر (دارای گلوتن یا بدون گلوتن) ایجاد شد. پس از تحلیل 105 ترکیب مختلف، در نهایت مدلی که از الگوریتم Random Forest برای طبقهبندی و از MinMaxScaler برای مقیاسبندی دادهها استفاده میکرد، بهعنوان بهترین مدل با دقت پیشبینی 99.8٪ برای هر دو مجموعه دادههای آموزش و آزمون انتخاب شد.
کلیدواژهها
موضوعات
عنوان مقاله [English]
Implementation of Several Data Mining Strategies on Electronic Nose Data for Identifying Gluten in Cheese
نویسندگان [English]
- Mohammad Nasiri-Galeh 1
- Mahdi Ghasemi-Varnamkhasti 2
1 Department of Information Technology Management, Faculty of Management and Economics, Tarbiat Modares University, Tehran, Iran
2 Department of Biosystems Mechanical Engineering, Faculty of Agriculture, Shahrekord University, Shahrekord, Iran
چکیده [English]
Electronic nose is an electronic device for smell detection. The data obtained from this device are stored in the form of numbers in different columns, which are related to the data of two types of cheese namely gluten-free cheese and cheese with gluten. It is not enough to make decisions and judge the data unless discovering the relationships and patterns between the data obtained to determine the relation of new data recorded by the device to the type of cheese, For this purpose, data mining and machine learning methods have been used in this research. Data mining includes various algorithms such as classification, clustering, and obtaining association rules. To get a better result from the data, a data mining process was performed on 105 different permutations of the models, and 13 models with the highest accuracy in understanding the relationships between the data were chosen. In this research, with data mining methods, cheese with gluten and gluten-free cheese data were classified into separate categories, and a model was created to predict the type of new input data in terms of the nature of cheese (gluten-free and with gluten). With analyzing 105 Permutations, Finally, the best suitable model to be used for data classification using the Random Forest algorithm and MinMaxScaler for scaling was selected with a prediction accuracy of 99.8% for both test and training datasets.
کلیدواژهها [English]
- Data classification
- Data mining
- Decision Tree
- Electronic nose
- Machine learning
©2025 The author(s). This is an open access article distributed under Creative Commons Attribution 4.0 International License (CC BY 4.0)
- Criminisi, J.S. (2011). Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Foundations and Trends⃝R in Computer Graphics and Vision, 7(2-3), 81-227. https://doi.org/10.1561/0600000035
- Bhattacharya, N.T. (2008). Preemptive identification of optimum fermentation time for black tea using electronic nose. Sensors and Actuators B: Chemical, 131(1), 110-116. https://doi.org/10.1016/j.snb.2007.12.032
- Biau, G. (2016). A random forest guided tour. Test 25.2, 197-227. https://doi.org/10.1007/s11749-016-0481-7
- Breiman, L. (2001). Random forests. Machine Learning, 45:5–32.
- Du, W. (2002). Building decision tree classifier on private data.
- Fernandez, L.Z. (2023). Applications of electronic noses in cheese quality assessment. Journal of Food Science and Technology, 60(3), 1234-1245.
- Freund, Y.S. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139. https://doi.org/10.1006/jcss.1997.1504
- Shekari, E.M. (2024). Evaluation of the quantitative and qualitative characteristics of gluten-free chicken nuggets containing quinoa flour and hydroxypropyl methyl cellulose (HPMC). (HPMC). Iranian Food Science & Technology Research Journal/Majallah-i Pizhūhishhā-yi ̒Ulūm va Sanāyi̒-i Ghaz̠āyī-i Īrān, 20(1), 47-62.
- Hu, W.H. (2008). Adaboost-based algorithm for network intrusion detection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38(2), 577-583. https://doi.org/10.1109/TSMCB.2007.914695
- Karoui, R. (2011). Fluorescence spectroscopy measurement for quality assessment of food systems—a review. Food and Bioprocess Technology, 4, 364-386. https://doi.org/10.1007/s11947-010-0370-0
- Persaud, K., & Dodd, G. (1982). Analysis of discrimination mechanisms in the mammalian olfactory system using a model nose. Nature, 299, 352-355. https://doi.org/10.1038/299352a0
- Qi, Y. (2012). Random forest for bioinformatics. Ensemble machine learning. Springer, Boston, MA, 307-323. https://doi.org/10.1007/978-1-4419-9326-7_11
- Quinlan, J.R. (1993). 5, Programs for Machine Learning. Morgan Kaufmann San Mateo Ca.
- Ren, S.C. (2015). Global refinement of random forest. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
- Schapire, R.E. (197-227). The strength of weak learnability. Machine Learning, 5(2), 1990. https://doi.org/10.1007/BF00116037
- Schapire, R.E. (2013). Explaining adaboost. In: Empirical inference. Springer, Berlin, Heidelberg, p. 37-52.
- Stein, G.C. (2005). Decision tree classifier for network intrusion detection with GA-based feature selection. Proceedings of the 43rd annual Southeast regional conference-Volume 2. https://doi.org/10.1145/1167253.1167288
- Thompson, T.S. (2023). Advances in gluten detection methods for celiac disease management. Nutrients, 15(2), 789.
- Vapnik, V. (1998). Statistical Learning Theory. John Wiley and Sons.
- Wang, F.L. (2019). Feature learning viewpoint of AdaBoost and a new algorithm. IEEE Access, 7, 149890-149899. https://doi.org/10.1109/ACCESS.2019.2947359
- Wilson, A.D. (2009). Applications and advances in electronic-nose technologies. Sensors, 9(7), 5099-5148. https://doi.org/10.3390/s90705099
- Wilson, A.D. (2013). Diverse applications of electronic-nose technologies in agriculture and forestry. Sensors, 13(2), 2295-2348. https://doi.org/10.3390/s130202295
- Yu, H.L. (2024). Rapid detection of gluten contamination in food products using advanced sensor technologies. Food Chemistry, 420, 136042.
- Zhang, Y.C. (2023). Data mining approaches in electronic nose technology for food quality control. Trends in Food Science & Technology, 135, 245-258.
- Zhao, X.L. (2024). Emerging sensor-based technologies for food safety and quality monitoring. Sensors and Actuators B: Chemical, 389, 134934.
ارسال نظر در مورد این مقاله