Research Article
BibTex RIS Cite

A Comparative Research on Data Analysis with Factorial ANOVA, Logistic Regression and CHAID Classification Tree Methods

Year 2022, Volume: 5 Issue: 3, 314 - 322, 01.07.2022
https://doi.org/10.47115/bsagriculture.1087820

Abstract

When the data structure is large and complex, the extraction of information hidden within the data is called data mining. In the context of data mining, there are numerous methods developed for statistical data analysis. When these methods are classified as conventional-classical methods and current methods, factorial ANOVA (FANOVA) and Logistic Regression (LR) methods are shown as conventional methods, while decision trees called Classification Tree (CT) and Regression Tree (RT) can be shown as current methods. The method to be used in statistical data analysis is directly related to the researcher’s hypothesis (i.e. purpose) and variable type. Therefore, the choice of data analysis method is important. In this regard, studies in which methods are examined comparatively are guiding. In this study, a dataset on which inferences could be made by ANOVA, LR, and CT methods was analyzed. With this dataset, the relationship between the birth type (single-twin) as dependent variable and the yield year and maternal age as independent variables in an Awassi sheep flock was examined. The findings of each method were interpreted in its own specific way. The methods were compared in terms of explaining the similarities and differences of the information they presented and the relationship between dependent and independent variables.It was concluded that each method offered different inferences based on purpose and perspective. It is believed that it is the right approach for researchers to determine the data analysis method appropriate to their goals by taking into account the data structure.

Project Number

Yok

References

  • Alev Çetin F, Mikail N. 2016. Data mining aplications in livestock. Turk J Agric Res, 3: 79-88.
  • Alpar R. 2011. Applied multivariate statistical methods. Detay Publishing, Ankara, Türkiye, 6th ed., pp: 858.
  • Bek Y, Efe E. 1989. Research and application methods I. 1th ed., Çukurova University, Agriculture Faculty, Textbook. Publication No 71. Adana, Türkye, pp: 395.
  • Bircan H. 2004. Logistic regression analysis: An application on medical data. Kocaeli Univ J Social Sci Institute, 2: 185-208.
  • Breiman L, Friedman JH, Olshen RA, Stone CF. 1984. Classification and regression tree. Wadsworth International Group, Belmont, California, US, pp: 3-7.
  • Cottle DJ, Gilmour AR, Pabiou T, Amer PR, Fahey AG. 2016. Genetic selection for increased mean and reduced variance of twinning rate in Belclare ewes. J Anim Breed Genetics, 133: 126-137.
  • Çokluk Ö. 2010. Logistic regression analysis: Concept and application. Educ Sci Theor Pract, 10: 1357-1407.
  • Dangeti P. 2017. Statistical for machine learning. 1th ed., Packt Publishing Ltd, Birmingham, UK, pp: 442.
  • Gacar BK, Kocakoç ID. 2020. Regression analyses or decision trees? Manisa Celal Bayar Univ J Social Sci, 18: 251-260.
  • Güner ZB. 2014. Cart and logistic regression analysis in data mining: An application on pharmacy provision system data. Soc Secur Profes Assoc J Soc Secur, 6: 59-61.
  • Koç Y, Eyduran E, Akbulut Ö. 2016. Application of regression tree method for different data from animal science. Pakistan J Zool, 49: 599-607.
  • Koç Y. 2016. Application of Regression Tree Method for Different Data from Animal Science. MSc thesis, Iğdır University, the Institute of Science and Technology, Iğdır, Türkiye, pp: 75.
  • Kurt İ, Türe M, Kurum AT. 2008. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl, 34: 366-374.
  • Kuyucu YE. 2012. Comparison of logistic regression analysis (LRA), artificial neural networks (ANN) and classification and regression trees (C&RT) methods and an application in medicine. MSc thesis, Gaziosmanpasa University, Institute of Health Sciences. Tokat, Türkiye, pp: 112.
  • Notter DR. 2008. Genetic aspects of reproduction in sheep. Reprod Domestic Anim, 43: 122-128.
  • Özdamar K. 2004. Statistical data analysis with package programs II. Multivariate Analysis. 5th ed., Kaan Publishing House, Eskisehir, Türkiye, pp: 649.
  • Özgür EG, Doğanay Erdoğan B. 2020. Regression tree approach in computer adaptive testing (BUT) applications: Evaluation of standard CAT algorithm using a psychometric model with regression decision trees on artificial data. J Ankara Health Sci, 9(1): 161-167.
  • Özkan K. 2012. Modelling ecological data using classification and regression tree technique (CART). Süleymen Demirel Üniv Fac Forest J, 13: 1-4.
  • Şahin O. 2017. Determining the important risk factors in preferring Ayvalık for touristic purpose using the method of logistic. Electronic J Soc Sci, 16(61): 647-660.
  • Şata M, Çakan M. 2018. Comparison of results of CHAID analysis and logistic regression analysis. Dicle Univ J Ziya Gökalp Fac of Educ, 33: 48-56.
  • Şenel S, Alatlı B. 2014. A review of articles used logistic regression analysis. J Measur Eval Educ Psychol, 5: 35-52.
  • SPSS 2011. SPSS for Windows, Version 20, SPSS Inc., Chicago, US.
  • Tatlıyer A. 2020. The effects of raising type on performances of some data mining algorithms in lambs. KSU J Agric Nat, 23: 772-780.
  • Vatankhah M, Talebi MA. 2008. Heritability estimates and correlations between production and reproductive traits in Lori-Bakhtiari sheep in Iran. South African J Anim Sci, 38: 110-118.
  • Vupa Çilengiroğlu Ö, Yavuz A. 2020. Comparison of predictive performance of logistic regression and CART methods for life satisfaction data. European J Sci Tec, 18: 719-727.
  • Yıldız N, Akbulut Ö, Bircan H. 2020. Introduction to statistics, 14th ed., Culture and Education Foundation Publishing House. Erzurum, Türkiye, pp: 326.
  • Yıldız N, Bircan H. 1994. Research and application methods in statistics. 2th ed., Agriculture Faculty Publication No: 697. Erzurum, Türkiye, pp: 266.
Year 2022, Volume: 5 Issue: 3, 314 - 322, 01.07.2022
https://doi.org/10.47115/bsagriculture.1087820

Abstract

Supporting Institution

Yok

Project Number

Yok

Thanks

Ceylanpınar Tarım İşletmesi Yöneticileri ve Çalışanları (Makale metnine yazılı)

References

  • Alev Çetin F, Mikail N. 2016. Data mining aplications in livestock. Turk J Agric Res, 3: 79-88.
  • Alpar R. 2011. Applied multivariate statistical methods. Detay Publishing, Ankara, Türkiye, 6th ed., pp: 858.
  • Bek Y, Efe E. 1989. Research and application methods I. 1th ed., Çukurova University, Agriculture Faculty, Textbook. Publication No 71. Adana, Türkye, pp: 395.
  • Bircan H. 2004. Logistic regression analysis: An application on medical data. Kocaeli Univ J Social Sci Institute, 2: 185-208.
  • Breiman L, Friedman JH, Olshen RA, Stone CF. 1984. Classification and regression tree. Wadsworth International Group, Belmont, California, US, pp: 3-7.
  • Cottle DJ, Gilmour AR, Pabiou T, Amer PR, Fahey AG. 2016. Genetic selection for increased mean and reduced variance of twinning rate in Belclare ewes. J Anim Breed Genetics, 133: 126-137.
  • Çokluk Ö. 2010. Logistic regression analysis: Concept and application. Educ Sci Theor Pract, 10: 1357-1407.
  • Dangeti P. 2017. Statistical for machine learning. 1th ed., Packt Publishing Ltd, Birmingham, UK, pp: 442.
  • Gacar BK, Kocakoç ID. 2020. Regression analyses or decision trees? Manisa Celal Bayar Univ J Social Sci, 18: 251-260.
  • Güner ZB. 2014. Cart and logistic regression analysis in data mining: An application on pharmacy provision system data. Soc Secur Profes Assoc J Soc Secur, 6: 59-61.
  • Koç Y, Eyduran E, Akbulut Ö. 2016. Application of regression tree method for different data from animal science. Pakistan J Zool, 49: 599-607.
  • Koç Y. 2016. Application of Regression Tree Method for Different Data from Animal Science. MSc thesis, Iğdır University, the Institute of Science and Technology, Iğdır, Türkiye, pp: 75.
  • Kurt İ, Türe M, Kurum AT. 2008. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl, 34: 366-374.
  • Kuyucu YE. 2012. Comparison of logistic regression analysis (LRA), artificial neural networks (ANN) and classification and regression trees (C&RT) methods and an application in medicine. MSc thesis, Gaziosmanpasa University, Institute of Health Sciences. Tokat, Türkiye, pp: 112.
  • Notter DR. 2008. Genetic aspects of reproduction in sheep. Reprod Domestic Anim, 43: 122-128.
  • Özdamar K. 2004. Statistical data analysis with package programs II. Multivariate Analysis. 5th ed., Kaan Publishing House, Eskisehir, Türkiye, pp: 649.
  • Özgür EG, Doğanay Erdoğan B. 2020. Regression tree approach in computer adaptive testing (BUT) applications: Evaluation of standard CAT algorithm using a psychometric model with regression decision trees on artificial data. J Ankara Health Sci, 9(1): 161-167.
  • Özkan K. 2012. Modelling ecological data using classification and regression tree technique (CART). Süleymen Demirel Üniv Fac Forest J, 13: 1-4.
  • Şahin O. 2017. Determining the important risk factors in preferring Ayvalık for touristic purpose using the method of logistic. Electronic J Soc Sci, 16(61): 647-660.
  • Şata M, Çakan M. 2018. Comparison of results of CHAID analysis and logistic regression analysis. Dicle Univ J Ziya Gökalp Fac of Educ, 33: 48-56.
  • Şenel S, Alatlı B. 2014. A review of articles used logistic regression analysis. J Measur Eval Educ Psychol, 5: 35-52.
  • SPSS 2011. SPSS for Windows, Version 20, SPSS Inc., Chicago, US.
  • Tatlıyer A. 2020. The effects of raising type on performances of some data mining algorithms in lambs. KSU J Agric Nat, 23: 772-780.
  • Vatankhah M, Talebi MA. 2008. Heritability estimates and correlations between production and reproductive traits in Lori-Bakhtiari sheep in Iran. South African J Anim Sci, 38: 110-118.
  • Vupa Çilengiroğlu Ö, Yavuz A. 2020. Comparison of predictive performance of logistic regression and CART methods for life satisfaction data. European J Sci Tec, 18: 719-727.
  • Yıldız N, Akbulut Ö, Bircan H. 2020. Introduction to statistics, 14th ed., Culture and Education Foundation Publishing House. Erzurum, Türkiye, pp: 326.
  • Yıldız N, Bircan H. 1994. Research and application methods in statistics. 2th ed., Agriculture Faculty Publication No: 697. Erzurum, Türkiye, pp: 266.
There are 27 citations in total.

Details

Primary Language English
Subjects Zootechny (Other)
Journal Section Research Articles
Authors

Ömer Akbulut 0000-0002-8860-3513

Ali Kaygısız 0000-0002-5302-2735

İsa Yılmaz 0000-0001-6796-577X

Project Number Yok
Publication Date July 1, 2022
Submission Date March 14, 2022
Acceptance Date June 17, 2022
Published in Issue Year 2022 Volume: 5 Issue: 3

Cite

APA Akbulut, Ö., Kaygısız, A., & Yılmaz, İ. (2022). A Comparative Research on Data Analysis with Factorial ANOVA, Logistic Regression and CHAID Classification Tree Methods. Black Sea Journal of Agriculture, 5(3), 314-322. https://doi.org/10.47115/bsagriculture.1087820

                                                  24890