{"id":3321,"date":"2017-05-03T08:08:15","date_gmt":"2017-05-03T15:08:15","guid":{"rendered":"http:\/\/blogs.pugetsound.edu\/econ\/?p=3321"},"modified":"2017-05-02T14:55:47","modified_gmt":"2017-05-02T21:55:47","slug":"thesis-corner-predicting-college-attendance","status":"publish","type":"post","link":"https:\/\/blogs.pugetsound.edu\/econ\/2017\/05\/03\/thesis-corner-predicting-college-attendance\/","title":{"rendered":"Thesis Corner: Predicting College Attendance"},"content":{"rendered":"<p>So three weeks ago, I spoke with Ian Hughes about his thesis, titled:<\/p>\n<p><span style=\"text-decoration: underline\">Identifying Socioeconomic Indicators of College Attendance with Classification Trees<br \/>\n<\/span>Much of Ian&#8217;s research centered around intergenerational income mobility and barriers to it. Some of the research showed that &#8220;<span style=\"font-weight: 400\">low-income students are subject to less college preparation and lower test scores because of their financial situations&#8221;, which contributes to more difficulty in using education as a route out of poverty.\u00a0<\/span><\/p>\n<p>Much of the literature surrounding wealth, race, and parental education in connection with children&#8217;s college attendance is well established, so Ian included some other characteristics that have not received as much attention. For his data set, Ian examined the\u00a0Panel Study of Income Dynamics (<a href=\"https:\/\/psidonline.isr.umich.edu\/\" target=\"_blank\">PSID<\/a>) from the University of Michigan. It covers household employment, income, wealth, expenditures, health, child development, education, and numerous other topics.Some of the more atypical variables Ian\u00a0investigated included: the overall positivity\u00a0of the child, their emotional well-being, and their impression of their school&#8217;s safety. \u00a0<i><\/i><\/p>\n<p>In order to tackle this problem, Ian developed both an OLS binary response regression and a classification tree machine learning approach. He focused on the classification tree, a technique borrowed from data science, which essentially evaluates the predictive capacity of explanatory variables, and eliminates those which are poor predictors. Below you can see the results.<\/p>\n<p><a href=\"http:\/\/blogs.pugetsound.edu\/econ\/files\/2017\/04\/ian-hughes-class-tree.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft wp-image-3322 size-full\" src=\"http:\/\/blogs.pugetsound.edu\/econ\/files\/2017\/04\/ian-hughes-class-tree.png\" alt=\"ian hughes class tree\" width=\"698\" height=\"484\" srcset=\"https:\/\/blogs.pugetsound.edu\/econ\/files\/2017\/04\/ian-hughes-class-tree.png 698w, https:\/\/blogs.pugetsound.edu\/econ\/files\/2017\/04\/ian-hughes-class-tree-300x208.png 300w\" sizes=\"auto, (max-width: 698px) 100vw, 698px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/docs.google.com\/document\/d\/1Vc78wp-I8SSnVzuXFADvYWGM2BwCdgp9z_Nl_J7nRZs\/edit?usp=sharing\" target=\"_blank\">Here<\/a> you can see the descriptions of all the variables used, but the main ones to know are the binary response variable, college, and the Broad Reading percentage, which is used as a proxy for children&#8217;s intelligence. In this classification tree, it takes an imaginary child, then evaluates their likelihood of going to college. So it operates like a decision tree in game theory, but instead of strategically choosing a path, it\u00a0shows what a child would do based on their characteristics. In my interview, I think Ian put it best, saying,<\/p>\n<blockquote><p>&#8220;[The classification tree] tells a different story&#8230;it&#8217;s more narrative-based compared to a regression where you have to build all these scenarios in order to tell the story&#8230;If you were given a sample child, you could predict whether or not they go to college a lot more easily than using a regression where you&#8217;re still given a percentage chance.&#8221;<\/p><\/blockquote>\n<p>Classification trees also perform another important function, by building nodes based on combinations of explanatory variables, they illuminate the relationships between the variables in a way that an OLS regression would not. For example, if neither of a child&#8217;s guardians completed 14 years of schooling, then a parent&#8217;s preference for a child receiving more education becomes critical to their odds of attending\u00a0college.<\/p>\n<p>As far as Ian&#8217;s conclusions go, it seems that the regression found parental income is still one of the most important factors in determining college attendance, but that other variables have significant impacts when paired together in relationships by a classification tree. Or as Ian put it,<\/p>\n<blockquote><p>&#8220;<span style=\"font-weight: 400\">Combining the two distinct models helps answer the question, \u201cwhich childhood environment factors explain college attendance?\u201d, with more clarity.&#8221;<\/span><\/p><\/blockquote>\n<p>No matter how you evaluate it, effective studies are necessary if we are to tackle a problem as far reaching as intergenerational mobility,<\/p>\n<p>-Max Coleman<\/p>\n<blockquote><p>&nbsp;<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>So three weeks ago, I spoke with Ian Hughes about his thesis, titled: Identifying Socioeconomic Indicators of College Attendance with Classification Trees Much of Ian&#8217;s research centered around intergenerational income mobility and barriers to it. Some of the research showed that &#8220;low-income students are subject to less college preparation and lower test scores because of their financial situations&#8221;, which contributes to more difficulty in using education as a route out of poverty.\u00a0 Much of the literature surrounding wealth, race, and parental education in connection with children&#8217;s college attendance is well established, so Ian included some other characteristics that have not <a class=\"more-link\" href=\"https:\/\/blogs.pugetsound.edu\/econ\/2017\/05\/03\/thesis-corner-predicting-college-attendance\/\">Continue reading <span class=\"screen-reader-text\">  Thesis Corner: Predicting College Attendance<\/span><span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":532,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[693,690,692,691,689],"class_list":["post-3321","post","type-post","status-publish","format-standard","hentry","category-economics","tag-classification-tree","tag-college-attendance","tag-income-mobility","tag-machine-learning","tag-psid"],"_links":{"self":[{"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/posts\/3321","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/users\/532"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/comments?post=3321"}],"version-history":[{"count":14,"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/posts\/3321\/revisions"}],"predecessor-version":[{"id":3370,"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/posts\/3321\/revisions\/3370"}],"wp:attachment":[{"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/media?parent=3321"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/categories?post=3321"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.pugetsound.edu\/econ\/wp-json\/wp\/v2\/tags?post=3321"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}