{"id":251,"date":"2008-10-05T14:23:07","date_gmt":"2008-10-05T18:23:07","guid":{"rendered":"http:\/\/itp.nyu.edu\/~bmy1\/itpblog\/?p=251"},"modified":"2008-10-05T14:23:07","modified_gmt":"2008-10-05T18:23:07","slug":"reading-exploratory-data-analysis-eda","status":"publish","type":"post","link":"https:\/\/yeeality.com\/blog\/2008\/10\/05\/reading-exploratory-data-analysis-eda\/","title":{"rendered":"Reading: Exploratory Data Analysis (EDA)"},"content":{"rendered":"<p>John Tukey and other mathematicians worked to differentiate between modes of analysis in the 60&#8217;s and 70&#8217;s. EDA treats data as more than the support for existing knowledge; instead, data is viewed as a source of new ideas and hypotheses. Some key points when thining of EDA are below:<\/p>\n<ul>\n<li>Skepticism joined by openness<\/li>\n<li>Flexible, adaptable, risk-taking<\/li>\n<li>Includes randomness<\/li>\n<li>Smooth enough? Rough enough?<\/li>\n<li>Analysis should begin with data not summaries<\/li>\n<li>Stem-and-leaf (easy to construct by hand, shows numbers and shape)<\/li>\n<li>Box-and-whisker (good for providing visual detail of outliers, tails)<\/li>\n<li>Note in data: skewness, outliers, gaps, and multiple peaks<!--more--><\/li>\n<\/ul>\n<p>Distribution<\/p>\n<ul>\n<li>Location (mean, median, mode)<\/li>\n<li>Spread (width, standard deviations: ~68%, ~95%)<\/li>\n<li>Shape (standard bell-curve)<\/li>\n<\/ul>\n<p>Tenets<\/p>\n<ol>\n<li>Shape of distribution at least as important as location and spread<\/li>\n<li>Visual representations are superior to purely numeric for discovering shape<\/li>\n<li>Choice of summary stats to describe data for single variable should depend on appropriateness of stat for the shape of distribution<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>John Tukey and other mathematicians worked to differentiate between modes of analysis in the 60&#8217;s and 70&#8217;s. EDA treats data as more than the support for existing knowledge; instead, data is viewed as a source of new ideas and hypotheses. Some key points when thining of EDA are below: Skepticism joined by openness Flexible, adaptable, [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13],"tags":[],"class_list":["post-251","post","type-post","status-publish","format-standard","hentry","category-craftingdata"],"_links":{"self":[{"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/posts\/251","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/comments?post=251"}],"version-history":[{"count":7,"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/posts\/251\/revisions"}],"predecessor-version":[{"id":259,"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/posts\/251\/revisions\/259"}],"wp:attachment":[{"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/media?parent=251"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/categories?post=251"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/yeeality.com\/blog\/wp-json\/wp\/v2\/tags?post=251"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}