Applied Predictive Analytics : Principles and Techniques for the Professional Data Analyst.


Dean. Abbott
Bok Engelsk 2014 · Electronic books.
Omfang
1 online resource (453 pages)
Utgave
1st ed.
Opplysninger
Cover -- Title Page -- Copyright -- Contents -- Chapter 1 Overview of Predictive Analytics -- What Is Analytics? -- What Is Predictive Analytics? -- Supervised vs. Unsupervised Learning -- Parametric vs. Non-Parametric Models -- Business Intelligence -- Predictive Analytics vs. Business Intelligence -- Do Predictive Models Just State the Obvious? -- Similarities between Business Intelligence and Predictive Analytics -- Predictive Analytics vs. Statistics -- Statistics and Analytics -- Predictive Analytics and Statistics Contrasted -- Predictive Analytics vs. Data Mining -- Who Uses Predictive Analytics? -- Challenges in Using Predictive Analytics -- Obstacles in Management -- Obstacles with Data -- Obstacles with Modeling -- Obstacles in Deployment -- What Educational Background Is Needed to Become a Predictive Modeler? -- Chapter 2 Setting Up the Problem -- Predictive Analytics Processing Steps: CRISP-DM -- Business Understanding -- The Three-Legged Stool -- Business Objectives -- Defining Data for Predictive Modeling -- Defining the Columns as Measures -- Defining the Unit of Analysis -- Which Unit of Analysis? -- Defining the Target Variable -- Temporal Considerations for Target Variable -- Defining Measures of Success for Predictive Models -- Success Criteria for Classification -- Success Criteria for Estimation -- Other Customized Success Criteria -- Doing Predictive Modeling Out of Order -- Building Models First -- Early Model Deployment -- Case Study: Recovering Lapsed Donors -- Overview -- Business Objectives -- Data for the Competition -- The Target Variables -- Modeling Objectives -- Model Selection and Evaluation Criteria -- Model Deployment -- Case Study: Fraud Detection -- Overview -- Business Objectives -- Data for the Project -- The Target Variables -- Modeling Objectives -- Model Selection and Evaluation Criteria.. - Business Understanding: Defining the Problem -- Data Understanding -- Data Preparation -- Modeling -- Deployment: "What-If" Analysis -- Revisit Models -- Deployment -- Summary and Conclusions -- Help Desk Case Study -- Data Understanding: Defining the Data -- Data Preparation -- Modeling -- Revisit Business Understanding -- Deployment -- Summary and Conclusions -- Index.. - Model Deployment -- Summary -- Chapter 3 Data Understanding -- What the Data Looks Like -- Single Variable Summaries -- Mean -- Standard Deviation -- The Normal Distribution -- Uniform Distribution -- Applying Simple Statistics in Data Understanding -- Skewness -- Kurtosis -- Rank-Ordered Statistics -- Categorical Variable Assessment -- Data Visualization in One Dimension -- Histograms -- Multiple Variable Summaries -- Hidden Value in Variable Interactions: Simpson's Paradox -- The Combinatorial Explosion of Interactions -- Correlations -- Spurious Correlations -- Back to Correlations -- Crosstabs -- Data Visualization, Two or Higher Dimensions -- Scatterplots -- Anscombe's Quartet -- Scatterplot Matrices -- Overlaying the Target Variable in Summary -- Scatterplots in More Than Two Dimensions -- The Value of Statistical Significance -- Pulling It All Together into a Data Audit -- Summary -- Chapter 4 Data Preparation -- Variable Cleaning -- Incorrect Values -- Consistency in Data Formats -- Outliers -- Multidimensional Outliers -- Missing Values -- Fixing Missing Data -- Feature Creation -- Simple Variable Transformations -- Fixing Skew -- Binning Continuous Variables -- Numeric Variable Scaling -- Nominal Variable Transformation -- Ordinal Variable Transformations -- Date and Time Variable Features -- ZIP Code Features -- Which Version of a Variable Is Best? -- Multidimensional Features -- Variable Selection Prior to Modeling -- Sampling -- Example: Why Normalization Matters for K-Means Clustering -- Summary -- Chapter 5 Itemsets and Association Rules -- Terminology -- Condition -- Left-Hand-Side, Antecedent(s) -- Right-Hand-Side, Consequent, Output, Conclusion -- Rule (Item Set) -- Support -- Antecedent Support -- Confidence, Accuracy -- Lift -- Parameter Settings -- How the Data Is Organized -- Standard Predictive Modeling Data Format.. - Other Practical Considerations for k-NN -- Naïve Bayes -- Bayes' Theorem -- The Naïve Bayes Classifier -- Interpreting Naïve Bayes Classifiers -- Other Practical Considerations for Naïve Bayes -- Regression Models -- Linear Regression -- Linear Regression Assumptions -- Variable Selection in Linear Regression -- Interpreting Linear Regression Models -- Using Linear Regression for Classification -- Other Regression Algorithms -- Summary -- Chapter 9 Assessing Predictive Models -- Batch Approach to Model Assessment -- Percent Correct Classification -- Rank-Ordered Approach to Model Assessment -- Assessing Regression Models -- Summary -- Chapter 10 Model Ensembles -- Motivation for Ensembles -- The Wisdom of Crowds -- Bias Variance Tradeoff -- Bagging -- Boosting -- Improvements to Bagging and Boosting -- Random Forests -- Stochastic Gradient Boosting -- Heterogeneous Ensembles -- Model Ensembles and Occam's Razor -- Interpreting Model Ensembles -- Summary -- Chapter 11 Text Mining -- Motivation for Text Mining -- A Predictive Modeling Approach to Text Mining -- Structured vs. Unstructured Data -- Why Text Mining Is Hard -- Text Mining Applications -- Data Sources for Text Mining -- Data Preparation Steps -- POS Tagging -- Tokens -- Stop Word and Punctuation Filters -- Character Length and Number Filters -- Stemming -- Dictionaries -- The Sentiment Polarity Movie Data Set -- Text Mining Features -- Term Frequency -- Inverse Document Frequency -- TF-IDF -- Cosine Similarity -- Multi-Word Features: N-Grams -- Reducing Keyword Features -- Grouping Terms -- Modeling with Text Mining Features -- Regular Expressions -- Uses of Regular Expressions in Text Mining -- Summary -- Chapter 12 Model Deployment -- General Deployment Considerations -- Deployment Steps -- Summary -- Chapter 13 Case Studies -- Survey Analysis Case Study: Overview.. - Transactional Format -- Measures of Interesting Rules -- Deploying Association Rules -- Variable Selection -- Interaction Variable Creation -- Problems with Association Rules -- Redundant Rules -- Too Many Rules -- Too Few Rules -- Building Classification Rules from Association Rules -- Summary -- Chapter 6 Descriptive Modeling -- Data Preparation Issues with Descriptive Modeling -- Principal Component Analysis -- The PCA Algorithm -- Applying PCA to New Data -- PCA for Data Interpretation -- Additional Considerations before Using PCA -- The Effect of Variable Magnitude on PCA Models -- Clustering Algorithms -- The K-Means Algorithm -- Data Preparation for K-Means -- Selecting the Number of Clusters -- The Kohonen SOM Algorithm -- Visualizing Kohonen Maps -- Similarities with K-Means -- Summary -- Chapter 7 Interpreting Descriptive Models -- Standard Cluster Model Interpretation -- Problems with Interpretation Methods -- Identifying Key Variables in Forming Cluster Models -- Cluster Prototypes -- Cluster Outliers -- Summary -- Chapter 8 Predictive Modeling -- Decision Trees -- The Decision Tree Landscape -- Building Decision Trees -- Decision Tree Splitting Metrics -- Decision Tree Knobs and Options -- Reweighting Records: Priors -- Reweighting Records: Misclassification Costs -- Other Practical Considerations for Decision Trees -- Logistic Regression -- Interpreting Logistic Regression Models -- Other Practical Considerations for Logistic Regression -- Neural Networks -- Building Blocks: The Neuron -- Neural Network Training -- The Flexibility of Neural Networks -- Neural Network Settings -- Neural Network Pruning -- Interpreting Neural Networks -- Neural Network Decision Boundaries -- Other Practical Considerations for Neural Networks -- K-Nearest Neighbor -- The k-NN Learning Algorithm -- Distance Metrics for k-NN.. - Learn the art and science of predictive analytics - techniques that get results Predictive analytics is what translates big data into meaningful, usable business information. Written by a leading expert in the field, this guide examines the science of the underlying algorithms as well as the principles and best practices that govern the art of predictive analytics. It clearly explains the theory behind predictive analytics, teaches the methods, principles, and techniques for conducting predictive analytics projects, and offers tips and tricks that are essential for successful predictive modeling. Hands-on examples and case studies are included. The ability to successfully apply predictive analytics enables businesses to effectively interpret big data; essential for competition today This guide teaches not only the principles of predictive analytics, but also how to apply them to achieve real, pragmatic solutions Explains methods, principles, and techniques for conducting predictive analytics projects from start to finish Illustrates each technique with hands-on examples and includes as series of in-depth case studies that apply predictive analytics to common business scenarios A companion website provides all the data sets used to generate the examples as well as a free trial version of software Applied Predictive Analytics arms data and business analysts and business managers with the tools they need to interpret and capitalize on big data.
Emner
Sjanger
Dewey
ISBN
9781118727935
ISBN(galt)

Bibliotek som har denne