0.1 to 0.3, then the predictor has a medium strength relationship to the Goods/Bads odds ratio.0.02 to 0.1, then the predictor has only a weak relationship to the Goods/Bads odds ratio.Less than 0.02, then the predictor is not useful for modeling (separating the Goods from the Bads).Rules related to Information Value Information ValueĪccording to Siddiqi (2006), by convention the values of the IV statistic in credit scoring can be interpreted as follows. IV = ∑ (% of non-events - % of events) * WOE The IV is calculated using the following formula : It helps to rank variables on the basis of their importance. Information value is one of the most useful technique to select important variables in a predictive model. In short, if you would not use WOE transformation, you may have to try out several transformation methods to achieve this. Otherwise it is not easy to accomplish linear relationship using other transformation methods such as log, square-root etc. WoE transformation helps you to build strict linear relationship with log odds.
We are adding 0.5 to the number of events and non-events in a group.ĪdjustedWOE = ln (((Number of non-events in a group + 0.5) / Number of non-events)) / ((Number of events in a group + 0.5) / Number of events)) How to check correct binning with WOEġ. If a particular bin contains no event or non-event, you can use the formula below to ignore missing WOE. In other words, the behavior of both the categories is same. It is because the categories with similar WOE have almost same proportion of events and non-events. The transformed variable will be a continuous variable with WOE values. In other words, use WOE values rather than raw categories in your model. Use WOE values rather than input values in your model.įor categorical independent variables : Combine categories with similar WOE and then create new categories of an independent variable with continuous WOE values. number of events and non-events.įor continuous independent variables : First, create bins (categories / groups) for a continuous independent variable and then combine categories with similar WOE values and replace categories with WOE values. Weight of Evidence (WOE) helps to transform a continuous independent variable into a set of groups or bins based on similarity of dependent variable distribution i.e. Coarse ClassingĬombine adjacent categories with similar WOE scores Usage of WOE Fine ClassingĬreate 10/20 bins/groups for a continuous independent variable and then calculates WOE and IV of the variable 2. Weight of Evidence and Information Value Calculationĭownload : Excel Template for WOE and IV Terminologies related to WOE 1. Note : For a categorical variable, you do not need to split the data (Ignore Step 1 and follow the remaining steps) Calculate WOE by taking natural log of division of % of non-events and % of events.Calculate the % of events and % of non-events in each group.Calculate the number of events and non-events in each group (bin).For a continuous variable, split data into 10 parts (or lesser depending on the distribution).It is calculated by taking the natural logarithm (log to base e) of division of % of non-events and % of events. It's good to understand the concept of WOE in terms of events and non-events. Many people do not understand the terms goods/bads as they are from different background than the credit risk.
Negative WOE means Distribution of Goods 1 means positive value. Positive WOE means Distribution of Goods > Distribution of Bads