woe与iv (python)

Python023

woe与iv (python),第1张

https://blog.csdn.net/kevin7658/article/details/50780391

IV 与 WOE:

IV表示一个变量的预测能力

<=0.02,没有预测能力,不可用

0.02~0.1 弱预测性

0.1~0.2 有一定预测能力

0.2+高预测性

 IV还可以用来挑选变量,IV就越大,它就越应该进入到入模变量列表中。

Psi

def calculate_psi(expected, actual, buckets=10): # test, base

   def psi(expected_array, actual_array, buckets):

       def scale_range(input, min, max):

           input += -(np.min(input))

           input /= np.max(input) / (max - min)

           input += min

           return input

       # 按照概率值分10段

       breakpoints = np.arange(0, buckets + 1) / (buckets) * 100

       breakpoints = scale_range(breakpoints, np.min(expected_array), np.max(expected_array))

       expected_percents = np.histogram(expected_array, breakpoints)[0] / len(expected_array)

       # print(expected_percents)

       actual_percents = np.histogram(actual_array, breakpoints)[0] / len(actual_array)

       def sub_psi(test, base): # test,base

           if base == 0:

 base = 0.0001

           if test == 0:

 test = 0.0001

           value = (test - base) * np.log(test / base)

           return(value)

       psi_value = np.sum(sub_psi(expected_percents[i], actual_percents[i]) for i in range(0, len(expected_percents)))

       return(psi_value)

   if len(expected.shape) == 1:

       psi_values = np.empty(len(expected.shape))

   else:

       psi_values = np.empty(expected.shape[0])

   for i in range(0, len(psi_values)):

       if len(psi_values) == 1:

           psi_values = psi(expected, actual, buckets)

       else:

           psi_values[i] = psi(expected[:,i], actual[:,i], buckets)

   return(psi_values)

开根号需要导入math模块\x0d\x0aimport math\x0d\x0amath.sqrt(4)\x0d\x0a-------\x0d\x0a2.0\x0d\x0a\x0d\x0a^ 是按位异或运算\x0d\x0a对等长二进制模式或二进制数的每一位执行逻辑异或操作. 操作的结果是如果某位不同则该位为1, 否则该位为0.