Python ProgrammingPython Programming

Calculates the covariance between columns of DataFrame in Pandas

Calculating Covariance:

import pandas as pd

df = pd.DataFrame([[10, 20, 30, 40], [7, 14, 21, 28], [55, 15, 8, 12],
                   [15, 14, 1, 8], [7, 1, 1, 8], [5, 4, 9, 2]],
                  columns=['Apple', 'Orange', 'Banana', 'Pear'],
                  index=['Basket1', 'Basket2', 'Basket3', 'Basket4',
                         'Basket5', 'Basket6'])

print("\n----------- Calculating Covariance -----------\n")
print(df.cov())

print("\n----------- Between 2 columns -----------\n")
# Covariance of Apple vs Orange
print(df.Apple.cov(df.Orange))


C:\pandas>python example.py
 
----------- Calculating Covariance -----------
 
        Apple     Orange      Banana        Pear
Apple   367.9  47.600000  -40.200000  -35.000000
Orange   47.6  52.666667   54.333333   77.866667
Banana  -40.2  54.333333  134.266667  154.933333
Pear    -35.0  77.866667  154.933333  211.866667
 
----------- Between 2 columns -----------
 
47.60000000000001
 
C:\pandas>