xlwings – dataframes and statistics

Another data conversion option offered by the latest xlwings release is Pandas dataframes and dataseries.  This post will look at how to pass an Excel range as a dataframe or dataseries in a User Defined Function (UDF), and some simple statistical applications, but this is barely scratching the surface of the potential of the Pandas Python Data Analysis Library.

To use the Pandas data structures, import pandas and xlwings; the data type can then be defined with an xw.arg decorator:

import pandas as pd
import xlwings as xw 

@xw.arg('datrange', pd.DataFrame, index = 0, header = 0)    
def rtnpdframe(datrange):
    return datrange
    
@xw.arg('datrange', pd.Series, header = 0)    
def rtnpdseries(datrange):
    return datrange

The Python functions can then be called from VBA:

Function GetDFrame(DRange As Variant)
On Error GoTo rtnerr:
    GetDFrame = Py.CallUDF(ModName, "rtnpdframe", Array(DRange), ThisWorkbook)
    Exit Function
rtnerr:
    GetDFrame = Err.Description
End Function

Function GetDSeries(DRange As Variant)
On Error GoTo rtnerr:
    GetDSeries = Py.CallUDF(ModName, "rtnpdseries", Array(DRange), ThisWorkbook)
    Exit Function
rtnerr:
    GetDSeries = Err.Description
End Function

Use of these functions is shown in the screenshots below:

DFrame1-1

The xl_Corr UDF shown above is taken from the xlwings documentation:

@xw.func
@xw.arg('x', pd.DataFrame, index=False, header=False)
@xw.ret(index=False, header=False)    
def xl_Correl(x):
    return x.corr()

I have added a simple extension to allow an Excel UDF to call any of the dataframe methods:

@xw.func
@xw.arg('x', pd.DataFrame, index=False, header=False)
@xw.ret(index=False, header=False)    
def xl_Stats(x, stat):
    method = getattr(x, stat)
    return method()

Use of this UDF with four different statistics functions is shown below.  In these examples the statistic is a single value, and the results for each of the three columns of the input range are returned as a single column with three rows.

DFrame1-2

The “describe” method returns 8 values for each column in the input range:

DFrame1-3

The Pandas documentation lists 20 common statistics functions available as dataframe methods.  The screenshot below shows the use of the xl_Stats UDF, in conjunction with the Index function to call any of these 20 functions.  In the example shown the cumsum function returns one row for each row of the input data (only the first three are shown):

DFrame1-4

The functions shown above have been added to:
xlwDict.zip
available for free download, with full open source code.

 

This entry was posted in Excel, Link to Python, Maths, Newton, Python Pandas, UDFs, VBA and tagged , , , , , , , . Bookmark the permalink.

One Response to xlwings – dataframes and statistics

  1. Pingback: xlwSciPy 1.7 | Newton Excel Bach, not (just) an Excel Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s