Numpy and Scipy for Excel

The Python Numpy and Scipy libraries contain a huge number of maths and science related functions.  Most of these are not directly available from Excel, but using the free ExcelPython package it is fairly easy to make the link, vastly increasing the functionality of Excel for maths, science and engineering applications.

I have collected interface functions to many of the (to me) most useful Scipy and Numpy functions in a Python module called xlScipy.  These may be accessed from a free and open source spreadsheet: xlSciPy.zip.

The spreadsheet requires a copy of Python to be installed (including Numpy and Scipy), all other files, including ExcelPython, are included in the download package.  Just unzip to a convenient folder, enable macros when you start up, and it should work.  For those who have downloaded earlier versions of ExcelPython, this spreadsheet uses version 2.08, which is not compatible with earlier versions, so install in a different folder to the earlier versions.

The functions included in xlSciPy are listed on the first sheet (index):

xlScipy1
There are ten functions for interpolation and fitting:
xlScipy2
17 linear algebra functions:
xlScipy3
12 functions for solving, creating and evaluating polynomials of any order:
xlScipy4
All of the Scipy special functions (219 in all):
xlScipy5
and 26 statistical functions:
xlScipy6

Additional functions will be added as time permits.

The spreadsheet includes examples and brief documentation of each function; more detailed documentation can be found at Numpy Reference and Scipy Reference.

Posted in Excel, Link to Python, Maths, NumPy and SciPy, UDFs, VBA | Tagged , , , , , , | Leave a comment

Working with array functions

Regular visitors will know that I make frequent use of User Defined Functions that return an array of data, and need to be entered as an array function.

This has the advantages that the functions are very flexible in the layout of input and output data, and brief documentation can be added easily, but they do have a couple of disadvantages:

  • There is no simple built-in way to reduce the size of an array functions output range.
  • For applications where simplicity and speed of operation is more important than flexibility it is often better to use sub-routines with pre-defined ranges and user-input forms and buttons.

To get the best of both worlds I have modified the RegExpres spreadsheet with the following features:

  • Added two sub-routines that will adjust an array function range to the selected range (press Ctrl-Shift-R), or adjust the size to display the full output array (press Ctrl-Shift-S).
  • Added a simple demonstration of how a function can be wrapped in a sub-routine, that will read the input data from defined ranges, and write the output array as values to a range of adjustable size.

The spreadsheet can be downloaded from RegExpres.xlsb, including full open-source code.

The array re-sizing functions were taken from the Technicana site, where they are made freely available, along with other VBA utilities.

The screen-shots below show an array output being reduced down to the selected area with Ctrl-Shift-R, then re-sized to display the full array with Ctrl-Shift-S:

Array-resize1

Array-resize2

The code for the subroutine to return the array function output to a defined range is shown below:


Sub Reg_Expres_Sub()
Dim SubIn As Variant, SubArg As Variant, FuncResA As Variant
Dim Pattern As String, Operation As String, RepString As String, Glob As Boolean, IgnoreCs As Boolean
Dim NumRows As Long, NumCols As Long

' Read data
SubIn = Range("subin").Value2
SubArg = Range("subarg").Value2

' Create function arguments
Pattern = (SubArg(1, 1))
Operation = SubArg(2, 1)
RepString = SubArg(3, 1)
Glob = SubArg(4, 1)
IgnoreCs = SubArg(5, 1)

'Call Function
FuncResA = Reg_Exp(SubIn, Pattern, Operation, RepString, Glob, IgnoreCs)

'Check size
NumRows = UBound(FuncResA)
NumCols = UBound(FuncResA, 2)

' Clear output range and rezize
With Range("subout")
.ClearContents
.Resize(NumRows, NumCols).Name = "subout"
End With
' Write results to spreadsheet
Range("subout").Value2 = FuncResA

End Sub

This code is linked to a button allowing the displayed results to be updated as required:

Array-resize3

Posted in Arrays, Excel, UDFs, VBA | Tagged , , , , | Leave a comment

More on Regular Expressions

The previous post presented a User Defined Function (UDF) using regular expressions to extract all numbers, or a selected number, from any string (or a list of strings).  In this post I will look more closely at the code used to achieve this, and present a general purpose UDF allowing regular expressions to be used on strings.  A spreadsheet with examples of both functions, and full open-source code, can be downloaded from RegExpres.xlsb.

First the Regular Expression object must be created and given the appropriate criteria to return all numbers (0-9), the decimal character (“.”), the exponent symbol (“E”), and the plus and minus symbols (“+-“).  It is also set to return all matching strings (Global = True) and to ignore the case of alphabetic characters (IgnoreCase = True):


Const ExpString As String = "E", pmString As String = "+-"
'Initialize variables
Set RE = CreateObject("VBScript.RegExp")
'Criteria for Regular Expression
REPatt = "[0-9" & DecString & ExpString & pmString & "]"
With RE
.Pattern = REPatt
.Global = True
.IgnoreCase = True
End With

There follows some code to convert the input range to a variant array and count the number of rows; the numbers in each string are then extracted with the line:

Set Matches = RE.Execute(NumString)

As discussed in the previous post, for a string containing a single value, that is all that is required, but if there is more than one number all the number strings will be concatenated, and must be separated. This can be done by stepping through the original string and comparing the characters with those in the Matches variable, as shown below:

'Loop Matches collection and compare with each character of original string to build strings of all numbers in the sample string
        strNumber = ""
        NumMatches = Matches.Count
        NumChar = Len(NumString)
        j = 0
        k = 1
        PrevMatch = False
        For i = 1 To NumChar
            MatchString = Mid(NumString, i, 1)
            If PrevMatch = False And UCase(MatchString) = ExpString Then
                j = j + 1
            ElseIf PrevMatch = False And (MatchString = "+" Or MatchString = "-") And _
                   IsNumeric(Mid(NumString, i + 1, 1)) = False Then
                j = j + 1
            ElseIf MatchString <> IgnoreString Then
                Digit = Matches(j)
                If Digit = MatchString Then
                    strNumber = strNumber + Matches(j)
                    j = j + 1
                    PrevMatch = True
                ElseIf PrevMatch = True Then
                    If Position = 0 Then
                        OutA(Row, k) = CDbl(strNumber)
                    ElseIf Position = k Then
                        OutA(Row, 1) = CDbl(strNumber)
                        Exit For
                    End If
                    strNumber = ""
                    k = k + 1
                    PrevMatch = False
                End If
            End If
        Next i
        If PrevMatch = True Then
            If Position = 0 Then
                OutA(Row, k) = CDbl(strNumber)
            ElseIf k = Position Then
                OutA(Row, 1) = CDbl(strNumber)
            End If
        End If
        If k < Position Then OutA(Row, 1) = CVErr(xlErrNA)
    Next Row

Regular Expressions also have many other uses, and the Reg_Exp UDF allows them to be applied quickly and easily. The Regular Expression object has the methods “List”, “Test”, and “Replace”, and the screen-shots below show the use of these methods on a list of strings:

Extract all numbers

RegExp1

Extract all occurrences of the string “Node”

RegExp2

Find strings containing “Node”

RegExp3

Replace “Node” with “Number”

RegExp4

Find all four letter strings starting with “n” and ending with “e”

RegExp5

For more details of setting up regular expression patterns see:
VBA RegEx: How to Evaluate and Use “Regular Expressions”

Finally note that if you want to apply these UDFs to a range they must be entered as an Array Function to return all the results.

Posted in Excel, UDFs, VBA | Tagged , , , , | 2 Comments

Extracting numbers from text and regular expressions

Daily Dose of Excel recently had a post on extracting a number from a specified position in a text string, using mega-in-sheet-formulas like:

ExtractNum2-0

This reminded me of a post from earlier this year, which provided a User Defined Function (UDF) that used regular expressions to extract numbers from a string.  This worked well if the string contained a single number, but returned incorrect results if there were two or more numbers.

I have now updated this function to return correct results from a string (or array of strings) with two or more separate numbers.  The new function (ExtractNums) is compared below with the previous version (ExtractNum), and another function that extracts numbers from the right hand end of a string.  As well as finding all the numbers, the new function will also deal with negative numbers, numbers with a comma separator for thousands, and numbers in E notation.  The new function may be downloaded from RegExpres.xlsb

In the screenshot below the function returns all the numbers from each string, in an array:ExtractNum2-1

It is also possible to specify a particular number from the sequence to return.  The screenshot below shows results for a range of different number formats, compared with two functions posted in the comments to the DDoE article:

ExtractNum2-2

In the screenshot below the third number in the sequence has been requested. When there are less than 3 numbers the function returns #N/A. Note that the Extractnums function recognises 123,456 as a single number, whereas the DDoE functions treat it as two separate numbers.
ExtractNum2-5

For speed of operation, The original ExtractNum function (which does no checking of the string returned by the regular expression function) is about 5 times faster than ExtractNums, so is useful for strings that contain a single numerical string.  The F_snb function is about 30% faster than ExtractNums, but does not recognise comma separators, E Notation, or numbers starting with a minus sign.  The F_snb2 function (which is written as a single line of VBA code), is about 10 times slower.

The next post will have a closer look at the code, and present another function for working with regular expressions.

Posted in Arrays, Excel, UDFs, VBA | Tagged , , , , | 1 Comment

Range tips

A couple of tips on Excel ranges, coming from questions on the Eng-Tips Forum:

1. How to rotate a range:

The Transpose function (or option in the Paste-Special dialog box) will swap the rows and columns of a range, but there is no simple built-in way to rotate a range of data through 90 degrees.  The VBA code below will create a User Defined Function (UDF) that will do the job:

Function RotateRange(RRange As Variant) As Variant
Dim NumRows As Long, NumCols As Long, i As Long, j As Long, RotnA() As Variant

RRange = RRange.Value2
NumRows = UBound(RRange)
NumCols = UBound(RRange, 2)
ReDim RotnA(1 To NumCols, 1 To NumRows)
For i = 1 To NumRows
For j = 1 To NumCols
RotnA(j, i) = RRange(NumRows + 1 - i, j)
Next j
Next i
RotateRange = RotnA
End Function

Copy and paste this code into a VBA module.

To use the function, enter =RotateRange(address of range to be rotated) in the top-left cell of the destination range, then select the entire output range, and enter the function as an array function to display the whole rotated range.

RotateRange

Update 29 Nov 2014:  Further to the comments from Ian Huitson and Lori below, I have uploaded a spreadsheet with examples of the UDF and their suggestions:

RotateRange2

 

2. How to show the address of a named range (or not):

The writer of this question was having a problem in Excel 2007 with named ranges in formulas being converted to cell addresses whenever a cell was edited, either by pressing the F2 key, or by double clicking on the cell.  This was happening in some files, but not others.

The solution to the problem was to go into Options-Advanced, scroll right to the bottom, and de-select the two boxes under “Lotus 123 Compatibility Settings for:”  Note that each sheet of a workbook can have different settings, so it may be necessary to check every sheet.

Further examination found that this option works differently in the more recent Excel versions (2010 and 2013).

For the later versions if you double click in the cell, or press F2 the formula box at the cell displays the cell address, but the formula bar still displays the range name.  If you click in the formula bar both the formula box and the formula bar display the cell address, rather than the range name, but the range name returns when you press enter.  This behaviour is actually quite useful, but it seems to be little known.

Strangely, Lotus 123 itself (at least in Rel. 9.5) does not replicate either of these behaviours.  When you click on a cell, or click on a formula bar, the range names remain as names, just like in Excel, when “Lotus 123 Compatibility” is not selected!

Posted in Excel, UDFs, VBA | Tagged , , , , | 6 Comments