Hacker News new | past | comments | ask | show | jobs | submit login
Use Python in Excel without add-ins (xlwings.org)
251 points by gauriage on Sept 26, 2014 | hide | past | favorite | 70 comments



This is fantastic - I am going to start testing it now. I write a fair amount of VB macros, and would much prefer to be able to use Python.

And, if they could create a plugin that allowed Libre Office to do the same, they could solve one of the last remaining major interoperability issues with LO and Excel: the inability to run VB macros in LO. That would be huge.



The Python API looks horrible, basically Java code with Python syntax.


That was my impression as well, the last time I looked. Surely it can't be all that difficult to resolve?

It's a shame, having a simple python scripting system built into Calc would be a major advantage.


Might be irrevelant but anyone know best source to learn VB macros?


Speaking specifically about Excel, the most important thing to learn is the Range object, no matter what book you're using:

http://msdn.microsoft.com/en-us/library/office/ff838238%28v=...

Close behind are Collections and UDTs. The best thing to level up after you're comfortable is with 'Implements' for interfaces.

I found that search engines are as good as any book for this stuff. It's smeared all over the internet.

The best place I know of to get a good understanding is http://www.cpearson.com/excel/MainPage.aspx, but it's not a tutorial.

edit: that's what I get for keeping this window open too long:) That's 2 votes for cpearson.com.


If you are trying to automate some actions in the UI, use the record macro functionality and then edit the code. You can usually remove or combine a lot of the code it recorded.

If you want to use dictionaries/hashtable there is a reference you can add called scripting.dictionary.

When you search google for how to do stuff in VBA, most of the best answers are in Ozgrid and MrExcel sites, and of course StackOverflow.


When I had to spend a large amount of intimate time with VBA I used to go to

http://www.cpearson.com/Excel/Topic.aspx

http://www.ozgrid.com/VBA/

Old websites but in my case I dealt with Office <= 2003


Great recommendations, these two sites are still the best today. Ozgrid is my go to forum, the community there is top notch.


I'm still not sure there's another skywalker. I stopped [ab]using VBA in 2012 and unfortunately I can't recall the name yet.


My favorite reference is the VBA Developer's Handbook by Getz & Gilbert from 2001.

You can probably learn everything you need from Googling for solutions to problems, but that book was great when I was committed to working with Excel and Access for a while. It covers a lot of things you're not going to think to search about.


If you have any specific questions about VBA feel free to contact me. I spent a couple of years on macros in various contexts and still have loads of links. (I second the links already posted)


Years ago, I used Microsoft MSDN anytime I had questions relating to VBA.


excel 2007 vba programmers reference


In LO you cannot define format strings that automagically scale the number, e.g. "d = "0"mm"%%% to use meters for further calculations yet display in mm.


Not the same thing, but this made me think about it and readers might be interested to know that the xlrd, xlwt and xlutils python modules (http://www.python-excel.org/) allow you to manipulate Excel files in Python, without even having Excel installed (it's pure Python, so it also works on Linux, for example).


Looks like xlrd may have been updated to read newer formats, but xlwt is still limited and cannot write .xlsx format spreadsheets, for example. openpyxl _can_ handle the newer formats, but only the newer formats.

It's a bit of a mess.


Working with Excel spreadsheets in general is a mess. There really isn't a single library that does everything well and if you want of be able to get all the information like formatting, formulas, etc... the only really complete solution is to use pywin32 and com which requires Excel and isn't particularly fast or easy either.


It is a bit of a mess; documentation often amounting to a few example calls. However, I've written a fairly well received data analysis package for a government department wherein the input and output had to be in excel spreadsheets (-sigh-), using only xlrd and xlwt, alongside numpy and scipy. It's a bit Rubegoldbergian (because.. excel), but I owe a lot to the authors of xlrd and xlwt (they've even supported named regions). So i'll be curious to see how xlwings implements the numerous methods available from xlrd/xlwt.



Kind of a serious difference... This is using Excel in Python. Python is not running inside Excel with this.


True, but you can trigger Python code from Excel using this so it seems reasonable enough.


I am guessing DataNitro isn't happy about this


Not at all, it's great to see more products helping people get away from VBA. xlwings is well-done and has some features we don't (e.g. Mac interop), and likewise we have some features they don't (e.g. UDF's).

(I'm one of the DataNitro founders.)


Have you considered Mac interop?


Yes, it's DataNitro but free and with Mac compatibility ...


Mac Excel compatibility is a big deal for us. Is this the only thing if its kind not rooted in Windows Excel?


Ha.


Awesome to see a FOSS solution for this.

Ben at DataNitro[1], a paid solution that's been around for a while, has been awesome in my occasional correspondence with him as a user. Anyone looking for xlwing's functionality with commercial support etc should give them a look.

[1] https://datanitro.com/


I know this doesn't support Linux and LibreOffice, but could it? Is there a technical way that would allow Python to sit alongside LibreOffice?


The documentation doesn't seem very mature, but Python is supposedly supported as a scripting/macro language, and it is also possible to interact with LO at the API level.


Very cool, just this week I was looking for something exactly like this to help automate reporting that absolutely has to be Excel. I wonder how well it plays with PivotTables and how it handles lots of data. I tried all the other python Excel libraries, and they either had memory errors with our data (It's not even that big, but it's much larger than we should be using Excel for), or were fine on memory, but would destroy PivotTables in the sheets they wrote.

For anyone running 64-bit Office and getting a compile error, to get it to run, just add the PtrSafe keyword to the Declare line like this,

    Private Declare PtrSafe Function [etc...]



To run a python function you do need to write one line of VBA code and you also need a vba file that comes with the package so although its not very much to add on the VBA side the title here seems a bit misleading.


From what I gather watching the video, you only need the xlwing.bas module and the supporting VBA code if you plan to invoke python from within the Excel file. If you're only manipulating an Excel file with Python externally, which was covered in the beginning of the video, it doesn't appear you need the module and VBA code.


I use it since today, and I'm very impressed. I work for a financial startups, with old fashioned non-tech guys who works only on excel. Xlwings sove me from VBA programmation for my algorithms ...


This may not be an issue where you work, but how do you get away from those who "want to see the formulas" and "want to be able to modify the spreadsheets"?


Python is very readable; those users can usually understand it if they try.


So this talks via IPC to Excel? How does it perform when working with huge data sets? Is the copying back and forth between the two processes a bigger bottleneck than the Python interpreter itself?


You can see this limitation in the video, when generating 100^2 random numbers takes a while.


Very nice indeed. My team do statistical work, usually delivered in Excel, and we're going to be focused on automation / productivity quite heavily in the next few months. I need to put this through its paces a bit more than putting 'Hello World' in A1 - but it raises the tantalising prospect of a bunch of R and VBA work turning into a bunch of R and Python work.


Well I did a quick look for F# and C# tools like this and found

http://blogs.msdn.com/b/fsharpteam/archive/2013/07/16/combin...

Interesting usage. Personally I try to use CSV and code in R. R has libraries where I can pull information in BUT I am always afraid of issues with a non-open file format.

In the Universe of everything is done in Word and I am hard pressed to get people to even use Excel these approaches might make my job easier.


This has been available for over a decade with the win32com client using COM. You can easily register python classes so that you can call them from Excel and easily call Excel from Python. There is nothing new in this video. I put a simple tutorial on how to do some of this last year. http://detroitcoder.me/2013/11/26/video-create-python-object...


I was looking for the 'secret' sauce used to get python being able to interact with Excel: http://showmedo.com/videotutorials/video?name=2190010&fromSe...

However, I still haven't looked how the reverse is done (by xlwings.bas in Excel). Is it that XLWings also implments a COM server from which VBA can run Python?


The source is on GitHub...

https://github.com/ZoomerAnalytics/xlwings

The magic you're looking for is here:

https://github.com/ZoomerAnalytics/xlwings/blob/master/xlwin...

Surprisingly simple actually.


I was expecting something a bit more complicated, but simplicity usually is the best way to go...


I recently compared various ways to automate Excel with python: http://leontius.net/2014/excel-data-analysis-python/

Xlwings is something I did not know before writing the post, but it does look great and I would love to try it as soon as possible. It basically has all the advantages of Datanitro but it is free.


Can anyone compare this with PyXLL[1]? It does require add-ins, but xlwings requires a VBA module. From the instructions, importing the two seem about equally complicated.

I am more curious about technical differences than the difference in license (I know that PyXLL is closed source, while XLWings is F/OSS, BSD-licensed).

[1] https://www.pyxll.com/


PyXLL embeds python, and so runs python in-process (that is, in the same process as Excel, without spawning a child process).

PyXLL fully supports UDFs (user defined functions) in a fast, efficient way. Calling a pyxll function (UDF written in python) calls into the in-process python interpreter and so there is no cost of starting up a new python process or calling into an out of process COM server.

Excel and Python types are converted to and from each other in optimized C code, depending on the argument and return types specified for the function (UDF). The type conversion is user extensible and so can handle automatically converting Excel ranges to and from pandas DataFrames (just one example).

The full Excel API is accessible from pyxll using the well known and established win32com package.

PyXLL works with Python 2 and 3, 32 bit and 64 bit (all versions from python 2.3 and 3.4).

PyXLL supports newer features of recent Excel releases like multi-threading and asynchronous functions, while maintaining compatibility with older versions.

PyXLL can expose python functions as Excel macros and menus as well as UDFs.

If you want to script Excel from a process outside of excel (eg IPython notebook) you can use win32com very easily without the need for any other package. But, if you want to have a python interpreter running inside Excel to be able to write performant UDFs as an alternative to writing VBA or C addins as far as I'm aware PyXLL is still the only package capable of doing that[1].

PyXLL has commercial support and is partnered with a major python training and consultancy company, Enthought[2].

[1]DataNitro claims to be able to do with, but according to their docs you are limited to only being able to run one Excel instance at a time, which for most real world users is too limiting in my experience.

[2]https://www.enthought.com/products/pyxll/


It doesn't really make much sense to compare xlwings with PyXLL, as they perform a slightly different function and actually could be used together.

PyXLL is more comparable to ExcelPython[1], which is open source does a similar thing (enables you to write UDFs in Python). ExcelPython hosts the Python process outside the Excel process but it

a) does not restart the Python process with every UDF therefore has good performance

b) does not require any registration of COM servers - therefore doesn't require admin rights

c) does not require an add-in (there is an add-in, but it just facilitates development, it's not actually required to use the resulting workbook).

[1] https://github.com/ericremoreynolds/excelpython


So do you need excel to be installed to use this? Currently I use xlwt and read the files with google docs.


Yes...you need Excel installed (on Windows or Mac) as this is a package for interaction/programming with open files, not just reading/writing files.


Very nice although the real pain of programming against Office APIs isn't the language but the Office object models themselves, which seem to have been invented in hell by multiple demons that weren't communicating with each other.


How does the automation of Excel work on the Mac since there is no COM on mac? Is there some Applescript alternative ?


Office apps on the Mac are automated using Scripting Bridge [1]. Applescript is one way to drive a scriptable app, but Python and other languages can also use the bridge. I built a substantial Word addin on the Mac in C# on Mono, using the Scripting Bridge.

[1] https://developer.apple.com/library/mac/documentation/Cocoa/...


Not sure if I see an actual use case for this since we live in a world that has Pandas, but I'm impressed.


Would you rather edit a messy dataset in pandas or in Excel? What about automatically generating reports? Going to give managers your ipython printout?


with http://www.pyxll.com you can have functions that take and return pandas dataframes and series, see https://github.com/pyxll/pyxll-examples/blob/master/customty...


We live in a world where Excel has several tens of millions more users than Pandas.


I wonder how this compares to datanitro


But couldn't you do this with win32com.client already?


Yes.

In the past, I've used Python to automate Excel, Outlook, and some non-MS products like Catia.

Honestly, I don't see the advantage here. FOSS is nice, and OSX support is nice, but if you're interacting with MS products, FOSS is not a concern anyway, and it's probably running on Windows...

What would be really nice is if it supported multiple backends, so I could run the same Python code with Excel, Apple's Numbers, Google spreadsheet or Gnumeric, and have it work the same on all of them.


http://pyxll.com/ already does all of this, better

In particular it's in-process and does UDFs.

Admittedly it isn't open-source but it is free for non-commercial use.


Except for the Windows only factor. Being able to code for all versions of Excel is useful.


So could you do UDFs with this?


No but have a look at https://github.com/ericremoreynolds/excelpython for UDFs.


Cool. The problem with all these big "VBA replacements" (and don't get me wrong, I've done both VBA and VSTO and the latter is more pleasant to work with) is that they never seem to provide replacements for UDFs, which is probably the #1 reason you'd want to use VBA in the first place.


Excel-DNA is a nice open source project that lets you write XLLs in .NET http://exceldna.codeplex.com


also http://www.pyxll.com if you want a solution that embeds python and doesn't use a COM server


Very interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: