Chaos Reigns Within: 2010

Friday, 31 December 2010

Alteryx 2011 - New Features - Connection Progress

I don't know how many Alteryx users are familiar with Microsoft's SQL Server Integration Services (SSIS)? It is Mircosoft's attempt at an Alteryx Engine type tool which I could write a whole article about. But I'm not going to risk boring you with what I see as SSIS's shortcomings. All I will say is I was very disappointed with SSIS when it was released and when I started using Alteryx I never looked back.

However this time last year there were two features of SSIS which I thought Alteryx lacked.

The first was the ability to hide away sections of the module in collapsible sections. At Extend 2010 Ned demonstrated this same functionality in Alteryx courtesy of Tool Containers. After that demo I mentioned to Ned that now there was only one thing which I thought was better about SSIS than Alteryx, namely the displaying of record counts on data connections.

Well now in Alteryx 2011 we have connection progress in real time as data flows through them. These are a great new feature and allow users to:

See progress of the module more easily
See mistakes in their module before it completes processing
Easily check joins and filters are behaving as expected without needing to cross reference to the log
Identify inefficiencies in their modules

The below video shows them in action:

Tuesday, 28 December 2010

Alteryx 2011 - New Features - Keyboard Shortcuts

This was going to be a post on the new keyboard shortcuts available in Alteryx 2011, but as I was writing it I found that there were already some keyboard shortcuts which I didn't know about in Alteryx 5.0. So I thought I would make this a complete list of all the keyboard shortcuts available in Alteryx (for those of you who like that kind of thing...)

Ctrl+N - New Module
Ctrl+O - Open Module
Crtl+S - Save Module
Crtl+R - Run Module/Cancel Module

F1 - Help on what is selected
F5 - Refresh Config

Crtl+Z - Undo
Ctrl+Y - Redo

Ctrl+X - Cut
Ctrl+C - Copy
Ctrl+V - Paste

Del - Delete what is selected

Ctrl+F - Find Tool

Ctrl+Tab - Switch between open modules

Ctrl+Mouse Scroll Wheel - Zoom in/out

New in Alteryx 2011

Ctrl+Alt+B - Show Toolbar
Ctrl+Alt+T - Show Toolbox
Ctrl+Alt+V - Show Overview
Ctrl+Alt+O - Show Output Window
Ctrl+Alt+P - Show Properties Window

Please let me know if there are any more that I have missed off the list.

Wednesday, 22 December 2010

Alteryx 2011 Released

A slightly belated post due to my busy life at the moment, but no less exciting for the delay. Alteryx 2011 is now released! Check out the offical blog here.

It is fully rebranded with the new Alteryx brand colors and logo which look great:

It comes with a whole host of exciting new features which are going to enable users to do more with Alteryx and do things more easily!

Some of the new features include (follow the links for more details):

Iterative macros
Tool containers
Record counts on connections
Enhanced macro and wizard editing
New charting tool with new chart types including a box and whisker chart.
Treeview control for macros and wizards
KML read support
Solocast tools
Document Constants (I can see these transforming how macros and wizards are built)
Module packager

Looks like it's going to be an exciting release! Look out for blog posts on some of the new features here when I have a bit more time.

Thursday, 11 November 2010

Inspired Ingenuity or Overly Engineered? (Part 2)

So I think my solution was probably inspired somewhat by my gantt chart module. I was trying to visualise the problem and realised I could actually turn the problem into a spatial one and then use Alteryx's spatial tools to solve it.

So here it is:

Imagine a rectangle for each task sitting on a time line (like a gantt chart but with all the tasks on one row).

If we make the length of the rectangles proportional to the length of the tasks, then we can say the area of the rectangles represents the time taken for that task.

Now imagine another rectangle on the same time line representing the entire reporting period. The area of this rectangle (in green below) represents the total time available.

Now if we cut the smaller task rectangles from the large reporting period rectangle, then the area remaining of the larger rectangle is the unused time.

With the spatial tools of Alteryx, cutting one set of polygons from another is easy. So that's what I did:

You can download the module here:

I think you'll agree a rather unique way of using the spatial capabilities of Alteryx in a way they were never designed for.

So where does that leave my Occam's razor? Well Ned has already sent me his solution and I have to say it doesn't involve any spatial tools and if I'd manage to think about the generate rows in the right way it would have been a quicker solution to implement than mine. My module does run more quickly (0.9 seconds against 2.0 seconds on my machine) but the report only gets run once a month in the middle of the night so speed isn't really a necessity...

You can download Ned's solution here:

I think this one is a simpler solution than mine and for somebody opening the module to make changes in the future somewhat less daunting. So, without the necessity of a fast runtime, by applying Occam's razor I think Ned's is the better solution.

However having said that I hope that my solution has been interesting and might inspire some future spatial solutions to non-spatial problems.

Wednesday, 10 November 2010

Inspired Ingenuity or Overly Engineered?

Inspiring Ingenuity is one Alteryx's taglines and I think the module I will cover in this post is a great example of a way to solve a problem using the tools in a way they were never designed for. However (and hence the title of the post) I wonder whether my solution is clever for the sake of being clever and in reality there is a simpler solution which I have just overlooked.

I believe strongly in an "Occam's Razor" approach to problem solving and Alteryx modules are no exception. Occam's Razor is stated in a number of forms and really relates to scientific hypotheses. Wikipedia says

The principle is popularly summarized as "the simplest explanation is usually the correct one."

I've also heard it stated as "The simplest solution is the best solution". Now this isn't quite the version I subscribe to. Consider the example in Alteryx of needing to match against a large lookup file of several million rows on a regular basis. Well the simplest solution would be two input tools, a join and browse. But is that the best solution? Well if the time it takes for the module to run is important to you, then no. A better solution would be to load your master file into a Calgary dataset, index the columns you need to match on and then rebuild your module with an input tool, Calgary append and browse. More work to set up, but in the end a better solution.

So where does that leave our Occam's razor. Well continue down the Wikipedia article and we have this version

"pluralitas non est ponenda sine necessitate" ("plurality should not be posited without necessity")

in other words complexity in a solution should only be there if it is necessary. So in our above example we introduce the extra complexity of loading our data into Calgary to make our solution run faster.

Which brings me to a question about my module. Is the complexity of my solution necessary or can somebody think of a simpler solution?

First the problem:

We have an Alteryx server which can run up to four Alteryx modules simultaneously. Every time a module is run the start and end time of that module is logged in a SQL server database, which is then used for looking at how the server is being utilized (using Alteryx to generate the reports of course).

So one of the metrics we are interested in is server utilization: how much of the processing time of the server did we use last month? How much more capacity have we got until we need to get a second server?

And the difficult question:

How much time is the server completely idle?

If the tasks were non overlapping this would be as simple as calculating the time for each of the jobs and subtracting the total of these from the total time for the reporting period. But the tasks do overlap...

I'm going to leave you to think about possible solutions and post my proposed solution in the next post. If you want to have a go then I've provided my example data below to see if you can match my answer.

Monday, 1 November 2010

SAS Transpose

Last week I visited a client who had been trialling Alteryx. The team who had been conducting the trial were SAS analysts and were particularly impressed by Alteryx's spatial capabilities. At the end of the meeting I asked if there was anything which they thought they could do in SAS and not in Alteryx. The reply was the only thing was that the transpose tool didn't really to seem to behave as they expected given their experience of SAS's Proc Transpose.

So this weekend I thought I would take a look at what the differences might be. Now it has been a long time since I have written anything in SAS, so first step was a quick google to find out how the Proc Transpose tool worked. I came across this excellent tutorial:

http://support.sas.com/resources/papers/proceedings09/060-2009.pdf

I decided I would recreate the lessons from this document using Alteryx. Upon starting I realised why they had been having trouble, the Proc Transpose function of SAS is closer to an Alteryx cross tab than transpose, but in actual fact to reproduce the results exactly Alteryx needs a transpose and cross tab tool.

Just to be clear it is not that Alteryx tools aren't as efficient as SAS they are just different and I have added an extra example to each lesson which I have titled "The Alteryx Way" which shows what I think an Alteryx analyst would do to transform the example data.

I could have tried to turn my modules into a SAS Proc Transpose macro for Alteryx however I don't believe that would be the most efficient way of doing things in Alteryx. I am not presenting these examples as the best way of doing things in Alteryx; more of a way for SAS analysts to get a handle on the differences between the two tools. I think if you are going to program efficiently in any language you need to think about your solution in terms of how that language works.

My example modules can be downloaded below:

As always feel free to post any comments/questions here or e-mail me directly.

Tuesday, 19 October 2010

Alteryx Team Blog

The Alteryx team blog has a new home and name. Check out the "Alteryx Engine Works" here:

http://blogs.alteryx.com/engineworks/default.aspx

Comments are also working at last, so hopefully we'll see some questions from the community on the new features of Alteryx 6.0 as the team blog about them. First post on Document Constants is already there and looks like a powerful new feature.

For anyone who is subscribed to my google RSS Alteryx bundle then I have updated this to include the new "Alteryx All Blogs RSS Feed". For anyone who isn't subscribed then the below link gives you access to all of the Alteryx RSS feeds in one place:

https://www.google.com/reader/bundle/user/08686630194161749278/bundle/Alteryx

On a related note the UK Alteryx Blog now has e-mail subscriptions available courtesy of feedburner. Blog posts delivered directly to your inbox. Sign up at the below link:

Subscribe to The UK Alteryx Blog by Email

Thursday, 7 October 2010

Palette Importer and Exporter

Seeing this post http://srcgroup.extendthereach.com/showthread.php?t=1414 by Margarita on the Alteryx forum reminded me of an Alteryx module I had started a while ago to try and solve a problem with theming on a map. The solution didn't quite work out as I had hoped, but I believe the module may still be of use.

So here is my palette importer module for Alteryx:

It allows you to add palettes from Expression Design Swatch xml file format (or entered manually in a text input) to your Alteryx reportsettings.xml file. The reason I chose this format is that I found this site http://www.colourlovers.com/palettes which has over a million palettes to download in this format (though frustratingly they all only seem to have a maximum of five colors, if anyone knows of any better sites please let me know). You need to sign up to download them, but then just hit the design button under download options:

The module will pick up any number of palette xml files from the same folder that it is in. I have included two as samples. Once the module has run then you will need to restart Alteryx in order to be able to see the new palettes.

As Margarita suggests in her post please backup ReportSettings.xml before you start using this. It is tested, but just in case there are any problems you will want to be able to restore your backup. The other thing to mention is that if you import a palette with the same name as an existing one then the existing one will be overwritten.

It allows you to relatively quickly find a palette you like; import it into Alteryx and use it:

I have also put together this module:

which exports your palettes from reportsettings.xml to Expression Design Swatch xml file format files. Ideal if you want to move a palette from one machine to another or share it on the web.

As always any comments/suggestions please let me know.

Friday, 10 September 2010

Gantt Charts

The below recently appeared on the Alteryx forum:

Hi,
I have a an Alteryx module that reports project data and milestones for various projects. What is missing is a visual representation of the milestones and I was wondering if anyone had used the Alteryx reporting tools to create a Gantt Chart type view of milestones/tasks.
Thanks,
Gavin

Hello Gavin,

We do not currently offer a Gantt Chart option in the Chart tool. Thank you for expressing an interest in this feature, we will look into it as a possibility for the future.

-Brett

While Brett's reply is quite correct; there is nothing to stop you creating your own Gantt Chart using the tools currently available in Alteryx 5.0. If anyone remembers the techniques which were used to create a polar graph or a chess board which used the mapping tool as a custom renderer, then a similar technique could be used to create a Gantt chart.

So I've put together a sample module which shows a possible method for doing this. Which produces the below:

From the following data table:

The module uses a couple of interesting techinques including using the Generate Rows tool to create a data row for every day between two date ranges and generating the spatial objects needed to build up the Gantt charts by dynamically creating json objects (Thanks Ned).

The module looks like this:

and can be downloaded from here:

The things that need doing next with this are to take a look at the scaling aspect (it currently doesn't look so good for timeframes longer than a month) and turn the whole thing into a macro which outputs a report .

As always any questions/comments please post below or feel free to contact me directly.

Friday, 27 August 2010

Add Total Row and Column

Ever needed to add Row and Column totals to your data tables in Alteryx? This macro makes it easy!

From this:

To this:

With this:

Download the macro here:

Monday, 12 July 2010

Tips and Tricks - The Browse Tool

The humble browse tool is probably the most used tool in the Alteryx toolbox, it is a great way to take a look at your data at any point in your module; but how many of you use more than its default settings? Score yourself out of ten for how many of the following tips and tricks you already knew about the browse tool.

1. You can select which columns you want to look at and the module will remember these selections:

Great for when you need to check two fields against each other.

2. The browse tool will tell you if a data cell has leading spaces; trailing spaces or embedded newlines by displaying a little red triangle in the corner of the cell. Hovering over the red triangle displays the message:

3. The "Cell Viewer" button lets you examine the contents of an individual cell and with "Show Whitespace"

turned on shows you all of those leading/trailing spaces and embedded newlines:

4. The cell viewer pane also gives information on fields and rows by clicking on the header/row label cells:

5. It is possible to split the browse tool vertically or horizontally. Great for comparing records in different rows of the file:

or viewing spatial data in map and tabular form:

6. In the split map/data table view. Selecting a data row in the table will highlight that spatial object on the map and selecting a spatial object on the map will highlight the corresponding data row.

7. Need to compare records from two different browses in the same module? No problem! The new window button will create a new floating browse window of the current data; then switch the original properties window to the other browse and compare away.

8. You can save records (all or a selection) directly from the browse tool. Useful when you have spotted an issue and want to send the records to someone or do some further analysis elsewhere.

9. You can copy records from the browse tool and paste them directly into a text input tool. Really useful if you need to take a few records into a second module to some extra analysis:

10. The browse tool will display up to 1 billion rows. After that it only displays alternate rows, but maintains the correct row numbers in the first column.

So how many did you score out of ten? Post your scores in the comments below and let me know if you think of any others I have missed.

Friday, 9 July 2010

Alteryx Customer Advisory e-Forum (Part 2)

It was great to see such a good attendance for this; I thought it was a really interesting event. Thanks to all the other presenters for sharing what they are doing with Alteryx.

As promised here are my slides:

Looking forward to the next one!

Tuesday, 29 June 2010

Alteryx Customer Advisory e-Forum

I'm looking forward to the first Alteryx e-Forum on Thursday. I'm going to be talking about a recent project I have completed at Experian involving a server based setup of Alteryx; a custom queuing application using the API and a terabyte of Calgary databases.

For anyone who hasn't already seen the details, they are here. Hope to "see" you all there!

Friday, 28 May 2010

FormulaAddIn (or 3Rd Floor Flat part 2)

Well the next question from my 3Rd Floor Flat user was "So do I need to use that big long formula every time I want to title case something?" and of course for a common task for a user that's not ideal. First thought was to wrap it in a macro, but that feels like overkill for a simple task. Then I remembered something I had seen in the Software Developer's Kit - FormulaAddIns!

Did you know it is possible to create your own functions to appear in the formula tool? And what's more for ones which can be built using existing functions it's really quite easy.

For the lack of a better name I've called my new function TITLECASE2 and if you create the folder

C:\Program Files\SRC\Alteryx5.0\RuntimeData\FormulaAddIn\

and save the below xml file into it,

then restart Alteryx you will find TITLECASE2 appears right there in the function list under TITLECASE; giving me the elegant solution I was looking for. I think the xml is fairly self explanatory, but let me know if you have any questions on it.

Thursday, 27 May 2010

3Rd Floor Flat

Should TitleCase("3rd floor flat") resolve to 3Rd Floor Flat? Alteryx thinks it does, I'm not so sure... But the question I had from a user on Friday was how do I turn "3rd floor flat" into "3rd Floor Flat"? Which of course is what you would want on an address envelope.

My first solution involved using a regex tool to pull the first letters out then the dynamic formula macro to build the string back together captilasing the first letters. I wasn't overly happy with the solution though (far too many tools for a simple task) so after some more thinking have come up with the following formula:

REGEX_REPLACE(TitleCase(REGEX_REPLACE([Field1], "(\d)", "$1X")), "(\d)X", "$1")

(which I think should always work...)

Anyone think they can come up with a more elegant solution?

Friday, 14 May 2010

Dynamic Formula

A recent post on the Alteryx forum from Stephs (here) got me thinking: wouldn't it be useful if there was a Dynamic formula tool? I think this would solve Stephs problem in that they could build formulas (with the formula tool) to compare the fields and use an if statement to choose between the file and database version and then push the data back to the database (that's if I've understood the problem correctly, if not I've already thought of some other great uses for this tool.).

Well the motto of any advanced user of Alteryx should be "If you want a new tool in Alteryx, build it!" So here it is:

Just drop all the files in the zip into your macros folder. Usually something like C:\Program Files\SRC\Alteryx5.0\RuntimeData\Macros\.

The two interesting features of this macro are :

It's the first time I have made use of the macro action "Update Raw XML", which worked very nicely.
I have used the batch macro tool in a way it wasn't particularly designed for. I used it to run a single iteration which you might think defeats the point of the 'batch' feature, but it allows you to take data from the data stream and insert into the properties of the formula tool. Using this technique, any Alteryx tool could be made to have dynamic properties! I'm sure there must be some more uses out there for this technique.

Now for one of those other uses I've thought of. Remember this query from the Alteryx Team blog here?

Hi Tara,
I have a table with 60+ fields. I need to create an extra field that is the total of all the 60+ fields. So far, I've been doing this with the formula tool and writing an expression as follows "[01]+[02]+[03]+[04]+[05]+[06]+[07]+[08]+[09]+[10]+[11]+[12]+[13]+[14]+[15]….. etc". Just wondering if there is a quicker way to do this? If not, can you add as a suggestion for a future release?
Thanks,

Well with the dynamic formula tool you can write this formula dynamically! Plus it becomes very easy to also create subtotals based of the field names. Check out the below module showing how this could work:

If anyone else thinks of some other uses for this tool then add a comment here or drop me a note.

Wednesday, 12 May 2010

Street View Macro

Following on from Ron's great post about using HTMLPassThrough here. I would like to share with you all the Street View macro which came out of the same brainstorm in the solution centre at Extend 2010.

What it does is allow you to insert a Google Street View iframe into an html report from a given set of x, y co-ordinates without needing to know about HTMLPassThrough or indeed worrying about it. It uses a couple of interesting techniques including HTMLPassThrough and batch macros. I'm not going to go into huge amounts of details of how it works here, the modules are commented so just let me know if you have any specific questions.

So here it is:

Just drop all the files in the zip into your macros folder. Usually something like C:\Program Files\SRC\Alteryx5.0\RuntimeData\Macros\.

I've also created an amended version of the presidents report sample here:

Extract this one to here to see it work: C:\Program Files\SRC\Alteryx5.0\RuntimeData\Samples\Reporting Samples\

Only caveats I'm going to throw in for this one is that it is html reports only (it's an iframe after all) and if you are going to use it make sure you do so within Google's licence agreement.

Thursday, 15 April 2010

A Challenge

For those of you who thought the Grand Prix sounded like fun or enjoy a puzzle, then I have a challenge for you. Each week New Scientist pose an enigma; last week's can be found here:

http://www.newscientist.com/article/mg20627557.200-enigma-number-1590.html

My challenge is see how fast you can solve the enigma using Alteryx! In the same way as in the Grand Prix allow yourself thinking time beforehand and start the clock when you drag on the first tool, stop it when you have a solution. This one took me 15 minutes and 45 seconds. Anyone think they can solve it faster?

My solution can be found here:

Extend 2010

I would like to start by thanking everyone who was involved with the organisation and running of Extend 2010. I thought it was a great event and very well organised. I met lots of interesting people; got to see what others were doing with Alteryx and picked up a few new tips and tricks along the way. This article is my summary of the event.

Day 1 - So after registration and lunch the first sessions I attended were the Alteryx Product Training. These were the first part of a series of sessions totalling eight hours covering a range of Alteryx topics. The first two sessions covered Allocate and Solocast before moving on to the basics of Alteryx. Not a lot new for me in these sessions, but giving demo's and training in Alteryx is something I do in my role at Experian, so I found it useful to see someone else's approach to training.

The final session on day 1 was the Alteryx User Group Forum. I thought this was a great session to meet other Alteryx users and see what other people are doing with Alteryx. Ron House kicked off the session talking about his blog and had kindly given me five minutes to talk about mine. It was a great opportunity to talk about what I've been doing and why I think interaction among the Alteryx community is important for all of us. We then had a number of speakers from different companies talking about how they are using Alteryx; an excellent opportunity to get new ideas to take back to our own work.

Day 2 - There were a few different streams running on day 2 including the continuation of the Alteryx Product Training and the DemographicsNow User group, but the one which I attended was the Developer Summit. This was a series of presentation's by Alteryx employees on a range of interesting topics aimed at the developer user of Alteryx. All of the material from the presentations can be found on the Alteryx team blog here.

Session 1 - This was an overview by Ned of the tools in the Developer Tools section of the Alteryx toolbox, essential kit for anyone wanting to build macros, wizards or modules which begin to push the limits of what Alteryx can do. I'd seen and used all of these tools before but was great to get some clarification of exactly what they do behind the scenes.

Session 2 - Another session by Ned this time on how to build Alteryx macros. A good introduction on how to get started.

Session 3 - This session was run by Rob and covered the new feature of Alteryx 5.0: batch macros. This is a feature I was crying out for while I was using Alteryx 4.0 and so have used a great deal prior to the conference. My summary of the feature is that it starts to let you do loops in Alteryx. Rob presented an interesting demonstration of how they could be used and a real world application. What I found interesting was that relatively few people in the audience seemed to be using this powerful feature (look out for a future blog post on this one!).

Session 4 - Now we were starting to get into real developer realms with Linda presenting how to run Alteryx wizards from C#. There's not many presentations you will see with Visual Studio and C# code up on the screen!

Session 5 - This was a taster by Iain on the new Alteryx Web Services. This looks like it will be very exciting and I'm looking forward to it being released and hopefully being able to have a play with it.

After lunch there was a 'bonus' session with Ned in the solution centre on using the Alteryx SDK to build your own Alteryx tools. Another very exciting topic which opens up all kinds of possibilities for using Alteryx (another topic to look out for a blog post on).

Then after lunch we had the Alteryx Grand Prix in which I was a 'driver' and was the reason I had managed to get a ticket to the conference. It's effectively "build Alteryx modules against the clock" and try and be faster than the other nine competitors. It was a lot of fun and maybe got my heart racing more than getting up to talk at the user group forum the day before! Look out for a second article covering this in more detail later.

Day 3 - The day kicked off with the President's Opening from Dean Stoecker where the company name change was announced (see my previous post); followed by an amazingly inspiration talk by Erik Weihenmayer the only blind man to have climbed the highest peak on every continent, including of course Mount Everest. The rest of the day had a number of speakers from different companies talking about how they were using Alteryx in their businesses. Though I must admit as a developer I didn't attend all of these, instead spending a fair bit of time in the solutions center talking with the Alteryx product developers. Something which I found very useful, so thanks everyone for your time. The other interesting session in the day was a talk by Amy Holland on the future of Alteryx; I'm looking forward to getting my hands on Alteryx 6 already!

The day concluded with the awards dinner and then drinks into the evening, a great end to a great conference. I found the whole event really useful and motivational, especially being able to meet other developers and exchange ideas and techniques. I hope I can get there again next year!

Sunday, 28 March 2010

The King is dead. Long live the King.

At the Extend 2010 conference on Wednesday Dean Stoecker, then president of SRC, announced that SRC going forwards will just be called Alteryx. So no more SRC, no more www.ExtendTheReach.com, just Alteryx and www.Alteryx.com. The full announcement can be read here, but from a personal viewpoint I think this is a great move by the company and I think the new branding looks amazing. I'm excited about what I'm sure is going to be a great year for the company and product.

Extend 2010 before and after the re-branding:

Look out for a more detailed report from the conference later in the week and some blog posts on a couple of new macros written collaboratively at the conference.

Monday, 22 March 2010

RSS Readers (or how to keep up with all the Alteryx news in one place)

It might just be me who is behind the times, but until the end of last year I had never used a RSS reader. For those of you who don't know what a RSS reader is (or like me it is something you have vaguely heard about but never quite known what the point of one is), it is basically a piece of software (web or windows based) which pulls a number of news feeds together into one place. Still confused? Then click on the below link:

https://www.google.com/reader/bundle/user/08686630194161749278/bundle/Alteryx

This is a bundle of Alteryx feeds created using Google's RSS reader (cleverly named Google Reader). What you see on that page is all the recent posts from my blog; Ron's blog; the official SRC Alteryx blog and questions posted on the SRC forums. So all you need to do is subscribe to that bundle and you can keep up with all the Alteryx hints, tips and news in one place (http://www.google.com/reader). And what's more if new Alteryx news feeds become available in the future (or if anyone knows of any now that I am missing) then I will add them to that bundle and as a subscriber you will automatically start receiving those feeds as well.

So what are you waiting for? Subscribe now!

Reading KML Files

On the SRC Alteryx forum this week ashokbhatt asked if Alteryx can read and write KML files (http://srcgroup.extendthereach.com/showthread.php?t=1287). Co-incidentally I also received an e-mail from a colleague on Friday asking about Alteryx reading GML files. While the answer to both of these questions is that there is no in-built support for reading or writing these file formats, they are both of course xml documents which can be digested by Alteryx using techniques similar to what I used in part 3 of my posts on RegEx here.

The below module shows how a simple placemark file could be read using Alteryx:

and this one how a simple placemark file could be written:

Obviously these only pick out a very specific section of the kml file, but hopefully will give some ideas of what could be done.

The difficultly in writing a generic kml reader is that kml is such a rich data format (http://code.google.com/apis/kml/documentation/kmlreference.html) that some of the data stored in it would have little meaning in Alteryx.

Saturday, 20 March 2010

Alteryx Plays Chess

"OK the first thing I have to admit before I start and anyone gets too excited: is my wizard doesn't actually play chess (I haven't got the time to write a chess engine at the moment ;-) ) What it does do is let two users play chess against each other. But with that caveat said, on with the post..."

I'm not quite sure how this particular wizard came into being. I had just read Bill Flanagan's post (here) on using the mapping tool to create a polar graph, which I thought was great; and wanted to have a go myself at creating something custom using the mapping tool. Turned out I couldn't really think of any pressing business problems that I could put this application of the mapping tool to use on; so just for something to experiment with the concept, I thought I would render a chess board and pieces. So one lunchtime using the Alteryx spatial tools and some custom chess piece icons I got from the net, I built a simple module which renders a chess board and pieces.

Once I'd got that far I started wondering how difficult would it be to make a wizard which allowed a user to move those pieces around the board? Turns out it didn't need many more tools to allow a user to enter a start and end co-ordinate in a wizard front end and get the piece to move between those locations. Which gives us a wizard which allows two people to play chess against each other!

What I haven't done is program Alteryx with the rules of the game: although I think it would be possible to program that logic within Alteryx, that one is going to have to wait until I have more time. Then the next step of getting the wizard to actually make its own moves in response I think is going to be beyond the abilities of Alteryx (although you could always run an external engine via the command tool...)

But having said that the wizard as it stands is a fun demonstration of what can be done using the mapping tool. It is available for download here:

Monday, 15 March 2010

\0 delimited

Ever wondered what the point of \0 delimited files would be? Ever wished you could output delimited files with no quote qualifiers? I had done both and having found the answers thought I would share them here.

Last week I ran into a problem I have had on a few occasions over the years I've worked with Alteryx. This time round some of the tricks I had used before weren't going to work so I decided to post the problem on the SRC forum. A big thanks to Margarita at SRC, as the same day I got a reply which let me build a working solution!

My problem is a seemingly simple one to do with the delimited output file which Alteryx produces. What I wanted to output was a delimited file with a pipe delimiter and no qualifiers. Alteryx does pipe delimited no problem, but when you come to options on qualifiers all you have is the tick box "Always qualify fields" which is either true or false. But there is no option to "Never qualify fields" which is what I want.

So as you might have guessed from the title of the post the solution comes from using \0 as the delimiter. What \0 does is produce an output file with no delimiters and no qualifiers, which at first glance might not seem so useful. But what it allows you to do is build up your output file exactly as you want it using the functionality of Alteryx.

For example if you have a dataset with three fields some of which contain quotes and you want to output a pipe delimited, non-qualified file; then you can just use the formula tool to build a new field like so:

[Field1] + "|" + [Field2] + "|" + [Field3]

and then a select to output only that new field to a \0 delimited file.

The below module shows a few techniques of how you could use this:

Wednesday, 3 March 2010

RegEx and Alteryx (Part 3)

For those of you who haven't read the comments on part 2, Ned Harding has posted some extra information on how the Tokenise Method works here. This shows how you can use the tool to split a field on a RegEx matched delimiter as the help file describes.

On with part 3...

Replace Output Method

The last and I think possibly the most powerful of the four methods in the RegEx tool. For this method I am going to leave the e-mail example and take a look at something I seem to be needing to do more and more recently: xml processing. So many applications today are outputting xml files or requiring xml input, that being able to process it efficiently needs to be a staple tool of any data developer.

Before I began to learn about RegEx I struggled to process xml using Alteryx as from what I have seen, there are no specific tools for reading or writing it.

(this the stage that somebody says to me: "why didn't you just use the xxx tool?" and I say because I didn't know it existed. Oh the time I could have saved if I'd only known about that multi-field formula tool a year earlier...)

I was using complex formulas to search for parts of xml strings and then more formulas to extract the piece of data I wanted. With RegEx finding and extracting parts of xml suddenly becomes rather easy.

For this example I will use a piece of xml which I hope that anyone who has used Alteryx much will have at least seen, if not manipulated themselves: an Alteryx module. Yes, for those of you that weren't aware, when you save an Alteryx module it is saved as an xml document which, with due care, can be modified in any text editor. Great for those times when you need to rename about 100 fields using a file layout you have been supplied in Excel.

This example can be followed in the module RegEx_Example_Part3.yxmd downloadable from here:

The example module uses its own xml as an input and outputs a list of all of the tool ids in the module; along with their types and their x, y display positions. Run the module and take a look at the output browse, then add a new tool, save the module and notice how the new tool you added is now included in the output.

So how does it work? Well there are 3 stages to the module:

Get the xml into a single field and single row.
Use our RegEx replace to extract the required information from the xml.
Use Text to Columns tools and a select to format the output.

Now as this article is focusing on the RegEx tools I will leave you to work out how stages 1 and 3 work for yourselves and only explain stage 2. The module actually uses the RegEx tool with the replace output method 3 times:

The first use is to remove all of the carriage return line feeds from the data:

So \r\n is the Regular Expression for a carriage return line feed combination, we have specified that we will replace it with nothing and we have also specified that we will pass all the other unmatched text through to the output as well. I think this could be done using the standard formula REPLACE function, but I think you would need to use the CHARFROMINT function and know the unicodes for a carriage return and line feed, which isn't really any easier.

The second use is to remove all white space from between the xml tags:

This time we are searching for one or more white space between two > < characters and replacing the whole lot with ><. This is another example of something which I believe would be very difficult without RegEx; you could remove all spaces from the data easily enough, but that would remove spaces inside the tags which we wish to keep. You could also remove a fixed number of spaces between the ends of tags, but to remove an unknown number of spaces I think you would struggle.

The final use, having prepared the xml, is to extract the data elements from the xml corresponding to the Alteryx tools.

From looking at the data I identified that the block of data I was interested in always looked like this:

which I can quite easily write a regular expression to match like so:

<Node ToolID="(\d+)"><GuiSettings Plugin="([^"]+)"><Position x="(\d+)" y="(\d+)"

notice that I have four marked groups (the parts of the regular expression in parenthesis) representing the parts of the xml data I wish to extract. The power of a RegEx replace is that I can reference these named groups in the replacement text.

For example if I made my replacement text $1 then the whole expression would be replaced by 52 (the first marked group) in the above example.

What I choose for my replacement in this example is

$1¬$2¬$$3,$4$|

Which for the above example results in

52 ¬AlteryxBasePluginsGui.BrowseV2.BrowseV2¬$99, 779$|

This pulls out the 4 pieces of data I want in a nicely delimited format which I can then split into cells by using the text to columns tool, which is what happens in stage 3 of the module.

I think that the RegEx replace tool is possibly the most useful of the RegEx tools available in Alteryx and definitely worth spending the time getting to understand.

Other RegEx Functions

The last thing to complete our overview of the RegEx capabilities of Alteryx is to briefly mention the remaining two functions available in the formula tool.

REGEX_COUNTMATCHES takes a string and a regular expression and returns the number of times the regular expression matches the string.

REGEX_REPLACE works in a similar manor to the replace method of the RegEx tool, but within the formula tool. One exciting use of this is in the actions of macros and wizards, an example of which I plan to cover in a future post.

Conclusion

That concludes my overview of the RegEx capabilities of Alteryx, though my examples are just some starting ideas of what is possible. I hope I have inspired you to learn some basic Regex and next time you are struggling with a complex string manipulation think about using the RegEx toolset.

If you have any specific questions or comments on anything I have posted about then feel free to e-mail me or post a comment here.

Pages