Planet Fp-Syd

April 10, 2014

Manuel Chakravarty

As a spin off from teaching programming to my 10 year old son...



As a spin off from teaching programming to my 10 year old son and his friends, we have published a sprite and pixel art editor for the iPad, called BigPixel, which you can get from the App Store. (It has a similar feature set as the earlier Haskell version, but is much prettier!)

April 10, 2014 11:47 AM

March 06, 2014

Simon Horman

kexec-tools 2.0.6 Released

I have released version 2.0.6 of kexec-tools, the user-space portion of kexec a soft-reboot and crash-dump facility of Linux.

This is a bugfix release to address a number of problems discovered in 2.0.5.

The code is available as a tarball here and in git here.

More information is available in the announcement email.

March 06, 2014 08:44 AM

February 06, 2014

Simon Horman

Perdition 2.1 Released

[perdition]

I have released version 2.1 of Perdition.

This is a bugfix release.

Key changes since 2.0:

  • Apply configured ciperhsuite preferences for outpand connections. This is a fix for for CVE-2013-4584.
  • Use 1.0 as the managesieve version

The code and related libraries are available as tarballs here. More information is available in the announcement email. More information about perdition here.

February 06, 2014 07:51 AM

February 05, 2014

Simon Horman

kexec-tools 2.0.5 Released

I have released version 2.0.5 of kexec-tools, the user-space portion of kexec a soft-reboot and crash-dump facility of Linux.

The code is available as a tarball here and in git here.

More information is available in the announcement email.

February 05, 2014 02:03 AM

January 07, 2014

Erik de Castro Lopo

When QuickCheck Fails Me

This is an old trick I picked up from a colleague over a decade ago and have re-invented or re-remembered a number of times since.

When implementing complicated performance critical algorithms and things don't work immediately, the best idea is to drop back to the old formula of:

  • Make it compile.
  • Make it correct.
  • Make it fast.

Often than means implementing slow naive versions of parts of the algorithm first and then one-by-one replacing the slow versions with fast versions. For a given function of two inputs, this might give me two functions with the identical type signatures:


   functionSlow :: A -> B -> C
   functionFast :: A -> B -> C

that can be used interchangeably.

When it comes to implementing the fast versions, the slow versions can be used to check the correct-ness of the fast version using a simple QuickCheck property like:


   \ a b -> functionFast a b == functionSlow a b

This property basically just asks QuickCheck to generate a, b pairs, pass these to both functions and compare their outputs.

With something like this, QuickCheck usually all finds the corner cases really quickly. Except for when it doesn't. QuickCheck uses a random number generator to generate inputs to the function under test. If for instance you have a function that takes a floating point number and only behaves incorrectly when that input is say exactly 10.3, the chances of QuickCheck generating exactly 10.3 and hitting the bug are very small.

For exactly this reason, using the technique above sometimes doesn't work. Sometimes the fast version has a bug that Quickcheck wasn't able to find.

When this happens the trick is to write a third function:


  functionChecked :: A -> B -> C
  functionChecked a b =
      let fast = functionFast a b
          slow = functionSlow a b
      in if fast == slow
           then fast
           else error $ "functionFast " ++ show a ++ " " ++ show b
                ++ "\nreturns    " ++ show fast
                ++ "\n should be " ++ show slow

which calculates the function output using both the slow and the fast versions, compares the outputs and fails with an error if the two function outputs are not identical.

Using this in my algorithm I can then collect failing test cases that QuickCheck couldn't find. With a failing test case, its usually pretty easy to fix the broken fast version of the function.

January 07, 2014 07:24 PM

December 28, 2013

Erik de Castro Lopo

Haskell : The Problem with Integer.

Haskellers may or not be aware that there are two libraries in the GHC sources for implementing the Integer data type.

The first, integer-gmp links to the GNU Multiple Precision Arithmetic Library which is licensed under the GNU LGPL. On most systems, libgmp is dynamically linked and all is fine. However, if you want to create statically linked binaries from Haskell source code you end up with your executable statically linking libgmp which means your binary needs to be under an LGPL compatible license if you want to release it. This is especially a problem on iOS which doesn't allow dynamic linking anyway.

The second Integer implementation is integer-simple which is implemented purely in Haskell (using a number of GHC extension) and is BSD licensed.

So why doesn't everyone just the the BSD licensed one? Well, integer-simple has a reputation for being slow. Even more intriguingly, I seem to remember Duncan Coutts telling me a couple of years ago that integer-simple was a little faster than integer-gmp when the Integer was small enough to fit in a single machine Word, but much slower when that was not the case. At the time I heard this, I decided to look into it at some time. That time has come.

A couple of weeks ago I put together some scripts and code to allow me to compile the two Integer implementations into a single program and benchmark them against each other. My initial results looked like this:

Integer performance (GMP vs Simple)

That confirmed the slowness for multiplication and division if nothing else.

Taking a look at the code to integer-simple I found that it was storing Word#s (unboxed machine sized words) in a Haskell list. As convenient as lists are they are not an optimal data structure for a something like the Integer library.

I have already started work on replacement for both versions of the current Integer library with the following properties:

  • BSD licensed.
  • Implemented in Haskell (with GHC extensions) so there are no issues with linking to external C libraries.
  • Fast. I'm aiming to outperform both integer-simple and integer-gmp on as many benchmarks as possible.
  • Few dependencies so it can easily replace the existing versions. Currently my code only depends on ghc-prim and primitive.

So far the results are looking encouraging. For Integer values smaller than a machine word, addition with my prototype code is faster than both existing libraries and for adding large integers its currently half the speed of integer-gmp, but I have an idea which will likely make the new library match the speed of integer-gmp.

December 28, 2013 11:08 PM

December 23, 2013

Manuel Chakravarty

The future of array-oriented computing in Haskell — The Result!

I recently posted a survey concerning The future of array-oriented computing in Haskell. Here is a summary of the responses.

It is not surprising that basically everybody (of the respondents — who surely suffer from grave selection bias) is interested in multicore CPUs, but I’m somewhat surprised to see about 2/3 to be interested in GPGPU. The most popular application areas are data analytics, machine learning, and scientific computing with optimisation problems and physical simulations following close up.

The most important algorithmic patterns are iterative numeric algorithms, matrix operations, and —the most popular— standard aggregate operations, such as maps, folds, and scans. (This result most surely suffers from selection bias!)

I am very happy to see that most people who tried Repa or Accelerate got at least some mileage out of them. The most requested backend feature for Repa are SIMD instructions (aka vector instructions) and the most requested feature for Accelerate is support for high-performance CPU execution. I did suspect that and we really like to provide that functionality, but it is quite a bit of work (so will take a little while). The other major request for Accelerate is OpenCL support — we really need some outside help to realise that, as it is a major undertaking.

As far as extending the expressiveness of Accelerate goes, there is strong demand for nested data parallelism and sparse data structures. This also requires quite a bit of work (and is conceptual very hard!), but the good news is that PLS has got a PhD student working on just that!

NB: In the multiple choice questions permitting multiple answers, the percentages given by the Google Docs summary is somewhat misleading.

December 23, 2013 09:19 AM

Let's program!

Last year, I started to teach my, then, 9 year-old programming. Yesterday, we took it a step further by including 5 of his friends. We began writing a simple 2D game in Haskell using the Gloss library (which provides a simple, purely functional, event-driven API on top of OpenGL). My goal is to provide the children with a basic understanding of fundamental computational concepts.

I put the code from our first session into a public Git repo along with a brief summary of how the session was structured.

December 23, 2013 06:20 AM

December 09, 2013

Manuel Chakravarty

The future of array-oriented computing in Haskell — a survey

In the Programming Languages & Systems (PLS) group, we have spent a lot of energy on developing methods for high-performance array programming in a purely functional style. We are curious how our work is being used and what else the community would like to be able to achieve with libraries, such as Repa and Accelerate. Please help us by completing this survey. Thanks!

December 09, 2013 01:51 AM

December 04, 2013

Manuel Chakravarty

A new version of the GPU language Accelerate

We released version 0.14 of Accelerate, the embedded high-level language for general-purpose GPU programming. In addition to new constructs for iterative algorithms and improved code generation, this version adds support for the latest CUDA release (5.5) and for OS X Mavericks.

To learn more about Accelerate, watch Trevor’s YOW! Lambda Jam 2013 talk (slides) or read Chapter 6 of Simon Marlow’s Book Parallel and Concurrent Programming in Haskell.

You can find more information on Accelerate’s GitHub page.

Call for help: Accelerate currently works (out of the box) on OS X and Linux. It should also work on Windows, but we need some community help to fix the build process on Windows — for details, please see the recent issue on GitHub.

December 04, 2013 02:30 AM

October 24, 2013

Manuel Chakravarty

The Glasgow Haskell Compiler (GHC) on OS X 10.9 (Mavericks)

Apple finally dropped the GNU C Compiler (GCC) from its developer tools and only supports the LLVM-based clang compiler. This causes the Glasgow Haskell Compiler (GHC) some grief, mainly due to its use of the C pre-processor (cpp) as a cheap macro system for Haskell[1].

Here is how to fix this for the latest version of the Haskell Platform for Mac — until the HP maintainers release an updated version. I am assuming you have installed Mavericks and that you have either (a) Xcode 5 (from the Mac App Store) with the command line tools installed or (b) have directly gotten the Command Line Tools for Xcode. Using the latest Haskell Platform for Mac, follow these two steps:

  1. Get and compile Luke Iannini’s clang-xcode5-wrapper[2] and put the binary into /usr/local/bin — or grab this already compiled binary and put it in /usr/local/bin/.
  2. Edit GHC’s settings file as described next.

Edit

/Library/Frameworks/GHC.framework/Versions/Current/usr/lib/ghc-7.6.3/settings

by changing the second line of the file, such that it reads

("C compiler command", "/usr/local/bin/clang-xcode5-wrapper")

That’s it! Happy Haskell hacking on the most advanced operating system ;)

And kudos to the kind Apple engineers who accepted last minute clang patches from the Haskell community, and to Austin Seipp and Carter Schonwald for developing the patches and working with Apple.

[1] I have long maintained the view that (ab)using cpp for Haskell is a Bad Idea.

[2] This is a Haskell program; so, either compile it before updating to Mavericks or grab my binary.

October 24, 2013 11:19 AM

October 16, 2013

Simon Horman

Status of MPLS support for Open vSwitch

Over the past year I have been working on adding support for MPLS, as defined in Open Flow from versions 1.2, to Open vSwitch. From time to time I receive email asking about the status of this work and in particular where is the best place to obtain the MPLS code. The purpose of this post is to answer those questions publicly.

There are two main threads of work in this area. And they can be found in branches of my own git repository https://github.com/horms/openvswitch

  1. The devel/mpls-v2.X series of patches

    These add basic MPLS support to the kernel datapath and are useful in their own right. If you want to use this I recommend the latest version, currently devel/mpls-v2.43.

    I am actively working on updating these patches to have them included in mainline. But as the merge is not under my control I can't indicate when that will happen.

    Basic MPLS support has already been added to the user-space datapath and is present in mainline.

    Basic support involves matching on MPLS LSE fields, push and pop MPLS actions, and actions to manipulate MPLS LSEs.

    A key restriction of this series is that it only allows one level of push or pop and moreover does not allow actions that act on the inner packet.

    For instance, it cannot handle mpls_pop:0x0800,dec_ttl.

  2. The devel/mpls-recirculation-v1.X series of patches

    These add recirculation to Open vSwtich, a mechanism to re-start processing of a packet after some actions have been applied. This mechanism is used to remove the restrictions described above in the mpls-v2.X series. When applied the only restriction that I am aware of is that recirculation may only occur a maximum of four times. And that may be altered at compile-time.

    This series is currently being re-written to take into account various changes in mainline. I hope that it will move forwards in the next few weeks.

    If you wish to try this patch-set I recommend the latest version, currently devel/mpls-recirculate-v17. This is based on devel/mpls-v2.36 and includes datapath MPLS support described above in 1. Both the datapath MPLS support and the upstream base are a little old but I believe the MPLS code works.

October 16, 2013 01:00 AM

October 08, 2013

Manuel Chakravarty

Here is the video of my YOW! Lambda Jam keynote asking,...



Here is the video of my YOW! Lambda Jam keynote asking, “Do Extraterrestrials Use Functional Programming?” You can also get the slides separately.

October 08, 2013 10:34 AM

October 04, 2013

Simon Horman

Perdition 2.0 Released

[perdition]

I have released version 2.0 of Perdition.

This is the culmination of the 1.19-rc series of releases. A decision has been made to name the release 2.0 instead of 1.19 as there are significant changes since the release of 1.18 including support for a new protocol, managesieve.

Key changes since v1.19-rc5:

  • Correct base64 calculation errors that resulted in managesieve authentication failing in some circumstances.
  • Use "imap" instead of "imap2" as default port for IMAP protocol

The code and related libraries are available as tarballs here. More information is available in the announcement email. More information about perdition here.

October 04, 2013 03:17 AM

September 27, 2013

Manuel Chakravarty

September 22, 2013

Manuel Chakravarty

PLS @ ICFP

My research group will present four talks ICFP and affiliated events. Trevor will present our ICFP paper Optimising Purely Functional GPU Programs, Ben will give the talk about our Haskell Symposium paper Data Flow Fusion with Series Expressions in Haskell, Amos will talk at the Haskell Implementors Workshop about GHC’s SpecConstr optimisation, and I will present an invited talk about Data Parallelism in Haskell at FHPC.

September 22, 2013 02:31 AM

September 17, 2013

Manuel Chakravarty

Embedding Foreign Code

In the new draft paper Embedding Foreign Code, we describe our use of template meta programming for skeleton-based code generation in Accelerate as well as the Accelerate foreign function interface to interoperate with native CUDA code.

September 17, 2013 07:42 AM

August 19, 2013

Manuel Chakravarty

A new home for C->Haskell

The home of the C->Haskell binding generator c2hs is now on GitHub: https://github.com/haskell/c2hs.

August 19, 2013 11:16 AM

August 15, 2013

Manuel Chakravarty

Embedded Languages for High-Performance Computing in Haskell

At the Facebook Faculty Summit, I talked about Embedded Languages for High-Performance Computing in Haskell. Using Accelerate as an example, this talk discusses the use of embedded languages with template-based code generation for targeting parallel architectures.

August 15, 2013 02:51 AM

Do Extraterrestrials Use Functional Programming?

The slides of my keynote from YOW! Lambda Jam 2013 in May in Brisbane consider a key question: Do Extraterrestrials Use Functional Programming? The talk discusses fundamental aspects of programming as well as means to leverage an understanding of these fundamental aspects to improve our programming practice.

August 15, 2013 12:01 AM

June 17, 2013

Manuel Chakravarty

Data Flow Fusion with Series Expressions in Haskell

We are currently exploring flow fusion, a new fusion method for purely functional array code that overcomes the main limitation of stream fusion, namely stream fusion’s inability to fuse branching streams. Our current flow-fusion prototype in the Glasgow Haskell Compiler manages to achieve a twofold speedup over stream fusion for computing convex hulls of 2D points using the QuickHull algorithm. In fact, the code generated by flow fusion is only a few percent points away from hand-written C code. We have summarised all the details in a draft paper Data Flow Fusion with Series Expressions in Haskell.

June 17, 2013 05:39 AM

May 02, 2013

Manuel Chakravarty

Seminar by Dave Thomas on "VMs Demystified – A Tour of the Engine Room"

Dave Thomas, who is widely known for his work on virtual machines and the YOW! Australia conference series, will be in Sydney this month. He kindly agreed to talk about VMs at CSE. Anybody with an interest in VM engineering and programming language implementations should mark the date in their calendar!

Time: 29 May 2013 (Wed), 11AM

Location: CSE Seminar room (K17_113), Level 1, CSE Building (K17)

Title: VMs Demystified – A Tour of the Engine Room

Speaker: David Thomas (Bedarra Research Labs and YOW! Developer Conference)

Also showing at: The talk is also presented at ScalaSyd the previous evening, 28 May, 18:30 — details here

Abstract

Language virtual machines are an essential part of current and next generation platforms.  Yet many developers have no real idea of what is actually happening when their program is run on a VM or the hardware.  This leads to many false assumptions about speed and space performance.  In this talk you will see under the hood of language virtual machines and gain an understanding of what makes VMs tick as well as differences between the languages they support.

First we explain the essence VM engineering including object representations, stack versus register VMs, RISC versus CISC byte codes; static dispatch to polymorphic inline cache; context management; interpretation versus dynamic translation/tracing versus compilation; garbage collection; and native types and code interfaces. We discuss benchmark speed and space performance versus real application performance.

Armed with the above knowledge we then engage in some of the entertaining educational VM debates. How can a JVM or PHP VM faster than C++? When is the JVM or CLR better? How does the language, or the language library impact the VM? Are strongly typed languages always faster than dynamic languages? How does hosting with CRuby, compare to JRuby or Java? Let’s put the VM in hardware? How do functional language VMs differ from object VMs? How can thousands of processes in Erlang be efficient compared to using native OS threads?

Bio

Dave Thomas is an expert in dynamic languages and has decades of experience building and deploying language VMs for mobile, instrumentation, embedded command and control, and business application on platforms from mainframes to micro devices. He is widely known and respected in the programming language community and this year will be presenting the keynote at the Commercial Users of Functional Programming (CUFP) conference. While CEO of OTI, now IBM OTI Labs, he over saw IBM’s Smalltalk and J9 family of Java enterprise and embedded JVMs, OSGi as well as the initial releases of Eclipse. He lead an IBM OTI research effort into universal virtual machines. After leaving IBM he worked on JVM support for dynamic languages and the use of V8 for embedded applications. For the past 6 years Dave has been working with high performance vector functional virtual machines, DSLs and most recently exploring special purpose HW VMs.

May 02, 2013 02:43 AM

April 29, 2013

Manuel Chakravarty

Back to Tumblr

With the demise of Posterous (after it was swallowed by Twitter), my blog is back to Tumblr. I have got plans for a more customised set up in the medium term, but currently I don’t have the time to set up blogging software.

As the URL scheme of Posterous and Tumblr is different, this breaks all existing direct links to post. Moreover, many of the old posts include a footer that points back to the defunct Posterous, and comments are also gone. I’m sorry for the mess.

April 29, 2013 05:44 AM

April 04, 2013

Manuel Chakravarty

Optimising Purely Functional GPU Programs

We just completed a draft of a new paper, Optimising Purely Functional GPU Programs, that explains two crucial optimisations of Accelerate, our embedded array language for GPU programming in Haskell. These two optimisations are a novel typed form of sharing recovery for embedded languages and a new array fusion method for massively parallel SIMD programs. The paper includes details on eight benchmark programs that support the effectiveness of our optimisations and pit Accelerate against competing frameworks, including CUDA C code.

April 04, 2013 03:40 AM

February 01, 2013

Tom Davies

Using the ZeptoProg II on OS X

I’ve decided that I should be using xmega AVR microcontrollers rather than the older series — the timer architecture is much nicer for one thing. The only downside is that they are 3.3V only — and have a different programming interface.

I bought the ZeptoProg II and a simple header board from Justin Mattair at mattairtech.com.

The programmer isn’t explicitly supported on OS X, but in fact it works fine once you set it up correctly.

I use avrdude for programming AVRs, and when I first ran it with the ZeptoProg II I got:

Blonder:avrtest tomd$ avrdude -c avrisp2 -p x128a4 -P usb:000200012345 -U flash:r:flash.hex:h -v -v -v -v

avrdude: Version 5.11.1, compiled on Feb 16 2012 at 22:11:20
    Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
    Copyright (c) 2007-2009 Joerg Wunsch

    System wide configuration file is "/usr/local/CrossPack-AVR-20120217/etc/avrdude.conf"
    User configuration file is "/Users/tomd/.avrduderc"

    Using Port                    : usb:000200012345
    Using Programmer              : avrisp2
avrdude: usbdev_open(): Found AVRISP mkII, serno: 000200012345
avrdude: usbdev_open(): using read endpoint 0x82
avrdude: Sent: . [01] 
avrdude: usbdev_recv_frame(): usb_bulk_read(): usb_bulk_read: An error occured during read (see messages above)
avrdude: stk500v2_recv_mk2: error in USB receive
avrdude: usbdev_send(): wrote -104 out of 1 bytes, err = usb_bulk_write: An error occured during write (see messages above)
avrdude: stk500_send_mk2(): failed to send command to serial port

I contacted Justin, who explained that this meant that the ZeptoProg was in AVR Studio mode, not avrdude mode. There’s a Java application which allows you to change this (it also allows you to use the ZeptoProg as a logic analyzer and signal generator).

There was one gotcha — the jar containing this application bundles the librxtxSerial.jnilib library which it needs to talk to the USB port, but only a 32 bit version. Java on my OS X version is 64 bit only, so I got messages like:

Blonder:xmega tomd$ java -jar ~/Downloads/ZeptoProg_II.jar 
java.lang.UnsatisfiedLinkError: /private/var/folders/tv/58qc8nq53x50q8f7r31z_6280000gn/T/natives3881310/librxtxSerial.jnilib: dlopen(/private/var/folders/tv/58qc8nq53x50q8f7r31z_6280000gn/T/natives3881310/librxtxSerial.jnilib, 1): no suitable image found.  Did find:
/private/var/folders/tv/58qc8nq53x50q8f7r31z_6280000gn/T/natives3881310/librxtxSerial.jnilib: no matching architecture in universal wrapper thrown while loading gnu.io.RXTXCommDriver
Exception in thread "main" java.lang.UnsatisfiedLinkError: /private/var/folders/tv/58qc8nq53x50q8f7r31z_6280000gn/T/natives3881310/librxtxSerial.jnilib: dlopen(/private/var/folders/tv/58qc8nq53x50q8f7r31z_6280000gn/T/natives3881310/librxtxSerial.jnilib, 1): no suitable image found.  Did find:
/private/var/folders/tv/58qc8nq53x50q8f7r31z_6280000gn/T/natives3881310/librxtxSerial.jnilib: no matching architecture in universal wrapper
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1939)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1868)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1854)
at java.lang.Runtime.loadLibrary0(Runtime.java:845)
at java.lang.System.loadLibrary(System.java:1084)
at gnu.io.CommPortIdentifier.<clinit>(CommPortIdentifier.java:83)
at GUITerminal.getAvailableSerialPortNames(GUITerminal.java:234)
at GUITerminal.<init>(GUITerminal.java:134)
at ZeptoProgGUI.<init>(ZeptoProgGUI.java:56)
at ZeptoProg_II.main(ZeptoProg_II.java:25)

The strange /private/var/folders/tv/58qc8nq53x50q8f7r31z_6280000gn/T... path in the message above is where the jnilib from the jar is unpacked.

To fix this I needed to:

  • download this file http://code.google.com/p/create-lab-commons/source/browse/trunk/java/lib/rxtx/librxtxSerial64.jnilib (this contains 64 bit and 32 bit architectures)
  • rename it to librxtxSerial.jnilib (possible not necessary)
  • replace the file in the jar: jar uf ZeptoProg_II.jar librxtxSerial.jnilib
  • then it runs fine with java -jar ZeptoProg_II.jar
  • Put the board into multitool mode, then a file /dev/tty.usbmodem513281 will appear (not sure if the number is always the same) and you can refresh the COM port list, select it and connect. When the board is not in tool mode that device is not present in /dev.

Once set up like that the ZeptoProg works with avrdude.

by Tom Davies at February 01, 2013 10:34 PM

January 22, 2013

Erik de Castro Lopo

parMap to the Rescue.

I had a long running, CPU intensive Haskell program that I wanted to speed up. The program was basically a loop consisting of a a small sequential part followed by a map of a CPU intensive pure function over a list of 1500 elements.

Obviously I needed some sort of parallel map function and I had a faint recollection of a function called parMap. The wonderful Hoogle search engine pointed me to the parMap documentation.

Changing the existing sequential map operation into a parallel map required a 3 line change (one of them to import the required module). I then added "-threaded" to the compile command line to enable the threaded runtime system in the generated executable and "+RTS -N6" to the executable's command line. The resulting program went from using 100% of 1 CPU to using 560% of 1 CPU on an 8 CPU box. Win!

I wish all code was this easy to parallelize.

January 22, 2013 11:08 AM

December 22, 2012

Erik de Castro Lopo

My Space is Leaking.

Over the last couple of days I wrote a small Haskell program to read a large CSV file (75Meg, approx. 1000 columns and 50000 rows) and calculate some statistics. Since I would have to do this for much larger files as well, I decided to use the csv-conduit library to read the data and use a function passed to Data.Conduit's sinkState to calculate the statistics.

The code was pretty easy to put together, and only came to about 100 lines of code. Unfortunately, when I ran the program, it tried to consume all 8Gig of memory on my laptop and when I actually let it run to completion, it took over an hour to produce useful output.

A bit of quick profiling showed that the problem was with the state used to hold the statistics. The state itself wasn't huge, but Haskell's lazy evaluation meant there were a huge number of thunks (pending calculations) piling up.

Aside : Haskell does lazy (more correctly called non-strict) evaluation by default. This means that values are calculated when they are needed rather than when the program hits that point in the code. For instance if a value is generated by calling a pure function, the GHC runtime will forgo actually calling the function and replace the value with a thunk containing the function and it's input parameters. Later, when the value is actually needed, the runtime will call the function stored in the thunk.

My first attempt to fix this problem was to add some strictness annotations to my data types, but that didn't seem to help. I then looked at the deepseq package and tried adding the $!! operator in a few places. This resulted in a compile error complaining about my data structures not having an NFData instance. A bit of googling for "custom NFData instance" showed up the deepseq-th package which uses Template Haskell to generate NFData instances.

Aside : For a value to be an instance of the NFData typeclass means that it can be fully evaluated, ie a thunk to calculate a value of this type can be forced by deepseq to replace the thunk with the value.

About 10 minutes later I had my code working again, but now it processed the same file in a little over 2 minutes and used less than 0.1% of the 8Gig it was using previously.

I was happy with this. So happy that I decided to thank the author of deepseq-th, Herbert Valerio Riedel (hvr) on the #haskell IRC channel. Herbert was pleased to hear of my success, but suggested that instead of deepseq-th I try using deepseq-generics. Someone else on the channel suggested that this might be slower, but Herbert said that he had not found that to be the case. Switching from one to the other was trivially easy and pleasingly enough the generics version ran just as fast.

That's when José Pedro Magalhães (dreixel in #haskell) said that he had a draft paper "Optimisation of Generic Programs through Inlining" explaining how and why this generic implementation is just as fast as the Template Haskell version. Basically it boils down to the compiler having all the information it needs at compile time to inline and specialize the code to be just as fast as hand written code.

Reflections:

  1. Streaming I/O libraries like Data.Conduit (there's more than one) do give guarantees about space usage so that when you get a space leak the I/O is probably not the first place to look.
  2. For small programs its relatively easy to reason about where the space leak is happening.
  3. For a relatively experienced Haskeller, following the bread crumbs to a working solution is relatively easy.
  4. Code that uses a struct to accumulate state is a common contributor to space leaks.
  5. Interacting with the Haskell community can often get a better result than the first thing you find (eg deepseq-generics instead of deepseq-th).
  6. Generics can be just as fast as Template Haskell generated code.

December 22, 2012 05:42 AM

August 22, 2012

Simon Horman

Chiz. Horman Textile

The first of my wife's textiles are available and our online shop is now open. chiz-horman.com

August 22, 2012 07:12 AM

August 19, 2012

Manuel Chakravarty

Learn how to program GPUs with Haskell

Simon Marlow recently taught a course on parallel programming in Haskell at the EDF/CEA/INRIA Summer School 2012, “Functional Programming for Parallel and Concurrent Applications”. One of his lectures teaches GPU programming in Haskell with our Accelerate framework. If you like to learn to use Accelerate to program GPUs in Haskell, check out Simon’s slides on the topic as well as the accompanying programming exercises (see Page 14 onwards).

August 19, 2012 12:08 PM

June 06, 2012

Simon Horman

Embedded Kernel Back-Porting

Today I gave a presentation at LinuxCon Japan in Yokohama discussing the motivations and methods used by him when making back-ports for embedded Linux kernels. more...

June 06, 2012 06:59 AM

June 04, 2012

Manuel Chakravarty

Being more clever about vectorising nested data parallelism

Our new draft paper on Vectorisation Avoidance introduces a novel program analysis for nested data parallelism that lets us avoid vectorising purely scalar subcomputations. It includes a set of benchmark kernels that suggest that vectorisation avoidance improves runtimes over merely using array stream fusion.

June 04, 2012 12:18 PM

Repa 3: more control over array representations with indexed types

We have got a new draft paper on Guiding Parallel Array Fusion with Indexed Types. It describes the design and use of the 3rd generation Repa API, which uses type indices to give the programmer control over the various parallel array representations. The result are clearer programs that the compiler can more easily optimise. The implementation of Repa 3 is ready for use on Hackage in the repa package.

June 04, 2012 12:02 PM

May 14, 2012

Manuel Chakravarty

GPU-accelerated array computations in Haskell

I just published stable release 0.12 of the embedded array language Accelerate for high-performance GPU computing on Hackage. The release includes example applications, such as a real-time Canny edge detector, a fluid flow simulator, and a quasicrystal animation, as well as example algorithms, such as radixsort, matrix computations, and a Black-Scholes option pricing model.

In the new modularised architecture, the Accelerate release comprises four packages:

  • accelerate (the main package providing the Accelerate language),
  • accelerate-io (conversion operations with other Haskell array libraries, including the data-parallel companion library Repa),
  • accelerate-cuda (backend targetting NVIDIA GPUs via the CUDA SDK), and
  • accelerate-examples (example applications and regression tests).

May 14, 2012 06:48 AM

April 21, 2012

Manuel Chakravarty

Using the Glasgow Haskell Compiler (GHC) on OS X Lion with Xcode 4.3

Here is what you need to do if you want to use the Glasgow Haskell Compiler (GHC) on OS X Lion with a clean installation of Xcode 4.3.

Command line tools

You need to install the command line tools from Apple. You may do that in two ways (the second is faster as it is a much smaller download):

  1. Install all of Xcode:
    • Install Xcode from the Mac App Store.
    • Launch Xcode.
    • In the Preferences dialog of Xcode, select the “Downloads” pane and install “Command line tools”.
  2. Install the command line tools only:

In both cases, you need to register as an Apple developer. (This is a free registration.)

Install GHC 7.4.1 (or later)

Older versions of GHC —including GHC 7.0.4, which is part of the Haskell Platform 2011.4.0.0 (December 2011)— won’t work. They complain about not being able to execute ‘/Developer/usr/bin/gcc’.

Download and install GHC 7.4.1 (or a later version) — or install the Haskell Platform once a 2012 release has been made.

Using GHC’s LLVM backend (optional)

If you like to use GHC’s LLVM backend —which is more efficient for array-based and other loop-oriented code— you need to seperately install LLVM. (This is despite the Apple tools being based on LLVM as well.)

  • Install Homebrew (as per the instructions on that webpage).
  • Then, execute the following: ‘brew install llvm’

Compiling GHC from the sources in the Git repositories (GHC developers only)

If you plan to work on GHC itself and you need to compile the develeopment version of GHC straight from the Git repositories, you need to install the GNU auto tools as well (which are no longer distributed by Apple).

  • Install Homebrew (as per the instructions on that webpage) — if you didn’t do that already to install LLVM above.
  • Then, execute the following: ‘brew install autoconf automake’

The GHC Trac has more information on Building and Porting GHC.

April 21, 2012 12:25 PM

April 20, 2012

Goanna

Goanna 2.7 Released

We are delighted to announce the new Goanna 2.7 release across all product lines!

There are some major improvements and a bunch of new features in the new release.
Most noteworthy, we completely redesigned the  way we handle different compilers
and environments in Goanna Studio for Eclipse. It is now straightforward to support
different embedded compilers and cross-compilation settings.

In this release we also provide some early support for Visual Studio 11. As one
of the leaders in Visual Studio support we provide a beta for public download and
exploration.

As usual there has been a lot of improvements to the underlying analysis engine often times
significantly reducing the overall runtime. There are improved checks and reduced false positives
as well as a re-worked interprocedural analysis that finds more issues in a single run.

Goanna Studio Eclipse & Visual Studio

  • The IDE versions now come with preconfigured setting for different needs:  ”Quick Check”
    and “Deep Check” . The “Quick Check” performs most checks that  can be run quickly to allow fast results. The “Deep Check” option does thorough analysis and includes all possible checks. These are simple alternatives to the default/customised configuration.
  • There is a free beta version of a new “Suppression Manager” included to view sort, filter, and manage the warnings from a project.
  • Goanna has early support for Visual Studio 11 (beta).

All Versions

  • Goanna has changed from a custom configuration file format to use JSON. By moving to the new format it is much easier to support different compilers and environments.
  • There is a new user guide as well as a separate reference guide with more introductions, check examples and explanations.
  • The Goanna interprocedural analysis has been improved to give stable results between runs.
  • Reduced false positive reporting for many cases including the RED-unusued-var-all  check.
  • Better handling of random numbers (from the “rand()” function).
  • Support for the “-isystem” argument by default.
  • Improved  overall Goanna performance including avoiding some pathological examples.
  • Greater support of the C++11 standard and future options to support other languages.

by Ralf at April 20, 2012 03:29 AM

April 12, 2012

Manuel Chakravarty

GPU computing in Haskell: version 0.10 of Data.Array.Accelerate

Accelerate is an embedded language for GPU-accelerated array computations in Haskell that targets NVIDIA’s CUDA framework and also has an experimental OpenCL backend (that currently does not support the whole feature set of the language). I just released version 0.10.0.0 of Accelerate. A considerable amount of example code is in the companion package accelerate-examples.

For more information, have a look at Accelerate on GitHub.

April 12, 2012 07:10 AM

March 24, 2012

Manuel Chakravarty

Work Efficient Higher-Order Vectorisation

Our new draft paper on Work Efficient Higher-Order Vectorisation introduces a novel representation for nested, irregular parallel arrays that enables the work-efficient SIMD-ification of nested data parallelism — i.e., nested parallelism is transformed into flat parallelism, while maintaining the work complexity of a naive pointer-based representation of nested arrays. This solves a long standing problem that dates back to the original implementation of the language NESL.

March 24, 2012 07:36 AM

March 20, 2012

Simon Horman

Perdition 1.19-rc5 Released

[perdition]

I have released version 1.19-rc5 of Perdition. The key changes are:

  • ldap: fix segmentation fault in dbserver_get2()
  • Manage-sieve: Fix handling of plain login which would segmentation fault in some cases
  • Manage-sieve: Fix handling of long authentication hashes
  • Enhance --bind_address option parsing to handle IPv6 addresses
  • Fix 8/4byte integer type miss-matches which may lead to undefined behaviour

The code and related libraries are available as tarballs here. More information is available in the announcement email. More information about perdition here.

March 20, 2012 12:25 PM

Goanna

Goanna is Gearing Up for Visual Studio 11

Red Lizards Software is preparing for the future! As one of the pioneers for supporting Visual Studio 2010, we are once again leading innovation and we are happy to announce our preparation for a sim-ship release with the upcoming Visual Studio 11.

As recently announced, Visual Studio 11 beta is now available for download. The new IDE comes with a simplified design and a much streamlined user experience. At this point in time MSDN subscribers can download and trial the beta release: Visual Studio 11 Download

 

Goanna Studio, previously already available for Visual Studio 2005,  Visual Studio 2008, and Visual Studio 2010, will be available in time for the upcoming Visual Studio 11 to seamlessly support our innovative and trend defining users. A big thanks also goes to our development team for working with the guys at Microsoft to make Goanna Studio even better in the upcoming release. As a sneak peak we give you a first screenshot below, but if you want to try Goanna Studio for Visual Studio yourself, send us an email at vs11@redlizards.com.

Goanna Studio in Visual Studio 11 beta

Goanna Studio in Visual Studio 11 beta

About

Red Lizard Software is the leading provider of code confidence tools for C/C++ developers, bringing advanced static analysis closer to the developer by providing solutions for flexible build integration on the server and deep IDE embedding on the desktop across platforms.

Goanna is a fast, scalable, and precise software solution that detects software bugs and security vulnerabilities automatically during the development process, helping to keep product launch timetables on track and to reduce life cycle costs.

by Ralf at March 20, 2012 04:35 AM

March 12, 2012

Manuel Chakravarty

Seminar by Peter Thiemann on "A Calculus for Gradual Access Control"

We are having the pleasure to host Peter Thiemann from the University of Freiburg at UNSW at the moment. His many contributions span the areas of functional programming, type systems, static analysis, program specialisation, and web programming.

Coming Monday (19 March), Peter will give a presentation at the School of Computer Science and Engineering (CSE) of the University of New South Wales. The details are as follows.

 

Time: 19 March 2012 (Friday), 11AM

Location: CSE Seminar room (K17_113), Level 1, CSE Building (K17)

Title: A Calculus for Gradual Access Control

Speaker: Peter Thiemann (University of Freiburg, Germany)

Abstract

 

Many client-side Web applications are composed of script fragments that are independently downloaded from the Web. A recurring problem in that scenario is the need to impose an access control policy on the execution of a downloaded script. No script should be able to compromise the state of the base application, that is, it should neither access nor change sensitive data, which can be private data of the application, sensitive state of the browser, or a part of the DOM that is not explicitly assigned to it.

To address this situation, we want to construct a browser-embedded facility that regulates access control according to directives in a trusted program fragment. The calculus for gradual access control is the foundation of this facility. It incorporates a component that performs dynamic access control according to the directives. It also incorporates a static component that enables the system to omit the dynamic checks if it can prove them unnecessary. The two components are tightly connected by a gradual typing approach.

Bio

 

Peter Thiemann obtained his diploma in computer science in 1987 at the Technical University of Aachen, Germany. He graduated in 1991 at the University of Tübingen, Germany, where worked as a research and teaching assistent until 1997. In 1998, he was a lecturer in Computer Science at the University of Nottingham, England. Since 1999 he is a professor at the computer science department of the University of Freiburg, Germany, where he leads the programming languages group.

His research interests comprise theory and practice of modern programming languages, in particular program analysis, compiler construction, and program specialization for functional programming languages. The focus of his current research is static and dynamic program analysis for scripting languages.

March 12, 2012 05:43 AM

February 02, 2012

Manuel Chakravarty

The n-body problem and vectorisation of nested data parallelism in the face of shared data structures

Last year, we spent a lot of energy on reducing the memory consumption of vectorised Data Parallel Haskell (DPH) programs that use large shared data structures. This work was driven by an implementation of the Barnes-Hut algorithm in DPH. Ben produced an illustrative video animating an n-body simulation with his Gloss library. The video is part of his blog article Vectorisation without Replication in Data Parallel Haskell, where he explains the performance of our new DPH array library in comparison to our old library and a purely sequential implementation of Barnes-Hut based on Data.Vector.
If you are interested in how vectorisation works, what the problem with shared data structures is, and how we are solving that problem, you may like to have a look at the slides of a talk that I gave in Copenhagen end of last year. It is available in two formats: HTML5 slideshow & PDF.

February 02, 2012 11:16 AM

February 01, 2012

Manuel Chakravarty

Released Data.Array.Accelerate 0.9.0.0 — the Haskell array library for GPUs

I just released accelerate 0.9.0.0 on Hackage. This is the version that has been available from the GitHub repository for a while (supporting shape polymorphism, stencil computations, block I/O, and much more), but adapted such that it works with the forthcoming GHC 7.4.1 release. (I tested it with 7.4.1 RC2).
It doesn’t yet include Trevor’s recent work that improved the CUDA backend in many significant ways — you can get that code from Trevor’s fork on GitHub.
For more details, see the main GitHub repository and the GitHub wiki pages.

February 01, 2012 05:55 AM

January 26, 2012

Manuel Chakravarty

Free, nicely presented textbooks with good distribution

Free, nicely presented textbooks with good distribution have got quite an appeal.

January 26, 2012 01:42 PM

January 24, 2012

Erik de Castro Lopo

Benchmarking and QuickChecking readInt.

I'm currently working on converting my http-proxy library from using the Data.Enumerator package to Data.Conduit (explanation of why in my last blog post).

During this conversion, I have been studying the sources of the Warp web server because my http-proxy was originally derived from the Enumerator version of Warp. While digging through the Warp code I found the following code (and comment) which is used to parse the number provided in the Content-Length field of a HTTP header:


  -- Note: This function produces garbage on invalid input. But serving an
  -- invalid content-length is a bad idea, mkay?
  readInt :: S.ByteString -> Integer
  readInt = S.foldl' (\x w -> x * 10 + fromIntegral w - 48) 0

The comment clearly states that that this function can produce garbage, specifically if the string contains anything other than ASCII digits. The comment is also correct that an invalid Content-Length is a bad idea. However, on seeing the above code, and remembering something I had seen recently in the standard library, I naively sent the Yesod project a patch replacing the above code with a version that uses the readDec function from the Numeric module:


  import Data.ByteString (ByteString)
  import qualified Data.ByteString.Char8 as B
  import qualified Numeric as N

  readInt :: ByteString -> Integer
  readInt s =
      case N.readDec (B.unpack s) of
          [] -> 0
          (x, _):_ -> x

About 3-4 hours after I submitted the patch I got an email from Michael Snoyman saying that parsing the Content-Length field is a hot spot for the performance of Warp and that I should benchmark it against the code I'm replacing to make sure there is no unacceptable performance penalty.

That's when I decided it was time to check out Bryan O'Sullivan's Criterion bench-marking library. A quick read of the docs and bit of messing around and I was able to prove to myself that using readDec was indeed much slower than the code I wanted to replace.

The initial disappointment of finding that a more correct implementation was significantly slower than the less correct version quickly turned to joy as I experimented with a couple of other implementations and eventually settled on this:


  import Data.ByteString (ByteString)
  import qualified Data.ByteString.Char8 as B
  import qualified Data.Char as C

  readIntTC :: Integral a => ByteString -> a
  readIntTC bs = fromIntegral
          $ B.foldl' (\i c -> i * 10 + C.digitToInt c) 0
          $ B.takeWhile C.isDigit bs

By using the Integral type class, this function converts the given ByteString to any integer type (ie any type belonging to the Integral type class). When used, this function will be specialized by the Haskell compiler at the call site to to produce code to read string values into Ints, Int64s or anything else that is a member of the Integral type class.

For a final sanity check I decided to use QuickCheck to make sure that the various versions of the generic function were correct for values of the type they returned. To do that I wrote a very simple QuickCheck property as follows:


  prop_read_show_idempotent :: Integral a => (ByteString -> a) -> a -> Bool
  prop_read_show_idempotent freader x =
      let posx = abs x
      in posx == freader (B.pack $ show posx)

This QuickCheck property takes the function under test freader and QuickCheck will then provide it values of the correct type. Since the function under test is designed to read Content-Length values which are always positive, we only test using the absolute value of the value randomly generated by QuickCheck.

The complete test program can be found on Github in this Gist and can be compiled and run as:


  ghc -Wall -O3 --make readInt.hs -o readInt && ./readInt

When run, the output of the program looks like this:


  Quickcheck tests.
  +++ OK, passed 100 tests.
  +++ OK, passed 100 tests.
  +++ OK, passed 100 tests.
  Criterion tests.
  warming up
  estimating clock resolution...
  mean is 3.109095 us (320001 iterations)
  found 27331 outliers among 319999 samples (8.5%)
    4477 (1.4%) low severe
    22854 (7.1%) high severe
  estimating cost of a clock call...
  mean is 719.4627 ns (22 iterations)

  benchmarking readIntOrig
  mean: 4.653041 us, lb 4.645949 us, ub 4.663823 us, ci 0.950
  std dev: 43.94805 ns, lb 31.52653 ns, ub 73.82125 ns, ci 0.950

  benchmarking readDec
  mean: 13.12692 us, lb 13.10881 us, ub 13.14411 us, ci 0.950
  std dev: 90.63362 ns, lb 77.52619 ns, ub 112.4304 ns, ci 0.950

  benchmarking readRaw
  mean: 591.8697 ns, lb 590.9466 ns, ub 594.1634 ns, ci 0.950
  std dev: 6.995869 ns, lb 3.557109 ns, ub 14.54708 ns, ci 0.950

  benchmarking readInt
  mean: 388.3835 ns, lb 387.9500 ns, ub 388.8342 ns, ci 0.950
  std dev: 2.261711 ns, lb 2.003214 ns, ub 2.585137 ns, ci 0.950

  benchmarking readInt64
  mean: 389.4380 ns, lb 388.9864 ns, ub 389.9312 ns, ci 0.950
  std dev: 2.399116 ns, lb 2.090363 ns, ub 2.865227 ns, ci 0.950

  benchmarking readInteger
  mean: 389.3450 ns, lb 388.8463 ns, ub 389.8626 ns, ci 0.950
  std dev: 2.599062 ns, lb 2.302428 ns, ub 2.963600 ns, ci 0.950

At the top of the output is proof that all three specializations of the generic function readIntTC satisfy the QuickCheck property. From the Criterion output its pretty obvious that the Numeric.readDec version is about 3 times slower that the original function. More importantly, all three version of this generic function are an order of magnitude faster than the original.

That's a win! I will be submitting my new function for inclusion in Warp.

Update : 14:13

At around the same time I submitted my latest version for readInt Vincent Hanquez posted a comment on the Github issue suggesting I look at the GHC MagicHash extension and pointed me to an example.

Sure enough, using the MagicHash technique resulted in something significantly faster again.

Update #2 : 2012-01-29 19:46

In version 0.3.0 and later of the bytestring-lexing package there is a function readDecimal that is even faster than the MagiHash version.

January 24, 2012 12:52 AM

January 14, 2012

Erik de Castro Lopo

A Simple Telnet Client Using Data.Conduit.

Below is a simple telnet client written using Haskell's new Conduit library. This library was written by Michael Snoyman the man behind the Yesod Web Framework for Haskell.

The Conduit library is a second generation approach to the problem of guaranteeing bounded memory usage in the presence of lazy evaluation. The first generation of these ideas were libraries like Iteratee, Enumerator, and IterIO. All of these first generation libraries use the the term enumerator for data producers and iteratee for data consumers. The new Conduit library calls data producers "sources" and data consumers "sinks" to make them a little more approachable.

The other big difference between Conduit and the early libraries in this space is to do with guaranteeing early clean up of potentially scarce resources like sockets. Although I have not looked in any detail at the IterIO library, both Iteratee and Enumerator simply rely on Haskell's garbage collector to clean up resources when they are no longer required. The Conduit library on the other hand uses Resource transformers to guarantee release of these resources as soon as possible.

The client looks like this (latest available here):


  import Control.Concurrent (forkIO, killThread)
  import Control.Monad.IO.Class (MonadIO, liftIO)
  import Control.Monad.Trans.Resource
  import Data.Conduit
  import Data.Conduit.Binary
  import Network (connectTo, PortID (..))
  import System.Environment (getArgs, getProgName)
  import System.IO


  main :: IO ()
  main = do
      args <- getArgs
      case args of
          [host, port] -> telnet host (read port :: Int)
          _ -> usageExit
    where
      usageExit = do
          name <- getProgName
          putStrLn $ "Usage : " ++ name ++ " host port"


  telnet :: String -> Int -> IO ()
  telnet host port = runResourceT $ do
      (releaseSock, hsock) <- with (connectTo host $ PortNumber $ fromIntegral port) hClose
      liftIO $ mapM_ (`hSetBuffering` LineBuffering) [ stdin, stdout, hsock ]
      (releaseThread, _) <- with (
                            forkIO $ runResourceT $ sourceHandle stdin $$ sinkHandle hsock
                            ) killThread
      sourceHandle hsock $$ sinkHandle stdout
      release releaseThread
      release releaseSock

There are basically three blocks, a bunch of imports at the top, the program's entry point main and the telnet function.

The telnet function is pretty simple. Most of the function runs inside a runResourceT resource transformer. The purpose of these resources transformers is to keep track of resources such as sockets, file handles, thread ids etc and make sure they get released in a timely manner. For example, in the telnet function, the connectTo function call opens a connection to the specified host and port number and returns a socket. By wrapping the connectTo in the call to with then the socket is registered with the resource transformer. The with function has the following prototype:


  with :: Resource m
       => Base m a             -- Base monad for the current monad stack
       -> (a -> Base m ())     -- Resource de-allocation function
       -> ResourceT m (ReleaseKey, a)

When the resource is registered, the user must also supply a function that will destroy and release the resource. The with function returns a ReleaseKey for the resource and the resource itself. Formulating the with function this way makes it hard to misuse.

The other thing of interest is that because a telnet client needs to send data in both directions, the server-to-client communication path and the client-to-server communication run in separate GHC runtime threads. The thread is spawned using forkIO and even though the thread identifier is thrown away, the resource transformer still records it and will later call killThread to clean up the thread.

The main core of the program are the two lines containing calls to sourceHandle and sinkHandle. The first of these lines pulls data from stdin and pushes it to the socket hsock while the second pulls from the socket and pushes it to stdout.

It should be noted that the final two calls to release are not strictly necessary since the resource transformer will clean up these resources automatically.

The experience of writing this telnet client suggests that the Conduit library is certainly easier to use than the Enumerator or Iteratee libraries.

January 14, 2012 02:22 AM

January 01, 2012

Simon Horman

An Introduction to Open vSwitch

Today I gave a presentation covering the basics of Open vSwitch at Linux.Conf.Au in Ballarat, Australia. more...

January 01, 2012 06:51 AM

November 23, 2011

André Pang

Two new mixes

I've been pretty dormant in my music for the past few years, but I have been working on two two mixes in my sparse spare time: "Tes Lyric":/music/tes_lyric, a weird blend of electronica, classical and rock, and "Stage Superior":/music/ss, a progressive house mix. They're up on my "music":/music page now; enjoy!

by ozone@algorithm.com.au at November 23, 2011 04:10 PM

November 21, 2011

Manuel Chakravarty

Seminar by John Hughes & Simon Peyton Jones @ UNSW

As part of YOW! 2011 (the Australian Software Developer Conference), Simon Peyton Jones and John Hughes will be in Sydney on the 7th and 8th of December 2011. On the evening of the 7th, they will appear at the YOW! Night Sydney.

On Thursday (8 December), John and Simon will give two presentations at the School of Computer Science and Engineering (CSE) of the University of New South Wales. The details are as follows.

Time: 8 December 2011, 10AM

Location: CSE Seminar room (K17_113), Level 1, CSE Building (K17)

 

Talk #1

Title: Accelerating race condition detection through procrastination

Speaker: John Hughes (Chalmers University & Quviq AB)

Abstract

Race conditions are notoriously frustrating to find, and good tools can help. The main difficulty is reliably provoking the race condition. In previous work we presented a randomising scheduler for Erlang that helps with this task.

In a language without pervasive shared mutable state, such as Erlang, performing scheduling decisions at random uncovers race conditions surprisingly well. However, it is not always enough. We describe a technique, procrastination, that aims to provoke race conditions more often than by random scheduling alone. It works by running the program and looking for pairs of events that might interfere, such as two message sends to the same process. Having found such a pair of events, we re-run the program but try to provoke a race condition by reversing the order of the two events.

We apply our technique to a piece of industrial Erlang code. Compared to random scheduling alone, procrastination allows us to find minimal failing test cases more reliably and more quickly.

John Hughes

John Hughes has worked with functional programming since 1980, was one of the designers of Haskell, and has been heavily involved with Erlang in recent years. In 2000 he and Koen Claessen published the first version of QuickCheck, a software testing tool which recently won the ACM SIGPLAN award for Most Influential Paper from ICFP in that year. He has focussed more and more on testing since then, co-founding Quviq AB in 2006 to market a commercial version of QuickCheck. He is currently both a Professor at Chalmers University, Sweden, and CEO of Quviq AB.

Talk #2

Title: Termination Combinators Forever

Speaker: Simon Peyton Jones (Microsoft Research)

Abstract

Nobody wants their compiler, or theorem prover, to go into an infinite loop, but making sure that never happens usually means applying some over-conservative heuristic like “never unfold a recursive function”. Approaches like that don’t work at all when you are doing partial evaluation or supercompilation, which fundamentally depend on inlining recursive functions. 

What we need is an online termination test. That is easier said than done; it is easy to make a mistake. In this talk I’ll describe a new combinator library that lets you build complex termination tests by combining simpler ones, while guaranteeing that that the result really is a termination test. The library elegantly encapsulates some clever mathematical ideas on well-quasi orders the cunning details are hidden from the customer.

I’ll use Haskell as the language of exposition, but you don’t need to be a Haskell guru to understand it, and the library would work equally well in other languages.

Simon Peyton Jones

Simon Peyton Jones, MA, MBCS, CEng, graduated from Trinity College Cambridge in 1980. After two years in industry, he spent seven years as a lecturer at University College London, and nine years as a professor at Glasgow University, before moving to Microsoft Research (Cambridge) in 1998.

His main research interest is in functional programming languages, their implementation, and their application. He has led a succession of research projects focused around the design and implementation of production-quality functional-language systems for both uniprocessors and parallel machines. He was a key contributor to the design of the now-standard functional language Haskell, and is the lead designer of the widely-used Glasgow Haskell Compiler (GHC). He has written two textbooks about the implementation of functional languages.

More generally, he is interested in language design, rich type systems, software component architectures, compiler technology, code generation, runtime systems, virtual machines, and garbage collection. He is particularly motivated by direct use of principled theory to practical language design and implementation — that’s one reason he loves functional programming so much.

Posted via email from PLS @ UNSW | Comment »

November 21, 2011 03:51 AM

November 02, 2011

Goanna

Goanna 2.6 Released

Goanna 2.6 is now available from the download page. In this release we have focused on the usability of our Goanna Studio for Eclipse offering which at the same time increases stability and flexibility in the face of the many different configurations that are possible within the Eclipse CDT environment. Here is a summary of what has changed in this release:

- All versions

  • Bounds checking for arrays of arbitrary dimension
  • Bounds checking for arrays within classes, structs and unions
  • Arrays of unspecified size are no longer considered to have size 0
  • Constant global variables are now modelled with a value that does not change
  • The constant “-1U” and others will now be modelled with an appropriately large value instead of -1
  • Check FPT-misuse no longer warns about function pointers that are the result of the ternary operator (?:)
  • Check ITR-uninit now works correctly for iterators that are initialized using operator=
  • Check RED-unused-param no longer warns for parameters that have the GNU attribute (unused)
  • Checks RED-cond-const-assign and EXP-cond-assign no longer consider “+=” and similar operators to be constant assignments
  • Non-system #include files are now included in the analysis of a file that includes them.

- Goanna Studio for Visual Studio

  • preprocessor macros within parentheses are expanded
  • macros in comments are not expanded

- Goanna Central

  • Cygwin support for windows, use –compiler-sort=cygwin to create a cygwin configuration
  • Other compiler sort added, use –compiler-sort=other for an empty configuration
  • Remove the dependencies on the hard to manage predefined_macro.txt files
  • Predefined macros are now stored in the Goanna resource files, which are generated during configuration

- Goanna Studio for Eclipse

  • Completely re-organised configuration
    • Per project configuration
    • File based (under a goanna directory in the project file system)
    • User editable, or use the Goanna Project Properties dialogs. (two way synchronisation)
  • Menu item “Run Goanna on Selected File(s)” now appears when right-clicking on folders, and will analyse all files found in the selected folder.
  • Cygwin toolchain support on windows

by Mark Bradley at November 02, 2011 12:46 AM

October 09, 2011

Erik de Castro Lopo

Michael Man Ho Mak RIP.

Michael Mak

On the same day that the computing world lost Steve Jobs, the company I work for lost its own star, founder and CEO of bCODE, Michael Mak.

I remember meeting Michael in late 2005 when I first came to interview for a job with bCODE. Michael impressed me immediately with his keen intelligence and his easy going personality. As I worked with him over the years, my respect grew. I came to trust him and the rest of the management team far more than I had ever trusted any other employer. I always felt that Michael saw the employees as an important part of the company and he wouldn't do anything to further the company at the expense of the employees. This was even true when he had to retrench a third of the workforce after the global financial crisis of 2008. I saw first hand how much distress this caused him and our COO.

When Michael moved the business side of the enterprise to the US, he would still make regular trips back to visit the Sydney office. During these visits three or four of us would go out to lunch and he would regale us with tales of people he met and deals that he was working on. These were exciting times and Michael was a great motivator.

The things I will remember about Michael was his enthusiasm, his integrity, his leadership and just being a great all round guy.

My condolences to his family and his girlfriend Emily. No one will miss Michael as much as them.

Michael was 37 years old.

October 09, 2011 01:07 AM

September 03, 2011

Manuel Chakravarty

Video and slides of "Data Parallelism in Haskell" @ Brisbane FP Group

Video @ vimeo

The Brisbane FP Group (BFPG) kindly invited me to give a talk about our work on data parallel programming in Haskell. The talk motivates the use of functional programming for parallel, and in particular, data parallel programming and explains the difference between regular and irregular (or nested) data parallelism. It also discusses the Repa library (regular data parallelism for multicore CPUs), the embedded language Accelerate (regular data parallelism for GPUs), and Data Parallel Haskell (nested data parallelism for multicore CPUs).  The slides of the presentation are available in two formats: HTML5 slideshow and PDF. Thank you to everybody who attended. Special thanks go to OJ Reeves and Tony Morris for organising everything, to Rob Manthey for producing the video, to Mincom for the venue, and to Functional IO for the sponsorship.

Posted via email from Just Testing | Comment »

September 03, 2011 12:49 PM

August 17, 2011

Goanna

Goanna 2.5 released

Goanna 2.5 is now available from the download page. Here’s a summary of what’s changed in this release.

- All versions

  • stored filenames are now specified as relative paths, allowing the Goanna database to be used in multiple locations
  • better support for Unicode source code
  • the ARR-inv-index check now allows for unbounded subscripts

– Goanna Central

  • license server options can be specified on the command line
  • the language choice, C or C++, can be given on the command line

– Visual Studio

  • clicking on a warning in a project summary navigates to the relevant code
  • better support for Win64 targets
  • partial compatibility with Intel Parallel Studio (on VS2005 and VS2008, Goanna does not work with projects converted to use the Intel C++ compiler)

As usual, feedback is welcome.

by Paul at August 17, 2011 02:00 AM

August 15, 2011

Simon Horman

Horms Solutions is Hiring

I am looking for an intern to work on a project with me as a Linux Kernel Developer focused on networking. For details please see horms.net/employment/

August 15, 2011 07:41 AM

August 11, 2011

Manuel Chakravarty

Data Parallel Haskell and Repa for GHC 7.2.1

As an add-on to the just released GHC 7.2.1, Ben just uploaded the Data Parallel Haskell packages (version 0.5) to Hackage. This version is still largely a technology preview and not a production-ready system. Nevertheless, it is significantly more robust than previous versions, especially for programs with a statically fixed depth of nesting of parallelism. For further information on how to install and use Data Parallel Haskell (DPH), see the DPH documentation.
Ben simultaneously released the companion library Repa (version 2.1.1.6) for use with GHC 7.2.1. In contrast to the DPH libraries, which enable nested data parallelism, Repa implements parallel, shape-polymorphic, regular multi-dimensional arrays.  See this previous post for some more details.

Posted via email from PLS @ UNSW | Comment »

August 11, 2011 02:17 PM

July 26, 2011

Goanna

Meet Goanna at the Symposium of Cyber Terrorism

Do you want to know more about software security? Learn about cyber defense? Meet some of the team behind the Goanna technology from NCITA at DSR 2011 in Singapore, August 3-5. We will be giving an overview of at the Symposium on Cyber Terrorism titled “Cyber Security at Software Development Time” and highlighting some of our experiences from the NIST/SAMATE program.

by Ralf at July 26, 2011 05:28 AM

June 29, 2011

Goanna

Ranking and Scoring Vulnerabilities

Recently, MITRE introduced the Common Weakness Scoring System (CWSS) for classifying and ranking common vulnerabilities. This systems has around 18 dimensions comprising technical severity, business impact, authentication/security barriers and overall probability mentioning just a few. While this is certainly a detailed scoring system it raises the common question: “Which of the detected issues should I fix first?”.

This is not an easy question to answer for any (automated) tool. While certain classes of bugs such as buffer overflows are likely more severe than, let’s say, unused function parameters it is not guaranteed that they have a larger impact. Sometimes a buffer overflow might only happen in a very unlikely scenario in some abandoned part of the code base, while the unused function parameter stems from a copy&paste mistake within the function leading to an always wrong and potentially dangerous result.  Similarly, it is difficult for any tool to tell, which parts of the code base are more important than others.

Having said that, we developed Goanna Studio and Goanna Central with openness in mind. This means all our detected issues can be easily exported and post-processed by the end user (you can even query our internal SQLite database if you really want to), filtered according to their needs and  ranked according to your system. Moreover, we provide a mapping of all issues to the common CWE criteria and we give you the following classification:

Goanna's Impact Guidance

Goanna's Impact Guidance

This 2-dimensional  classification is used for all issues detected by Goanna (see user manuals):

Severity: How serious is this issue typically?
Certainty: How confident are we that this will likely happen?

Both dimensions are based on our experience from having analyzed literally hundreds of millions of lines of code. Severity is ranked similarly to the above example, where a buffer overflow is deemed to be serious. Certainty on the other hand addresses a number of sub-dimensions: How likely is this from our experience to happen in real-life? How certain is Goanna, i.e., does the analysis conclude the issue will appear on every program part or just on a few? How sensitive is the issue to input data etc.? The combined dimensions should give you a good idea where to spend your time first. And while we are looking into integrating a CWSS ranking, we believe that less is sometimes more.

by Ralf at June 29, 2011 11:58 PM

June 15, 2011

Goanna

Goanna Studio and Goanna Central 2.4

In the latest round of development we aimed for the most rock solid and sane implementation of C/C++ static analysis that we could. We did this by working on three fronts: consistency, speed and features.

As usual, all current users get the upgrade for free. If you were a trial user in the past and need a trial extension visit: http://redlizards.com/trial-extension.

Whats new in 2.4?

General improvements:

  • We revamped the integer arithmetic interval analysis engine with a new algorithm which affords us increased speed and precision. So much speed that we have also added the ability to track pointer arithmetic as mentioned in the previous blog post.
  • Our interval analysis engine can handle new operations: Casts, Logical operators and shift operators.
  • We have enabled floating licenses in our Studio and Central products; please contact us if you are interested.

Goanna Central improvements:

  • The argument –parse-error=0 is now a default, this will cause Goanna to exit with an exit code of 0 when a parse error is encountered. We have also revamped the way parse errors are handled.
  • The interprocedural analysis in the Goanna Central command line is now even simpler to use and manage by conveniently specifying a user-defined folder with the argument –ipa=<project name>.  All new database files are then stored in ~/.goanna_project_store/<project name>.

Goanna Studio improvements:

  • In Eclipse there is now an option to not run the tool chain compiler during analysis.
  • In Eclipse there is a new menu item to jump to the currently selected projects Goanna Studio Properties page.

Renamed checks for consistency:

  • ATH-div-0-aft-assign -> ATH-div-0-assign
  • ATH-div-0-aft-cmp -> ATH-div-0-cmp-aft
  • ATH-div-0-bef-cmp -> ATH-div-0-cmp-bef
  • ATH-div-0-param-unchk -> ATH-div-0-unchk-param
  • PTR-param-unchk -> PTR-unchk-param
  • PTR-param-unchk-some -> PTR-unchk-param-some
  • RED-const-assign-cond -> RED-cond-const-assign
  • RED-const-expr-cond -> RED-cond-const-expr
  • SPC-ret-stack -> MEM-stack
  • builtin_ctor_dtor_leak -> COP-ctor-dtor-leak

New checks:

  • ARR-inv-index-ptr – A pointer is assigned to an array, static or dynamic, and it is accessed with an index that is out of the array’s bounds.
  • ARR-inv-index-ptr-pos – A pointer is assigned to an array, static or dynamic, and it is accessed with an index that may be out of the array’s bounds.
  • ATH-overflow-cast – An expression is cast to a different type, resulting in an overflow or underflow of its value.
  • ATH-shift-neg – The left-hand side of a right shift operation may be a negative value.
  • COP-dtor-throw – An exception is thrown, or may be thrown, in a class’ destructor.
  • CPU-delete-throw – An exception is thrown, or may be thrown, in an overloaded delete or delete[] operator.
  • FPT-arith-address – Performing pointer arithmetic on the address of a function.
  • FPT-literal – Dereferencing a function pointer that refers to a literal address.
  • FPT-misuse – A function pointer is used in an invalid context.
  • ITR-end-cmp-aft – An iterator is used, then compared with end().
  • ITR-invalidated – An iterator is assigned to point into a container, but subsequent modifications to that container have possibly invalidated the iterator. The iterator is then used or dereferenced, which may be undefined behavior.
  • ITR-mismatch-alg – A pair of iterators passed to an STL algorithm function point to different containers.
  • ITR-store – A container’s begin() or end() iterator is stored and subsequently used.
  • MEM-malloc-diff-type – A call to malloc tries to allocate memory based on a sizeof operator, but the target type of the call is of a different type.
  • MEM-stack-ref – A stack object is returned from a function as a reference.
  • PTR-arith-field – Direct access to a field of a struct using an offset from the address of the struct.
  • PTR-arith-var – Invalid pointer arithmetic with an automatic variable that is neither an array nor a pointer.
  • RED-cond-var-always – The value of the variable used as a condition will always evaluate to non-zero or true. This means the condition will always be met.
  • RED-cond-var-never – The value of the variable used as a condition will always evaluate to zero or false. This means the condition will never be met.

by Mark Bradley at June 15, 2011 06:54 AM

June 03, 2011

Simon Horman

An Introduction to Open vSwitch

Yesterday afternoon I made a presentation introducing Open vSwitch at Linux Con Japan. This is an update on the presentation of the same title that I made at the Netfilter Workshop 2011.

slides

June 03, 2011 02:17 AM

June 01, 2011

Goanna

Bounds checking for aliased arrays

Over the last few months we’ve been hard at work expanding our interval analysis and writing new checks for array bounds checking. I am happy to announce that we can now detect out-of-bounds array accesses for pointers to arrays, both automatic and dynamic. We also now fully handle pointers that offset arrays, allowing us to track their values and detect some extremely hard-to-find bugs.

For example:

int arr[15];

int example(int random){
  int *p = arr;
  p += 3;

  int offset = (random > 10 ? random : 10);
  return p[offset + 4];    //'p' points to 4th element, and index will be at least 14.
}

In this code sample, the pointer ‘p’ is used to alias the global, automatic array of ints, ‘arr’. This pointer is then increased by 3, making it point to the 4th element of the array.  An int, ‘offset’ , is set to be some unknown value, but we know it is at least 10. Finally, the pointer is accessed with index (offset + 4). Since the pointer already refers to the 4th element, and we are trying to access an at least 14 elements further along, Goanna will give the following warning:

8: warning: Goanna[ARR-inv-index-ptr] Array pointer `arr' is accessed with index [17,INF]
which is out of array bounds [0,14]

Similarly, we can perform the same analysis on dynamic arrays allocated with new or malloc. We also handle different syntactic forms of array access. The following example illustrates this, as well as our ability to warn for possible, as opposed to definite index violations, from a range of possible index values that may be out of bounds:

#include <malloc.h>
#include <assert.h>

int example2(int random){
  int *p = malloc(5 * sizeof(int));  //p is an array of size
  int offset = (random ? 7 : 3);
  return *(p + offset);    //'offset' will be either 7 or 3.
}

Here, ‘p’ points to a dynamic array of ints, with 5 elements. The ‘offset’ is set to be either 7 or 3. Accessing the index of value ‘offset’, by explicitly dereferencing the pointer, we know that the index may be within the bounds of the array, but it may be too large. Goanna will issue the following warning:

7: warning: Goanna[ARR-inv-index-ptr-pos] Array pointer `p' is accessed with index [3,7]
which may be out of array bounds [0,4]

Both these checks, along with some others we’ve been working on, will appear in the upcoming 2.4 release.

by Dominic Gurto at June 01, 2011 02:44 AM

May 20, 2011

Simon Horman

SH-Mobile ARM zboot presentation

Hikari This afternoon I made a short presentation at Japan Technical Jamboree 37 the work that I have been doing to allow booting Linux directly on the SH-Mobile ARM platform. This is an update on the presentation on the same topic in January at MobileFOSS Miniconf, part of linux.conf.au Brisbane 2011.

slides, etc...

May 20, 2011 05:58 AM