I wrote & tested my implementation of n-DES (encryption algorithm) for my thesis. It's support nVIDIA CUDA, CPU SINGLE CORE and CPU MULTI CORE. I think that i can't put here code but i can send binary version. If you want help with performance diagnostics on different cards write comment.
Picture from Wikipedia.com