Loading…
useR!2017 has ended
Thursday, July 6 • 6:20pm - 6:25pm
Ultra-Fast Data Mining With The R-KDB+ Interface

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Keywords: kdb+, big data mining, R-KDB+ Interface, business/industry, high-performance computing
Webpages: http://code.kx.com/wiki/Cookbook/IntegratingWithR
Commercial application of ultra-low latency techniques for data mining and machine learning have been ubiquitous in financial trading and related disciplines for many years. As early as 2005, algorithmic trading desks at hedge funds and large investment banks have relied on in-memory, columnar databases and map-reduce techniques for analysing millions of data points in milliseconds long before such tools were used in other verticals. In particular, technologies such as kdb+ and Q - a vector-based programming platform developed as a successor to APL (A Programming Language developed in the 1950s/60s in Harvard by Ken Iverson), provide an unchallenged ability to perform both simple and complex data manipulations at scale with speeds that are orders of magnitude faster than contemporary platforms used for Big Data. A lesser-used, but formidable capability that has been used by R-enthusiasts who were also kdb+ experts has been the R-KDB+ Interface used for interprocess-communication to share data between R and KDB+ processes all from within the user’s R-console or Q-console. In my nearly, 12 years of using R, I, like many of my colleagues who have worked in financial trading environments have found such capabilities indispensable especially when working with large, oftentimes, TeraByte-scale datasets. The proposed talk features the basics of using the R-KDB+ interface as a faster, superior and more optimal method to extract aggregated data from TB-scale data warehouses prior to statistical analysis in R.

Speakers

Thursday July 6, 2017 6:20pm - 6:25pm CEST
4.02 Wild Gallery