FP & Scala: The Killer Combo for Machine Learning • Marek Kolodziej • YOW! 2015

Опубликовано: 31 Август 2024
на канале: GOTO Conferences
1,223
18

This presentation was recorded at YOW! 2015. #GOTOcon #YOW
https://yowcon.com

Marek Kolodziej - Principal Research Engineer at Nitro, Inc. ‪@mkolod‬

RESOURCES
https://x.com/marekinfo
https://github.com/mkolod
  / marek-kolodziej  

ABSTRACT
While FP and Scala have already become the mainstays of middleware, web development and big data stacks (#Akka, Play, Kafka, Spark), they tend not to have a big presence in the machine learning and NLP communities. For instance, the emerging deep learning toolkits are mostly Python‐based (#Pylearn2, Theano, etc.). The same goes for general-purpose machine learning (Python's scikit-learn, countless R libraries). Performance seekers dissatisfied with slow scripting languages write typed Cython code, contorted C++ libraries bound to scripting language wrappers, or resort to random exotic solutions such as Lua. Some even dispense with all abstraction and write incomprehensible CUDA kernels. There has to be a better way.

As a machine learning engineer, I want to write strongly typed functional code. Math has no place for side effects, and I don't want to waste time running a simulation for hours, only to find that I made a typo in my "stringly-typed" script. Unbeknownst to most, Scala's machine learning and NLP ecosystem is growing rapidly, from numeric processing (Spire, Breeze) to big data machine learning (MLLib, Mahout) to GPU‐based text parsing (Puck), to general‐purpose probabilistic programming (FACTORIE).

In this talk, I'll do a quick overview of Scala's machine learning ecosystem, and show how easy it is to re-use existing components to build a new, scalable algorithm implementation. If you'd like to see how you can write vectorized linear regression running native BLAS code, based on an SGD/Adagrad implementation written from scratch. capable of running at scale on petabytes of data using Spark, this talk is for you.

RECOMMENDED BOOKS
Martin Odersky • Programming in Scala 5th Edition • https://amzn.to/44rXiaM
Joshua D. Suereth • Scala in Depth • https://amzn.to/3QVADk6

  / gotocon  
  / goto-  
  / goto_con  
  / gotoconferences  
#FunctionalProgramming #ApacheSpark #ApacheFlink #MachineLearning #ScalaLang #SoftwareEngineering #Programming #MarekKolodziej #YOWcon

CHANNEL MEMBERSHIP BONUS
Join this channel to get early access to videos & other perks:
   / @goto-  

Looking for a unique learning experience?
Attend the next GOTO conference near you! Get your ticket at https://gotopia.tech
Sign up for updates and specials at https://gotopia.tech/newsletter

SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
https://www.youtube.com/user/GotoConf...