Another London Salesforce Developer meetup rolls around, this time with Paul Battisson speaking on Machine Learning with Apex. It was a topic I knew nothing about, hence the previous link to the wikipedia entry on it.
The meetup again was at capacity, with many folk on the waiting list; these meetups really are going from strength-to-strength. This outing of the group was hosted by BrightGen at their office in the stunning
Heron Tower / Salesforce Tower / 110 Bishopsgate venue, so many thanks go to them for hosting and the drinks and pizza.
Machine Learning with Apex – Paul Battisson
After enquiring about the audiences previous exposure to machine learning (ML), Paul gave us a good overview into it’s backgrounds and a couple of the different strategies that can be adopted. He covered the basics of Supervised and Unsupervised practices. Some example of real use of such undertakings, those of Google and Netflix (recommendation engines, etc) were also mentioned.
Field of study that gives computers the ability to learn without being explicitly programmed
– Arthur Samuel on Machine Learning
And what was Paul’s motivation behind trying to implement machine learning with Apex? Well he was told it couldn’t be done, of course. And that was it. I forgot to ask him if he’d actually used ML outside of demonstrations, for a client, it would interesting to know if it’d be worth doing. Paul had worked on this with Jen Whyer, another Maven, and was sure to credit her.
Paul talked about how he and Jen choose the K-means Clustering method to vector quantify a set of data, he ran through some example data and what this would give him. One of his examples was football penalties. Analysing where the strikers hit the ball, and whether or not a goal was scored. The BBC had run this analysis on the 2014 World Cup Penalties and came up with graphics like this;
You can see how this sort of analysis can really bring data to life, and be directly beneficial to those using it.
Paul then spoke on their implementation; how they’d used batch Apex to run through the (complex, I thought so) maths in a loop that eneded when the output was the same as the previous iteration. He talked through some techniques used to speed up the batch jobs;
- Chaining batches
- Speedier Loops – Paul has vlogged on how to make optimisations
- JSON (de)serialisation
- Keeping running totals to reduce duplication of calculations
- Using JS remoting to load the data sets as needed. They had kept each dataset in an attachment.
He ended by sharing some other, nicely honest, thoughts on how if you wanted to do this sort of thing that it might be better to do it with other tools, such as IBM Watson, Google or Amazon. He hopes that the Salesforce tools are better integrated, meaning that ML could be achieved using Wave, the Salesforce Analytics cloud.
The slide-deck has been posted to the Meetup.com event page and is available at this link.
In all I really enjoyed the meetup and it was especially interesting hearing from someone with a lot of passion, and about something that I knew very little. Of course thanks must go to Anup and the organising team for once again enabling such an event to happen.
Here’s to the next one.