Previously in this series, we have looked at decision trees and random forests, two types of supervised learning algorithms. Supervised algorithms are trained with data that provides input vectors as well as their corresponding target vectors, or the output that is expected after the data is processed. Unsupervised models, on the other hand, are trained using data... Continue Reading →
Machine Learning Algorithms Explained – Random Forests
Random Forests are supervised ensemble-learning models used for classification and regression. Ensemble learning models aggregate multiple machine learning models, allowing for overall better performance. The logic behind this is that each of the models used is weak when employed on its own, but strong when put together in an ensemble. In the case of Random... Continue Reading →
Fraud Detection by Stacking Cost-Sensitive Decision Trees
Recently, we published a research paper showing how it is possible to detect fraudulent credit card transactions with a high level of accuracy and a low number of false positives. By using ensembles of cost-sensitive decision trees, we can save up to 73 percent of losses stemming from fraud. Here’s how. Classification, in the context... Continue Reading →
Machine Learning Algorithms Explained – Decision Trees
A Decision Tree is a supervised predictive model that can learn to predict discrete or continuous outputs by answering a set of simple questions based on the values of the input features it receives. To get a better understanding of how DT works, we will use a real-world dataset to better illustrate the concept. This... Continue Reading →
From Real-Time Learning to Reinforcement Learning with Asynchronous Feedback
Online, or real-time, transactional fraud detection systems have recently created quite the buzz in the info security industry. They are an appealing concept: Because we know that fraud patterns change over time, the ability to use machine-learning algorithms to automatically learn new patterns instantly allows us to have a stronger defense system. We often find... Continue Reading →
Building AI Applications Using Deep Learning
Recently, we have seen a huge boom around the field of deep learning; it is currently being implemented in a wide variety of fields, from driverless cars to product recommendation. In their most primitive form, deep learning algorithms originated in the 1960s. If the concept has been around for decades, why is it that widespread... Continue Reading →
Classifying Phishing URLs Using Recurrent Neural Networks
In a recent research paper, we showed how we are able to detect with a high level of accuracy if a website is a phish just by looking at the URL. This post lays out in greater detail how, by using a deep recurrent neural network, we’re able to accurately classify more than 98 percent... Continue Reading →
Machine Learning Explained
Machine learning models are often dismissed on the grounds of lack of interpretability. There is a popular story about modern algorithms that goes as follows: Simple linear statistical models such as logistic regression yield to interpretable models. On the other hand, advanced models such as random forest or deep neural networks are black boxes, meaning... Continue Reading →
TDWI: 5 Minutes with a Data Scientist: Alejandro Correa Bahnsen of Easy Solutions Lead data scientist Alejandro Correa Bahnsen develops machine learning algorithms for fraud detection. He described for Upside the basic skills and personality traits he believes are necessary to succeed in data science. [Read More]
Benefits of Anomaly Detection Using Isolation Forests
One of the newest techniques to detect anomalies is called Isolation Forests. The algorithm is based on the fact that anomalies are data points that are few and different. As a result of these properties, anomalies are susceptible to a mechanism called isolation. This method is highly useful and is fundamentally different from all existing... Continue Reading →
The Technical Side of Phishing and How to Prevent It
Phishing, by definition, is the act of defrauding an online user and tricking them into clicking on a malicious link in order to obtain personal information by posing as a trustworthy institution or entity. That is why users have a hard time differentiating between a legitimate and a malicious site. Although one might think the... Continue Reading →
Applying Data Science to Fraud Prevention
Eighty thousand Kindle users. Sixty-five million Tumblr users. What do they have in common? Both groups had their login credentials breached, courtesy of hackers. While these attacks didn’t directly target financial accounts,the information contained in these breaches is likely being sold on the Dark Web and being used to build a larger profile that will... Continue Reading →