Who are we?
We leverage massive scale machine learning and our global network of data predicts fraudulent behavior with unparalleled accuracy. Having seen strong traction from customers like Airbnb, Yelp, OpenTable, Twitter, and hundreds more, we are rapidly expanding in all departments! Launched & run by ex-Googlers and top executive staff with diverse backgrounds, we are looking to add very strong talent to the team. Our investors include First Round, Union Square Ventures, Spark Capital and Insight Venture Partners.
What are we working on?
The 20+ members of our engineering team are building:
- Highly Scalable, Distributed Microservices that can handle hundreds of millions of events per day
- Machine Learning Pipeline leveraging Kafka & Spark
- 100+ Node HBase & ElasticSearch Cluster that stores Petabytes of global fraud data
- Distributed Workflow Systems that automate our key customers’ business processes
- Highly interactive UI leveraged by thousands of fraud analysts around the globe using React.js, ES6, and D3.js.
Who are you?
You are an experienced engineer who has built scalable machine learning data pipelines & systems. You feel equally comfortable running small experiments on your laptop using R as you are running Spark or Map Reduce jobs on remote clusters. You understand that while RNN might be a newer algorithm than Naive Bayes, more & better data usually trumps using the latest & best algorithm. Not only are you familiar with how ML algorithms work but also how to help build the infrastructure where they run. Finally, you have been looking for an opportunity where you can leverage your passion and skills for good (in Sift Science’s case, helping to eliminate fraud on the internet).
What would make you a good fit?
- 3+ years working with large datasets using Spark / MapReduce / etc
- 3+ years working with data on HDFS / HBase / Cassandra / etc
- 3+ years building backend system using Java / Scala / Python
- Deep understanding of statistical modeling / machine learning / data mining concepts, and a track record of solving problems with these methods
- Strong familiarity setting up, deploying, managing, and deploying to the cloud (e.g. AWS)
- Solid understanding of relational database modeling and design, including experience building data-intensive applications
- Strong communication & collaboration skills, and a belief that team output is more important than individual output
- Familiarity with one or more machine learning or statistical tools such as R, MATLAB, or similar libraries for other programming languages.
- Experience participating in on-call rotation
- Advanced degree in Statistics, Machine Learning, Computer Science, Electrical Engineering, Applied Mathematics, Operations Research, or a related field.