← All Jobs
Posted May 15, 2026

Senior Data Scientist – International eKYC, Identity Graph

Apply Now
Job Description: • Lead the design, development, and deployment of ML and graph-based algorithms for international entity resolution, identity trust scoring, and anomaly detection across heterogeneous, country‑specific datasets. • Architect reusable matching and linking frameworks that work across multiple ID schemes (e.g., national ID numbers, passports, voter IDs, mobile accounts, bank accounts) and local name/address conventions. • Develop probabilistic and rule‑augmented models that handle noisy, sparse, or partially labeled international data while maintaining explainability and regulatory defensibility. • Define and evolve the international extension of Socure’s identity graph: schema design, linkage strategies, quality tiers, and confidence scoring that can be leveraged by multiple products (Verify, KYC, watchlists, fraud). • Design and implement robust data quality and monitoring frameworks for international identity data (coverage, stability, drift, regional bias, label quality) and integrate them into modeling and production monitoring workflows. • Own experimentation strategy for major international eKYC initiatives: Design offline evaluations and online A/B tests that reflect local ground truth constraints and data sparsity. • Define success metrics that balance approval rates, fraud capture, and regulatory/operational constraints per market. • Analyze lift, stability, and fairness trade‑offs and drive go/no‑go decisions with Product and Engineering. • Contribute to model governance documentation and support responses to regulators and large enterprise customers regarding model logic, data provenance, fairness, and monitoring for international markets. Requirements: • Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field, or equivalent practical experience. • 6+ years of hands-on applied ML / data science experience (4+ with Ph.D.), including owning production models and pipelines in high‑stakes domains (fraud, risk, identity, payments, credit, or similar). • Significant prior work on international or multi‑region products is strongly preferred (e.g., cross‑country KYC, credit risk, payments, or compliance systems). • Expert‑level proficiency in Python and SQL, with extensive experience in distributed data processing (Spark/PySpark, Databricks or similar) on very large datasets. • Deep experience designing, training, and deploying models for classification, ranking, anomaly detection, and/or graph learning, including: • Feature engineering for noisy/heterogeneous identity data. • Robust evaluation under label sparsity and feedback delays. • Calibration and thresholding tailored to regional risk and regulatory constraints. • Proven expertise with graph technologies (e.g., Neo4j, AWS Neptune, GraphFrames, DGL, PyTorch Geometric) and graph algorithms (entity resolution, link prediction, community detection, label propagation) at scale. Benefits: • Offers Equity • Offers Bonus Apply tot his job Apply To this Job