Your browser is outdated!

To ensure you have the best experience and security possible, update your browser. Update now

×

Patrick Kamnang Wanko, PhD

Senior Data Scientist/Engineer

AI - Data Science
Machine Learning
Data Engineering
Cloud - AWS - Palantir
Computer Science
Experiences
  • goal and achievement: delivery of a pipeline automating the ingestion, processing and visualization of reinsurance data
    * Data ingestion from a relational database to a cloud environment
    * Setting up a data processing pipeline under Databricks environment (Bronze and Silver layers)
    * Building data transformation pipelines in Palantir Foundry (Gold layer)
    * Implementation and integration of business KPIs
    * Building of activity monitoring / dashboard tools with Foundry
  • tools: Foundry (Workshop, Slate, Contour, ...), Databricks, Dbt, PySpark, SQL, ...
  • environment: a squad of 4 developers; working based on Scrum,
  • goal and achievement: develop, maintain and upgrade data science/LLM functionalities in two client-facing tools:
    * Development and integration of NLP/LLM/AI features under AWS environment
    * Translation of customer needs into Data Science issues
    * Abstraction and modeling of business problems
    * Production of prototypes of functionalities/models
  • Some data science / AI issues addressed:
    * Automatic answering of questions by querying a large documentary database
    * Timeline of publication summaries related to user defined parameters (topic and location)
    * Document deduplication
    * Named entity recognition
    * Document classification (machine learning model)
  • tools: Python, LLMs (Gpt, Llama, Mistral), ElasticSearch, PostgreSql/OpenSearch, Spark, MapR, AWS, Pandas, Linux, Shell, Git, Sagemaker, HuggingFace
  • environment: two squads of 6 developers each; working based on Scrum
  • Airbus (one year)
    * goal and achievement: delivery of a pipeline automating the ingestion, processing and visualization of aviation data: - Setting up a data processing pipeline under Palantir's Foundry environment. - Building data transformation pipelines in Palantir Foundry. - Implementation and integration of business KPI
    * tools: Palantir Foundry, PySpark, Hive, Scala, Spark
    * environment: many squads of 3 developers; working based on Scrum
  • BNP Paribas (6 months)
    * goal and achievement: Development of a disasters clustering model using Natural Language Processing (NLP)
    * Data science issues addressed: Topic extraction, Text classification, Text translation
    * tools: Python, Scikit-Learn, NLP
    * environment: a squad of 2 data scientists
  • Sodexo (6 months)
    * goal and achievement: Construction of data processing pipeline in distributed environment. Construction of a restaurant attendance forecasting model in order to reduce waste
    * tools: Dataiku, PySpark, Azure, time series, Machine learning (LSTM, Prophet, SARIMA)
    * environment: 3 squads of 2 data scientists each; working based on Scrum
  • optimize in time and space the calculation of queries called Skyline within relational databases
  • estimation of the size of the query result
  • approximate calculation
  • identification of relationships (in particular functional dependencies) between columns
  • pre-computation, data structure
  • Multidimensional data analysis and correlation detection
  • tools: Java, C++, BigData
  • Student Mentoring
  • Student Project Evaluation
  • Data Scientist, Machine Learning Engineer, and Data Analyst Programs
  • Introduction to statistics with SPSS
  • Student assessment
Education

Engineer Statistician

Ecole Nationale de la Statistique et de l'Analyse de l'Information (ENSAI - Rennes)

September 2011 to November 2013
Data processing and analysis
Statistical Information System
Skills

Data Science

  • Data processing
  • Data analysis
  • Decision support models
  • Classification, Clustering
  • Machine Learning
  • Data Mining
  • AI, LLMs

Tools

  • SAS (certification)
  • R, SPSS, Matlab, Spad
  • Scikit-learn, TensorFlow, Pytorch, Keras, MLOPS,
  • Tableau, PowerBI
  • AWS (EC2, EBS, Sagemaker, OpenSearch, ...)
  • GCP, Microsoft Azure, Dataiku, Palantir
  • Jupyter notebook, Jupyter Lab, Pycharm
  • ElasticSearch

Computer Science

  • Python, JAVA, C++, C, VBA
  • HTML5, Javascript, PHP, CSS
  • Base de données, SQL, NoSQL, Postgresql, MySQL, Oracle
  • Spark, pySpark, Hadoop, Scala
  • Docker, VmWare, CI/CD
  • bitbucket, gitlab, github, subversion
  • Code review

Languages

  • French
  • English

Management

  • Project Management
  • Agile, Scrum, Kanban
  • Jira, Confluence, Notion
Certifications

Palantir Technologies: Foundry & AIP Builder Foundations

2024

Palantir Technologies: Speedrun: Creating Your First Data Connection

2024

Palantir Technologies: Deep Dive: Creating Your First Ontology

2024

Palantir Technologies: Deep Dive: Building Your First Pipeline

2024

Convolutional Neural Networks

2019

Sequence Models

2019

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

2018

Structuring Machine Learning Projects

2018

Neural Networks and Deep Learning

2017

SAS Certified Base Programmer for SAS 9

2013
BP031276v9
Resume created on DoYouBuzz
Download Download