About Me

Education : I am a recent graduate from Ecole Centrale with a Data Science and Digitalization Option (S2D) - Management Track and I have two years of experience in finance. My background in mathematics and my strong analytical mindset have prepared me well for a career in data science.

Research Interests : I’m currently learning Data science that includes Machine learning and Deep learning. I’m looking to collaborate on data science projects

Publications: All of my projects are available on my github account

Projects

2023-04-10

Project I : Building a data pipeline based on the Methane emissions data from the WorldBank API.

In this project, I build a data pipeline using python. the project consists in creating a RestAPI (FastAPI) with a specific endpoint that will be able to return the Methane emissions for a country at a specific year.

The project includes the following steps:
  1. Data retrieval: Using a Python script to retrieve methane emission data from the World Bank API using the WBGAPI module.
  2. Data estimation: Using various methods such as rolling statistical, KNNImputer, and linear interpolation to estimate missing methane emission data.
  3. Uncertainty computation: Using the bootstrapping method to calculate the uncertainty of the estimated values.
  4. Scoring methodology: Evaluating countries based on their methane emissions using various methods such as total emissions, emissions per area, emissions per capita, and emission intensity.
  5. Data visualization: Creating interactive visualizations using the power BI to present the results.
  6. REST API (FastAPI) to return the Methane emissions for a country at a specific year.

Here is the link to the code

2023-03-20

Project II : Premier league predictions

In this project, we build a machine learning model to predict the score of Premier League matches. For this we have data from the last 3 seasons.

The project includes the following steps :
  1. Data retrieval: Using a Python script to scrape premier league matches data from this website here
  2. Preprocessing: Using right data types to optimize the memory used of the data and performing cleaning data.
  3. Modelisation: Using machine learning model especially RandomForestClassifier. I trained the model first and predict the result of future matches.
  4. First Evaluation: Compute the precision score of the prediction of the model.
  5. Improving the precision: Creating new features by performing rolling average.
  6. Last evaluation: Compute again the precision score of the prediction of the model improved.

Here is the link to the code

2022-09-05

Project III : Clustering Football Players by using FIFA ratings Data

The goal of this project is to find similarities with players based on their ratings. In order to achieve this goal, we implemented the k-means clustering algorithm which is an unsupervised machine learning algorithm that you can use to find clusters in your data.

The project includes the following steps :
  1. Data cleaning: handling missing values by deleting them
  2. Feature selection: Selecting 5 most important features to build clusters
  3. Normalize the dataset: Scaling data between [0,4] to normalize values.
  4. Modelisation: Building K-means algorithm from scratch.
  5. Comparison: Comparison with the KMeans module of the sklearn library.

Here is the link to the code

2020-09-01

Project IV : Build a convolutional neural network for image classification

The goal of this project is to build and train a convolutional neural network (CNN) that will serve to classify photos of the three football players (Iniesta, Neymar, and Messi).

The project includes the following steps :
  1. Preprocessing data: To achieve this goal, I trained the fine-tuned VGG16 model to classify images of the three football players.
  2. Training: To achieve this goal, I trained the fine-tuned VGG16 model to classify images of the three football players.
  3. Building a web application: I used a simple Flask App to build a web application to visualize the result of the model
  4. Result: In this video here, you can see that the model predicts well the images given in input with pretty good accuracy.
  5. Future plans: The next step is to build more deep learning apps using NLP, RNN models, and so on .

Here is the link to the code

"PS: Sorry for Real Madrid supporters 😁".