María Aguilera García

Logo

Resume | Linkedin | GitHub

Econometrics, big data, machine learning, coding and data science passionate.

MS degree in Business Analytics and Big Data (specialization in Applied Artificial Intelligence) at IE School of Human Sciences and Technology.



Portfolio

Machine Learning

🚵‍♀️ Kaggle Competition: Forest Cover Type Prediction🚵‍♀️

Open Notebook View on GitHub Run in Google Colab

* Conducted a project on Machine Learning II and achieved the highest accuracy score compared to the other teams. * Analyzed, cleaned, pre-processed and engineered features for the creation of machine learning models. * Achieved a grade of 10/10

Skills: Python | Matplotlib | Seaborn | Plotly | Scikit- learn Pipelines | Grid search | Hyperparameter Configuration | Data Visualization


🚀Kaggle Competition: Determining the Fate of Passengers in an Alternate Dimension 🚀

Open Notebook View on GitHub Run Google Colab

* Explored data cleaning, feature relationships, handling missing values, feature engineering, and developed modeling pipelines with informative visualizations. * The data can be downloaded from [Kaggle](https://www.kaggle.com/competitions/spaceship-titanic) ***Skills***: Data Cleaning | Pipeline Development | GridSearch | Hadnling Missing Values | Data Exploration | Cross- validation



🚲 Predict number bicycle users on an hourly basis🚲

Open Notebook View on GitHub Run in Google Colab

* Conducted a Python II Group Final Project to predict the total number of Washington D.c bycle users on an hourly basis. * Conducted Exploratory Data Analysis, Data Cleaning & Analysis, and Time-Based Cross Validation. Goal was to predict the total number of Washington D.C bicycle users on an hourly basis. ***Skills***: Data Visualization | Python




Social Network Analysis

☎ nstagram Graph Analysis and Community Detection Algorithms ☎

Open Notebook View on GitHub Run in Google Colab

* Using graph algorithims and GraphX to analyze and explore different patterns and communities in the instagram dataset. * Found out the most influential members of the network to increase sales by advertisement. * As the dataset was too large to process, we had to do exploratory data analysis to check how to reduce it so that it didn't become a random network. ***Skills***: GraphX | Comunity Detection Algorithims




Reinforcement Learning

🌔 Lundar Landing Assignment 🌔

HuggingFace Run in Google Colab View on GitHub

* Our goal is to teach the Lunar Lander (our agent) how to correctly land their spaceship between two flags (our landing pad). * The more accurately the agent is able to land, the bigger the ultimate reward he will be able to attain. * The agent may choose any of the following four actions at any moment to achieve this objective: fire the left engine, fire the right engine, fire down the engine, or do nothing. ***Skills***: Reinforcement Learning | Game Theory | Hyperparameter Tuning



🚗Training AWS Car 🚗

Skills: Reinforcement Learning | Game Theory | Hyperparameter Tuning


def reward_function(params):

  import math

def reward_function(params):

  # Read input parameters
  track_width = params['track_width']
  distance_from_center = params['distance_from_center']
  
  # reward function as Gauss curve with the variable distance_from_center
  reward = (1 / (math.sqrt(2 * math.pi * (track_width*2/15) ** 2)) * math.exp(-((distance_from_center + track_width/10) ** 2 / (4 * track_width*2/15) ** 2))) *(track_width*2/3)
  
  return float(reward)
    # - - - - -
    
    return speed_reward + heading_reward + steering_reward


Major Projects

Corporate Data Breaches and Narrative Disclosures

Link

Skills: R (Tidy verse, ggplot, lubridate, PostgreSQL), Fuzzy Matching (Fuzzy-Lookup Add-in), Econometrics & Statistics (DID, Logit, Fixed Effects).



Master Thesis Project: Raw Material Forecasting of Industrias Duero

Skills: Python, Facebook Prophet, Time Series Analysis & Forecasting, XGboost, Catboost, Microsoft Power BI.



© 2020 Khanh Tran. Powered by Jekyll and the Minimal Theme.