I have developed a strong background and expertise in data engineering, data analysis, and data assimilation. Throughout my career, I have acquired proficiency in various tools and technologies, including dbt (data build tool), Python (including the Pandas library), Jinja templating language, Google BigQuery, and Looker Studio.
In the field of data engineering, I have demonstrated my ability to create robust and scalable data pipelines using dbt, Python, and Jinja templating. I take pride in ensuring consistent and clean data for analysis, leveraging my skills in transforming and querying data effectively.
My expertise in data analysis shines through my proficiency in utilizing Python for tasks such as data manipulation, aggregation, calculations, and data filtering. Whether it’s conducting daily or weekly analyses, I am able to derive meaningful insights by examining specific metrics and uncovering valuable patterns in datasets.
An essential aspect of my work is integrating visualization and reporting tools to present analysis results effectively. By utilizing tools like Looker Studio, I have been able to create interactive dashboards and visualizations that provide stakeholders with a user-friendly and informative experience. This highlights my strong communication skills and my ability to effectively convey complex data analysis to a diverse audience.
My experience showcases my competence in data engineering, data analysis, and visualization. I am skilled at handling complex datasets, extracting valuable insights, and presenting analysis results in a clear and compelling manner.
Research Experience 🌍Universidad del Norte, BAQ 080001, Colombia.Lawrence Livermore National Laboratory, CA 94550, USA.Argonne National Laboratory, IL 60439, USA.Virginia Polytechnic Institute and State University, VA 24060, USA.Faculty Experience 🛸Universidad del Norte, BAQ 080001, ColombiaUniversidad del Norte, BAQ 080001, ColombiaUniversidad EAFIT, MDE 050001, Colombia.Correlation-One (DS4A - Colombia), USAUniversidad del Norte, BAQ 080001, Colombia.Virginia Polytechnic Institute and State University, VA 24060, USA.Industry Experience 🚀07/2024 - Present. Staff Data Scientist
Snowflake.OpenAI, to enrich data, refine survey questions, and apply relevant tagging.Python, AWS, and cloud-based analytics platforms (GCP Looker Studio) for fast data integration and visualization.Prophet, DeepAR, DARTs, RNNs), predicting sales based on insights from survey data.Bash and Python scripts for faster execution across various scenarios.11/2023 - Present. Senior Data Engineer
AWS S3 and Google Cloud Storage.AWS Redshift and GCP BigQuery to support analytical insights from both structured and unstructured data.AWS Glue, GCP Dataflow, and custom Python scripts, automating data ingestion, transformation, and integration.scikit-learn and Keras.Google Vision API to extract image features, enriching product catalogs and improving the performance of models like DeepFM, Apriori, and Graph Neural Networks.FastAPI, with domain customization via GCP to create tailored endpoints for various applications.dbt (Data Build Tool) to transform raw data from data warehouses into structured, production-ready databases.GCP Compute Engine instances on demand to support custom database services and provide scalable resources for training machine learning and deep learning models.07/2022 - 08/2024. Senior Data Engineer/Senior Data Scientist
Python, Keras, scikit-learn, Pandas, Numpy, and BigQuery ML.Looker Studio and Jira, ensuring effective data-driven decision-making.Google Maps API, to analyze routes and paths, enhancing data integration and analysis in transportation scenarios.Folium, Dash, and Python for real-time data monitoring.Docker, GitHub, Google Colab, and cloud-based services like Google Functions, Google Compute Engine, and BigQuery.01/2024 - 06/2024. Senior Data Scientist
Snowflake to ingest and process data from AWS S3 buckets.Snowflake using Streamlit.Python and JavaScript for continuous training and deployment of machine learning models.AWS S3.05/2024 - 07/2024. AI Prompt Engineer
08/2023 - 11/2023 (Part-time). Senior Data Engineer
Azure Synapse.Python, enabling efficient data-driven decision-making.06/2022 - 08/2022. Python Developer
Python and Microsoft Azure Functions, streamlining reporting processes with SQL Server.Azure DevOps.Google Colab and GitHub.10/2020 - 07/2022. Data Engineer & Data Science Trainer
Python-based solutions across diverse markets, managing complex datasets and cloud infrastructure using Docker, REST API, and AWS services (RDS, EC2, S3).GitHub for code versioning and ensured robust and scalable deployment.08/2005 - 12/2005. Intern
Oracle.Microsoft Access.SQL, and data processing tools.