Skip to content
View pehls's full-sized avatar
  • Porto Alegre, RS, Brazil
Block or Report

Block or report pehls

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pehls/README.md

Hi there 👋

I am a Senior Data Scientist with a proven track record of driving innovative data science initiatives, leading the renewal of Machine Learning structures, implementing MLOps practices at various organization, with achievements in drug replacement algorithms, Machine Learning models for authorization of medical claims, and NLP. Previously, played a key role in establishing and leading the Data Science/BI area, overseeing DW creation using Python + Amazon RedShift + Power BI, and implementing BI processes. Demonstrated expertise in ETL, data visualization, and ML model applications across various sectors. Explore more about my journey at my Profile!

Platforms & Tools

Experienced with Python, R and Java;

Already performed data transformations inside Pyspark, Pandas, Dask, Oracle Data Integration/Data Flow, AWS Glue;

Create Pipelines of Data Engineering inside Databricks, Alteryx, AWS Glue + Athena;

Started a Data Science Area inside a software development company, starting with just me and exiting with a team of 1 BI Developer, 2 Data Eng, 1 Product Owner and 1 UI/UX Developer, formulating a Data Warehouse for Power BI tasks, and a Data LakeHouse structure after some maturity with the data, Deploying multiple models for production, with great performance for a streaming process with Pyspark;

Formulate AI Systems using Machine and Deep Learning Tasks, like Recommendation Systems, HealthCare Audit, Forecasts in different granularities, pricing elasticities development and analysis, key driver analysis using Machine Learning Models and Model Interpretability, key driver analysis using Structural Equation Modelling, Find Similarity between groups of data using Clustering, using LLM's with vector databases (RAG) to delivery faster results from internal processes and documents with LLangChain + different llm models and Muti-Stage Reasoning for automated processes inside a pipeline, between other applications of Data Science, Machine Learning and Deep Learning (including llms!);

Delivery models to a model store inside Amazon Sagemaker, Databricks, Azure Machine Learning Studio and Oracle Data Science / Model Catalog;

Visualized Data with Power BI, Tableau, Plotly/Seaborn inside Python, and ggplot2 inside R;

Follow DevOps and MLOps practices along this way, helping other developers as Tech Lead / at a Senior Position, Leading different projects and delivering Data Science and Machine Learning / AI Systems with great quality and adherence to business objectives, leading discoveries with different companies in different areas.

Studies

Currently, studying an intersect between MLOps, LLMOps and how large language models was made - inside the black box of "binarized models" and apis, how transformers, attention and other structures of Deep Learning and Feature Engineering made to transform text to numbers, inside matrices, and back to text, image, sounds, videos, codes, etc.

Contact

Pinned

  1. descritor-de-ativos descritor-de-ativos Public

    Projeto para o curso de AI in Financial Market, da I2A2 - Data H, contando com uma descrição (no estilo relatório) do snapshot atual de um ativo, seja via yfinance (bolsa tradicional) ou cripto ati…

    Jupyter Notebook 4

  2. i2a2NaiveBayes i2a2NaiveBayes Public

    Desafio de aplicar naive bayes a um modelo que efetua compra e venda de ações utilizando PETR4 como base, e um modelo de gestão de risco baseado no desempenho de indicadores em cima do ativo.

    Jupyter Notebook 1

  3. gp27_techchallenge_3 gp27_techchallenge_3 Public

    Tech Challenge of the Postgraduate in Data Analytics, from FIAP, developing a Data Warehouse with data from PNAD-COVID-19, from IBGE, using Pyspark and Google BigQuery for ETL, as well as an analys…

    Jupyter Notebook

  4. gp27_techchallenge_4 gp27_techchallenge_4 Public

    Tech Challenge of the Postgraduate in Data Analytics, from FIAP, analyzing Brent Oil price data, in comparison with historical, economic and societal data, integrating correlation and causality ana…

    Jupyter Notebook

  5. llm-deploy-locally llm-deploy-locally Public

    First steps into llangchain universe, using llama2 as a model, inside and chromadb as a vector db, going to a local deployment of a llm, as a backend api

    Makefile

  6. mlops_structure mlops_structure Public

    MLOps full structure for llm/ml, from mlops study, done with Python.

    Python