Skip to content

Fincredo/yandex-projects-eng

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Projects completed as a part of the Data Science course at Yandex.Practicum:

Project Project goal Module, tools
1. 🎵 Analysis of musical preferences Using data from an online music service to compare the behavior of Moscow and St. Petersburg users: the dependence of the users’ activity on day of week, as well as their preferences for music genres. Basic Python (pandas)
2. 🏦 Research of borrowers’ credit worthiness To determine the impact of the borrower’s marital status, number of children and income level, as well as the loan purpose on the timeliness of repayment of bank loans. Data preparation (pandas)
3. 🏠 Researching apartment listings Using data from an online real estate service to identify factors among the parameters of real estate in St. Petersburg and neighboring communities that affect the market value of such real estate. Research-based data analysis (pandas, matplotlib)
4. 📱 Determining a promising tariff for a telecom company To analyze two tariffs of a mobile operator on a small sample of customers and make a conclusion about the most promising of the tariffs. Statistical data analysis (pandas, numpy, matplotlib, seaborn, scipy, sympy)
5. 🎮 Identifying patterns of computer game success Using data on game sales, genres and platforms, as well as user and expert evaluations, to identify the patterns that determine the success of a game. JOINED PROJECT 1 (pandas, numpy, matplotlib, seaborn, scipy, difflib, re)
6. 📱 Recommendation of telecom company tariffs To build a model for the classification problem that selects the appropriate of the two new tariffs to be offered to customers of the mobile operator, based on the behavior of subscribers who have already switched to them. Introduction to Machine learning (pandas, seaborn, sklearn)
7. 🏦 Forecasting of bank customer outflow Based on historical data concerning customer behavior and termination of contracts concluded with the bank, to build a model capable of predicting the customer’s leaving the bank in the near future. Supervised learning (pandas, numpy, seaborn, matplotlib, sklearn, re, random)
8. ⛲ Choosing a location for the well Using information about oil samples from different fields in the three regions, to build a model capable to predict the volume of oil reserves in new wells and determine the region where oil production will bring the most profit. Machine learning in business (pandas, seaborn, numpy, scipy, sklearn; Bootstrap)
9. 🧈 Recovery of gold from ore To prepare a prototype machine learning model for predicting the rate of gold recovery from gold-bearing ore, based on data containing gold mining and refining parameters. JOINED PROJECT 2 (pandas, numpy, seaborn, matplotlib, sklearn, math)
10. 🔠 Protection of customers’ personal data In order to protect the insurance company’s customer information, to develop a method of data conversion, after applying which it would be difficult to restore personal information. Linear algebra (pandas, matplotlib, seaborn, sklearn)
11. 🚗 Determining the value of cars Based on historical data on the sale of used cars, to build a model to determine their market value. Numerical methods (pandas, numpy, seaborn, matplotlib, re, sklearn, catboost, lightgbm)
12. 🚖 Car order forecasting Using data on car ordering at airports for a certain period, to build a model for predicting the number of orders for each following hour. Time series (pandas, matplotlib, numpy, sklearn, lightgbm, catboost)
13. 💬 Moderation of toxic comments Using a set of data from the online store about user-edited product descriptions, to teach a model to determine the toxicity of comments and classify them into positive and negative ones. Machine learning for texts (pandas, numpy, matplotlib, re, nltk, sklearn, catboost, lightgbm)
14. 👩 Determining customer age To build and teach a convolutional neural network using a set of photos of people with an indication of age, which is capable of determining the approximate age of a customer in a chain supermarket based on the photo. Computer vision (pandas, seaborn, matplotlib, keras)
15. 📱 Forecasting of telecom company customer outflow Based on the personal data of some of the telecom operator’s customers and information about their tariffs and contracts, to build a model for predicting customer outflow and preventing it in a timely manner. FINAL PROJECT (pandas, numpy, seaborn, matplotlib, sklearn, catboost, lightgbm; OHE)