Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



16 Commits

Repository files navigation


Codacy Badge

In this repository, we present the first travel dataset generator of the GitHub.

This dataset serves as a good base for Data Mining learning models, including, but not limited to, supervised learning (e.g. , Classification, Regression) and unsupervised learning (e.g. , Clustering).

This generator produces flight and hotel data. Everything is randomly generated, for example, business users, hotels, travel, etc. See the python/ to understand the parameters.


Run dataset generator python/


Probabilities customization:

#- Companies and Users
#-- Types of gender
defGenders = list(str, str, ...)
#-- Ages of users 
defAgesInterval = {'min': int, 'max': int}
#-- Number of flights by user
defFlightsInterval = {'min': int, 'max': int}
#-- Companies
#--- Number of users by company
defCompanies = {
    'ABC': {'usersCount': int},
    'DEC': {'usersCount': int}, ...
#-- Number of Places of a Company
defCompaniesPlacesInterval = {'min': int, 'max': int}

#- Flight Agencies
#-- Types of flight
#--- Weight of price by type
defFlightTypes = {
    'economic': {'price': float},
    'premium': {'price': float}, ...
#-- Names of agency
defAgenciesName = [str, str, ...]

#- Places
#-- Names of place
defPlacesName = [str, str, ...]
#-- Distances between cities
defDistancesInterval = {'min': float, 'max': float}
#-- Plain velocity - km/hour
defPlaceTravelKmPerHour = float 

#-- Lodges (Accommodation)
#--- Number of lodges by place
defLodgesInterval = {'min': int, 'max': int}
#--- Prices of lodges
defLodgesPrices   = {'min': float, 'max': float}

#- Travels
#-- Number of days of a travel
defTravelsDays = {'min': int, 'max': int}
#-- Flights prices
defTravelsFlightPrices = {'init': float, 'interval': float}
#-- Probabity of a flight with hotel
defTravelWithLodge = float # ranging [0, 1]
#-- Dates of the travels
defTravelDate = {'init': datetime, 'interval':{'min': int, 'max': int}}


Step-by-step of the generator:


This generator is available for researchers and data scientists under the Creative Commons BY license. In case of publication and/or public use, as well as any dataset derived from it, one should acknowledge its creators by citing us.

Also look ~


No releases published


No packages published