Density-Based Spatial Clustering in Rain and Temperature

Problem Statement:

DBSCAN Density-Based Spatial Clustering is an algorithm for clustering data that is commonly used in machine learning and data mining. It is based on the idea of density-reachability, which means that it can identify clusters of points that are closely packed together and separate them from points that are more spread out. We can use DBSCAN to cluster weather data that includes temperature and rain measurements. We use it to identify patterns in the relationship between temperature and rain by grouping together data points that have similar temperature and rain values. This could be useful for understanding how temperature and rain vary in different parts of our country at different times of the year. However, keep in mind that DBSCAN is just one tool among many that can be used to analyze weather data, and there may be other methods that are better suited to your specific research questions. Finding density-based clustering based on connected regions with high density.

System Requirements:

Processor: Intel Core I5 Processor Or Equivalent. (Preferable )
RAM: 2GB RAM (4GB preferable)
Operating System: Windows 7 to 10
IDE Used: Jupyter Notebook and Google Colab

System Design:

Machine learning algorithm that is used in this project is an unsupervised learning method called Density-Based clustering. In this clustering algorithm, it takes two values as inputs. One is epsilon ε which is the radius of the density circle, meaning the maximum distance between two points in the same cluster and another is the minimum points. The surroundings with a radius ε of a given point are known as the ε neighbourhood of the point. If the ε neighbourhood of the point comprises at least a minimum number (min points), then it is called a core point. The data points in the region are separated by two clusters. High-density points and low-density points. Low point density is considered as noise. In this project, we have taken a dataset of Bangladesh’s weather with 4 features such as year, month, temperature, and rain. Then we fitted the dataset into the model.

Implementation:

clusters = fit_predict(X,0.1,4)

Here the program passes the input values to the fit_predict function.

def fit_predict(X,eps,minPts):
  clusters = [0]*X.shape[0]

  dbscan(X,clusters,eps,minPts,metric=distance.euclidean)

  return clusters

where the dataset, clusters, eps, minimum point, and metric value are being passed to the dbscan function, and the value of the clusters is being updated

def dbscan(X,clusters,eps,minPts,metric=distance.euclidean):
  currentPoint=0

  for i in range(0,X.shape[0]):
    if clusters[i] != 0:
      continue

    neighbors = neighborsGen(X,i,eps,metric)  # check number of neighbors

    if len(neighbors) < minPts:       # check core point
      clusters[i] = -1
    else:
      currentPoint += 1
      expand(X,clusters,i,neighbors,currentPoint,eps,minPts,metric)    #expand chain with core points and add clusters

  return clusters

Testing Results:

#anik

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
1901_2019_BD_weather.csv		1901_2019_BD_weather.csv
477.ipynb		477.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1901_2019_BD_weather.csv

1901_2019_BD_weather.csv

477.ipynb

477.ipynb

readme.md

readme.md

Repository files navigation

Density-Based Spatial Clustering in Rain and Temperature

Problem Statement:

System Requirements:

System Design:

Implementation:

Testing Results:

About

Releases

Languages

ashiqurrahmananik/DBSCAN

Folders and files

Latest commit

History

1901_2019_BD_weather.csv

1901_2019_BD_weather.csv

477.ipynb

477.ipynb

readme.md

readme.md

Repository files navigation

Density-Based Spatial Clustering in Rain and Temperature

Problem Statement:

System Requirements:

System Design:

Implementation:

Testing Results:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Languages