Genetic Algorithm Approach Based Clustering

Niharika Srivastava
December 28, 2022
4 min read

Humans have always endeavored to stretch the limits given by nature whether they are physical limits or cognitive limits. Seeing the birds flying we invented airplanes. Taking inspiration from brain neural networks are designed which are revolutionizing many industries now. Among the many algorithms inspired from evolution like  Genetic Algorithms, Evolution Strategies, Estimation of Distribution Algorithms etc. Genetic algorithms have shown concrete results for many problems in optimization and search..; Clustering has many practical applications like  Retail Marketing,Streaming Services,and Sports Science etc. There are many clustering algorithms present. In this article we are going to see what exactly Genetic Algorithm is and how it works.

Why is yet another algorithm required for clustering?

Though there are many algorithms in clustering space it is a basic question whether we need one more? Although many of the traditional algorithms perform reasonably well, they do not seek global optimal solutions with respect to given clustering criteria. Genetic algorithms also cannot guarantee the global optimal solution but such techniques do possess the important advantage of at least trying to seek this objective at a solution-wide level of analysis.As a practical note GA based algorithms perform consistently better than other algorithms like k-means on unconstrained dataset.Also traditional algorithm needs to specify the value of k beforehand which requires domain knowledge. Genetic algorithms can find sub-optimal solutions for the number of clusters as well.

Clustering:-

Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group and dissimilar to the data points in other groups. It is basically a collection of objects on the basis of similarity and dissimilarity between them.

There are many clustering algorithms present. Some of them are classified as hard clustering, some of them are fuzzy clustering but they generally have some limitations or challenges such as they require some user input, for example: some clustering technique like k-means could require the number of clusters as an input from a user which can be difficult for a user to assume in advance etc. and also sometimes these techniques can get stuck at a local optima. To avoid these problems, genetic algorithms are generally useful.

Genetic Algorithm:-

Genetic algorithms are based upon the idea of genetics and evolution .It implements the idea of survival of the fittest using mathematics and taking help of tools like MATLAB or python.

Problems while implementing traditional Genetic Algorithm on clustering:-

The use of traditional encoding scheme which has constant length chromosomes usually cause problems like:-

  • Redundant Codification
  • Context insensitivity

To solve the problem Genetic Algorithm is tweaked in a specific way for clustering problems called Clustering Genetic Algorithm.

Clustering genetic algorithm:-

The flowchart of steps involved in Clustering genetic algorithms

Application and Results:-

  1. Ruspini data :-The simulation deals with the well known Ruspini data .There are 75 objects described by means of two attributes say x and y. This is a very simple problem but it shows that the CGA quickly found the right solution from a randomly generated initial data.

  2. Wisconsin's breast cancer data :- This dataset is available at UCI Machine Learning Repository. Each object has 9 attributes and a label of whether the tumor is benign or malignant. The CGA found two clusters in all simulations but clusterings were not always the same one. The average classification rate varied between 95.02% to 95.75%.

  3. The implicit parallelism is there in implementation of genetic algorithm. To leverage the parallelism GPU are great accelerators to increase compute speed as well as compute efficiency.

Are you looking for optimal Cloud GPU Solutions?

Check us out: https://www.e2enetworks.com/products

Connect with us to clear any query you may have: sales@e2enetworks.com