Parallel Support Vector Machines on a Hadoop Framework
Keywords:
SVM Parameters, MapReduce, Hadoop, Parallel SVMAbstract
The term "big data" refers to large datasets that cannot be processed using standard
computer procedures. Hadoop applications may be stored and run on commodity hardware clusters.
Map Reduce, a distributed programming approach, may be used to break down large amounts of
data into smaller chunks. SVM (Support Vector Machine) is a well-known and powerful classifier in
the field of machine learning. As a consequence, SVM is inappropriate for large datasets due of its
high computational cost. A Map Reduce-based SVM for large datasets was demonstrated in this
research. Penalty and kernel settings have been used to analyse the parallel SVM's performance











