dc.contributor.author |
Mokoatle, Mpho
|
|
dc.contributor.author |
Coleman, Toshka
|
|
dc.contributor.author |
Mokilane, Paul M
|
|
dc.date.accessioned |
2024-01-04T06:42:46Z |
|
dc.date.available |
2024-01-04T06:42:46Z |
|
dc.date.issued |
2023-12 |
|
dc.identifier.citation |
Mokoatle, M., Coleman, T. & Mokilane, P.M. 2023. A comparative study of over-sampling techniques as applied to seismic events. http://hdl.handle.net/10204/13445 . |
en_ZA |
dc.identifier.isbn |
978-3-031-49001-9 |
|
dc.identifier.issn |
1865-0937 |
|
dc.identifier.uri |
https://doi.org/10.1007/978-3-031-49002-6
|
|
dc.identifier.uri |
http://hdl.handle.net/10204/13445
|
|
dc.description.abstract |
The likelihood that an earthquake will occur in a specific location, within a specific time frame, and with ground motion intensity greater than a specific threshold is known as a seismic hazard. Predicting these types of hazards is crucial since doing so can enable early warnings, which can lessen the negative effects. Research is currently being executed in the field of machine learning to predict seismic events based on previously recorded incidents. However, because these events happen so infrequently, this presents a class imbalance problem to the machine learning or deep learning learners. As a result, this study provided a comparison of the performance of popular over-sampling techniques that seek to even out class imbalance in seismic events data. Specifically, this work applied SMOTE, SMOTENC, SMOTEN, BorderlineSMOTE, SVMSMOTE, and ADASYN to an open source Seismic Bumps dataset then trained several machine learning classifiers with stratified K-fold cross-validation for seismic hazard detection. The SVMSMOTE algorithm was the best over-sampling method as it produced classifiers with the highest overall accuracy, F1 score, recall, and precision of 100%, respectively, whereas the ADASYN over-sampling methodology showed the lowest performance in all the reported metrices of all the models. To our understanding, no research has been done comparing the effectiveness of the aforementioned over-sampling techniques for tasks involving seismic events. |
en_US |
dc.format |
Fulltext |
en_US |
dc.language.iso |
en |
en_US |
dc.relation.uri |
https://2023.sacair.org.za/programme_overview/ |
en_US |
dc.relation.uri |
https://link.springer.com/chapter/10.1007/978-3-031-49002-6_22 |
en_US |
dc.source |
The Southern African Conference on AI Research (SACAIR 2023), Muldersdrift, Gauteng, 4-8 December 2023 |
en_US |
dc.subject |
Seismic events |
en_US |
dc.subject |
Machine learning |
en_US |
dc.subject |
Oversampling |
en_US |
dc.subject |
SVMSMOTE |
en_US |
dc.subject |
SMOTE |
en_US |
dc.subject |
ADASYN |
en_US |
dc.title |
A comparative study of over-sampling techniques as applied to seismic events |
en_US |
dc.type |
Conference Presentation |
en_US |
dc.description.pages |
15 |
en_US |
dc.description.note |
This is the preprint version of the paper. The published version can be obtained via https://link.springer.com/chapter/10.1007/978-3-031-49002-6_22 |
en_US |
dc.description.cluster |
Next Generation Enterprises & Institutions |
en_US |
dc.description.impactarea |
Data Science |
en_US |
dc.identifier.apacitation |
Mokoatle, M., Coleman, T., & Mokilane, P. M. (2023). A comparative study of over-sampling techniques as applied to seismic events. http://hdl.handle.net/10204/13445 |
en_ZA |
dc.identifier.chicagocitation |
Mokoatle, Mpho, Toshka Coleman, and Paul M Mokilane. "A comparative study of over-sampling techniques as applied to seismic events." <i>The Southern African Conference on AI Research (SACAIR 2023), Muldersdrift, Gauteng, 4-8 December 2023</i> (2023): http://hdl.handle.net/10204/13445 |
en_ZA |
dc.identifier.vancouvercitation |
Mokoatle M, Coleman T, Mokilane PM, A comparative study of over-sampling techniques as applied to seismic events; 2023. http://hdl.handle.net/10204/13445 . |
en_ZA |
dc.identifier.ris |
TY - Conference Presentation
AU - Mokoatle, Mpho
AU - Coleman, Toshka
AU - Mokilane, Paul M
AB - The likelihood that an earthquake will occur in a specific location, within a specific time frame, and with ground motion intensity greater than a specific threshold is known as a seismic hazard. Predicting these types of hazards is crucial since doing so can enable early warnings, which can lessen the negative effects. Research is currently being executed in the field of machine learning to predict seismic events based on previously recorded incidents. However, because these events happen so infrequently, this presents a class imbalance problem to the machine learning or deep learning learners. As a result, this study provided a comparison of the performance of popular over-sampling techniques that seek to even out class imbalance in seismic events data. Specifically, this work applied SMOTE, SMOTENC, SMOTEN, BorderlineSMOTE, SVMSMOTE, and ADASYN to an open source Seismic Bumps dataset then trained several machine learning classifiers with stratified K-fold cross-validation for seismic hazard detection. The SVMSMOTE algorithm was the best over-sampling method as it produced classifiers with the highest overall accuracy, F1 score, recall, and precision of 100%, respectively, whereas the ADASYN over-sampling methodology showed the lowest performance in all the reported metrices of all the models. To our understanding, no research has been done comparing the effectiveness of the aforementioned over-sampling techniques for tasks involving seismic events.
DA - 2023-12
DB - ResearchSpace
DP - CSIR
J1 - The Southern African Conference on AI Research (SACAIR 2023), Muldersdrift, Gauteng, 4-8 December 2023
KW - Seismic events
KW - Machine learning
KW - Oversampling
KW - SVMSMOTE
KW - SMOTE
KW - ADASYN
LK - https://researchspace.csir.co.za
PY - 2023
SM - 978-3-031-49001-9
SM - 1865-0937
T1 - A comparative study of over-sampling techniques as applied to seismic events
TI - A comparative study of over-sampling techniques as applied to seismic events
UR - http://hdl.handle.net/10204/13445
ER -
|
en_ZA |
dc.identifier.worklist |
27370 |
en_US |