One of the most prevalent illnesses, diabetes does not directly result in patient mortality. But, it increases the risk of death. Any disease that may be predicted in its early stages can lessen its fatal effects while also enhancing the quality of the healthcare system. For the early-stage prediction of diabetes or such types of non-communicable diseases, we need the proper set of influential features. This research has developed a machine learning-based disease prediction model to identify the influential features for diabetes prediction and give a near-perfect classification accuracy. This model includes Min-Max normalization for data normalization, Isolation Forest (iForest) for outlier removal, Synthetic Minority Oversampling Technique (SMOTE) for oversampling, Random Forest based Recursive Feature Elimination (RFE-RF) test, Chi-Square test, and Minimum Redundancy Maximum Relevancy (mRMR) test based feature selection methods for identifying the influential features, and Support Vector Machine (SVM), K Nearest Neighbor (KNN), and Naive Bayes (NB) for the classification. The results clarify that the proposed model outperforms the previous models and studies. The SVM has attained an accuracy of 99.58% in classification using the five features chosen from the Chi-Square test. Lastly, SHAP, an explainable AI model, has been used to assess the classifier model’s performance. These selected features and the classifier model can be used for early-stage diabetes prediction.