Pulmonary sarcoidosis is a multisystem granulomatous interstitial lung disease (ILD) with a variable presentation and prognosis. Early accurate detection of pulmonary sarcoidosis may prevent progression to pulmonary fibrosis, a serious and potentially life-threatening form of the disease. However, the lack of a gold-standard diagnostic test and specific radiographic findings pose challenges in diagnosing pulmonary sarcoidosis. Chest computed tomography (CT) imaging is commonly used but requires expert, chest-trained radiologists to differentiate pulmonary sarcoidosis from lung malignancies, infections, and other ILDs. In this work, we developed a multichannel, CT and radiomics-guided ensemble network (RadCT-CNNViT) with visual explainability for pulmonary sarcoidosis vs. lung cancer (LCa) classification using chest CT images. We leverage CT and hand-crafted radiomics features as input channels, and a 3D convolutional neural network (CNN) and vision transformer (ViT) ensemble network for feature extraction and fusion before a classification head. The 3D CNN sub-network captures localized spatial information of lesions, while the ViT sub-network captures long-range, global dependencies between features. Through multichannel input and feature fusion, our model achieved the highest performance with accuracy, sensitivity, specificity, and combined AUC of 0.93±0.04, 0.94±0.04, 0.93±0.08 and 0.97 respectively in a 5-fold cross-validation study with pulmonary sarcoidosis (n=126) and LCa (n=93) cases. The model offers promising potential for improving the diagnosis of pulmonary sarcoidosis. A detailed ablation study showing the impact of CNN+ViT compared to CNN or ViT alone, and CT+radiomics input, compared to CT or radiomics alone, is also presented in this work.