SEARCH
Search Details
MATSUO HidetoshiUniversity Hospital / RadiologyAssistant Professor
Research activity information
■ Award■ Paper
- OBJECTIVE: This study aims to evaluate whether large language models (LLMs) can accurately predict the urgency and severity of radiology reports. MATERIALS AND METHODS: Based on the recommendations of the Academy of Royal Colleges, we defined radiology reports that include unexpected findings of high urgency or severity as "high-priority (HP) radiology reports." Overall, 1906 radiology reports were used as the training set, and 176 radiology reports were used as the test set, with a balanced ratio of HP to non-HP radiology reports (1:1) in both sets. Four types of LLMs (Llama2 7B, Llama3 8B, Llama3 Elyza 8B, and Llama 3.1 8B) were fine-tuned using four different input settings: (1) findings only, (2) findings + referring department, (3) findings + referring department + clinical diagnosis before examination, and (4) findings + referring department + clinical diagnosis before examination + details of examination request. The fine-tuned LLMs predicted whether each radiology report was HP or not. RESULTS: Among the four LLMs, Llama3 Elyza 8B, with inputs comprising findings and the referring department, demonstrated the best performance, achieving PRAUC = 0.962, ROCAUC = 0.968, accuracy = 0.915, sensitivity/recall = 0.932, specificity = 0.898, and F1 = 0.916. Adding a clinical diagnosis before the examination and details of examination requests did not necessarily lead to performance improvement. CONCLUSION: The fine-tuned LLMs accurately predicted HP radiology reports, suggesting their potential utility in supporting communication regarding radiology reports with high urgency or severity. KEY POINTS: Question This study aims to evaluate whether large language models (LLMs) can accurately predict the high-priority (HP) radiology reports. Findings The fine-tuned best LLM accurately HP radiology reports, achieving PRAUC of 0.962 and ROCAUC of 0.968. Clinical relevance This study demonstrates that fine-tuned LLMs can accurately identify HP radiology reports, potentially improving timely clinical decision-making and enhancing patient safety through faster communication of critical findings.Dec. 2025, European radiology, English, International magazineScientific journal
- Sep. 2025, Applied Sciences, 15(19) (19), EnglishScientific journal
- (公社)日本診療放射線技師会, Sep. 2025, JART: 日本診療放射線技師会誌, 72(9) (9), 1014 - 1014, Japanese深層学習モデルを用いた脳出血分類AIに対する有用性の検討
- (公社)日本診療放射線技師会, Sep. 2025, JART: 日本診療放射線技師会誌, 72(9) (9), 1014 - 1014, Japanese深層学習モデルを用いた脳出血分類AIに対する有用性の検討
- Feb. 2025, CoRR, abs/2502.03333Scientific journal
- BACKGROUND/OBJECTIVES: This study aimed to investigate the accuracy of Tumor, Node, Metastasis (TNM) classification based on radiology reports using GPT3.5-turbo (GPT3.5) and the utility of multilingual large language models (LLMs) in both Japanese and English. METHODS: Utilizing GPT3.5, we developed a system to automatically generate TNM classifications from chest computed tomography reports for lung cancer and evaluate its performance. We statistically analyzed the impact of providing full or partial TNM definitions in both languages using a generalized linear mixed model. RESULTS: The highest accuracy was attained with full TNM definitions and radiology reports in English (M = 94%, N = 80%, T = 47%, and TNM combined = 36%). Providing definitions for each of the T, N, and M factors statistically improved their respective accuracies (T: odds ratio [OR] = 2.35, p < 0.001; N: OR = 1.94, p < 0.01; M: OR = 2.50, p < 0.001). Japanese reports exhibited decreased N and M accuracies (N accuracy: OR = 0.74 and M accuracy: OR = 0.21). CONCLUSIONS: This study underscores the potential of multilingual LLMs for automatic TNM classification in radiology reports. Even without additional model training, performance improvements were evident with the provided TNM definitions, indicating LLMs' relevance in radiology contexts.Oct. 2024, Cancers, 16(21) (21), English, International magazineScientific journal
- Jan. 2024, Scientific reports, 14(1) (1), 2233 - 2233, English, International magazine
- IEEE, 2024, 48th IEEE Annual Computers, Software, and Applications Conference(COMPSAC), 1952 - 1954International conference proceedings
- (一社)日本核医学会, 2024, 核医学, 61(Suppl.) (Suppl.), S173 - S173, Japanese胸部領域PET/MRIにおける深層学習併用減弱補正法の再現性検討
- 2024, Nihon Hoshasen Gijutsu Gakkai zasshi, 80(6) (6), 673 - 678, Japanese, Domestic magazineScientific journal
- BACKGROUND AND PURPOSE: Mean pulmonary artery pressure (mPAP) is a key index for chronic thromboembolic pulmonary hypertension (CTEPH). Using machine learning, we attempted to construct an accurate prediction model for mPAP in patients with CTEPH. METHODS: A total of 136 patients diagnosed with CTEPH were included, for whom mPAP was measured. The following patient data were used as explanatory variables in the model: basic patient information (age and sex), blood tests (brain natriuretic peptide (BNP)), echocardiography (tricuspid valve pressure gradient (TRPG)), and chest radiography (cardiothoracic ratio (CTR), right second arc ratio, and presence of avascular area). Seven machine learning methods including linear regression were used for the multivariable prediction models. Additionally, prediction models were constructed using the AutoML software. Among the 136 patients, 2/3 and 1/3 were used as training and validation sets, respectively. The average of R squared was obtained from 10 different data splittings of the training and validation sets. RESULTS: The optimal machine learning model was linear regression (averaged R squared, 0.360). The optimal combination of explanatory variables with linear regression was age, BNP level, TRPG level, and CTR (averaged R squared, 0.388). The R squared of the optimal multivariable linear regression model was higher than that of the univariable linear regression model with only TRPG. CONCLUSION: We constructed a more accurate prediction model for mPAP in patients with CTEPH than a model of TRPG only. The prediction performance of our model was improved by selecting the optimal machine learning method and combination of explanatory variables.2024, PloS one, 19(4) (4), e0300716, English, International magazineScientific journal
- Elsevier BV, 2024, Informatics in Medicine Unlocked, 46, 101465 - 101465Scientific journal
- Dec. 2023, Proceedings of the 17th NTCIR Conference on Evaluation of Information Access Technologies, NTCIR, 155 - 162[Refereed]International conference proceedings
- BACKGROUND: Deep learning (DL) has been widely used for diagnosis and prognosis prediction of numerous frequently occurring diseases. Generally, DL models require large datasets to perform accurate and reliable prognosis prediction and avoid overlearning. However, prognosis prediction of rare diseases is still limited owing to the small number of cases, resulting in small datasets. PURPOSE: This paper proposes a multimodal DL method to predict the prognosis of patients with malignant pleural mesothelioma (MPM) with a small number of 3D positron emission tomography-computed tomography (PET/CT) images and clinical data. METHODS: A 3D convolutional conditional variational autoencoder (3D-CCVAE), which adds a 3D-convolutional layer and conditional VAE to process 3D images, was used for dimensionality reduction of PET images. We developed a two-step model that performs dimensionality reduction using the 3D-CCVAE, which is resistant to overlearning. In the first step, clinical data were input to condition the model and perform dimensionality reduction of PET images, resulting in more efficient dimension reduction. In the second step, a subset of the dimensionally reduced features and clinical data were combined to predict 1-year survival of patients using the random forest classifier. To demonstrate the usefulness of the 3D-CCVAE, we created a model without the conditional mechanism (3D-CVAE), one without the variational mechanism (3D-CCAE), and one without an autoencoder (without AE), and compared their prediction results. We used PET images and clinical data of 520 patients with histologically proven MPM. The data were randomly split in a 2:1 ratio (train : test) and three-fold cross-validation was performed. The models were trained on the training set and evaluated based on the test set results. The area under the receiver operating characteristic curve (AUC) for all models was calculated using their 1-year survival predictions, and the results were compared. RESULTS: We obtained AUC values of 0.76 (95% confidence interval [CI], 0.72-0.80) for the 3D-CCVAE model, 0.72 (95% CI, 0.68-0.77) for the 3D-CVAE model, 0.70 (95% CI, 0.66-0.75) for the 3D-CCAE model, and 0.69 (95% CI 0.65-0.74) for the without AE model. The 3D-CCVAE model performed better than the other models (3D-CVAE, p = 0.039; 3D-CCAE, p = 0.0032; and without AE, p = 0.0011). CONCLUSIONS: This study demonstrates the usefulness of the 3D-CCVAE in multimodal DL models learned using a small number of datasets. Additionally, it shows that dimensionality reduction via AE can be used to learn a DL model without increasing the overlearning risk. Moreover, the VAE mechanism can overcome the uncertainty of the model parameters that commonly occurs for small datasets, thereby eliminating the risk of overlearning. Additionally, more efficient dimensionality reduction of PET images can be performed by providing clinical data as conditions and ignoring clinical data-related features.Dec. 2023, Medical physics, 50(12) (12), 7548 - 7557, English, International magazineScientific journal
- "Preprocessing" is the first step required in brain image analysis that improves the overall quality and reliability of the results. However, it is computationally demanding and time-consuming, particularly to handle and parcellate complicatedly folded cortical ribbons of the human brain. In this study, we aimed to shorten the analysis time for data preprocessing of 1410 brain images simultaneously on one of the world's highest-performing supercomputers, "Fugaku." The FreeSurfer was used as a benchmark preprocessing software for cortical surface reconstruction. All the brain images were processed simultaneously and successfully analyzed in a calculation time of 17.33 h. This result indicates that using a supercomputer for brain image preprocessing allows big data analysis to be completed shortly and flexibly, thus suggesting the possibility of supercomputers being used for expanding large data analysis and parameter optimization of preprocessing in the future.Nov. 2023, Scientific reports, 13(1) (1), 19901 - 19901, English, International magazineScientific journal
- RATIONALE AND OBJECTIVES: Pericardial fat (PF)-the thoracic visceral fat surrounding the heart-promotes the development of coronary artery disease by inducing inflammation of the coronary arteries. To evaluate PF, we generated pericardial fat count images (PFCIs) from chest radiographs (CXRs) using a dedicated deep-learning model. MATERIALS AND METHODS: We reviewed data of 269 consecutive patients who underwent coronary computed tomography (CT). We excluded patients with metal implants, pleural effusion, history of thoracic surgery, or malignancy. Thus, the data of 191 patients were used. We generated PFCIs from the projection of three-dimensional CT images, wherein fat accumulation was represented by a high pixel value. Three different deep-learning models, including CycleGAN were combined in the proposed method to generate PFCIs from CXRs. A single CycleGAN-based model was used to generate PFCIs from CXRs for comparison with the proposed method. To evaluate the image quality of the generated PFCIs, structural similarity index measure (SSIM), mean squared error (MSE), and mean absolute error (MAE) of (i) the PFCI generated using the proposed method and (ii) the PFCI generated using the single model were compared. RESULTS: The mean SSIM, MSE, and MAE were 8.56 × 10-1, 1.28 × 10-2, and 3.57 × 10-2, respectively, for the proposed model, and 7.62 × 10-1, 1.98 × 10-2, and 5.04 × 10-2, respectively, for the single CycleGAN-based model. CONCLUSION: PFCIs generated from CXRs with the proposed model showed better performance than those generated with the single model. The evaluation of PF without CT may be possible using the proposed method.Oct. 2023, Academic radiology, 31(3) (3), 822 - 829, English, International magazineScientific journal
- To evaluate the diagnostic performance of our deep learning (DL) model of COVID-19 and investigate whether the diagnostic performance of radiologists was improved by referring to our model. Our datasets contained chest X-rays (CXRs) for the following three categories: normal (NORMAL), non-COVID-19 pneumonia (PNEUMONIA), and COVID-19 pneumonia (COVID). We used two public datasets and private dataset collected from eight hospitals for the development and external validation of our DL model (26,393 CXRs). Eight radiologists performed two reading sessions: one session was performed with reference to CXRs only, and the other was performed with reference to both CXRs and the results of the DL model. The evaluation metrics for the reading session were accuracy, sensitivity, specificity, and area under the curve (AUC). The accuracy of our DL model was 0.733, and that of the eight radiologists without DL was 0.696 ± 0.031. There was a significant difference in AUC between the radiologists with and without DL for COVID versus NORMAL or PNEUMONIA (p = 0.0038). Our DL model alone showed better diagnostic performance than that of most radiologists. In addition, our model significantly improved the diagnostic performance of radiologists for COVID versus NORMAL or PNEUMONIA.Oct. 2023, Scientific reports, 13(1) (1), 17533 - 17533, English, International magazineScientific journal
- We aimed to develop and evaluate an automatic prediction system for grading histopathological images of prostate cancer. A total of 10,616 whole slide images (WSIs) of prostate tissue were used in this study. The WSIs from one institution (5160 WSIs) were used as the development set, while those from the other institution (5456 WSIs) were used as the unseen test set. Label distribution learning (LDL) was used to address a difference in label characteristics between the development and test sets. A combination of EfficientNet (a deep learning model) and LDL was utilized to develop an automatic prediction system. Quadratic weighted kappa (QWK) and accuracy in the test set were used as the evaluation metrics. The QWK and accuracy were compared between systems with and without LDL to evaluate the usefulness of LDL in system development. The QWK and accuracy were 0.364 and 0.407 in the systems with LDL and 0.240 and 0.247 in those without LDL, respectively. Thus, LDL improved the diagnostic performance of the automatic prediction system for the grading of histopathological images for cancer. By handling the difference in label characteristics using LDL, the diagnostic performance of the automatic prediction system could be improved for prostate cancer grading.MDPI AG, Feb. 2023, Cancers, 15(5) (5), 1535 - 1535, English, International magazineScientific journal
- (一社)日本核医学会, 2023, 核医学, 60(Suppl.) (Suppl.), S184 - S184, Japanese胸部PET/MRIの減弱補正 高速Zero-TE MRIを用いた深層学習によるノイズ除去および擬似CT生成
- (一社)日本核医学会, 2023, 核医学, 60(Suppl.) (Suppl.), S206 - S206, JapaneseZTE MRIから2.5次元法深層学習で生成した骨要素を含む減弱補正が胸部領域のSUVに与える影響
- (一社)日本核医学会, 2023, 核医学, 60(Suppl.) (Suppl.), S184 - S184, Japanese胸部PET/MRIの減弱補正 高速Zero-TE MRIを用いた深層学習によるノイズ除去および擬似CT生成
- (一社)日本核医学会, 2023, 核医学, 60(Suppl.) (Suppl.), S206 - S206, JapaneseZTE MRIから2.5次元法深層学習で生成した骨要素を含む減弱補正が胸部領域のSUVに与える影響
- PURPOSE: The purpose of this study is to compare two libraries dedicated to the Markov chain Monte Carlo method: pystan and numpyro. In the comparison, we mainly focused on the agreement of estimated latent parameters and the performance of sampling using the Markov chain Monte Carlo method in Bayesian item response theory (IRT). MATERIALS AND METHODS: Bayesian 1PL-IRT and 2PL-IRT were implemented with pystan and numpyro. Then, the Bayesian 1PL-IRT and 2PL-IRT were applied to two types of medical data obtained from a published article. The same prior distributions of latent parameters were used in both pystan and numpyro. Estimation results of latent parameters of 1PL-IRT and 2PL-IRT were compared between pystan and numpyro. Additionally, the computational cost of the Markov chain Monte Carlo method was compared between the two libraries. To evaluate the computational cost of IRT models, simulation data were generated from the medical data and numpyro. RESULTS: For all the combinations of IRT types (1PL-IRT or 2PL-IRT) and medical data types, the mean and standard deviation of the estimated latent parameters were in good agreement between pystan and numpyro. In most cases, the sampling time using the Markov chain Monte Carlo method was shorter in numpyro than that in pystan. When the large-sized simulation data were used, numpyro with a graphics processing unit was useful for reducing the sampling time. CONCLUSION: Numpyro and pystan were useful for applying the Bayesian 1PL-IRT and 2PL-IRT. Our results show that the two libraries yielded similar estimation result and that regarding to sampling time, the fastest libraries differed based on the dataset size.2023, PeerJ. Computer science, 9, e1620, English, International magazineScientific journal
- PURPOSE: This study proposes a Bayesian multidimensional nominal response model (MD-NRM) to statistically analyze the nominal response of multiclass classifications. MATERIALS AND METHODS: First, for MD-NRM, we extended the conventional nominal response model to achieve stable convergence of the Bayesian nominal response model and utilized multidimensional ability parameters. We then applied MD-NRM to a 3-class classification problem, where radiologists visually evaluated chest X-ray images and selected their diagnosis from one of the three classes. The classification problem consisted of 150 cases, and each of the six radiologists selected their diagnosis based on a visual evaluation of the images. Consequently, 900 (= 150 × 6) nominal responses were obtained. In MD-NRM, we assumed that the responses were determined by the softmax function, the ability of radiologists, and the difficulty of images. In addition, we assumed that the multidimensional ability of one radiologist were represented by a 3 × 3 matrix. The latent parameters of the MD-NRM (ability parameters of radiologists and difficulty parameters of images) were estimated from the 900 responses. To implement Bayesian MD-NRM and estimate the latent parameters, a probabilistic programming language (Stan, version 2.21.0) was used. RESULTS: For all parameters, the Rhat values were less than 1.10. This indicates that the latent parameters of the MD-NRM converged successfully. CONCLUSION: The results show that it is possible to estimate the latent parameters (ability and difficulty parameters) of the MD-NRM using Stan. Our code for the implementation of the MD-NRM is available as open source.Dec. 2022, Japanese journal of radiology, 41(4) (4), 449 - 455, English, Domestic magazineScientific journal
- (株)Gakken, Dec. 2022, 画像診断, 43(1) (1), 12 - 13, Japanese
- (株)Gakken, Dec. 2022, 画像診断, 43(1) (1), 62 - 63, Japanese
- (株)Gakken, Sep. 2022, 画像診断, 42(11) (11), A12 - A13, Japanese
- This retrospective study aimed to develop and validate a deep learning model for the classification of coronavirus disease-2019 (COVID-19) pneumonia, non-COVID-19 pneumonia, and the healthy using chest X-ray (CXR) images. One private and two public datasets of CXR images were included. The private dataset included CXR from six hospitals. A total of 14,258 and 11,253 CXR images were included in the 2 public datasets and 455 in the private dataset. A deep learning model based on EfficientNet with noisy student was constructed using the three datasets. The test set of 150 CXR images in the private dataset were evaluated by the deep learning model and six radiologists. Three-category classification accuracy and class-wise area under the curve (AUC) for each of the COVID-19 pneumonia, non-COVID-19 pneumonia, and healthy were calculated. Consensus of the six radiologists was used for calculating class-wise AUC. The three-category classification accuracy of our model was 0.8667, and those of the six radiologists ranged from 0.5667 to 0.7733. For our model and the consensus of the six radiologists, the class-wise AUC of the healthy, non-COVID-19 pneumonia, and COVID-19 pneumonia were 0.9912, 0.9492, and 0.9752 and 0.9656, 0.8654, and 0.8740, respectively. Difference of the class-wise AUC between our model and the consensus of the six radiologists was statistically significant for COVID-19 pneumonia (p value = 0.001334). Thus, an accurate model of deep learning for the three-category classification could be constructed; the diagnostic performance of our model was significantly better than that of the consensus interpretation by the six radiologists for COVID-19 pneumonia.May 2022, Scientific reports, 12(1) (1), 8214 - 8214, English, International magazineScientific journal
- (公社)日本医学放射線学会, Mar. 2022, 日本医学放射線学会学術集会抄録集, 81回, S232 - S232, English深層学習を用いた肺結節の三次元CT画像の生成(Generation of Three-Dimensional CT Images of Lung Nodules using Deep Learning)
- (一社)日本核医学会, 2022, 核医学, 59(1) (1), 35 - 35, Japanese
- (一社)日本核医学会, 2022, 核医学, 59(1) (1), 35 - 35, Japanese深層学習を用いてZTE MRIから生成した骨による減弱補正が胸部領域のSUVに及ぼす影響
- (一社)日本医療情報学会, Nov. 2021, 医療情報学連合大会論文集, 41回, 1111 - 1114, Japanese慢性血栓塞栓性肺高血圧症患者における低侵襲・高精度な肺動脈平均圧の予測モデル作成
- (一社)日本医療情報学会, Nov. 2021, 医療情報学連合大会論文集, 41回, 1122 - 1124, Japanese大学病院における遺伝的アルゴリズムを用いた当直予定表作成システムの開発
- 症例は50歳代女性で、家族が毛細血管拡張症と診断されたことを契機にCTが施行され、肺や肝臓に多発する動静脈奇形を認め、遺伝性出血性毛細血管拡張症(HHT)と診断された。受診日の朝より辻褄の合わない言動があり、精査加療目的に当院に紹介受診した。頭部単純CTでは、直静脈洞、ガレン大静脈および内大脳静脈の一部に高吸収域を認めた。頭部単純MRIにおけるT2強調像では単純CTでの高吸収域に一致した低信号域を認め、急性期の直静脈洞血栓が疑われた。上矢状静脈洞にはT1強調像およびFLAIR像で高信号域を認め、こちらも血栓を見ているものと考えられた。FLAIR像および拡散強調像では視床、尾状核頭、被殻に両側性の高信号域を認め、直静脈洞閉塞に伴う脳実質の浮腫性変化や梗塞を反映した所見と考えられた。この他、右前頭葉には既往の膿瘍後の変化を認め、小脳虫部右側には磁化率強調像にて静脈奇形を考える線状の低信号域が、両側淡蒼球には門脈大循環シャントによると思われるT1強調像で高信号域を認めた。受診当日にカテーテル血管造影検査を行い、直静脈洞、両側横静脈洞、上矢状静脈洞にそれぞれ血栓が認められたため、静脈血栓回収術を施行し抗凝固療法を開始した。治療により意識状態は改善傾向を示したが、自発性の低下や両側四肢の軽度麻痺などが残存したため、リハビリ目的に転院となった。その後のフォローにてHHTに伴う鼻出血が頻回となったため抗凝固薬療法を中止したが、脳静脈血栓症の再発なく経過している。金原出版(株), Sep. 2021, 臨床放射線, 66(9) (9), 937 - 942, Japanese
- (公社)日本医学放射線学会, Aug. 2021, 日本医学放射線学会秋季臨床大会抄録集, 57回, S400 - S400, Japanese慢性血栓塞栓性肺高血圧症患者における、重回帰分析を用いた肺動脈平均圧推定についての検討
- 医用画像情報学会, Jul. 2021, 医用画像情報学会雑誌, 38(2) (2), 53 - 56, Japanese
- The integrated positron emission tomography/magnetic resonance imaging (PET/MRI) scanner facilitates the simultaneous acquisition of metabolic information via PET and morphological information with high soft-tissue contrast using MRI. Although PET/MRI facilitates the capture of high-accuracy fusion images, its major drawback can be attributed to the difficulty encountered when performing attenuation correction, which is necessary for quantitative PET evaluation. The combined PET/MRI scanning requires the generation of attenuation-correction maps from MRI owing to no direct relationship between the gamma-ray attenuation information and MRIs. While MRI-based bone-tissue segmentation can be readily performed for the head and pelvis regions, the realization of accurate bone segmentation via chest CT generation remains a challenging task. This can be attributed to the respiratory and cardiac motions occurring in the chest as well as its anatomically complicated structure and relatively thin bone cortex. This paper presents a means to minimise the anatomical structural changes without human annotation by adding structural constraints using a modality-independent neighbourhood descriptor (MIND) to a generative adversarial network (GAN) that can transform unpaired images. The results obtained in this study revealed the proposed U-GAT-IT + MIND approach to outperform all other competing approaches. The findings of this study hint towards possibility of synthesising clinically acceptable CT images from chest MRI without human annotation, thereby minimising the changes in the anatomical structure.Jun. 2021, Scientific reports, 12(1) (1), 11090 - 11090, English, International magazineScientific journal
- OBJECTIVES: This study analyzed an artificial intelligence (AI) deep learning method with a three-dimensional deep convolutional neural network (3D DCNN) in regard to diagnostic accuracy to differentiate malignant pleural mesothelioma (MPM) from benign pleural disease using FDG-PET/CT results. RESULTS: For protocol A, the area under the ROC curve (AUC)/sensitivity/specificity/accuracy values were 0.825/77.9% (81/104)/76.4% (55/72)/77.3% (136/176), while those for protocol B were 0.854/80.8% (84/104)/77.8% (56/72)/79.5% (140/176), for protocol C were 0.881/85.6% (89/104)/75.0% (54/72)/81.3% (143/176), and for protocol D were 0.896/88.5% (92/104)/73.6% (53/72)/82.4% (145/176). Protocol D showed significantly better diagnostic performance as compared to A, B, and C in ROC analysis (p = 0.031, p = 0.0020, p = 0.041, respectively). MATERIALS AND METHODS: Eight hundred seventy-five consecutive patients with histologically proven or suspected MPM, shown by history, physical examination findings, and chest CT results, who underwent FDG-PET/CT examinations between 2007 and 2017 were investigated in a retrospective manner. There were 525 patients (314 MPM, 211 benign pleural disease) in the deep learning training set, 174 (102 MPM, 72 benign pleural disease) in the validation set, and 176 (104 MPM, 72 benign pleural disease) in the test set. Using AI with PET/CT alone (protocol A), human visual reading (protocol B), a quantitative method that incorporated maximum standardized uptake value (SUVmax) (protocol C), and a combination of PET/CT, SUVmax, gender, and age (protocol D), obtained data were subjected to ROC curve analyses. CONCLUSIONS: Deep learning with 3D DCNN in combination with FDG-PET/CT imaging results as well as clinical features comprise a novel potential tool shows flexibility for differential diagnosis of MPM.Impact Journals, {LLC}, Jun. 2021, Oncotarget, 12(12) (12), 1187 - 1196, English, International magazineScientific journal
- OBJECTIVES: To evaluate a deep learning model for predicting gestational age from fetal brain MRI acquired after the first trimester in comparison to biparietal diameter (BPD). MATERIALS AND METHODS: Our Institutional Review Board approved this retrospective study, and a total of 184 T2-weighted MRI acquisitions from 184 fetuses (mean gestational age: 29.4 weeks) who underwent MRI between January 2014 and June 2019 were included. The reference standard gestational age was based on the last menstruation and ultrasonography measurements in the first trimester. The deep learning model was trained with T2-weighted images from 126 training cases and 29 validation cases. The remaining 29 cases were used as test data, with fetal age estimated by both the model and BPD measurement. The relationship between the estimated gestational age and the reference standard was evaluated with Lin's concordance correlation coefficient (ρc) and a Bland-Altman plot. The ρc was assessed with McBride's definition. RESULTS: The ρc of the model prediction was substantial (ρc = 0.964), but the ρc of the BPD prediction was moderate (ρc = 0.920). Both the model and BPD predictions had greater differences from the reference standard at increasing gestational age. However, the upper limit of the model's prediction (2.45 weeks) was significantly shorter than that of BPD (5.62 weeks). CONCLUSIONS: Deep learning can accurately predict gestational age from fetal brain MR acquired after the first trimester. KEY POINTS: • The prediction of gestational age using ultrasound is accurate in the first trimester but becomes inaccurate as gestational age increases. • Deep learning can accurately predict gestational age from fetal brain MRI acquired in the second and third trimester. • Prediction of gestational age by deep learning may have benefits for prenatal care in pregnancies that are underserved during the first trimester.Jun. 2021, European radiology, 31(6) (6), 3775 - 3782, English, International magazineScientific journal
- (公社)日本医学放射線学会, Mar. 2021, 日本医学放射線学会学術集会抄録集, 80回, S203 - S203, English超高精細CTのための深層学習に基づくイメージ超解像処理(Deep-learning-based Image Super Resolution for Super High-resolution Computed Tomography)
- (一社)日本核医学会, 2021, 核医学, 58(Suppl.) (Suppl.), S196 - S196, English深層学習を用いたZTE MRIによる胸部PET/MRIの吸収補正に関する定量的検証
- (一社)日本核医学会, 2021, 核医学, 58(Suppl.) (Suppl.), S196 - S196, English深層学習を用いたZTE MRIによる胸部PET/MRIの吸収補正に関する定量的検証
- 症例は60歳代男性で、前胸部・四肢、頭部の皮疹を主訴とした。胸部CTで前縦隔に著明な石灰化を伴う軟部濃度腫瘤を認め、胸部MRIでは腫瘤内部にout of phaseで脂肪が確認でき退縮した胸腺に発生した腫瘍が疑われた。頸部MRIではシェーグレン症候群が疑われ、FDG-PETでは右頸部リンパ節腫大と胸腺腫瘤に集積を伴っていた。右頸部リンパ節腫大は生検でmarginal zone B-cell lymphoma of mucosa-associated lymphoid tissueを疑い、胸腺腫瘍は胸腺腫か奇形腫の疑いで、診断のため切除を行った。右頸部リンパ節からの生検の病理所見は異型リンパ球や形質細胞様細胞の増殖を認め、免疫染色でCD20陽性、CD138陽性を認めた。胸腺腫瘍ではアミロイド沈着を認め、腫瘍の一部に異型リンパ球や形質細胞様細胞が集塊状に目立つ部分が存在した。胸腺へのALアミロイドーシス沈着を伴い形質細胞への分化を伴った低悪性度B細胞性リンパ腫と診断した。金原出版(株), Jan. 2021, 臨床放射線, 66(1) (1), 59 - 64, Japanese
- Purpose: The purpose of this study was to develop and evaluate lung cancer segmentation with a pretrained model and transfer learning. The pretrained model was constructed from an artificial dataset generated using a generative adversarial network (GAN). Materials and Methods: Three public datasets containing images of lung nodules/lung cancers were used: LUNA16 dataset, Decathlon lung dataset, and NSCLC radiogenomics. The LUNA16 dataset was used to generate an artificial dataset for lung cancer segmentation with the help of the GAN and 3D graph cut. Pretrained models were then constructed from the artificial dataset. Subsequently, the main segmentation model was constructed from the pretrained models and the Decathlon lung dataset. Finally, the NSCLC radiogenomics dataset was used to evaluate the main segmentation model. The Dice similarity coefficient (DSC) was used as a metric to evaluate the segmentation performance. Results: The mean DSC for the NSCLC radiogenomics dataset improved overall when using the pretrained models. At maximum, the mean DSC was 0.09 higher with the pretrained model than that without it. Conclusion: The proposed method comprising an artificial dataset and a pretrained model can improve lung cancer segmentation as confirmed in terms of the DSC metric. Moreover, the construction of the artificial dataset for the segmentation using the GAN and 3D graph cut was found to be feasible.2021, Frontiers in artificial intelligence, 4, 694815 - 694815, English, International magazineScientific journal
- We hypothesized that, in discrimination between benign and malignant parotid gland tumors, high diagnostic accuracy could be obtained with a small amount of imbalanced data when anomaly detection (AD) was combined with deep leaning (DL) model and the L2-constrained softmax loss. The purpose of this study was to evaluate whether the proposed method was more accurate than other commonly used DL or AD methods. Magnetic resonance (MR) images of 245 parotid tumors (22.5% malignant) were retrospectively collected. We evaluated the diagnostic accuracy of the proposed method (VGG16-based DL and AD) and that of classification models using conventional DL and AD methods. A radiologist also evaluated the MR images. ROC and precision-recall (PR) analyses were performed, and the area under the curve (AUC) was calculated. In terms of diagnostic performance, the VGG16-based model with the L2-constrained softmax loss and AD (local outlier factor) outperformed conventional DL and AD methods and a radiologist (ROC-AUC = 0.86 and PR-ROC = 0.77). The proposed method could discriminate between benign and malignant parotid tumors in MR images even when only a small amount of data with imbalanced distribution is available.Nov. 2020, Scientific reports, 10(1) (1), 19388 - 19388, English, International magazine[Refereed]Scientific journal
- This study aimed to develop and validate computer-aided diagnosis (CXDx) system for classification between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray (CXR) images. From two public datasets, 1248 CXR images were obtained, which included 215, 533, and 500 CXR images of COVID-19 pneumonia patients, non-COVID-19 pneumonia patients, and the healthy samples, respectively. The proposed CADx system utilized VGG16 as a pre-trained model and combination of conventional method and mixup as data augmentation methods. Other types of pre-trained models were compared with the VGG16-based model. Single type or no data augmentation methods were also evaluated. Splitting of training/validation/test sets was used when building and evaluating the CADx system. Three-category accuracy was evaluated for test set with 125 CXR images. The three-category accuracy of the CAD system was 83.6% between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy. Sensitivity for COVID-19 pneumonia was more than 90%. The combination of conventional method and mixup was more useful than single type or no data augmentation method. In conclusion, this study was able to create an accurate CADx system for the 3-category classification. Source code of our CADx system is available as open source for COVID-19 research.Oct. 2020, Scientific reports, 10(1) (1), 17532 - 17532, English, International magazine[Refereed]Scientific journal
- (一社)日本核医学会, Oct. 2020, 核医学, 57(Suppl.) (Suppl.), S157 - S157, English胸部PET/MRIの吸収補正 別症例のCTを用いてZTEから偽CTを深層学習により作成する検討
- BACKGROUND: Although fractures of the sternum are rare in young children, owing to the compliance of the chest wall, these fractures are still possible and require thorough examination. We present a case that emphasizes the usefulness of point-of-care ultrasound in the diagnosis of a pediatric sternal fracture complicated by a subcutaneous abscess. CASE REPORT: A 5-year-old boy presented with tenderness of the sternum, with diffuse swelling extending bilaterally to the anterior chest wall. Ultrasound imaging identified irregular alignment of the sternum with a subcutaneous abscess and swirling of purulent material within the abscess in the fracture area. These findings were confirmed on enhanced chest computed tomography and had not been visible at the time of the first evaluation 6 days prior. WHY SHOULD AN EMERGENCY PHYSICIAN BE AWARE OF THIS?: Our case demonstrates the usefulness of point-of-care ultrasound for the diagnosis and appropriate management of a sternal fracture complicated by a subcutaneous abscess in a young child. As ultrasound imaging is easy to perform at the bedside, it is useful for examining pediatric patients with swelling of the anterior chest and local tenderness of the sternum to rule out a sternal fracture, even if these fractures are deemed to be uncommon in children.May 2019, The Journal of emergency medicine, 56(5) (5), 536 - 539, English, International magazine[Refereed]Scientific journal
- (一社)日本インターベンショナルラジオロジー学会, Nov. 2018, IVR: Interventional Radiology, 33(3) (3), 319 - 319, Japanese巨大肺仮性動脈瘤に対してコイルおよびvascular plugで塞栓術を行った1例
- 京都 : 日本放射線技術学会, Jun. 2024, 日本放射線技術学会雑誌 = Japanese journal of radiological technology, 80(6) (6), 673 - 678, Japanese放射線技術学研究におけるPythonの活用術 応用編(11)胸部単純X線写真の診断レポートの自動作成
- 2024, 核医学(Web), 61(Supplement) (Supplement)胸部領域PET/MRIにおける深層学習併用減弱補正法の再現性検討
- 2024, 日本医学放射線学会秋季臨床大会抄録集, 60th死亡時画像診断(Ai)の人工知能(AI)
- 2023, 核医学(Web), 60(Supplement) (Supplement)胸部PET/MRIの減弱補正:高速Zero-TE MRIを用いた深層学習によるノイズ除去および擬似CT生成
- 2023, 核医学(Web), 60(Supplement) (Supplement)ZTE MRIから2.5次元法深層学習で生成した骨要素を含む減弱補正が胸部領域のSUVに与える影響
- 2021, 医療情報学連合大会論文集(CD-ROM), 41stAccurate prediction model of pulmonary artery mean pressure using minimally invasive examinations in chronic thromboembolic pulmonary hypertension patients
- 2021, 核医学(Web), 58(Supplement) (Supplement)深層学習を用いたZTE MRIによる胸部PET/MRIの吸収補正に関する定量的検証
- 2021, 医療情報学連合大会論文集(CD-ROM), 41stDevelopment of a Duty Schedule Generation System Using Genetic Algorithm in a University Hospital
- 2021, 医用画像情報学会雑誌(Web), 38(2) (2)Development of Deep Learning Model for COVID-19 Pneumonia in Chest X-ray Images
- (一社)神緑会, Dec. 2020, 神緑会学術誌, 36, 60 - 61, JapaneseMR画像とDeep learningを用いた耳下腺腫瘍の良悪性判別の試み
- (一社)日本核医学会, Oct. 2020, 核医学, 57(Suppl.) (Suppl.), S157 - S157, English胸部PET/MRIの吸収補正 別症例のCTを用いてZTEから偽CTを深層学習により作成する検討
- (一社)日本インターベンショナルラジオロジー学会, Aug. 2020, 日本インターベンショナルラジオロジー学会雑誌, 35(Suppl.) (Suppl.), 126 - 126, Japanese最新技術とIVR 深層学習を用いた人工知能の基本とIVRにおける臨床応用の可能性
- (株)学研メディカル秀潤社, Aug. 2020, 画像診断, 40(10) (10), 1056 - 1059, Japanese
- 2020, 日本神経放射線学会プログラム・抄録集, 49th胎児MRIの頭部画像を用いたディープラーニングによる胎児の週数予測
- 2020, 核医学(Web), 57(Supplement) (Supplement)胸部PET/MRIの吸収補正:別症例のCTを用いてZTEから偽CTを深層学習により作成する検討
- (公社)日本医学放射線学会, Sep. 2019, 日本医学放射線学会秋季臨床大会抄録集, 55回, S550 - S550, Japaneseアミロイド沈着により著明な石灰化を呈した前縦隔リンパ腫の1例
- (一社)日本インターベンショナルラジオロジー学会, May 2019, 日本インターベンショナルラジオロジー学会雑誌, 34(Suppl.) (Suppl.), 257 - 257, JapaneseChronic kidney disease(CKD)患者に対し炭酸ガス造影を用いたEVARの検討
- CARS 2025, Jun. 2025Comparison of LLaVA 1.5 and 1.6 for Radiology Report Generation: The Impact of LoRA Fine-Tuning on Medical Image Analysis.
- CARS 2024, Jun. 2024Examining the Efficacy of Fine-Tuning Multilingual Large Language Models for Report Structuring in English and Japanese Radiology Reports
- ECR 2024 - European Congress of Radiology, Feb. 2024, EnglishA Multilingual Approach to Structured Data Extraction from Radiological Reports using Large Language Models: Focus on TNM Staging AccuracyPoster presentation
- ECR 2023 - European Congress of Radiology, Mar. 2023, EnglishDiagnostic accuracy of AI detection of intracranial hemorrhage using vision transformers and large datasets: Application to post-mortem CTPoster presentation
- 最先端医療画像研究会, Nov. 2022基礎から学ぶAI技術~deep learningって何?~
- ECR 2022 - European Congress of Radiology, Jul. 2022, EnglishPrognostic prediction of patients with malignant mesothelioma using 3D PET images and clinical data with self-supervised learning
- 播淡画像診断研究会, Feb. 2022基礎から学ぶAI技術~deep learningって何?~[Invited]
- ECR 2021 - European Congress of RadiologyIdentification of Malignant Pleural Mesothelioma by Artificial Intelligent combining 3D structure of Positron Emission Tomography Images and Clinical Data
- RSNA 2020 - Annual meeting of Radiological Society of North AmericaAnomaly Detection for a Small Amount and Highly Biased Dataset: Discrimination of Magnetic Resonance Images between Benign and Malignant Parotid Tumors
- 第49回日本 IVR 学会総会, Aug. 2020, JapaneseBasics of Artificial Intelligence Using Deep Learning and Possibility of Clinical Application in IVR[Invited]
- ECR 2020 - European Congress of RadiologyClassification of MR Images between Benign and Malignant Parotid Tumors using Deep Learning.
- Japan Society for the Promotion of Science, Grants-in-Aid for Scientific Research, Grant-in-Aid for Scientific Research (C), Kobe University, 01 Apr. 2024 - 31 Mar. 2029Development of a Novel Attenuation Correction Method for Integrated PET/MRI Systems
- Japan Society for the Promotion of Science, Grants-in-Aid for Scientific Research, Grant-in-Aid for Early-Career Scientists, Kobe University, 01 Apr. 2023 - 31 Mar. 2028Multitask Image-Natural Language Correspondence Model Development using Large-Scale Medical Image Dataset今年度、JMIDと呼ばれる日本医学放射線学会が管理・運用するDBから画像やレポートのデータのダウンロードを行った。現在2万件以上のデータをダウンロードしている。このデータについては現在内容を精査中である。大規模自然言語モデル(LLM)の一種であるGPTを用いてレポートから肺癌のTNM分類を行うモデルを作成し、NTCIR17でconference paperとして発表した。加えてTNM分類においてどういった情報・指示を与えるのが性能に寄与するか、日本語・英語による性能の違いについてECR2024にて発表した。加えて既存のLLMだけでは無く、llama2などのオープンLLMの学習環境を整え、llama2のfinetuningを開始した。上記のNTCIR17のデータを用いてオープンLLMのfinetuningを行っており、その性能への情報や指示への寄与がGPTとどう違うかについて検討を行った。この結果については来るCARS2024にて発表予定である。以前から行っている画像を用いた深層学習モデルの検討についても引き続き行っており、立体の画像と医療情報を加えることによる悪性中皮腫の予後推定精度がどう変わるかについて検討を行ったものがMedical Physics誌に掲載された。その他にも主に医療画像を用いた深層学習モデルや手法について共同研究を行っており、その結果が論文などに掲載されつつある。
- Japan Society for the Promotion of Science, Grants-in-Aid for Scientific Research, Fund for the Promotion of Joint International Research (International Collaborative Research), Kobe University, 08 Sep. 2023 - 31 Mar. 2026Application of large language models to medical natural language processing今年度、JMIDと呼ばれる日本医学放射線学会が管理・運用するデータベースを利用できた。JMIDのデータベースから100万件以上の放射線診断レポートを収集し、そのレポートから文章に関する深層学習のtransformerのモデルを作成した。作成したモデルでは、レポートの所見欄から診断欄を自動生成が可能となった。このモデルを評価し、その結果が査読付き英文誌に掲載された。また、transformer・大規模言語モデルの一つであるChatGPTを使って、レポートから肺癌のTNM分類を推定するモデルの作成も行い、NTCIR17でconference paperとして発表した。
胸部単純レントゲン写真からレポートの文章を直接に自動生成するシステムを作成した。作成したモデルはtransformerをベースにしたVision and Languageのモデルで、事前学習済みのtransformerをfine tuningすることでレポートの生成が可能となった。今年度は英語のレポートのみを対象とし、英語の文章としては問題のないレポートが生成出来た。上記に加えて、レポートの文章や医療画像に関する深層学習のモデルを作り、複数の論文が査読付き英文誌に掲載された。
上記の業績の一部はスイスのチューリッヒ大学との国際経共同研究の成果であった。2024年4月からは分担研究者の一人がチューリッヒ大学に留学し、数か月間チューリッヒ大学で共同研究をする予定である。
