Constructing Regression Models with High Prediction Accuracy and Interpretability Based on Decision Tree and Random Forests [Published online in advanced , by J-STAGE]

[Advanced Published online Journal of Computer Chemistry, Japan, by J-STAGE]
<Title:> Constructing Regression Models with High Prediction Accuracy and Interpretability Based on Decision Tree and Random Forests
<Author(s):> Naoto SHIMIZU, Hiromasa KANEKO
<Corresponding author E-Mill:> hkaneko(at)meiji.ac.jp
<Abstract:> Models for predicting properties/activities of materials based on machine learning can lead to the discovery of new mechanisms underlying properties/activities of materials. However, methods for constructing models that exhibit both high prediction accuracy and interpretability remain a work in progress because the prediction accuracy and interpretability exhibit a trade-off relationship. In this study, we propose a new model-construction method that combines decision tree (DT) with random forests (RF); which we therefore call DT-RF. In DT-RF, the datasets to be analyzed are divided by a DT model, and RF models are constructed for each subdataset. This enables global interpretation of the data based on the DT model, while the RT models improve the prediction accuracy and enable local interpretations. Case studies were performed using three datasets, namely, those containing data on the boiling point of compounds, their water solubility, and the transition temperature of inorganic superconductors. We examined the proposed method in terms of its validity, prediction accuracy, and interpretability.
<Keywords:> Model interpretability, Predictive ability, Decision tree, Random forests, Regression model
<URL:> https://www.jstage.jst.go.jp/article/jccj/advpub/0/advpub_2020-0021/_article/-char/ja/

熱硬化性樹脂コンポジットにおける物性予測に向けた機械学習モデル構築 [Published online J. Comput. Chem. Jpn., 20, 14-21, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan Vol.20, 14-21, by J-STAGE]
<Title:> 熱硬化性樹脂コンポジットにおける物性予測に向けた機械学習モデル構築
<Author(s):> 高原 渉, 小林 優希, 森田 将司, 奥山 浩二郎, 川村 信行
<Corresponding author E-Mill:> takahara.wataru(at)jp.panasonic.com
<Abstract:> 本研究では自社の実験データを用いて,熱硬化性樹脂コンポジットを工業応用する際に重要となる比誘電率(ε),誘電正接(tanδ)予測に向けた機械学習モデルを構築した.機械学習モデルの構築には近年注目を集めている勾配ブースティング木(GBDT)系のアルゴリズムを含む幅広い手法を採用した.複数の手法にて構築したモデルの中で,Training data setにおける交差検証(Cross-validation)時の決定係数R2CV > 0.8を満たすモデルを抽出した.更にTraining data set においてRMSE (Root Mean Square Error)及びMAE (Mean Absolute Error)の値が小さく,より定量的な物性予測が可能と考えられるモデルを選択し,Test data setにおける評価を行った.その結果,RMSEやMAEがε及びtanδそれぞれの平均値に対して10-1 10-2オーダーで物性予測可能な機械学習モデルが得られた.本結果より,熱硬化性樹脂コンポジットにおいてもMI (Materials Informatics)によるアプローチが有効であり,定量的な特性予測が可能であることを初めて実証した.今後の開発において,本アプローチを用いることで材料開発期間の短縮及び材料開発の促進を期待する.
<Keywords:>
<URL:> https://www.jstage.jst.go.jp/article/jccj/20/1/20_2021-0026/_article/-char/ja/

自己触媒反応機構によるアミノ酸熱重合物のカプセル形成 [Published online J. Comput. Chem. Jpn., 20, 10-13, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan Vol.20, 10-13, by J-STAGE]
<Title:> 自己触媒反応機構によるアミノ酸熱重合物のカプセル形成
<Author(s):> 伊藤 俊介, 櫻沢 繁
<Corresponding author E-Mill:> sakura(at)fun.ac.jp
<Abstract:> アミノ酸熱重合物は数種のアミノ酸混合物を熱重合して得られる原始的な高分子である.アミノ酸熱重合物微小球は周囲の環境の変化に応じて外側にカプセルを形成するが,そのメカニズムは明らかになっていない.本研究 では,アミノ酸熱重合物の持つ自己触媒反応的な性質がカプセル形成の要因であるとの仮説を立て,ブラウン動 力学に自己触媒的なクラスター形成機構を組み込み,その検証を試みた.クラスター形成機構を組み込んだ場合, 高密度領域が形成されることが明らかになった.この結果は,高密度領域のクラスターがさらに成長してカプセ ル状の構造を形成することを示唆する.これによって,原始的な高分子であるアミノ酸熱重合物が,生命起源に 寄与したと考えられる物理的区画を形成する機能を持ち,それはアミノ酸熱重合物の自己触媒的会合過程に由来 することが明らかになった.
<Keywords:> Colloid, Self-Organizing System, Origin of life, Compartment Formation, Nonlinearity
<URL:> https://www.jstage.jst.go.jp/article/jccj/20/1/20_2020-0027/_article/-char/ja/

種々の科学データにおける機械学習を用いた分析の試み [Published online J. Comput. Chem. Jpn., 19, A21-A24, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan Vol.19, A21-A24, by J-STAGE]
<Title:> 種々の科学データにおける機械学習を用いた分析の試み
<Author(s):> 奥脇 弘次, 増田 淳希, 柿沼 紗也果, 谷川 貴一, 水野 寛哉, 満野 仁美, 伊藤 雅仁, 藤方 玲衣, 望月 祐志
<Corresponding author E-Mill:> okuwaki(at)rikkyo.ac.jp
<Abstract:> In recent years, there has been progress in the development of machine learning and deep learning technologies in various fields, and a number of software packages have been released that can be implemented. Our research group has attempted to establish analysis methods using machine learning for various scientific data. In this paper, we will report on further developments such as prediction of lipophilicity of molecules, analysis of psalms data using natural language processing, and similarity calculation system of spectrum data.
<Keywords:>
<URL:> https://www.jstage.jst.go.jp/article/jccj/19/4/19_2021-0018/_article/-char/ja/

珍しい塩基触媒による不斉Diels-Alder合成反応のMOシミュレーション解析 [Published online J. Comput. Chem. Jpn., 19, 175-177, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan Vol.19, 175-177, by J-STAGE]
<Title:> 珍しい塩基触媒による不斉Diels-Alder合成反応のMOシミュレーション解析
<Author(s):> 染川 賢一, 上田 岳彦, 吉留 俊史, 石川 岳志, 錦織 寿
<Corresponding author E-Mill:> somekw(at)voice.ocn.ne.jp
<Abstract:> The reaction process and steric situations of novel basic and chiral catalyst Diels-Alder reactions by Kagan et al. were speculated by IRC of PM7 simulation for the three molecules reactions clearly. The addition reactions of enolic dienes (1) with dienophiles (2) by amines (3) such as (S)-(+)-prolinol / (R)-(-)-prolinol proceeded via lower energy reaction complexes (RC) and transition states (TS) of two steps. The steric shapes by IRC (Figure 2 6) showed the clear interactions between the reaction points, and of OH with amine moieties in the 1 3, 1 3 2 and TS complexes, to give high stereoselective adducts. IRC of some reactions also guesses right the Michael reaction selectivity. The handy PM7 simulation is recommended for usual chemical growth.
<Keywords:> MOPAC2016-PM7, Diels-Alder reaction, Chiral and basic catalyst, Transition state, MO simulation, Reaction complex, IRC
<URL:> https://www.jstage.jst.go.jp/article/jccj/19/4/19_2021-0011/_article/-char/ja/

化合物のAmes予測におけるGraph Convolutional Networkの特徴評価 [Published online J. Comput. Chem. Jpn., 20, 10-18, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan Vol.20, 1-9, by J-STAGE]
<Title:> 化合物のAmes予測におけるGraph Convolutional Networkの特徴評価
<Author(s):> 半田 千彰, 小沢 知永, 福澤 薫, 米持 悦生
<Corresponding author E-Mill:> chiaki_handa(at)pharm.kissei.co.jp
<Abstract:> 医薬品候補物質の潜在的な発がん性早期警戒システムであるAmes試験のin silico予測は,創薬研究において重要な予測項目の一つである.in silico予測の一手法である機械学習による予測では,Applicability Domain (AD)という機械学習モデルが本来の性能を発揮できるデータ領域を定義する研究がある.創薬研究においては,学習データと構造類似性が低い医薬品候補化合物の予測を行う場合があり,そのような化合物はAD領域外になる可能性が高く予測精度が低下する傾向がある.本研究では,Ames試験の機械予測モデルを作成し,テストデータとしてAD領域内/外となる確率が高い化合物群をそれぞれ用意して,複数の機械学習手法の予測性能を評価した.人工知能技術の発展により,創薬分野でも注目を集めているGraph Convolutional Network (GCN)と既存の機械学習手法の予測性能を比較した結果,AD領域外となる可能性が高い化合物群の予測性能において,GCNは既存手法より優れていた.
<Keywords:> Keywords Graph Convolutional Network, Machine learning, Ames test, Applicability Domain, Structural similarity
<URL:> https://www.jstage.jst.go.jp/article/jccj/20/1/20_2020-0015/_article/-char/ja/

Differences between Gaussian and GAMESS Basis Sets (II) ―6-31G and 6-31G*― [Published online J. Comput. Chem. Jpn. Int. Ed., 7, -, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan -International Edition Vol.7, -, by J-STAGE]
<Title:> Differences between Gaussian and GAMESS Basis Sets (II) ―6-31G and 6-31G*―
<Author(s):> Munetaka TAKEUCHI, Masafumi YOSHIDA, Umpei NAGASHIMA
<Corresponding author E-Mill:> myoshida(at)tcu.ac.jp
<Abstract:> Gaussian and GAMESS, which are calculation codes for the ab initio molecular orbital method, can be used by simply specifying a basis set name such as 6-31G. However, if an individual basis set with a common name does not have the same parameter set, the calculations with the two codes will each produce a different result. Previously, we used Gaussian and GAMESS for STO-3G calculations of hydrides containing third-period elements and compared the results [J. Comput. Chem. Jpn., 18, 194 (2019)]. In this study, we used 6-31G and 6-31G* for 36 molecules containing a first- to fourth-period element (H, Be, N, Ne, Na-Kr) and compared the results calculated using the two codes. For molecules containing a first- to third-period element (H, Be, N, Ne, Na-Ar) except Si, the optimized structure and total energy obtained with Gaussian and GAMESS were almost the same, whereas the two codes gave different results for K, Ca, and Ga-Kr because the basis parameters used in the two codes are different. On the other hand, the results for the Sc-Zn were in agreement. When the results calculated using Gaussian and GAMESS codes are compared or combined, it is necessary to severe check whether or not the input data produces a sufficiently accurate calculation result.
<Keywords:> Keyword Basis set, Gaussian, GAMESS, 6-31G, 6-31G*, Total energy
<URL:> https://www.jstage.jst.go.jp/article/jccjie/7/0/7_2020-0010/_html

Density Functional Study of σ Bond Cleavage in P P Multiple Bond of Phosphinophosphinidene [Published online J. Comput. Chem. Jpn. Int. Ed., 7, -, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan -International Edition Vol.7, -, by J-STAGE]
<Title:> Density Functional Study of σ Bond Cleavage in P P Multiple Bond of Phosphinophosphinidene
<Author(s):> Toshiaki MATSUBARA, Keisuke SHIRASAKA
<Corresponding author E-Mill:> matsubara(at)kanagawa-u.ac.jp
<Abstract:> Recently, the synthesis of phosphinophosphinidene, which is a phosphorus analog of carbene, has been reported. Subsequent experimental reports have shown that phosphinophosphinidene acts as an electron acceptor. Because the terminal phosphorus atom inherently acts as an electron donor, chemical reactions may lead to the σ bond cleavage at the phosphorus atom through charge-transfer interaction. In this study, we explore the possibility of the σ bond cleavage in H H, C H, O H, N H, and B H bonds by means of the density functional method using the model molecules, H2, CH4, H2O, NH3 and BH3. For H2 and CH4, the H H and the C H bonds were found to be broken at the single site of the terminal phosphorus atom by the charge-transfer interactions. The potential energy barrier of about 22 24 kcal/mol is similar to that for carbene, suggesting the possibility of σ bond cleavage in phosphinophosphinidene. In contrast, for H2O and NH3, the O H and N H bonds are broken at the two sites of both phosphorus atoms by the abstraction of hydrogen as a proton. In the case of BH3, cleavage of the B H bond occurs easily at both the single and dual sites of the phosphorus atoms.
<Keywords:> Density functional method, Phosphinophosphinidene, σ Bond cleavage, Reaction mechanism
<URL:> https://www.jstage.jst.go.jp/article/jccjie/7/0/7_2020-0003/_html

Materials Informatics Approach to Predictive Models for Elastic Modulus of Polypropylene Composites Reinforced by Fillers and Additives [Published online J. Comput. Chem. Jpn. Int. Ed., 7, -, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan -International Edition Vol.7, -, by J-STAGE]
<Title:> Materials Informatics Approach to Predictive Models for Elastic Modulus of Polypropylene Composites Reinforced by Fillers and Additives
<Author(s):> Yuko IKEDA, Michihiro OKUYAMA, Yukihito NAKAZAWA, Tomohiro OSHIYAMA, Kimito FUNATSU
<Corresponding author E-Mill:> tomohiro.oshiyama(at)konicaminolta.com
<Abstract:> Advanced processes are useful when developing polymer composites because there are an enormous number of possible combinations of fillers and additives to realize polymers with desired properties. Materials informatics is a data-driven approach to find novel materials or a suitable combination of materials from material data sheets. Here, we used materials informatics to construct a predictive model for the elastic modulus of polypropylene composites. To apply materials informatics to existing experimental data, we described explanatory variables by a combination of 0 and 1 representing polypropylene, or by the content ratio of filler and additive, without using materials property data. We constructed a predictive model for the elastic modulus of polypropylene composites using a partial least square regression model with dummy variables. To validate the predictive model, comparisons were made between measured and predicted elastic moduli for eight new polypropylene composites. The residual was less than 300 MPa for the range 1,000 3,000 MPa. We improved the accuracy of the prediction for composites with high filler content ratio by applying a nonlinear support vector regression model. The predictive model is therefore useful for identifying suitable combinations of polypropylene, filler and additive to achieve a desired elastic modulus.
<Keywords:> Materials informatics, Elastic modulus, Polypropylene composite, PLS, SVR, Dummy variables
<URL:> https://www.jstage.jst.go.jp/article/jccjie/7/0/7_2020-0007/_html

導電性高分子系太陽電池のキャリア発生と発電機構に関する検討 [Published online J. Comput. Chem. Jpn., 19, 172-174, by J-STAGE]

[Published online Journal of Computer Chemistry, Japan Vol.19, 172-174, by J-STAGE]
<Title:> 導電性高分子系太陽電池のキャリア発生と発電機構に関する検討
<Author(s):> 中村 潤之介, 原岡 壮馬 , 成島 和男
<Corresponding author E-Mill:> narushim(at)ube-k.ac.jp
<Abstract:> Organic solar cells are flexible elements that offer benefits such as low cost, light weight, and applicability. For this study, quantum chemistry calculations of cells using conductive polymers were performed. Poly (3,4-ethylenedioxythiophene)-poly (styrenesulfonate)(PEDOT:PSS), phthalocyanine and fullerene C60 were used as semiconductors. Generation of carriers (conduction electrons and holes) was expected in both the ground state and the excited state. When MgAg and Al were selected for the electrodes, the energy diagram was found to have an ideal step structure. The organic solar cell was designed from the energy diagram of the whole layer that constituted the solar cell.
<Keywords:> Organic solar cells, Conductive polymer, Quantum chemistry calculations, Generation of carriers, Energy diagram
<URL:> https://www.jstage.jst.go.jp/article/jccj/19/4/19_2021-0007/_article/-char/ja/