ダウンロード数: 149

このアイテムのファイル:
ファイル 記述 サイズフォーマット 
jps_562_22.pdf1.1 MBAdobe PDF見る/開く
タイトル: 知覚と認知の計算理論
その他のタイトル: Computational Theory of Visual Perception and Cognition
著者: 乾, 敏郎  KAKEN_name
著者名の別形: Inui, Toshio
発行日: 10-Oct-1996
出版者: 京都哲学会 (京都大学文学部内)
誌名: 哲學研究
巻: 562
開始ページ: 22
終了ページ: 44
抄録: Marr's philosophy has played a significant role in studies of the brain, notably in the vision studies during the 1980s (Marr, 1982). He proposed that the major function of vision is to estimate the 3-dimensional structure of the world from a 2-dimensional image projected onto the retina. Mathe-matically, this is an ill-posed problem and a general solution cannot be given in most cases. He suggested that a unique solution to this ill-posed problem could be given if physical laws were taken into consideration as constraints. It was later pointed out that this idea of Marr's was conceptually equivalent to Tikhonov's method of standard regularization, which is a common method for solving inverse problems in mathematics (Poggio, et al., 1985). With this realization, various visual computations have been formulated in precise mathematical terms. In addition, Marr suggested that there are many modules in early vision, and that the computation for estimating of surfaces from the retinal image is carried out independently in each module. The computational theory of modules is now generally called "Shape-from-X", and it has generated a large body of research. Here, X represents binocular disparity, shading, texture, motion, or some other source of information relevant to shape determination. In this paper, we initially discuss how outputs of early vision modules are integrated into one unique representation: a 2 1/2D sketch. For this problem, a Bayesian estimation framework is useful for explaining much of the psychophysical data obtained by a cue-conflict paradigm. We proposed a new theory for the integration between vision modules that is based on a Bayesian estimation and a simple neural network. On the other hand, a representational space should be a topological space that preserves a relative degree of similarity between patterns. In order to investigate the representation of a visual pattern, it is very important to develop a general framework that explains their pychological similarity. In the latter part of this paper, the mechanism of visual cognition is discussed based on our recent research concerning psychological similarity. We then introduce a neural network model of the inferotemporal cortex, which is the center of visual cognition. Finally, we discuss the computation of information integration in middle vision and pattern representation in the general framework of visual computation proposed by Kawato and Inui (1990).
DOI: 10.14989/JPS_562_22
URI: http://hdl.handle.net/2433/273726
出現コレクション:第562號

アイテムの詳細レコードを表示する

Export to RefWorks


出力フォーマット 


このリポジトリに保管されているアイテムはすべて著作権により保護されています。