Human actions capture a wide variety of interactions between people and objects. As a result, the set of possible actions is extremely large and it is difficult to obtain sufficient training examples for all actions. However, we could compensate for this sparsity in supervision by leveraging the rich semantic relationship between different actions. A single action is often composed of other smaller actions and is exclusive of certain others. We need a method which can reason about such relationships and extrapolate unobserved actions from known actions. Hence, we propose a novel neural network framework which jointly extracts the relationship between actions and uses them for training better action retrieval models. Our model incorporates linguistic, visual and logical consistency based cues to effectively identify these relationships. We train and test
our model on a largescale image dataset of human actions. We show a significant improvement in mean AP compared to different baseline methods including the HEX-graph approach from Deng et al.
Etiket: Jia Deng
CVPR2015 : Büyük Ölçekli Görsel Tanıma Yarışması Eğitimi
ImageNet Büyük Ölçekli Görsel Tanıma Yarışması (ImageNet Large Scale Visual Recognition Challenge – ILSVRC) yüzlerce obje kategorisi ve milyonlarca resimden obje sınıflandırmaya yönelik yapılan bir faaliyettir. Yarışma 2010 yılından günümüze kadar yıllık olarak yapılmakta ve 50’nin üzerinde kuruluş tarafından katılım sağlanmaktadır.
Yarışmaya katılmak isteyenleri eğitmeyi amaçlayan bir çalışma 7 Haziran 2015 tarihinde icra edilmiştir. Eğitim kapsamında icra edilen sunumlar aşağıda yer almaktadır.
Makale: ImageNet Large Scale Visual Recognition Challenge
Everything you wanted to know about ILSVRC: data collection, results, trends over the years, current computer vision accuracy, even a stab at computer vision vs. human vision accuracy — all here!