11eyesova,11eyesova下载

文史通2年前历史故事问答312

Paper 1 : Multi-Task Convolutional Neural Network for Face Recognition

Authors: Yin Xi, Liu Xiaoming

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: This paper explores multi-task learning (MTL) for face recognition. We answerthe questions of how and why MTL can improve the face recognition performance.First, we propose a multi-task Convolutional Neural Network (CNN) for facerecognition where identity recognition is the main task and pose, illumination,and expression estimations are the side tasks. Second, we develop adynamic-weighting scheme to automatically assign the loss weight to each sidetask. Third, we propose a pose-directed multi-task CNN by grouping differentposes to learn pose-specific identity features, simultaneously across allposes. We observe that the side tasks serve as regularizations to disentanglethe variations from the learnt identity features. Extensive experiments on theentire Multi-PIE dataset demonstrate the effectiveness of the proposedapproach. To the best of our knowledge, this is the first work using all datain Multi-PIE for face recognition. Our approach is also applicable toin-the-wild datasets for pose-invariant face recognition and we achievecomparable or better performance than state of the art on LFW, CFP, and IJB-A.

Abstract_CN: 本文探讨了多任务学习(MTL)的人脸识别。我们回答的问题是如何以及为什么MTL可以提高人脸识别的性能。首先,我们提出了一种多任务卷积神经网络(美国有线电视新闻网)用于人脸识别,身份识别和姿态,光照和表情的主要任务,估计是边任务。其次,我们开发了动态加权方案自动分配给每个sidetask减肥。第三,我们提出了一种姿态的学习构成特定的身份特征differentposes分组定向多任务的美国有线电视新闻网,同时在allposes。我们观察到,一边任务作为调整到disentanglethe变化从学的身份特征。在整个数据集的多派大量的实验表明,提出的方法的有效性。据我们所知,这是使用所有数据的多重人脸识别的第一工作。我们的方法也适用在姿态不变的人脸识别野生数据和我们achievecomparable或更好的性能比艺术在LFW,CFP的状态,和ijb-a.

展开全文

Paper 2 : Visual Discovery at Pinterest

Authors: Zhai Andrew, Kislyuk Dmitry, Jing Yushi, Feng Michael, Tzeng Eric, Donahue Jeff, Du Yue Li, Darrell Trevor

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Over the past three years Pinterest has experimented with several visualsearch and recommendation services, including Related Pins (2014), SimilarLooks (2015), Flashlight (2016) and Lens (2017). This paper presents anoverview of our visual discovery engine powering these services, and shares therationales behind our technical and product decisions such as the use of objectdetection and interactive user interfaces. We conclude that this visualdiscovery engine significantly improves engagement in both search andrecommendation tasks.

Abstract_CN: 在过去的三年里,Pinterest尝试了一些视觉搜索和推荐服务,包括相关的引脚(2014),similarlooks(2015)、手电筒(2016)和镜头(2017)。本文介绍了我们的视觉发现引擎供电这些服务的概况,并分享我们的技术和产品therationales决定,如目标检测和交互式用户界面背后的使用。我们认为这visualdiscovery引擎能提高在搜索和推荐任务的参与。

Paper 3 : Handwritten Arabic Numeral Recognition using Deep Learning Neural Networks

11eyesova,11eyesova下载

Authors: Ashiquzzaman Akm, Tushar Abdul Kawsar

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Handwritten character recognition is an active area of research withapplications in numerous fields. Past and recent works in this field haveconcentrated on various languages. Arabic is one language where the scope ofresearch is still widespread, with it being one of the most popular languagesin the world and being syntactically different from other major languages. Daset al. \cite{DBLP:journals/corr/abs-1003-1891} has pioneered the research forhandwritten digit recognition in Arabic. In this paper, we propose a novelalgorithm based on deep learning neural networks using appropriate activationfunction and regularization layer, which shows significantly improved accuracycompared to the existing Arabic numeral recognition methods. The proposed modelgives 97.4 percent accuracy, which is the recorded highest accuracy of thedataset used in the experiment. We also propose a modification of the methoddescribed in \cite{DBLP:journals/corr/abs-1003-1891}, where our method scoresidentical accuracy as that of \cite{DBLP:journals/corr/abs-1003-1891}, with thevalue of 93.8 percent.

Abstract_CN: 手写字符识别在众多领域的研究withapplications有源区。过去和最近的作品在这一领域集中在各种语言。阿拉伯语是一种语言,研究的范围仍然是广泛的,它是一种最流行的语言世界和其他主要语言的语法结构不同。daset铝。\引用{ DBLP:期刊/记者/ abs-1003-1891 }已率先研究forhandwritten数字识别阿拉伯语。在本文中,我们提出了一种新的基于深度学习神经网络的激活函数,使用适当的正则化层,这显示了显着改善的精度存在的阿拉伯数字识别方法。该modelgives百分之97.4的准确率,这是记录的最高精度的实验中所用的数据集。我们还提出了一个修改的方法\引用{ DBLP:期刊/记者/ abs-1003-1891 },我们的方法的准确率,scoresidentical \引用{ DBLP:期刊/记者/ abs-1003-1891 },与价值百分之93.8。

11eyesova,11eyesova下载

Paper 4 : Computational Model for Predicting Visual Fixations from Childhood to Adulthood

Authors: Meur Olivier Le, Coutrot Antoine, Liu Zhi, Roch Adrien Le, Helo Andrea, Rama Pia

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: How people look at visual information reveals fundamental information aboutthemselves, their interests and their state of mind. While previous visualattention models output static 2-dimensional saliency maps, saccadic models aimto predict not only where observers look at but also how they move their eyesto explore the scene. Here we demonstrate that saccadic models are a flexibleframework that can be tailored to emulate observer’s viewing tendencies. Morespecifically, we use the eye data from 101 observers split in 5 age groups(adults, 8-10 y.o., 6-8 y.o., 4-6 y.o. and 2 y.o.) to train our saccadic modelfor different stages of the development of the human visual system. We showthat the joint distribution of saccade amplitude and orientation is a visualsignature specific to each age group, and can be used to generate age-dependentscanpaths. Our age-dependent saccadic model not only outputs human-like,age-specific visual scanpath, but also significantly outperforms otherstate-of-the-art saliency models. In this paper, we demonstrate that thecomputational modelling of visual attention, through the use of saccadic model,can be efficiently adapted to emulate the gaze behavior of a specific group ofobservers.

Abstract_CN: 人们如何看待视觉信息基本信息揭示他们自己,他们的利益和他们的精神状态。虽然以前的视觉注意模型输出的静态二维显著图,扫视模型目的预测不仅在观察人士也把他们的目光发掘现场。在这里,我们表明,动物模型是一flexibleframework,可以根据观察者的观察倾向模仿。更具体地说,我们使用眼睛的数据来自101观察员在5个年龄组分(8-10岁的成年人,6-8岁,4-6岁,2岁)来训练我们的动物模型不同阶段的人类视觉系统的发展。我们表明,眼跳幅度和方向的联合分布是一个特定于每个年龄组visualsignature,可以用来产生年龄dependentscanpaths。我们的年龄相关的扫视模式不仅输出像人类,年龄视觉扫描路径,但也明显优于其它艺术特征模型。在本文中,我们证明了视觉注意计算模型,通过扫视模型的使用,可以有效地适应模拟一组特定ofobservers凝视行为。

Paper 5 : Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

Authors: Zintgraf Luisa M, Cohen Taco S, Adel Tameem, Welling Max

Subject: Computer Vision and Pattern Recognition, Artificial Intelligence

Submitted Date: 20170215

Abstract: This article presents the prediction difference analysis method forvisualizing the response of a deep neural network to a specific input. Whenclassifying images, the method highlights areas in a given input image thatprovide evidence for or against a certain class. It overcomes severalshortcoming of previous methods and provides great additional insight into thedecision making process of classifiers. Making neural network decisionsinterpretable through visualization is important both to improve models and toaccelerate the adoption of black-box classifiers in application areas such asmedicine. We illustrate the method in experiments on natural images (ImageNetdata), as well as medical images (MRI brain scans).

Abstract_CN: 本文提出的预测差异分析方法forvisualizing一深度神经网络的一个特定的输入的响应。在分类时的图像,该方法突出了支持或反对某一类在一个给定的输入图像区域提供证据。它克服了以往方法severalshortcoming进入决策过程的分类提供了巨大的额外的洞察力。通过可视化使神经网络decisionsinterpretable是提高模型的重要应用领域,加快医学采用黑盒的分类。说明我们对自然图像的实验方法(imagenetdata),以及医学图像(MRI扫描)。

Paper 6 : Deep Multi-camera People Detection

Authors: Chavdarova Tatjana, Fleuret François

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Deep architectures are currently the top performing methods for monocularpedestrian detection. Surprisingly, they have not been applied in themulti-camera set-up. This is probably in large part due to the lack oflarge-scale labeled multi-camera data-sets with overlapping fields of view. Ourmain contribution is a strategy in which we re-use a pre-trained objectdetection network, fine-tune it on a large-scale monocular pedestrian data-set,and train an architecture which combines multiple instances of it on a smallmulti-camera data-set.

We estimate performance on both a new HD multi-view data-set, and thestandard one - PETS 2009, on which we outperform state of the art methods.

Abstract_CN: 深层结构是目前最好的方法monocularpedestrian检测。令人惊讶的是,他们没有被应用在多摄像机设置。这可能在很大程度上是由于缺乏大规模的标记的多摄像机数据集重叠的视场。本文的主要贡献是一个战略,我们再利用预先训练探测网络,调整它在大型单眼行人数据集,并训练体系相结合在小多摄像机数据集的多个实例。

我们估计在一个新的高清多角度数据集的性能,和标准的一个宠物2009、可以超越国家的最先进的方法。

Paper 7 : Normalized Total Gradient: A New Measure for Multispectral Image Registration

Authors: Chen Shu-Jie, Shen Hui-Liang

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Image registration is a fundamental issue in multispectral image processing.In filter wheel based multispectral imaging systems, the non-coplanar placementof the filters always causes the misalignment of multiple channel images. Theselective characteristic of spectral response in multispectral imaging raisestwo challenges to image registration. First, the intensity levels of a localregion may be different in individual channel images. Second, the localintensity may vary rapidly in some channel images while keeps stationary inothers. Conventional multimodal measures, such as mutual information,correlation coefficient, and correlation ratio, can register images withdifferent regional intensity levels, but will fail in the circumstance ofsevere local intensity variation. In this paper, a new measure, namelynormalized total gradient (NTG), is proposed for multispectral imageregistration. The NTG is applied on the difference between two channel images.This measure is based on the key assumption (observation) that the gradient ofdifference image between two aligned channel images is sparser than thatbetween two misaligned ones. A registration framework, which incorporates imagepyramid and global/local optimization, is further introduced for rigidtransform. Experimental results validate that the proposed method is effectivefor multispectral image registration and performs better than conventionalmethods.

Abstract_CN: 图像配准是图像处理的一个基本问题。在滤光轮为基础的多光谱成像系统,非共面放置过滤器总是导致多通道图像的错位。在多光谱成像raisestwo挑战光谱响应特征的图像配准的基础。首先,一个局域的强度水平可能是单通道图像的不同。其次,在localintensity可以迅速变化的一些通道图像的同时保持固定的别人。传统的多式联运的措施,如互信息、相关系数和相关比率,可以登记图像不同区域的强度水平,但会在情况严重的局部强度变化不。在本文中,一个新的措施,namelynormalized总梯度(NTG),提出了多光谱图像配准。NTG是应用于双通道图像之间的差异。这项措施是基于关键的假设(观察),两对准通道图像之间的梯度差异图像比,两对齐的。注册框架,其中包括金字塔和全局/局部优化,进一步介绍了rigidtransform。实验结果验证了所提出的方法是有效的多光谱图像配准和比常规方法进行。

Paper 8 : A deep learning model integrating FCNNs and CRFs for brain tumor segmentation

Authors: Zhao Xiaomei, Wu Yihong, Song Guidong, Li Zhenye, Zhang Yazhuo, Fan Yong

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Accurate and reliable brain tumor segmentation is a critical component incancer diagnosis, treatment planning, and treatment outcome evaluation. Deeplearning techniques are appealing for their capability of learning high leveltask-adaptive image features and have been adopted in brain tumor segmentationstudies. However, most of the existing deep learning based brain tumorsegmentation methods are not equipped to ensure appearance and spatialconsistency of the segmentation results. To improve tumor segmentationperformance, we propose a novel brain tumor segmentation method by integratingfully convolutional neural networks (FCNNs) and Conditional Random Fields(CRFs) in a unified framework, rather than adopting the CRFs as apost-processing step of the tumor segmentation. Our segmentation model wastrained using image patches and image slices in following steps: 1) trainingFCNNs using image patches; 2) training CRF-RNN using image slices of axial viewwith parameters of FCNNs fixed; and 3) fine-tuning the whole network usingimage slices. Our method could segment brain images slice-by-slice, much fasterthan those image patch based tumor segmentation methods. Our method couldsegment tumors based on only 3 imaging modalities (Flair, T1c, T2), rather than4 (Flair, T1, T1c, T2). We have evaluated our method based on imaging dataprovided by the Multimodal Brain Tumor Image Segmentation Challenge (BRATS)2013 and the BRATS 2016. Our method was ranked at the 1st place on the 2013Leaderboard dataset, the 2nd place on the 2013 Challenge dataset, and at the1st place in multi-temporal evaluation in the BRATS 2016.

Abstract_CN: 准确和可靠的脑肿瘤分割是一个重要的组成部分在肿瘤诊断、治疗计划和治疗疗效的评价。深度学习技术,呼吁他们学习高leveltask自适应图像特征的能力,已在脑肿瘤segmentationstudies采用。然而,大多数现有的深度学习的脑tumorsegmentation方法无法保证分割结果的外观和spatialconsistency。提高肿瘤segmentationperformance,我们提出了一个新的脑肿瘤分割方法integratingfully卷积神经网络(问题)和条件随机场(CRF)在一个统一的框架,而不是采用条件随机场作为肿瘤分割后处理步骤。我们的分割模型训练,在下面的步骤使用图像块和切片图像:1)利用图像块trainingfcnns;2)培训crf-rnn利用FCNNs固定轴用参数图像切片;3)微调整个网络图像切片。我们的方法可以部分的脑图像一片一片的,远远超过基于这些图像块肿瘤分割方法。我们的方法的基础上couldsegment肿瘤只有3成像(FLAIR,T1C,T2),而4(FLAIR,T1,T1C,T2)。我们已经评估了我们的方法基于图像数据的多模态脑肿瘤图像分割的挑战(小鬼)2013和小2016。我们的方法是排在第一位的2013leaderboard数据,第二对2013的挑战数据集,并在孩子2016多时间评估第一的地方。

Paper 9 : Application of Multi-channel 3D-cube Successive Convolution Network for Convective Storm Nowcasting

Authors: Zhang Wei, Han Lei, Sun Juanzhen, Guo Hanyang, Dai Jie

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Convective storm nowcasting has attracted substantial attention in variousfields. Existing methods under a deep learning framework rely primarily onradar data. Although they perform nowcast storm advection well, it is stillchallenging to nowcast storm initiation and growth, due to the limitations ofthe radar observations. This paper describes the first attempt to nowcast storminitiation, growth, and advection simultaneously under a deep learningframework using multi-source meteorological data. To this end, we present amulti-channel 3D-cube successive convolution network (3D-SCN). As real-timere-analysis meteorological data can now provide valuable atmospheric boundarylayer thermal dynamic information, which is essential to predict storminitiation and growth, both raw 3D radar and re-analysis data are used directlywithout any handcraft feature engineering. These data are formulated asmulti-channel 3D cubes, to be fed into our network, which are convolved bycross-channel 3D convolutions. By stacking successive convolutional layerswithout pooling, we build an end-to-end trainable model for nowcasting.Experimental results show that deep learning methods achieve better performancethan traditional extrapolation methods. The qualitative analyses of 3D-SCN showencouraging results of nowcasting of storm initiation, growth, and advection.

Abstract_CN: 对流天气临近预报已在各个领域,吸引了大量的关注。现有的方法深入学习框架下主要依赖雷达数据。虽然他们执行风暴对流临近预报,这是stillchallenging临近预报风暴的发生和生长,由于条件的限制,雷达观测。本文介绍了临近预报的storminitiation,首次增长,和对流同时使用多源气象数据learningframework下深。为此,我们提出了多通道三维立方体连续卷积网络(3d-scn)。真正的身体分析气象数据现在可以提供宝贵的大气边界层的热动态信息,这是预测storminitiation和成长的必要条件,原3D雷达和重新分析数据使用不经任何手工特征工程。这些数据是制定多通道三维立方体,可以进入我们的网络,这是交叉通道三维卷积的卷积。通过叠加连续卷积layerswithout池,我们建立一个端到端的可预报模型。实验结果表明,深度学习方法达到更好的性能优于传统的外推方法。对3d-scn showencouraging结果风暴引发、增长和对流临近预报的定性分析。

Paper 10 : Recognizing Dynamic Scenes with Deep Dual Deor based on Key Frames and Key Segments

Authors: Hong Sungeun, Ryu Jongbin, Im Woobin, Yang Hyun S.

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Dynamic scene recognition is a challenging problem in characterizing acollection of static appearances and dynamic patterns in moving scenes. Whileexisting methods focus on reliable capturing of static and dynamic information,few works have explored frame selection from a dynamic scene sequence. In thispaper, we propose dynamic scene recognition using a deep dual deor basedon ‘key frames’ and ‘key segments.’ Key frames that reflect the featuredistribution of the sequence with a small number are used for capturing salientstatic appearances. Key segments, which are captured from the area around eachkey frame, provide an additional discriminative power by dynamic patternswithin short time intervals. To this end, two types of transferredconvolutional neural network features are used in our approach. A fullyconnected layer is used to select the key frames and key segments, while theconvolutional layer is used to describe them. We conducted experiments usingpublic datasets. Owing to a lack of benchmark datasets, we constructed adataset comprised of 23 dynamic scene classes with 10 videos per class. Theevaluation results demonstrated the state-of-the-art performance of theproposed method.

Abstract_CN: 动态场景识别是表征一静态表现和动态模式在移动场景中一个具有挑战性的问题。whileexisting方法侧重于静态和动态信息的可靠捕获,很少有作品探索帧选择从动态场景序列。在本文中,我们提出了动态场景识别使用基于关键帧的深度对偶描述子和关键环节,关键帧,反映一小部分序列的特征分布是用来捕捉salientstatic出场。关键环节,这是来自各重点框架的区域捕获,通过动态参于短时间内提供额外的辨别力。为此,两种transferredconvolutional神经网络功能的使用方法。一个连接层是用来选择关键帧和关键环节,而卷积层用来形容他们。我们进行的实验usingpublic数据集。由于缺乏基准数据集,我们构建了由23个动态场景班,每班10视频数据。评价结果表明,该方法的先进性。

Paper 11 : Deep Heterogeneous Feature Fusion for Template-Based Face Recognition

Authors: Bodla Navaneeth, Zheng Jingxiao, Xu Hongyu, Chen Jun-Cheng, Castillo Carlos, Chellappa Rama

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Although deep learning has yielded impressive performance for facerecognition, many studies have shown that different networks learn differentfeature maps: while some networks are more receptive to pose and illuminationothers appear to capture more local information. Thus, in this work, we proposea deep heterogeneous feature fusion network to exploit the complementaryinformation present in features generated by different deep convolutionalneural networks (DCNNs) for template-based face recognition, where a templaterefers to a set of still face images or video frames from different sourceswhich introduces more blur, pose, illumination and other variations thantraditional face datasets. The proposed approach efficiently fuses thediscriminative information of different deep features by 1) jointly learningthe non-linear high-dimensional projection of the deep features and 2)generating a more discriminative template representation which preserves theinherent geometry of the deep features in the feature space. Experimentalresults on the IARPA Janus Challenge Set 3 (Janus CS3) dataset demonstrate thatthe proposed method can effectively improve the recognition performance. Inaddition, we also present a series of covariate experiments on the faceverification task for in-depth qualitative evaluations for the proposedapproach.

Abstract_CN: 虽然已经取得了令人印象深刻的表现深度学习的人脸识别算法,许多研究表明,不同的网络学习differentfeature地图:有些网络更容易接受的姿势和illuminationothers似乎捕捉更多的本地信息。因此,在这项工作中,我们提出了一种深刻的异质特征融合网络利用不同深convolutionalneural网络产生的互补信息特征(DCNNs)基于人脸识别的模板,在templaterefers一组静态人脸图像或视频帧从不同来源介绍更模糊,姿势,照明和其他的变化传统的人脸数据集。该方法有效地融合不同深度特征判别式信息1)共同学习的非线性高维投影的深层特征和2)产生更多的歧视性的模板表示保留在特征空间的内在几何特征的深。对实验组3(IARPA Janus挑战Janus CS3)数据表明,该方法可以有效地提高识别性能。此外,我们还提出了一系列的变量实验对faceverification任务进行深入的定性评价的方法。

Paper 12 : Analyzing the Weighted Nuclear Norm Minimization and Nuclear Norm Minimization based on Group Sparse Representation

Authors: Zha Zhiyuan, Zhang Xinggan, Wang Qiong, Bai Yechao, Tang Lan

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: The nuclear norm minimization (NNM) tends to over-shrink the rank componentsand treats the different rank components equally, thus limits its capabilityand flexibility in practical applications. Recent advances have suggested thatthe weighted nuclear norm minimization (WNNM) is expected to be moreappropriate than NNM. However, it still lacks a plausible mathematicalexplanation why WNNM is more appropriate than NNM. In this paper, we analyzethe WNNM and NNM from a point of the group sparse representation (GSR).Firstly, an adaptive dictionary for each group is designed. Then we showmathematically that WNNM is more appropriate than NNM. We exploit the proposedscheme to two typical low level vision tasks, including image deblurring andimage compressive sensing (CS) recovery. Experimental results have demonstratedthat the proposed scheme outperforms many state-of-the-art methods.

Abstract_CN: 核范数最小化(NNM)趋向于过度收缩的等级和不同等级的部件同样对待,从而限制了其在实际应用能力和灵活性。最近的研究表明,加权核范数最小化(wnnm)预计将合适的NNM。然而,它仍然缺乏一个合理的mathematicalexplanation为什么wnnm比NNM更合适。在本文中,我们分析了wnnm和NNM从一个点的稀疏表示(GSR)。首先,对每一组自适应字典的设计。然后我们showmathematically,wnnm比NNM更合适。我们利用该算法对两个典型的低层视觉任务,包括图像去模糊的图像压缩感知(CS)的恢复。实验结果表明,该方案优于许多国家的最先进的方法。

Paper 13 : Learning from Ambiguously Labeled Face Images

Authors: Chen Ching-Hui, Patel Vishal M., Chellappa Rama

Subject: Computer Vision and Pattern Recognition

Submitted Date: 20170215

Abstract: Learning a classifier from ambiguously labeled face images is challengingsince training images are not always explicitly-labeled. For instance, faceimages of two persons in a news photo are not explicitly labeled by their namesin the caption. We propose a Matrix Completion for Ambiguity Resolution (MCar)method for predicting the actual labels from ambiguously labeled images. Thisstep is followed by learning a standard supervised classifier from thedisambiguated labels to classify new images. To prevent the majority labelsfrom dominating the result of MCar, we generalize MCar to a weighted MCar(WMCar) that handles label imbalance. Since WMCar outputs a soft labelingvector of reduced ambiguity for each instance, we can iteratively refine it byfeeding it as the input to WMCar. Nevertheless, such an iterativeimplementation can be affected by the noisy soft labeling vectors, and thus theperformance may degrade. Our proposed Iterative Candidate Elimination (ICE)procedure makes the iterative ambiguity resolution possible by graduallyeliminating a portion of least likely candidates in ambiguously labeled face.We further extend MCar to incorporate the labeling constraints betweeninstances when such prior knowledge is available. Compared to existing methods,our approach demonstrates improvement on several ambiguously labeled datasets.

Abstract_CN: 学习一个分类的模糊标记的人脸图像challengingsince训练图像并不总是明确标记。例如,在一个新闻照片,两人没有明确标记的人脸图像的名字的标题。我们提出了一个模糊分辨率矩阵(MCAR)从隐约的标记图像预测实际标签的方法。Thisstep跟着学习标准的监督thedisambiguated标签分类新的图像分类。防止多数labelsfrom统治MCar的结果,我们推广了MCar一MCar(wmcar)加权处理标签不平衡。由于WMCar的输出为每个实例的软labelingvector减少歧义,我们可以反复地精炼它采用它作为WMCar输入。然而,这样一个iterativeimplementation可以由嘈杂的软标签载体的影响,因此性能可能会降低。我们提出的迭代候选消除(冰)程序使迭代解模糊的graduallyeliminating最小可能的候选人的一部分在隐约标记的脸。我们进一步扩大MCar将标签的限制betweeninstances这样的先验知识是可用时。与现有的方法相比,我们的方法演示了几个隐约的标记数据的改进。

Paper 14 : Filling missing data in point clouds by merging structured and unstructured point clouds

Authors: Lippoldt Franziska, Schwandt Hartmut

Subject: Computational Geometry, Computer Vision and Pattern Recognition, Discrete Mathematics

Submitted Date: 20170215

Abstract: Point clouds arising from structured data, mainly as a result of CT scans,provides special properties on the distribution of points and the distancesbetween those. Yet often, the amount of data provided can not compare tounstructured point clouds, i.e. data that arises from 3D light scans or laserscans. This article hereby proposes an approach to extend structured data andenhance the quality by inserting selected points from an unstructured pointcloud. The resulting point cloud still has a partial structure that is called”half-structure”. In this way, missing data that can not be optimally recoveredthrough other surface reconstruction methods can be completed.

Abstract_CN: 点云所产生的结构化数据,主要是由于CT扫描,提供特殊性能对点的分布和距离之间的那些。但往往,提供的数据量比不上tounstructured点云数据,即来自三维光扫描或laserscans。本文提出了一种扩展的结构化数据的方法提高质量的插入点从非结构化的点云数据。由此产生的点云仍然有一个部分结构,被称为“半结构”。在这种方式中,数据丢失,无法进行recoveredthrough其他表面重建方法可以完成。

本文首发于网站,版权所有,侵权必究。

本文永久链接:

欢迎您扫一扫上面的微信公众号,订阅图灵之心,及时获取每日论文信息!

标签: eyesova11