More than 1800 participants showed up to discuss their research at this year’s International Conference on Computer Vision and Pattern Recognition (CVPR’12), held in Providence, RI last month. The main conference consisted of three eventful -- and exhausting -- days of talks and poster sessions, supplemented by an additional three days of tutorials and workshops.
This year, I found the CVPR posters to be especially energizing: poster presenters were mobbed by huge crowds that prompted the authors to start early and give encore performances through breaks and into subsequent sessions. Live demos and videos on laptops and tablets were increasingly common and allowed the audience to get a closer look at the research.
Here is a small sampling of papers (both oral and poster) that I particularly enjoyed:
- “Accidental pinhole and pinspeck cameras: revealing the scene outside the picture”: A. Torralba and W. Freeman
- “Weakly Supervised Structured Output Learning for Semantic Segmentation”: A. Vezhnevets, V. Ferrari, and J. Buhmann
- “A Theory of Flat Refractive Geometry”: A. Agrawal, S. Ramalingam, Y.Taguchi, and V. Chari
- “Learning Object Class Detectors from Weakly Annotated Video”: A. Prest, C. Leistner, J. Civera, C. Schmid, and V. Ferrari
Research at Google was very active at CVPR '12:
- Sebastian Thrun presented a plenary talk on self-driving cars
- Dennis Strelow gave an oral presentation, “General and Nested Wiberg Minimization”
- Google researchers co-authored an additional seven papers in the main conference:
- “Collection Flow”: I. Kemelmacher, S.Seitz (Google)
- “Computer Vision Aided Target Linked Radiation Imaging”: D. Gao, Y. Yao, F. Pan, T. Yu, L. Guan, B. Yu, T.-P. Tian, D. Walter, B. Yanoff, & N. Krahnstoever (Google)
- “D-Nets: Beyond Patch-Based Image Descriptors”: F. von Hundelshausen, R. Sukthankar (Google)
- “Model Recommendation for Action Recognition”: P. Matikainen, R. Sukthankar (Google), & M. Hebert
- “Refractive Height Fields from Single and Multiple Images”: Q. Shan, S. Agarwal (Google), & B. Curless
- “Schematic Surface Reconstruction”: C. Wu (Google), S. Agarwal (Google), B. Curless, S. Seitz (Google)
- “Visibility Based Preconditioning for Bundle Adjustment”: A. Kushnal, S. Agarwal (Google)
- Three invited talks were given in CVPR workshops:
- “Monoscopic to Stereoscopic Conversion of YouTube Videos”: D. Mukherjee, C. Wu (both Google)
- “Machine Perception for Content Discovery at YouTube”: A. Natsev (Google)
- “Algorithmic Frontiers in Computer Vision”: H. Neven (Google)
- Two tutorials:
- “Deep Learning Methods for Vision”: R. Fergus, H. Lee, M.'A. Ranzato (Google), G. Taylor, R. Salakhutdinov, & K. Yu
- “Python for MATLAB Users: Promoting Open Source Computer Vision Research”: M. Leotta, A. Perera, E. Swears, P. Reynolds, Y. Zhao (Google), & V. Ganapathi (Google)
For me, the best part of CVPR was talking with graduate students about their work: at the doctoral consortium, during poster sessions and at the Google booth (where interesting demos and swag drew large crowds).
Since becoming a part of Research at Google last year, I’ve been particularly excited about the idea of training spatiotemporally localized object and action detectors from lots of video, with minimal human supervision -- a goal that seemed both technically and computationally infeasible until recently. It’s great to see that many in the CVPR community share my belief that we’re now ready to learn from large-scale video and we’ve decided to organize a AAAI Spring Symposium on this topic.
Next year’s CVPR will be held in Portland, OR. I look forward to seeing many of you there!
![]() |
M. Grundmann and V. Kwatra present the YouTube video stabilization demo |
Thank u for this wonderful article and sharing your view with us. Keep posting!!
ReplyDeletelaptops-and-tablets
I always love searching online courses for my kids. I usually do research some websites or visit this site that features different articles.
ReplyDelete