NATEA-SV SIG-Soft Seminar


Date: Jan 20, 2005 (Thursday)

Time: 6:30 - 9:15 PM

6:30-7:30 Dinner

7:45-8:30 First talk

8:30-9:15 Second talk

Place: Formosa Restaurant, 1655 S. De Anza Blvd., Cupertino, CA 95014 (Tel: 408-257-1120)

Fee: Free admission for the talks; participants pay for their own dinner (~$10)


First Talk: Personal Image Organization and Search




Major VCs and search-engine CEOs have recognized in a recent technical forum (hosted by PBS in the Bay Area) that the next generation search engines must be able to deal with multimedia data (images/videos) in a mobile environment. As the number of cell phones and digital cameras in circulation continues to soar, and the storage cost/capacity ratio continues to drop, the quantities of digital photos are accumulating rapidly.  To organize large repositories of photos for efficient sharing, browsing, and searching, it is highly desirable to provide photos with meaningful metadata such as time (when), location (where), names of people (who), landmarks (what), and events (inferred from when, who, where, and what).  Although digital cameras can provide the when and where, obtaining the who and what will depend largely on analyzing photo content and the relationships among events and photographs. And the grand challenge -- event recognition -- will depend on intelligent detection of the synergistic relationship between context and content.   In this talk, I will discuss some fundamental issues of building a context and content combined image annotation system, and our team’s (consisting of members from UCSB, Stanford, and VIMA) recent endeavors for developing a personalized image organizer and search engine.




Professor Edward Chang received his M.S. in Computer Science and PhD in Electrical Engineering at Stanford University in 1994 and 1999, respectively. Since 2003, he is an Associate Professor of Electrical and Computer Engineering at the University of California, Santa Barbara. His recent research activities are in the areas of machine learning, data mining, high-dimensional data indexing and their applications to image databases and video surveillance. Recent research contributions of his group include methods for learning image/video query concepts via active learning, formulating distance functions via dynamic associations and kernel alignment, managing and fusing distributed video-sensor data, and categorizing and indexing high-dimensional image/video information. Professor Chang has served on several conference program committees including ACM SIGMOD, ACM Multimedia, ACM CIKM, SIAM Data Mining, International Conference on Artificial Intelligence, International Conference on Computer Vision, and etc. He co-chaired the first two annual ACM Video Sensor Network Workshop in 2003 and 2004, and will co-chair three major Multimedia conferences in the next two years. He serves as an Associate Editor for IEEE Transactions on Knowledge and Data Engineering. Professor Chang is a recipient of the IBM Faculty Partnership Award and the NSF Career Award. He is a co-founder of VIMA Technologies, which provides image searching and filtering solutions.


Second Talk: Video Semantic Analysis and Retrieval




With the growing amount of multimedia content and increasing popularity of search engines, people are more enthusiastic about viewing personalized videos specifically catered to the users needs.  People are only interested in accessing media contents that match their preferences and displaying them on their devices.  Because of the existence of heterogeneous data sources and user clients, it is a real challenge to implement a universally compliant system that satisfies various usage environments. 


A multimedia semantic framework is introduced to analyze the media and generate semantic models in order to pursue the understanding of contents.  In this talk, we present our efforts in developing semantic representation for video understanding through the IBM VideoAnnEx MPEG-7 annotation tool.  A video personalization and summarization system is demonstrated to dynamically generate a personalized video summary based on user preference and usage environment.  Experimental results for semantic modeling are shown in the context of the TREC Video Retrieval Benchmark.




Belle Tseng is a Senior Research Staff Member at the Intelligent Information Integration Department at NEC Laboratories America, where she leads a team on Enterprise Activity Intelligence – a system for real-time decision support with knowledge discovery through user behavior and social network analysis.  She received her Ph.D. in Electrical Engineering from Columbia University in 1996 and her M.S. and B.S. in Electrical Engineering and Mathematics from MIT in 1992.  Before joining NEC, Belle worked at IBM T. J. Watson Research Center in the Pervasive Multimedia Management department on research projects in multimedia database, semantic understanding, personalization and summarization using MPEG-7 and MPEG-21.  Her contributions include the MPEG-2 stereoscopic CODEC, immersive whiteboard collaboration system, MPEG-7 rich media summarization, and MPEG-21 personalized content adaptation.  Belle's team performed best in NIST TREC video semantic retrieval benchmarking in 2001, and video concept detection benchmarking in 2002.



Please send your RSVP to Yen-Kuang Chen, Although RSVP is not required, it helps the organizing committee to arrange the seminar and dinner better.