CINXE.COM

Visual Geometry Group - University of Oxford

<!doctype html> <html lang="en"> <head> <!-- Required meta tags --> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta name="description" content="Computer Vision group from the University of Oxford"> <meta name="keywords" content="vgg,oxford,computer vision,machine learning,research,software,publications,data"> <!-- Plugins CSS --> <link rel="stylesheet" href="/~vgg/assets/css/bootstrap.min.css"> <link rel="stylesheet" href="/~vgg/assets/fonts/themify/themify-icons.css"> <link rel="stylesheet" href="/~vgg/assets/css/slick.css"> <link rel="stylesheet" href="/~vgg/assets/css/slick-theme.css"> <link rel="stylesheet" href="/~vgg/assets/css/all.css"> <!-- Theme CSS --> <link rel="stylesheet" href="/~vgg/assets/css/style.css"> <link rel="stylesheet" href="/~vgg/assets/css/responsive.css"> <!-- cookiealert styles, see /~vgg/assets/css/cookiealert/LICENSE --> <link rel="stylesheet" href="/~vgg/assets/css/cookiealert/cookiealert.css"> <title>Visual Geometry Group - University of Oxford</title> </head> <body class="top-header"> <!-- LOADER TEMPLATE <div id="page-loader"> <div class="loader-icon fa fa-spin colored-border"></div> </div> <!-- /LOADER TEMPLATE --> <!-- COOKIE MESSAGE ================================================== --> <div class="alert text-center cookiealert" role="alert"> This website uses Google Analytics to help us improve the website content. This requires the use of standard Google Analytics cookies, as well as a cookie to record your response to this confirmation request. If this is OK with you, please click 'Accept cookies', otherwise you will see this notice on every page. For more information, please <span class="current-year"><a href="http://www.admin.ox.ac.uk/dataprotection/cookies/">click here</a></span> <button type="button" class="btn btn-primary btn-sm acceptcookies" aria-label="Close"> Accept cookies </button> </div> <!-- /COOKIE MESSAGE --> <!-- HEADER AREA ================================================== --> <section class="banner-area py-2"> <!-- INCLUDE VGG NAVBAR --> <div id="vggnavbar"></div> <!-- Content --> <div class="container"> <div class="row align-items-center"> <div class="col text-left"> <!-- Heading --> <h1 class="font-weight-bold mb-0 banner-area-subpage-title"> Research </h1> </div> <!-- / .col --> <!-- INCLUDE VGG SOCIALS --> <div id="vggsocials"></div> </div> <!-- / .row --> </div> <!-- / .container --> </section> <!-- ACTUAL PAGE CONTENT ================================================== --> <section id="about" class="section"> <div class="container"> <p>Click on a dataset category to expand/collapse it. <strong><span id="expand_categories">Click here to expand</span></strong> ALL categories. <strong><span id="collapse_categories">Click here to collapse</span></strong> ALL categories.</p> <button class="collapsible active">Sign language recognition</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Sign language recognition --> <div class="row"> <!-- litfic --> <div class="col-md-6"> <div class="service-block media"> <a href="litfic/"><img src="ico_litfic.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="litfic/">Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues</a></h3> <p> Learn about our sign language translation method which incorporates context allowing it to generate more complete and meaningful translations. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="transpeller/"><img src="ico_transpeller.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="transpeller/">Weakly-supervised Fingerspelling Recognition in British Sign Language Videos</a></h3> <p> We propose the Transpeller model to recognize fingerspelt words in BSL videos. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="bsldensify/"><img src="ico_bsldensify.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="bsldensify/">Automatic dense annotation of large-vocabulary sign language videos</a></h3> <p> We propose a simple, scalable framework to vastly increase the density of automatic annotations in sign language interpreted TV broadcasts. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="bslalign/"><img src="bslalign_thumbnail.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="bslalign/">Aligning Subtitles in Sign Language Videos</a></h3> <p>We propose a Transformer architecture to temporally align asynchronous subtitles in sign language videos.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="bslattend/"><img src="bslattend.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="bslattend/">Read and Attend: Temporal Localisation in Sign Language Videos</a></h3> <p>We show that the ability to localise signs emerges from the attention patterns of the Transformer sequence prediction model.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="signsegmentation/"><img src="ico_signsegmentation.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="signsegmentation/">Sign Language Segmentation with Temporal Convolutional Networks</a></h3> <p>We determine the location of temporal boundaries between signs in continuous sign language videos.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="bsldict/"><img src="ico_bsldict.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="bsldict/">Learning to spot signs from multiple supervisors</a></h3> <p>For a given sign and its corresponding dictionary video, our task is to identify whether and where it has occured in a continuous sign language video.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="bsl1k/"><img src="ico_bsl1k.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="bsl1k/">BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues</a></h3> <p>We introduce a new scalable approach to data collection for sign recognition in continuous videos.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="sign_language_new/"><img src="ico_sign.gif" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="sign_language_new/">Learning Sign Language by watching TV</a></h3> <p>Learning sign language from TV broadcasts using a combination of strong and weak supervision.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="pose_track/index.html"><img src="ico_posetrack.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="pose_track/index.html">Upper Body Pose Estimation and Tracking</a></h3> <p>Fast and accurate upper body pose estimation over long video sequences using a random forest framework with pose structured output.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Self-supervised learning</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Self-supervised learning --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="lrtl/"><img src="ico_lrtl.gif" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="lrtl/">Learning segmentation from point trajectories</a></h3> <p>We address video object segmentation using motion as the sole supervision, introducing a method that leverages long-term point trajectories alongside optical flow.</p> </div> </div> </div> <div class="col-md-6"> <div class="service-block media"> <a href="loco/"><img src="ico_loco.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="loco/">LoCo: Learning 3D Location-Consistent Image Features with a Memory-Ef铿乧ient Ranking Loss</a></h3> <p>A memory-efficient ranking loss helps to scale up learning of image features that are consistent under large viewpoint changes.</p> </div> </div> </div> <div class="col-md-6"> <div class="service-block media"> <a href="derender3d/"><img src="ico_derender3d.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="derender3d/">De-rendering 3D Objects in the Wild</a></h3> <p>A method for de-rendering a 3D object from a single image into shape, material, and lighting, that is trained in a weakly-supervised fashion relying only on rough shape estimates.</p> </div> </div> </div> <div class="col-md-6"> <div class="service-block media"> <a href="unsup-parts/"><img src="unsup-parts/thumbnail.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="unsup-parts/">Unsupervised Part Discovery with Contrastive Reconstruction</a></h3> <p>We propose an unsupervised method to decompose a images of objects into semantically meaningful parts by building a self-supervised task that encourages the model to learning a semantic decomposition.</p> </div> </div> </div> <!-- <div class="col-md-6"> <div class="service-block media"> <a href="sorderender/"><img src="ico_sorderender.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="sorderender/">De-rendering the World's Revolutionary Artefacts</a></h3> <p>Learning to de-render a single image of a vase into shape, material and environment illumination, by training on only a single image collection, without explicit 3D, multi-view or multi-light supervision.</p> </div> </div> </div> --> <div class="col-md-6"> <div class="service-block media"> <a href="CoCLR/"><img src="ico_CoCLR.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="CoCLR/">Self-supervised Co-training for Video Representation Learning</a></h3> <p>Self-supervised video representation learning goes beyond instance discrimination by co-training both RGB and optical flow models.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="avobjects/"><img src="ico_avobjects.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="avobjects/">Self-Supervised Learning of Audio-Visual Objects from Video</a></h3> <p>Transforming a video into a set of discrete audio-visual objects using self-supervised learning.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="selavi/"><img src="ico_selavi.gif" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="selavi/">Labelling unlabelled videos from scratch with multi-modal self-supervision</a></h3> <p>A novel multi-modal clustering method that allows unsupervised pseudo-labelling of a video dataset without any human annotations.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="self-label/"><img src="ico_self_label.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="self-label/">Self-labelling via simultaneous clustering and representation learning</a></h3> <p>Simultaneously learning feature representations and useful dataset labels by optimizing the common cross-entropy loss for features and labels, while maximizing information.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="DPC/"><img src="ico_dpc.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="DPC/">Video Representation Learning by Dense Predictive Coding</a></h3> <p>Self-supervised video representation learning by predicting spatio-temporal features in the future. RGB-stream action classification accuracy higher than ImageNet pretrained weights.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="corrflow/"><img src="ico_corrflow.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="corrflow/">Self-supervised Learning for Video Correspondence Flow</a></h3> <p>The objective of this research is to learn correspondences from videos in a self-supervised manner, the learnt embedding has shown superior performance for dense pixel-wise tracking.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="DVE/"><img src="ico_dve.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="DVE/">Unsupervised learning of landmarks by Descriptor Vector Exchange</a></h3> <p>DVE is a technique for learning high dimensional unsupervised landmarks.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="unsupervised_pose/"><img src="ico_unsupervised_pose.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="unsupervised_pose/">Learning Human Pose from Unaligned Data through Image Translation</a></h3> <p>Learn landmark detectors from unlabelled videos and unaligned pose annotations. No need for paired data/labelled images.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="unsup_learn_watch_faces/"><img src="ico_unsup_learn_watch_faces.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="unsup_learn_watch_faces/">Self-supervised learning of a class-specific representation for faces</a></h3> <p>Self-supervised learning of representations that can later be used in downstream tasks such as emotion prediction or landmark regression.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="probabilistic_introspection/"><img src="ico_probabilistic_introspection.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="probabilistic_introspection/">Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection</a></h3> <p>This research aims at using self-supervision for geometry-oriented tasks such as semantic matching and part detection.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="unsupervised_landmarks/"><img src="ico_unsupervised_landmarks.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="unsupervised_landmarks/">Unsupervised Learning of Object Landmarks through Conditional Image Generation</a></h3> <p>A method that learns to discover object landmarks without any manual annotations.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Audio-visual learning</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Audio-visual learning --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="gestsync/"><img src="ico_gestsync.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="gestsync/">GestSync: Determining who is speaking without a talking head</a></h3> <p> A new synchronisation task - Identify if the person鈥檚 gestures and speech are in-sync or not. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vtp-for-lip-reading/"><img src="ico_vtp-for-lip-reading.gif" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vtp-for-lip-reading/">Sub-word Level Lipreading with Visual Attention</a></h3> <p> We propose Visual Transformer Pooling (VTP) to pay attention to the lip region for lip reading. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="transpotter/"><img src="transpotter/thumbnail.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="transpotter/">Visual Keyword Spotting with Attention</a></h3> <p> We propose the Transpotter, a cross-modal attention based architecture for visual keyword spotting. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="lvs/"><img src="lvs/lvs_icon.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="lvs/">Localizing Visual Sounds the Hard Way</a></h3> <p>Localize sound sources that are visible in a video using hard samples.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vlid/"><img src="ico_vlid.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vlid/">Now you're speaking my language: Video language identification</a></h3> <p>In this work we identify a spoken language just by interpreting the speaker's lip movements.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="sighttosound/"><img src="ico_sighttosound.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="sighttosound/">Sight to Sound: An End-to-end Approach for Visual Piano Transcription</a></h3> <p>Transcribing piano music from visual data alone.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="cross-modal-disentanglement/"><img src="ico_cross_modal_disentanglement.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="cross-modal-disentanglement/">Disentangled Speech Embeddings using Cross-Modal Self-Supervision</a></h3> <p>Disentanglement of speech embeddings into content and identity with only accompanying facetrack as supervision.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="speech2action/"><img src="ico_speech2action.gif" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="speech2action/">Speech2Action: Cross-modal Supervision for Action Recognition</a></h3> <p>Learning a model that predicts actions from transcribed speech alone.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="theconversation/"><img src="ico_conv.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="theconversation/">The Conversation: Deep Audio-Visual Speech Enhancement</a></h3> <p>Isolating individual voices in multi-speaker videos by conditioning on lip movement.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="concealed/"><img src="ico_concealed.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="concealed/">My lips are concealed: Audio-visual speech enhancement through obstructions</a></h3> <p>An audio-visual model for sound source separation robust to visual occlusions.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="deep_lip_reading/"><img src="ico_deep_lip_reading.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="deep_lip_reading/">Deep Lip Reading: A comparison of models and an online application</a></h3> <p>The goal of this work is to develop state-of-the-art models for lip reading -- visual speech recognition.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="speakerID/"><img src="ico_speakerID.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="speakerID/">Utterance-level Aggregation for Speaker Recognition in the Wild</a></h3> <p>The objective of this research is speaker recognition 'in the wild', where utterances may be of variable length and also contain irrelevant signals.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="cross-modal-emotions/"><img src="ico_cross_modal_emotions.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="cross-modal-emotions/">Emotion Recognition in Speech using Cross-Modal Transfer in the Wild</a></h3> <p>Transferring knowledge of emotion from faces to voices.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="LearnablePins/"><img src="ico_learnable_pins.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="LearnablePins/">Learnable PINS: Cross-Modal Embeddings for Person Identity</a></h3> <p>Joint representations can be learnt for voices and faces with no identity supervision.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="CMBiometrics/"><img src="ico_cmbiometrics.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="CMBiometrics/">Seeing Voices and Hearing Faces: Cross-modal biometric matching</a></h3> <p>A network is trained to recognise faces from voices alone and vice versa.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Understanding and training convolutional neural networks</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Understanding and training convolutional neural networks --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="qtae/"><img src="qtae/ico-qtae.gif" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="qtae/">Quantised Transforming Auto-Encoders: Achieving Equivariance to Arbitrary Transformations in Deep Networks</a></h3> <p> We teach networks to predict what an image would look like under a transformation, such as rotation and scale, but also much more general changes such as 3D rotations, object deformations and lighting changes. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="curveball/"><img src="ico_curveball.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="curveball/">Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning</a></h3> <p>Curveball is a fast second-order method that can be used as a drop-in replacement for current deep learning solvers.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="https://dmitryulyanov.github.io/deep_image_prior"><img src="ico_deepquiz.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="https://dmitryulyanov.github.io/deep_image_prior">Deep Image Prior</a></h3> <p>In this work we show that the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="invrep/"><img src="ico_invrep.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="invrep/">Understanding Deep Image Representations by Inverting Them</a></h3> <p>Visualize representations by inverting them back into images with the help of a natural image prior.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="deeptex/"><img src="ico_deeptex.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="deeptex/">Deep Filter Banks for Texture Recognition, Description and Segmentation</a></h3> <p>This research explores the relation between texture representation and deep learning.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="deep_eval/"><img src="ico_deep_eval.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="deep_eval/">Deep Features Evaluation</a></h3> <p>Evaluation of deep convolutional features for image classification.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Search and retrieval of images and video</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Search and retrieval of images and video --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="smooth-ap/"><img src="ico_smooth_ap.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="smooth-ap/">Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval</a></h3> <p>We define an explicit smooth version of the Average Precision Loss, and show it to improve results for image retrieval.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="collaborative-experts/"><img src="ico_collaborative_experts.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="collaborative-experts/">Video retrieval using representations from collaborative experts</a></h3> <p>Collaborative Experts is a framework for combining deep neural networks for text-video retrieval.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="faces-in-places/"><img src="ico_faces_in_places.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="faces-in-places/">Faces in Places: Compound Query Retrieval</a></h3> <p>Retrieving images containing both a target person and a target scene type (e.g. Barack Obama on the beach) from a large dataset of images.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="on-the-fly/"><img src="ico_visor.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="on-the-fly/">Visual Search of BBC News</a></h3> <p>On-the-fly retrieval of object categories, instances and faces using a textual keyword.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="oxbuildings/index.html"><img src="ico_oxbuildings.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="oxbuildings/index.html">Oxford Building Search Demo</a></h3> <p>Search for specific objects in extremely large datasets of images using efficient methods.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vgoogle/index.html"><img src="ico_vgoogle.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vgoogle/index.html">Video Google Demo</a></h3> <p>Retrieve objects or scenes in a movie with the ease, speed and accuracy with which Google retrieves web pages containing particular words.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="fgoogle/index.html"><img src="ico_fgoogle.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="fgoogle/index.html">Video Google Faces</a></h3> <p>Retrieve shots containing particular people/actors in video using an imaged face as the query.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="pose_retrieval/index.html"><img src="ico_pose_retrieval.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="pose_retrieval/index.html">Pose-based Video Retrieval</a></h3> <p>Retrieve humans striking a pose from a database of Hollywood movies in real time.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="pose_estimation/index.html"><img src="ico_pose_estimation.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="pose_estimation/index.html">2D Human Pose Estimation and Search in TV Shows and Movies</a></h3> <p>Estimate the 2D body pose of people in images and video. Search a video dataset for people in a particular pose.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Video-based recognition and understanding</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Video-based recognition and understanding --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="ChimpanzeeFaces/"><img src="ico_chimp_faces.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="ChimpanzeeFaces/">Chimpanzee face recognition from videos in the wild using deep learning</a></h3> <p>Face detection, tracking, and recognition of wild chimpanzees from long-term video records using deep CNNs.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="laeonet/"><img src="ico_laeonet.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="laeonet/">LAEO-Net: revisiting people Looking At Each Other in videos</a></h3> <p>Given two input head tracks and their relative position, the LAEO-Net determines if two people are looking at each other in a video.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="laeo/"><img src="ico_laeo.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="laeo/">Detecting people looking at each other in videos</a></h3> <p>The goal is to localise both spatially and temporally pairs of people looking at each other in video sequences.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="/~vgg/data/Sherlock/"><img src="/~vgg/data/icos/ico_Sherlock.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="/~vgg/data/Sherlock/">Character Identification in TV series without a Script</a></h3> <p>The goal of this work is to recognise people under unconstrained conditions automatically, from TV show and feature film material.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="ubc/"><img src="ico_ubc.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="ubc/">Upper Body Configuration Detection</a></h3> <p>Detecting configurations of one or more people in edited TV material.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="nface/index.html"><img src="ico_nface.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="nface/index.html">Automatic Naming of Characters in TV Video</a></h3> <p>Automatically label television or movie footage with the names of the people present in each frame of the video.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="rface/index.html"><img src="ico_rface.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="rface/index.html">Real-time Person Identification</a></h3> <p>Identify people in video, using a modern multi-core computing architecture, with the only input a video stream obtained from a standard web-cam.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="actions_interactions/"><img src="ico_act_interact.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="actions_interactions/">Recognizing Interactions in TV Shows</a></h3> <p>Temporal and spatial localisation of human interactions in TV shows using global and local context information.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="arrow/"><img src="arrow/ico_arrow.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="arrow/">Seeing the Arrow of Time</a></h3> <p>This work explores whether it is possible to observe Time's Arrow in a temporal sequence.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="continuity/index.html"><img src="ico_continuity.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="continuity/index.html">Visual Continuity Errors in Movies</a></h3> <p>Detect unexplained visual discrepencies in movies by examining pairs of similar shots. For any DVD, produce a ranked list of possible errors automatically.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Counting, detection, reading and tracking</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Counting, detection, reading and tracking --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="countgd/"><img src="ico-countgd.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="countgd/">CountGD: Multi-Modal Open-World Counting</a></h3> <p> A network is trained to count objects of any class specified by text, visual examples, or both. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vgg-heads/"><img src="ico-vgg-heads.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vgg-heads/">VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads</a></h3> <p> We release fully synthetic dataset for human head detection and 3D mesh estimation with over 1 million images generated with diffusion models. cA model trained on it is capable of simultaneous heads detection and head meshes reconstruction from a single image in a single step. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="instance-augmentation/"><img src="ico-instance-augmentation.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="instance-augmentation/">Dataset Enhancement with Instance-Level Augmentations</a></h3> <p> We augment images by redrawing individual objects in the scene retaining their original shape. This allows training with the unchanged class label, e.g., class, segmentation, detection, etc. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="countx/"><img src="ico-countx.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="countx/">CounTX: Open-world Text-specified Object Counting</a></h3> <p> A network is trained to count objects of any class specified by a natural language description. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="tpod/"><img src="ico-tpod.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="tpod/">A Tri-Layer Plugin to Improve Occluded Detection</a></h3> <p> Formulating 'occluded detection' by setting up 2 benchmarks, and trying to handle occlusion via an occluder-target-occludee plugin. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="cyws/"><img src="ico-cyws.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="cyws/">The Change You Want to See</a></h3> <p>Detecting "object level" changes in image pairs despite photometric and geometric differences.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="lvc/"><img src="ico_lvc.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="lvc/">Label, Verify, Correct: A Simple Few-Shot Object Detection Method</a></h3> <p> We propose a method to verify and correct pseudo-annotations for few-shot object detection. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="time/"><img src="ico_time.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="time/">It鈥檚 About Time: Analog Clock Reading in the Wild</a></h3> <p>We show that neural networks can read analog clocks in unconstrained environments without manual supervision.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="hoi/"><img src="ico_hoi.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="hoi/">Amplifying Key Cues for Human-Object-Interaction Detection</a></h3> <p>This work introduces two methods to amplify key cues, and a method to combine cues when considering the interaction between a human and an object.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="FontAdaptor20/"><img src="ico_FontAdaptor20.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="FontAdaptor20/">Adaptive Text Recognition through Visual Matching</a></h3> <p>This work addresses the problems of generalization and flexibility for text recognition in documents.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="ccr/"><img src="ico_ccr.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="ccr/">Fine-grained recognition of animal individuals in video without explicit detection</a></h3> <p>The works introduces a 'Count, Crop and Recognise' (CCR) multistage recognition process for frame level labelling of animal individuals.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="autocorrect/"><img src="ico_autocorrect.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="autocorrect/">AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations</a></h3> <p>The goal of this work is to train a model with noisy data, and to correct the registration noise in the annotations.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="detect-track/"><img src="ico_detect_track.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="detect-track/">Detect to Track and Track to Detect</a></h3> <p>Object detection and tracking in realistic video.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="class-agnostic-counting/"><img src="ico_class_agnostic_counting.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="class-agnostic-counting/">Class-Agnostic Counting</a></h3> <p>A network is trained to count objects of any class, using video data labeled for tracking.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="counting/"><img src="ico_count.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="counting/">Learning to Count Objects in Images</a></h3> <p>Learning to count objects in images, e.g. cells in a microscopic image or humans in surveillance video frames.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="text/"><img src="ico_text.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="text/">Reading Text in the Wild</a></h3> <p>Localising and recognising text in natural images, allowing large scale annotation and search for text in images.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="hands/index.html"><img src="ico_hands.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="hands/index.html">Hand Detection Using Multiple Proposals</a></h3> <p>Detection and localization of human hands in still images.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Categorisation, Classification, and Clustering</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Categorisation, Classification, and Clustering --> <div class="row"> <div class="col-md-6"> <div class="service-block media"> <a href="auto_novel/"><img src="ico_auto_novel.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="auto_novel/">Discovering and Learning New Visual Categories with Ranking Statistics</a></h3> <p>Tackling the problem of discovering novel classes in an image collection given labelled examples of other classes.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="DTC/"><img src="ico_dtc.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="DTC/">Learning to Discover Novel Visual Categories via Deep Transfer Clustering</a></h3> <p>This work aims to discover novel visual categories in an image collection.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="flowers/index.html"><img src="ico_flowers.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="flowers/index.html">Flower Classification from Images</a></h3> <p>Classification of different flower species from images using shape, colour and texture. These pages describe our database and some experimental results.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vis_attrib/index.html"><img src="ico_vis_attrib.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vis_attrib/index.html">Learning Visual Attributes</a></h3> <p>Learn models for visual qualities of objects, such as red, striped, or spotted, and determine their spatial extent in the image.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Art recognition and search</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Art recognition and search --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="art_search/"><img src="ico_artsearch.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="art_search">Visual Search of Paintings</a></h3> <p>Search a large dataset of paintings on-the-fly for a given object category.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="face_paint/"><img src="ico_facepaint.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="face_paint/">Faces to Paintings</a></h3> <p>Match photographs of people to similar looking paintings in a large corpus.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="BL/"><img src="ico_bl.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="BL/">Exploring the British Library</a></h3> <p>On-the-fly retrieval of object categories and repeating illustrations across 1 million images.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="sculptures/index.html"><img src="ico_sculptures.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="sculptures/index.html">Sculpture Retrieval and Identification</a></h3> <p>A retrieval based method for automatically determining the title and sculptor of an imaged sculpture.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vase_annotation/"><img src="ico_vaseanno.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vase_annotation/">Automatic Annotation of Greek Vases</a></h3> <p>A method allowing automatic detection of gods and animals in a large dataset of Greek vases.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vases/index.html"><img src="ico_vases.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vases/index.html">Shape Based Vase Retrieval Demo</a></h3> <p>The goal is to provide a web-based vase retrieval system, that allows the upload of new vase images and classifies the shape of the vase. Additionally, a list of close matches in terms of the vase shape are returned.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Medical Imaging</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Medical Imaging --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="auto-report-labeller/"><img src="auto-report-labeller/icon.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="auto-report-labeller/">Automated Medical Report Labelling</a></h3> <p> A general pipeline to extract clinical labels from radiology reports using large language models. </p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="vertebrae-detection/"><img src="ico_vertebrae_detection.gif" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="vertebrae-detection/">Vertebra Detection and Labelling</a></h3> <p>A new method to detect and label vertebrae in clinical MR images, robust to pathology and different fields of view.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="mitosis/"><img src="ico_mitosis.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="mitosis/">Automated Labelling of Cell Cycle Phases</a></h3> <p>Automatically detect and track cells through a video, labelling cell cycle phase at every time point.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="cell_detection/"><img src="ico_cell_detect.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="cell_detection/">Learning to Detect Cells</a></h3> <p>Detect cells automatically with models learnt from simple annotations.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="spine/"><img src="ico_spine.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="spine/">Spine</a></h3> <p>This research aims to automate the analysis of spinal MRIs and investigate the correlation of MR scans and clinical scores related to back pain.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="med_search/"><img src="ico_med_search.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="med_search/">Medical Image Search Engine</a></h3> <p>Instantly search for arbitrary regions of interest in inter-patient medical image datasets.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> <button class="collapsible active">Miscellaneous</button> <div class="collapsible-content" style="display: block;"> <!-- Content for Miscellaneous --> <div class="row"> <!-- ADD NEW COLUM HERE TO ADD A NEW ITEM --> <div class="col-md-6"> <div class="service-block media"> <a href="clever/"><img src="clever/clever-thumbnail.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="clever/">The Curious Layperson: Fine-Grained Image Recognition without Expert Labels</a></h3> <p>We propose a new problem of fine-grained image classification without expert annotations, by utilizing class agnostic non-expert descriptions and off-the-shelf expert corpus.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="c1c/"><img src="ico_c1c.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="c1c/">Constrained Video Face Clustering using 1NN Relations</a></h3> <p>The proposed C1C method imposes self-supervised constraints on HAC methods and without any training achieves the new state of the art for video face clustering.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="SSL_scarce/"><img src="ico_SSL_scarse.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="SSL_scarce/">Semi-Supervised Learning with Scarce Annotations</a></h3> <p>We introduce a semi-supervised learning method that works with scarce annotations.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="researchdoom/"><img src="ico_researchdoom.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="researchdoom/">Research Doom</a></h3> <p>Doom game frames with automatically labeled ground truth for various tasks - object and category recognition, detection, segmentation, monocular depth estimation etc.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="encoding_eval/"><img src="ico_encoding_eval.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="encoding_eval/">Encoding Methods Evaluation</a></h3> <p>Evaluation of shallow feature encoding methods for image classification.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="learn_desc/"><img src="ico_learn_desc.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="learn_desc/">Descriptor Learning Using Convex Optimisation</a></h3> <p>Learn feature descriptors using convex formulations for keypoint matching and object instance retrieval.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="iseg/"><img src="ico_iseg.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="iseg/">Geodesic Star Convexity</a></h3> <p>Interactive image segmentation with geodesic star convexity constraints.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="objcut/index.html"><img src="ico_objcut.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="objcut/index.html">OBJ CUT</a></h3> <p>Given an image containing an instance of a known object category, OBJ CUT aims to obtain accurate, object-like segmentation automatically.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="moseg/index.html"><img src="ico_moseg.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="moseg/index.html">Learning Layered Motion Segmentations</a></h3> <p>Learning a generative layered representation of the scene for motion segmentation in an unsupervised manner.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="category/index.html"><img src="ico_category.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="category/index.html">Object Category Recognition</a></h3> <p>Learning what an object category (face, car <i>etc</i>) looks like, in order to identify new instances in a query image, taking into account factors such as object variation, background clutter, occlusion, scale and lighting changes.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="affine/index.html"><img src="ico_affine.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="affine/index.html">Affine Covariant Features</a></h3> <p>Extraction and description of affine covariant regions for matching and recognition of images under varying imaging conditions.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="texclass/index.html"><img src="ico_texclass.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="texclass/index.html">Texture Classification</a></h3> <p>Classification of materials on the basis of their appearance in single textured images obtained under unknown viewpoint and illumination conditions.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="mkdb/index.html"><img src="ico_mkdb.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="mkdb/index.html">Harvesting Image Databases from the Web</a></h3> <p>Automatically retrieve large numbers of images from the web for specified object classes with high accuracy and without using any user interaction.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="/~vgg/software/MKL/"><img src="ico_caltech.jpg" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="/~vgg/software/MKL/">Image Classification</a></h3> <p>Classify images by the object category they contain. Different feature descriptions of the image are combined to learn the class models.</p> </div> </div> </div> <!-- / .col --> <div class="col-md-6"> <div class="service-block media"> <a href="SR/index.html"><img src="ico_SR.png" class="img-fluid service-img border"></a> <div class="service-inner-content media-body px-1"> <h3><a href="SR/index.html">Image Super-Resolution</a></h3> <p>Improve the spatial resolution of sets of images, taking into account the uncertainty in factors like image registrations and lighting.</p> </div> </div> </div> <!-- / .col --> </div> <!-- / .row --> </div> </div> <!-- / .container --> </section> <!-- /ACTUAL PAGE CONTENT --> <!-- INCLUDE VGG FOOTER --> <div id="vggfooter"></div> <!-- Page Scroll to Top --> <a class="scroll-to-top js-scroll-trigger" href=".top-header"> <i class="fa fa-angle-up"></i> </a> <!-- JAVASCRIPT ================================================== --> <!-- Global JS --> <script src="/~vgg/assets/js/jquery.min.js"></script> <script src="/~vgg/assets/js/popper.min.js"></script> <!-- Plugins JS --> <script src="/~vgg/assets/js/bootstrap.min.js"></script> <!-- Slick JS --> <script src="/~vgg/assets/js/jquery.easing.1.3.js"></script> <script src="/~vgg/assets/js/slick.min.js"></script> <!-- Theme JS --> <script src="/~vgg/assets/js/theme.js"></script> <!-- cookiealert JS, see /~vgg/assets/css/cookiealert/LICENSE --> <script src="/~vgg/assets/js/cookiealert/cookiealert.js"></script> <!-- Load includes --> <script> $(function(){ $("#vggnavbar").load("/~vgg/vggnavbar.html"); $("#vggsocials").load("/~vgg/vggsocials.html"); $("#vggfooter").load("/~vgg/vggfooter.html"); }); </script> <!-- GoogleAnalytics code --> <script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-20555581-1', 'auto'); ga('send', 'pageview'); </script> <script type="text/javascript"> var coll = document.getElementsByClassName("collapsible"); var i; for (i = 0; i < coll.length; i++) { coll[i].addEventListener("click", function() { this.classList.toggle("active"); var content = this.nextElementSibling; if (content.style.display === "block") { content.style.display = "none"; } else { content.style.display = "block"; } }); } </script> <script type="text/javascript"> var expand = document.getElementById("collapse_categories"); expand.addEventListener("click", function() { var coll = document.getElementsByClassName("collapsible"); var i; for (i = 0; i < coll.length; i++) { coll[i].classList.remove("active"); var content = coll[i].nextElementSibling; content.style.display = "none"; } }); </script> <script type="text/javascript"> var expand = document.getElementById("expand_categories"); expand.addEventListener("click", function() { var coll = document.getElementsByClassName("collapsible"); var i; for (i = 0; i < coll.length; i++) { coll[i].classList.add("active"); var content = coll[i].nextElementSibling; content.style.display = "block"; } }); </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10