CINXE.COM

<!DOCTYPE html> <html> <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>VGG Practical</title> <link rel="stylesheet" href="base.css" /> <link rel="stylesheet" href="prism.css" /> </head> <body> <h1 id="vgg-computer-vision-practicals">VGG Computer Vision Practicals</h1> <p>The Oxford <a href="http://www.robots.ox.ac.uk/~vgg">Visual Geomery Group</a> Computer Vision Practicals is a collection of MATLAB-based hands-on experiences introducing fundamental concepts in image understanding (<a href="#installation">requirements and installation instructions</a>).</p> <div class="toc"> <ul> <li><a href="#vgg-computer-vision-practicals">VGG Computer Vision Practicals</a><ul> <li><a href="#the-practicals">The practicals</a><ul> <li><a href="#convolutional-neural-networks">Convolutional neural networks</a></li> <li><a href="#image-classification">Image classification</a></li> <li><a href="#image-retrieval">Image retrieval</a></li> <li><a href="#object-detection">Object detection</a></li> </ul> </li> <li><a href="#general-instructions">General instructions</a><ul> <li><a href="#planning-your-laboratory-experience">Planning your laboratory experience</a></li> <li><a href="#installation">Installation</a></li> <li><a href="#help-and-troubleshooting">Help and troubleshooting</a></li> </ul> </li> </ul> </li> </ul> </div> <h2 id="the-practicals">The practicals</h2> <h3 id="convolutional-neural-networks">Convolutional neural networks</h3> <p>Learn to use convolutional neural networks (CNNs), an important class of learnable representations applicable to numerous computer vision problems and are the main method for feature extraction in image understanding. This practical explores the basic CNN building blocks (linear filters and ReLU), back-propagation, learning CNNs to detect particular image structures as well as typewritten characters (using a variety of different fonts), and using stochastic gradient descent with momentum, mini-batches, and data augmentation.</p> <ul> <li><a href="http://www.robots.ox.ac.uk/~vgg/share/practical-cnn-pytorch-2018a.tar.gz">PyTorch 2018a version</a> (uses Jupyter)</li> <li><a href="https://www.robots.ox.ac.uk/~vgg/practicals/cnn/index.html">MatConvNet version</a> (uses MATLAB)</li> </ul> <h3 id="image-classification">Image classification</h3> <p>Learn how to tell if an image contains an object of a certain class (e.g. a dog, a mountain, or a person). The challenge is to be invariant to irrelevant factors such as viewpoint and illumination as well as to the differences between objects (no two mountains look exactly the same). The practical covers using various deep convolutional neural networks (CNNs) to extract image features, learning an SVM classifier for five different object classes (airplanes, motorbikes, people, horses and cars), assessing its performance using precision-recall curves, and training a new classifiers from data collected using Internet images.</p> <ul> <li><a href="http://www.robots.ox.ac.uk/~vgg/share/practical-category-recognition-cnn-pytorch-2018a.tar.gz">PyTorch 2018a version</a> (uses Jupyter)</li> <li><a href="../category-recognition-cnn/index.html">MatConvNet version</a> (uses MATLAB)</li> <li><a href="../category-recognition/index.html">MATLAB pre-deep learning version</a> (uses MATLAB and no ConvNets!)</li> </ul> <h3 id="image-retrieval">Image retrieval</h3> <p>Learn to recognize specific objects in images, such as the Notre Dame cathedral or `Starry Night’ by Van Gogh, by quickly matching a query to a large database. The challenge is to be invariant to changes in scale, camera viewpoint, illumination conditions and partial occlusion. The practical covers matching images using sparse SIFT features, geometric verification, feature quantization and bag-of-visual-words, and evaluating a retrieval systems using mean average precision.</p> <p>Start <a href="../instance-recognition/index.html">here</a>.</p> <h3 id="object-detection">Object detection</h3> <p>Learn to detect objects such as pedestrian, cars, traffic signs, in an image. The challenge is to not only recognize but also localize objects in images, as well as to enumerate their occurrences, regardless changes in location, scale, illumination, articulation, and many other factors. The practical covers using HOG features to describe image regions, building a sliding-window SVM object detector, operating at multiple scales, evaluating a detector using average precision, and improving it using hard negative mining.</p> <ul> <li><a href="http://www.robots.ox.ac.uk/~vgg/share/practical-category-detection-2019a-pytorch.tar.gz">PyTorch 2018a version</a> (uses Jupyter)</li> <li><a href="../category-detection/index.html">MatConvNet version</a> (uses MATLAB)</li> </ul> <p>Start <a href="../category-detection/index.html">here</a>.</p> <h3 id="ann-methods">Approximate Nearest Neighbour (ANN) Methods</h3> <p> This short practical focuses on different Approximate Nearest Neighbour (ANN) methods, which are used for search and retrieval systems that handle high-dimensional data such as images and sound. The practical covers Product Quantization (PQ) and Vector Quantization (VQ), and allows experimenting with different trade-offs of memory, speed, and search accuracy. </p> <ul> <li><a href="https://github.com/ox-vgg/practicals/blob/main/ann-faiss/practical.ipynb">JupyterNotebook</a></li> </ul> <h2 id="general-instructions">General instructions</h2> <p>This section contains information for lab setters and instructors.</p> <h3 id="planning-your-laboratory-experience">Planning your laboratory experience</h3> <p>Practicals are organized in tracks of different duration.</p> <ul> <li><strong>Fast track:</strong> 1.5 hours.</li> <li><strong>Full track:</strong> 3 hours.</li> </ul> <p>Parts that should be skipped on fast track are clearly marked <strong>in this style</strong>.The practical requires each student to have the following equipment and software:</p> <ul> <li>Windows, Mac OS X, or Linux computer.</li> <li>At least 2GB RAM.</li> <li>MATLAB 2016A onwards (earlier versions may or may not work).</li> <li>MATLAB Image Processing toolbox.</li> </ul> <p><strong>Note:</strong> If you plan to use students' personal computers, we suggest that you leave plenty of time in advance of the practical in order to download and install the required data and software.</p> <h3 id="installation">Installation</h3> <p>Each practical contains data and MATLAB code (including a copy of the VLFeat library). To install a practical:</p> <ul> <li>Find the practical download link at the top of each practical page. E.g. <code>practical-instance-recognition-2013a.tar.gz</code>.</li> <li>Download the package. </li> <li>Unpack the package. The archive is in <code>.tar.gz</code> format and will unpack to a directory of the same name, e.g. <code>practical-instance-recognition-2013a/</code>.</li> </ul> <p>For convenience, additional packages containing only the code and the data are provided as well.</p> <p>The newest practicals (namely, the object detection practical) will require a little more work as the MatConvNet library does not ship with binary MEX files and will need compilation for each specific platform.</p> <h3 id="help-and-troubleshooting">Help and troubleshooting</h3> <ul> <li><strong>Getting help.</strong> As you progress in a practical you can use MATLAB help command to display the help of the MATLAB functions that you need to use. For example, try typing help setup.</li> <li><strong>Interfering VLFeat copies.</strong> If you have a copy VLFeat toolbox loaded automatically on starting MATLAB, the copy shipped with this practical may conflict with it (it will generate errors during the execution of the exercises). In this case, switch to the shipped version of the toolbox. First, try issuing <code>clear mex ; vlfeat/toolbox/vl_setup</code>. If this does not solve the problem, exit MATLAB and restart it without loading your default VLFeat installation (this may require editing your MATLAB <code>startup.m</code> file).</li> <li><strong>Corrupted .MAT or other files.</strong> If MATLAB complains about a corrupted .mat file, consider trying a different unpacking software. At least one version on WinZip caused problems for some students.</li> </ul><script type="text/x-mathjax-config"> MathJax.Hub.Config({ extensions: ["tex2jax.js"], jax: ["input/TeX", "output/HTML-CSS"], tex2jax: { inlineMath: [ ['$','$'], ["\$","\$"] ], displayMath: [ ['$$','$$'], ["\\[","\\]"] ], processEscapes: true }, "HTML-CSS": { availableFonts: ["TeX"] }, TeX: { equationNumbers: { autoNumber: "AMS" } } }); if (typeof MathJaxListener !== 'undefined') { MathJax.Hub.Register.StartupHook('End', function () { MathJaxListener.invokeCallbackForKey_('End'); }); } </script> <script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> <script type="text/javascript" src="prism.js"></script> </body> </html>