CINXE.COM
Margaret Mitchell
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <title>Margaret Mitchell</title> <LINK REL=StyleSheet HREF="../css/style.css" TYPE="text/css" MEDIA=screen> <meta property="og:description" content=" Multi-task learning provides an opportunity to mix related tasks together, which can benefit all of the tasks, some of the tasks, or none of the tasks. Be ready for 'none of the tasks', there are a lot of details to get right. This paper discusses some of the things we did to move Multi-task learning from being ineffective to effective, with a look towards the longer-term goals for this framework." /> <meta property="og:image" content="http://m-mitchell.com/images/MTL.png" /> <meta name="twitter:card" content="summary_large_image"> <meta name="twitter:creator" content="@mmitchell_ai"> <meta name="twitter:title" content="Multi-Task Learning for Mental Health"> <meta name="twitter:description" content="Multi-task learning provides an opportunity to mix related tasks together, which can benefit all of the tasks, some of the tasks, or none of the tasks. Be ready for 'none of the tasks', there are a lot of details to get right. This paper discusses some of the things we did to move Multi-task learning from being ineffective to effective, with a look towards the longer-term goals for this framework."> <meta name="twitter:image" content="http://m-mitchell.com/images/MTL.png"> <script src="https://use.fontawesome.com/574a28de77.js"></script> <script src="../js/jquery.min.js"></script> <script> $(function(){ $("#sidebar").load("../sidebar.html", function() {$('#publications').addClass("active");}); }); </script> </head> <body> <div id="page"> <div id="header"> <h1>Margaret Mitchell - margarmitchell {at} gmail.com</h1> <div class="description"> <BR /> <BR /> </div> </div> <div id="mainarea"> <div id="sidebar"> </div> </div> <div id="contentarea" style="width: 500px"> <div style="text-align:center"> <div style="float:left;padding-right:3em;padding-left:1em;"><h2><i class="fa fa-file-text-o" style="font-size: 2em;" aria-hidden="true"></i></h2></div> <div style="float:left"> <h2> Multi-Task Learning for Mental Health<br/> using Social Media Text</h2></div> </div> <br/> <br/> <br/> <br/> <div style="float:bottom"> <br/> Benton, A. and Mitchell, M. and Hovy, D. (2017) <a href="multitask-clinical.pdf" target="_blank">Multi-Task Learning for Mental Health using Social Media Text</a>. <i>Proceedings of EACL 2017</i>. <br/><br/> Multi-task learning provides an opportunity to mix related tasks together, which can benefit all of the tasks, some of the tasks, or none of the tasks. Be ready for 'none of the tasks', there are a lot of details to get right. This paper discusses some of the things we did to move multi-task learning from being ineffective to effective, with a look towards the longer-term goals for this framework.<br/><br/> I first became aware of multi-task learning reading <a href="https://arxiv.org/pdf/1504.08083v2.pdf" target="_blank">Ross Girshick's "Fast R-CNN" paper</a>. In that work, Girshick was thinking about <i>object detection</i>, and specifically about the dual problems of <i>object localization</i> and <i>object classification</i> in an image. The idea felt natural following the recent successes in deep learning: Predict both the object location and class at once, modeled as two separate tasks using one network. Locating objects would be one task, with one loss; classifying object regions would be another task, with another loss. For each region, the forward pass of the network predicts: <ol><li> object class </li> <li> region offsets that form a bounding box (bounding box regression)</li> </ol> <center> <IMG style="width:60%" src="../images/fast_rcnn.png" /><br/>From <a href="https://arxiv.org/pdf/1504.08083v2.pdf">Ross Girshick's Fast R-CNN paper</a>. <br/> </center> <br/> I didn't think of it as multi-task learning at the time, but rather joint learning. In retrospect it's clearly multi-task: The network uses more than one loss, where each loss corresponds to a task. They are still trained jointly, though (very very jointly -- the whole network is shared until the two task output layers -- the <a href="https://arxiv.org/abs/1506.01497" target="_blank">Faster R-CNN work</a> pulls things out further).<br/><br/> In 1993, Rich Caruana wrote <a href="http://www8.cs.umu.se/research/ifor/dl/LEARNING/multitask%20learning.pdf" target="_blank">a beautiful thesis on multi-task learning</a>, where he explains some initial motivations for this approach: Generalization in neural networks improves when the network is trained to represent underlying regularities. It's a computational model of how solutions learned for one problem may help in learning solutions for another problem (like how learning to <a href="https://www.youtube.com/watch?v=DsLk6hVBE6Y" target="_blank"><i>sand the floor</i> / <i>wax on, wax off</i></a> helped in the Karate Kid). <br/><br/> <center> <IMG style="width:60%" src="../images/STL.png"/> <br/> <IMG style="width:40%" src="../images/MTL.png"/> <BR/> From <a href="http://www8.cs.umu.se/research/ifor/dl/LEARNING/multitask%20learning.pdf" target="_blank">Rich Caruana's thesis, 1993.</a> </center> <br /> One of the applications he proposed was <a href="http://www.cs.cornell.edu/~caruana/mlj97.pdf">in the medical domain</a>, similar to our application in this paper. We were thinking about how to predict suicide risk (imminent risk) in a clinical care environment, using writing that a patient might be comfortable sharing with their clinican. This was part of the <a href="http://www.clsp.jhu.edu/workshops/">JSALT Workshop series</a> on <a href="http://www.clsp.jhu.edu/workshops/16-workshop/detecting-risk-and-protective-factors-of-mental-health-using-social-media-linked-with-electronic-health-records/">Detecting Risk and Protective Factors of Mental Health using Social Media Linked with Electronic Health Records</a>. (Neither Dirk nor I are in the photo, because we were not cool enough, but Adrian is there, who was amazing to work with.) <br/><br/> In this domain, we were thinking about an application where someone with suicidal thoughts could use assistance from a clinical expert, on-call in a moment where suicide risk is high. A related application would be providing numbers that might help clinicians acquire increased care for a patient, e.g., to an insurance company that requires proof of increased need.<br/><br/> You can see how quickly <i>ethical considerations</i> start leaping up -- for example, <a href="https://en.wikipedia.org/wiki/Dual-use_technology" target="_blank"><i>dual use</a></i>, or providing a means for ``armchair diagnoses'' of colleagues or potential hires through what might be exposed within the paper. I know, and Dirk too. So we organized an <a href="http://www.ethicsinnlp.org/" target="_blank">ACL workshop on Ethics in NLP</a>, with other people interested in the topic, <a href="http://eacl2017.org/" target="_blank">at the same conference where we're presenting the work</a> (that last part is a coincidence). We have also added an Ethical Considerations section to the work, which also addresses why we will not be including examples. <br/><br/> <IMG style="width:60%" src="../images/ethical_considerations.png"/> <br/><br/> We have tried to write this document through the lens of what might benefit patients, while describing an interesting general framework from an ML/NLP perspective. <br/><br/> To be presented in <a href="http://eacl2017.org/" target="_blank">Valencia, Spain, at EACL 2017</a>. <br/> </div> <br/><br/> <span style="font-size:.9em">The work reported was started at <a href="http://www.clsp.jhu.edu/workshops/16-workshop/">JSALT 2016</a>, and was supported by JHU via grants from DARPA (LORELEI), Microsoft, Amazon, Google and Facebook.</span> </div> <div id="footer"> </div> </div> </body> </html>