CINXE.COM
{"title":"The Relationship between Representational Conflicts, Generalization, and Encoding Requirements in an Instance Memory Network","authors":"Mathew Wakefield, Matthew Mitchell, Lisa Wise, Christopher McCarthy","volume":188,"journal":"International Journal of Cognitive and Language Sciences","pagesStart":348,"pagesEnd":357,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10012645","abstract":"<p>This paper aims to provide an interpretation of artificial neural networks (ANNs) and explore some of its implications. The interpretation views ANNs as a memory which encodes instances of experience. An experiment explores the behavior of encoding and retrieval of instances from memory. A localised representation ANN is created that allows control over encoding and retrieved memory sample size and is experimented with using the MNIST digits dataset. The relationship between input familiarity, conflict within retrieved samples, and error rates is described and demonstrated to be an effective driver for memory encoding. Results indicate that selective encoding and retrieval samples that allow detection of memory conflicts produce optimal performance, and that error rates are normally distributed with input familiarity and conflict. By using input familiarity and sample consistency to guide memory encoding, the number of encoding trials on the dataset were reduced to 18.33% of the training data while maintaining good recognition performance on the test data.<\/p>","references":"[1] Y. LeCun, Y. Bengio, and G. Hinton, \u201cDeep learning,\u201d Nature, vol. 521,\r\nno. 7553, pp. 436\u2013444, 2015.\r\n[2] G. E. Hinton, D. E. Rumelhart, and J. L. McClelland, Distributed\r\nRepresentations. MITP, 1986, pp. 77\u2013109.\r\n[3] I. J. Goodfellow, J. Shlens, and C. Szegedy, \u201cExplaining and harnessing\r\nadversarial examples,\u201d arXiv preprint arXiv:1412.6572, 2014.\r\n[4] M. M. Botvinick, T. S. Braver, D. M. Barch, C. S. Carter, and\r\nJ. D. Cohen, \u201cConflict monitoring and cognitive control,\u201d Psychological\r\nReview, vol. 108, no. 3, pp. 624\u2013652, 2001.\r\n[5] D. Kumaran, D. Hassabis, and J. L. McClelland, \u201cWhat learning systems\r\ndo intelligent agents need? complementary learning systems theory\r\nupdated,\u201d Trends in Cognitive Sciences, vol. 20, no. 7, pp. 512\u2013534,\r\n2016.\r\n[6] J. L. McClelland, B. L. McNaughton, and R. C. O\u2019Reilly, \u201cWhy there\r\nare complementary learning systems in the hippocampus and neocortex:\r\nInsights from the successes and failures of connectionist models of\r\nlearning and memory,\u201d Psychological Review, vol. 102, no. 3, pp.\r\n419\u2013457, 1995.\r\n[7] M. M. Botvinick, S. Ritter, J. X. Wang, Z. Kurth-Nelson, C. Blundell,\r\nand D. Hassabis, \u201cReinforcement learning, fast and slow,\u201d Trends in\r\nCognitive Sciences, vol. 23, no. 5, pp. 408\u2013422, 2019.\r\n[8] J. L. McClelland and D. E. Rumelhart, \u201cDistributed memory and\r\nthe representation of general and specific information,\u201d Journal of\r\nExperimental Psychology: General, vol. 114, no. 2, pp. 159\u2013188, 1985.\r\n[9] C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals,\r\n\u201cUnderstanding deep learning requires rethinking generalization,\u201d arXiv\r\npreprint arXiv:1611.03530, 2016.\r\n[10] \u2014\u2014, \u201cUnderstanding deep learning (still) requires rethinking\r\ngeneralization,\u201d Commun. ACM, vol. 64, no. 3, p. 107\u2013115, 2021.\r\n[11] D. Arpit, S. Jastrzebski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal,\r\nT. Maharaj, A. Fischer, A. Courville, and Y. Bengio, \u201cA closer look\r\nat memorization in deep networks,\u201d in International Conference on\r\nMachine Learning. PMLR, Conference Proceedings, pp. 233\u2013242.\r\n[12] G. E. Hinton, \u201cWhat kind of graphical model is the brain?\u201d in Proc.\r\n19th International Joint Conference on Artificial intelligence, vol. 5,\r\n2005, Conference Proceedings, pp. 1765\u20131775.\r\n[13] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, \u201cLearning\r\nrepresentations by back-propagating errors,\u201d Nature, vol. 323, no. 6088,\r\npp. 533\u2013536, 1986.\r\n[14] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, \u201cA simple framework\r\nfor contrastive learning of visual representations,\u201d arXiv preprint\r\narXiv:2002.05709, 2020.\r\n[15] N. Papernot and P. McDaniel, \u201cDeep k-nearest neighbors: Towards\r\nconfident, interpretable and robust deep learning,\u201d arXiv preprint\r\narXiv:1803.04765, 2018.\r\n[16] S. Grossberg, \u201cHow does a brain build a cognitive code?\u201d Psychological\r\nReview, vol. 87, no. 1, pp. 1\u201351, 1980.\r\n[17] Z. Ghahramani, \u201cProbabilistic machine learning and artificial\r\nintelligence,\u201d Nature, vol. 521, no. 7553, pp. 452\u2013459, 2015.\r\n[18] J. Wang, P. Neskovic, and L. N. Cooper, \u201cNeighborhood size selection\r\nin the k-nearest-neighbor rule using statistical confidence,\u201d Pattern\r\nRecognition, vol. 39, no. 3, pp. 417\u2013423, 2006.\r\n[19] M. Page, \u201cConnectionist modelling in psychology: A localist manifesto,\u201d\r\nBehavioral and Brain Sciences, vol. 23, no. 4, pp. 443\u2013467, 2000.\r\n[20] G. Dong and H. Liu, Feature engineering for machine learning and data\r\nanalytics. CRC Press, 2018.\r\n[21] J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson,\r\n\u201cUnderstanding neural networks through deep visualization,\u201d arXiv\r\npreprint arXiv:1506.06579, 2015.\r\n[22] J. S. Bowers, \u201cParallel distributed processing theory in the age of deep\r\nnetworks,\u201d Trends in Cognitive Sciences, vol. 21, no. 12, pp. 950\u2013961,\r\n2017.\r\n[23] J. Grainger and A. M. Jacobs, On localist connectionism and\r\npsychological science. Mahwah, New Jersey: Lawrence Erlbaum, 1998,\r\npp. 1\u201338.\r\n[24] J. L. McClelland and D. E. Rumelhart, \u201cAn interactive activation model\r\nof context effects in letter perception: I. An account of basic findings,\u201d\r\nPsychological review, vol. 88, no. 5, pp. 375\u2013407, 1981.\r\n[25] D. A. Norman and T. Shallice, Attention to Action: Willed and Automatic\r\nControl of Behavior. Boston, MA: Springer US, 1986, pp. 1\u201318.\r\n[26] J. Yosinski, \u201cUnderstanding neural networks through deep\r\nvisualization,\u201d 2015, accessed: 27-01-2021. [Online]. Available:\r\nhttp:\/\/yosinski.com\/deepvis\r\n[27] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, \u201cHow transferable are\r\nfeatures in deep neural networks?\u201d in Advances in neural information\r\nprocessing systems 27 (NIPS 2014), Z. Ghahramani, M. Welling,\r\nC. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran\r\nAssociates, 2014, Conference Proceedings, pp. 3320\u20133328.\r\n[28] Y. LeCun, C. Cortes, and C. J. C. Burges, \u201cThe\r\nMNIST database,\u201d accessed: 03-06-2020. [Online]. Available:\r\nhttp:\/\/yann.lecun.com\/exdb\/mnist\/\r\n[29] A. Pritzel, B. Uria, S. Srinivasan, A. P. Badia, O. Vinyals, D. Hassabis,\r\nD. Wierstra, and C. Blundell, \u201cNeural episodic control,\u201d in Proceedings\r\nof the 34th International Conference on Machine Learning-Volume 70.\r\nJMLR. org, Conference Proceedings, pp. 2827\u20132836.\r\n[30] S. Grossberg, \u201cAdaptive resonance theory: How a brain learns to\r\nconsciously attend, learn, and recognize a changing world,\u201d Neural\r\nNetworks, vol. 37, pp. 1\u201347, 2013.\r\n[31] B. C. Love, D. L. Medin, and T. M. Gureckis, \u201cSustain: A network\r\nmodel of category learning,\u201d Psychological Review, vol. 111, no. 2, pp.\r\n309\u2013332, 2004.\r\n[32] G. Shafer and V. Vovk, \u201cA tutorial on conformal prediction,\u201d Journal of\r\nMachine Learning Research, vol. 9, no. 3, 2008.\r\n[33] T. L. Griffiths, N. Chater, C. Kemp, A. Perfors, and J. B. Tenenbaum,\r\n\u201cProbabilistic models of cognition: exploring representations and\r\ninductive biases,\u201d Trends in Cognitive Sciences, vol. 14, no. 8, pp.\r\n357\u2013364, 2010.\r\n[34] J. L. McClelland, M. M. Botvinick, D. C. Noelle, D. C. Plaut, T. T.\r\nRogers, M. S. Seidenberg, and L. B. Smith, \u201cLetting structure emerge:\r\nconnectionist and dynamical systems approaches to cognition,\u201d Trends\r\nin Cognitive Sciences, vol. 14, no. 8, pp. 348\u2013356, 2010.\r\n[35] V. Di Lollo, \u201cThe feature-binding problem is an ill-posed problem,\u201d\r\nTrends in Cognitive Sciences, vol. 16, no. 6, pp. 317\u2013321, 2012.\r\n[36] Z. Tu, X. Chen, A. L. Yuille, and S.-C. Zhu, \u201cImage parsing: Unifying\r\nsegmentation, detection, and recognition,\u201d International Journal of\r\ncomputer vision, vol. 63, no. 2, pp. 113\u2013140, 2005.\r\n[37] P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola,\r\nA. Maschinot, C. Liu, and D. Krishnan, \u201cSupervised contrastive\r\nlearning,\u201d arXiv preprint arXiv:2004.11362, 2020.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 188, 2022"}