{"id":315,"date":"2021-03-21T12:08:21","date_gmt":"2021-03-21T10:08:21","guid":{"rendered":"https:\/\/emielcaron.nl\/?p=315"},"modified":"2021-03-30T17:53:57","modified_gmt":"2021-03-30T15:53:57","slug":"entity-resolution-in-large-patent-databases-an-optimization-approach-accepted-for-iceis-2021","status":"publish","type":"post","link":"https:\/\/emielcaron.nl\/?p=315","title":{"rendered":"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):"},"content":{"rendered":"\n<p><em>Candidate to ICEIS 2021 best paper award<\/em><\/p>\n\n\n\n<p>Authors: Emiel Caron and Ekaterini Ioannou<\/p>\n\n\n\n<p>Abstract: Entity resolution focuses on detecting and merging entities that refer to the same real-world object. Collective resolution is among the most prominent mechanisms suggested for address this challenge since the resolution decisions are not made independently but are based on available relationships. In this paper we introduce a<br>novel resolution approach that combines the essence of collective resolution with rules and transformations among entity attributes and values. We illustrate how the approach\u2019s parameters are optimized based on a global optimization algorithm, i.e., simulated annealing, and explain how this optimization is performed using a small training set. The quality of the approach is verified through an extensive experimental evaluation with 40M real-world scientific entities from the Patstat database.<\/p>\n\n\n\n<p>Keywords: Entity Resolution, Data Disambiguation, Data Cleaning, Data Integration, Bibliographic Databases.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Candidate to ICEIS 2021 best paper award Authors: Emiel Caron and Ekaterini Ioannou Abstract: Entity resolution focuses on detecting and merging entities that refer to the same real-world object. Collective resolution is among the most prominent mechanisms suggested for address this challenge since the resolution decisions are not made independently but are based on available &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/emielcaron.nl\/?p=315\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-315","post","type-post","status-publish","format-standard","hentry","category-research"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28): - Emiel Caron<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/emielcaron.nl\/?p=315\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28): - Emiel Caron\" \/>\n<meta property=\"og:description\" content=\"Candidate to ICEIS 2021 best paper award Authors: Emiel Caron and Ekaterini Ioannou Abstract: Entity resolution focuses on detecting and merging entities that refer to the same real-world object. Collective resolution is among the most prominent mechanisms suggested for address this challenge since the resolution decisions are not made independently but are based on available &hellip; Continue reading &quot;Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):&quot;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/emielcaron.nl\/?p=315\" \/>\n<meta property=\"og:site_name\" content=\"Emiel Caron\" \/>\n<meta property=\"article:published_time\" content=\"2021-03-21T10:08:21+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-03-30T15:53:57+00:00\" \/>\n<meta name=\"author\" content=\"Emiel Caron\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Emiel Caron\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/?p=315#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/?p=315\"},\"author\":{\"name\":\"Emiel Caron\",\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/#\\\/schema\\\/person\\\/992b3c38031ce991eef0e83dd12e11cd\"},\"headline\":\"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):\",\"datePublished\":\"2021-03-21T10:08:21+00:00\",\"dateModified\":\"2021-03-30T15:53:57+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/?p=315\"},\"wordCount\":156,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/#\\\/schema\\\/person\\\/992b3c38031ce991eef0e83dd12e11cd\"},\"articleSection\":[\"Research projects\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/emielcaron.nl\\\/?p=315#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/?p=315\",\"url\":\"https:\\\/\\\/emielcaron.nl\\\/?p=315\",\"name\":\"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28): - Emiel Caron\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/#website\"},\"datePublished\":\"2021-03-21T10:08:21+00:00\",\"dateModified\":\"2021-03-30T15:53:57+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/?p=315#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/emielcaron.nl\\\/?p=315\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/?p=315#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/emielcaron.nl\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/#website\",\"url\":\"https:\\\/\\\/emielcaron.nl\\\/\",\"name\":\"Emiel Caron\",\"description\":\"PhD, Lecturer &amp; Researcher in Business Intelligence &amp; Analytics, Data science\",\"publisher\":{\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/#\\\/schema\\\/person\\\/992b3c38031ce991eef0e83dd12e11cd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/emielcaron.nl\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/emielcaron.nl\\\/#\\\/schema\\\/person\\\/992b3c38031ce991eef0e83dd12e11cd\",\"name\":\"Emiel Caron\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g\",\"caption\":\"Emiel Caron\"},\"logo\":{\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28): - Emiel Caron","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/emielcaron.nl\/?p=315","og_locale":"en_US","og_type":"article","og_title":"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28): - Emiel Caron","og_description":"Candidate to ICEIS 2021 best paper award Authors: Emiel Caron and Ekaterini Ioannou Abstract: Entity resolution focuses on detecting and merging entities that refer to the same real-world object. Collective resolution is among the most prominent mechanisms suggested for address this challenge since the resolution decisions are not made independently but are based on available &hellip; Continue reading \"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):\"","og_url":"https:\/\/emielcaron.nl\/?p=315","og_site_name":"Emiel Caron","article_published_time":"2021-03-21T10:08:21+00:00","article_modified_time":"2021-03-30T15:53:57+00:00","author":"Emiel Caron","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Emiel Caron","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/emielcaron.nl\/?p=315#article","isPartOf":{"@id":"https:\/\/emielcaron.nl\/?p=315"},"author":{"name":"Emiel Caron","@id":"https:\/\/emielcaron.nl\/#\/schema\/person\/992b3c38031ce991eef0e83dd12e11cd"},"headline":"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):","datePublished":"2021-03-21T10:08:21+00:00","dateModified":"2021-03-30T15:53:57+00:00","mainEntityOfPage":{"@id":"https:\/\/emielcaron.nl\/?p=315"},"wordCount":156,"commentCount":0,"publisher":{"@id":"https:\/\/emielcaron.nl\/#\/schema\/person\/992b3c38031ce991eef0e83dd12e11cd"},"articleSection":["Research projects"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/emielcaron.nl\/?p=315#respond"]}]},{"@type":"WebPage","@id":"https:\/\/emielcaron.nl\/?p=315","url":"https:\/\/emielcaron.nl\/?p=315","name":"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28): - Emiel Caron","isPartOf":{"@id":"https:\/\/emielcaron.nl\/#website"},"datePublished":"2021-03-21T10:08:21+00:00","dateModified":"2021-03-30T15:53:57+00:00","breadcrumb":{"@id":"https:\/\/emielcaron.nl\/?p=315#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/emielcaron.nl\/?p=315"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/emielcaron.nl\/?p=315#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/emielcaron.nl\/"},{"@type":"ListItem","position":2,"name":"Entity Resolution in Large Patent Databases: an optimization approach (Accepted for ICEIS 2021, April 26-28):"}]},{"@type":"WebSite","@id":"https:\/\/emielcaron.nl\/#website","url":"https:\/\/emielcaron.nl\/","name":"Emiel Caron","description":"PhD, Lecturer &amp; Researcher in Business Intelligence &amp; Analytics, Data science","publisher":{"@id":"https:\/\/emielcaron.nl\/#\/schema\/person\/992b3c38031ce991eef0e83dd12e11cd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/emielcaron.nl\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/emielcaron.nl\/#\/schema\/person\/992b3c38031ce991eef0e83dd12e11cd","name":"Emiel Caron","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g","caption":"Emiel Caron"},"logo":{"@id":"https:\/\/secure.gravatar.com\/avatar\/16d7767d69c769cde896a0f5e53533595a081cfaeab0aca485f4736e51e08ae0?s=96&d=mm&r=g"}}]}},"_links":{"self":[{"href":"https:\/\/emielcaron.nl\/index.php?rest_route=\/wp\/v2\/posts\/315","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/emielcaron.nl\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/emielcaron.nl\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/emielcaron.nl\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/emielcaron.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=315"}],"version-history":[{"count":4,"href":"https:\/\/emielcaron.nl\/index.php?rest_route=\/wp\/v2\/posts\/315\/revisions"}],"predecessor-version":[{"id":320,"href":"https:\/\/emielcaron.nl\/index.php?rest_route=\/wp\/v2\/posts\/315\/revisions\/320"}],"wp:attachment":[{"href":"https:\/\/emielcaron.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=315"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/emielcaron.nl\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=315"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/emielcaron.nl\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=315"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}