Class NameCentricSynonymIndexGenerator


  • public class NameCentricSynonymIndexGenerator
    extends java.lang.Object
    Synonym or gene name centric indexer, new as of March 11, 2019. The idea is to save storage and gain more focused gene mention search results by not indexing each synonym of each gene but group the gene ids by all possible synonyms. Thus, each synonym is only stored once and references the list of genes it may refer to, immediately showing the ambiguity of the synonym.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void createIndex()
      Creates the synonym index.
      static void main​(java.lang.String[] args)
      To execute the ContextIndexGenerator start it with the following command-line arguments:
      arg0: path to resources directory arg1: path to synonym indices directory
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • NameCentricSynonymIndexGenerator

        public NameCentricSynonymIndexGenerator​(java.io.File dictFile,
                                                java.io.File indexFile)
                                         throws java.io.FileNotFoundException,
                                                java.io.IOException
        Parameters:
        dictFile - A file containing gene or protein names / synonyms and their respective NCBI Gene or UniProt ID. No term normalization is expected for this dictionary.
        indexFile - The directory where the name / synonym index will be written to.
        Throws:
        java.io.FileNotFoundException
        java.io.IOException
    • Method Detail

      • main

        public static void main​(java.lang.String[] args)
        To execute the ContextIndexGenerator start it with the following command-line arguments:
        arg0: path to resources directory arg1: path to synonym indices directory
        Parameters:
        args -
      • createIndex

        public void createIndex()
                         throws java.io.IOException
        Creates the synonym index. Each unique synonym is indexed in a document of its own. Each such document has a number of fields for each gene that has the current synonym and lists the gene ID, its tax ID (if the tax ID mapping is given) and the "priority" that the synonym has for the gene. The priority aims to describe the reliability of the source given the respective synonym. Higher numbers mean a lower priority. The official gene symbol has priority -1.
        Throws:
        java.io.IOException