Expanding Memomics – Mining the Datagems of a Bejeweled Babylon of Information

Memomics, when understood as the study of the Meme, by decoding it into an ontological mapping, is a valuable tool to improve semantic webs and search engines. Commercial and advertisement applications facilitated by Artificial Intelligent agents can profit from the correlations found as will be explained hereunder:

According to Wikipedia a Meme is a term, which identifies ideas or beliefs that are transmitted from one person or group of people to another. The name comes from an analogy: as genes transmit biological information, Memes can be said to transmit idea and belief information. The Memome can be seen as the entire collection of all Memes. If we dive a bit deeper into this concept a bit it can also be said to encompass all human knowledge.

Genomics and Proteomics are the study of the genome, the entirety of organisms’ hereditary information and its entire complement of proteins, respectively. Likewise Memomics can be considered the study of the Memome, the entire collection of all Memes.

In Genomics and Proteomics the study entails different types of “mapping” of the functions and structures of genes and proteins. The mapping can for instance be or it can be pathological i.e. the correlation between expression profiles of certain genes and proteins with diseases or it can be topological: expression as regards a certain type of tissue, cell type or organ.

Likewise, Memomics studies the ontological mapping of ideas and terms. A company, Alitora systems, has undertaken the first steps in the field of Memomics and guess where they have started: with lifesciences data. They have developed convenient data and text mining tools which can accelerate a meaningful search and which provide links to most ontologically correlated concepts.

A more ambitious project would be to make a complete ontological mapping of all human knowledge. That is to find for every existing term or concept, which concepts it is naturally linked with. What I mean with this is not only providing a semantic mapping, which provides the meaning of a term in features and other terms. I’d like to expand mappings as suggested in my previous article: “The OWLs of Minerva only fly at dusk – Patently Intelligent Ontologies”. That is, to map the proximity relation for each term defined in a semantic web to each other term likewise defined to know the average distance between those terms in all documents on the entire World Wide Web and the weight of the frequency of such occurrences. Such an ontology map could fish out terms that have a correlation of occurrence which is well above the “noise”. Many trivial terminologies will occur in high frequency of proximity to virtual any term. This forms a level of noise frequency which is a threshold which significant term correlations must exceed. Such terminologies include all kind of syntactic terms such as conjunctions, adverbs, adjectives, modal verbs etc.

A disadvantage at setting the threshold too high is that terms which are normally trivial, in combination with another term could have a very specific meaning.

When this ontological mapping is carried out only within specific segmented classes / fields of meaning, suddenly important correlations can emerge, which were not visible in most classes and fields.

Thus, such an ontological proximity mapping with weighted frequency of occurrence could be carried out in combination with a “website classification” (i-taxonomy).

Vice versa the exercise of ontological proximity mapping with weighted frequency of occurrence could provide classes and subclasses. Therefore this process can be implemented in an iterative manner. Significant correlation can create classes, which can in turn be data-mined to find new mappings and suggest new subclasses.

Another ontological mapping is to determine if certain links on the web have a correlation with certain terms.

The implementation must start with all the information present on the web at a fixed date. This information must somehow be stored as frozen to implement the extensive data mining exercise of proximity mapping. Once that given Memome is entirely decoded, the process can be repeated iteratively with top-ups and will eventually catch up with the “present” at that time.

Artificial intelligent agents will carry out the process of ontological mapping and will learn from the patterns they recognise making it easier to map future events and create further classes. In addition links thus spotted and/or generated which are used more often can be added to appropriate Hubs in the “Hubbit” system, which I discussed in my earlier article: “From Search Engines to Hub Generators and Centralised Personal Multiple Purpose Internet Interfaces”. Well-frequented links will be favoured and insignificant links won’t make it to a permanent stage according to the evangelical adage: “To he who hath it shall be given, from he who hath not, it shall be taken away”, which is also a good metaphor for the way neuronal links are established in our brains.

To undertake such a huge project would require enormous amounts of computational power and memory and may as of yet still be beyond what is technically possible. This is the disadvantage. But the computational power and memory of computers has been increasing in an exponential manner over many decades and there is no reason to believe that the required technology is not within close reach.

The applications and commercial advantages are numerous.

Chatbots and other linguistic systems can be improved by learning from these correlation maps. Search engines can be improved by displaying results in a ranking according to proximity mapping with weighted frequency. At the bottom of a search you could have suggestions in the form of “people who looked for these terms also looked for…”.

Commercial ontological mappings can be created where terms are linked to all companies involved in trade of products relating to the term. Just like Alitora systems has mapped how certain genes linked to diseases are connected to the companies who develop drugs against these diseases via a mechanism involving the associated gene, protein or metabolic pathway.

Thus one could also create the Commerce Memome (Commercome) as a searchable database: the entire set of all commercial relations i.e. the products linked to the sellers, buyers manufacturers etc. Commercomics would map the relations in an ontological manner. Once such a network of information has been created, it will have been become a very useful and simple way of identifying your competitors and newcomers in the field (provided that the system is kept up to date).

Advertisement could greatly benefit from such correlation maps. In analogy to suggestions in the form of “people who looked for these terms also looked for..” ontology mapping based technology could be employed in advertisement: I.e. based on the same principle in analogy to what happens on commercial sites such as Amazon.com: “people who bought A, also bought B” but going a bit beyond this principle in an evolutionary and learning algorithm. E.g. advertisement costs could be linked to the frequency of clicking on the ad in question (PPC advertising), while simultaneously having the frequency of display of the ad also linked thereto. In this way again obeying the principle of “to he who hath it shall be given, from he who hath not it shall be taken away”. Another commercial data and text mining mapping could involve mapping frequency of ad clicking to certain search terms. This could likewise be coupled to a system that links advertisement cost to click frequency and/or display frequency. Again the AIbot providing these functions would learn from context and tailor display of information according to the context. Again the AIbot would generate classes and mine more specific correlations from the generated subclasses.

FAQ sheets inquiries could be helped by such AIbots, preferably capable of conversing in natural language as a chatbot. From replies and questions and user satisfaction results, such bots could be programmed to learn and evolve to more efficient information providers.

Leave a Comment

Your email address will not be published. Required fields are marked *