Whatever is here, is found elsewhere. But what is not here, is nowhere else. – Thus goes a line in the first chapter of Mahabharata
In the 21st century, the same could be said about the Internet
Internet has become the sum total of the entire knowledge collected by humans all over the known history.
Thousands and thousands of new web pages are added everyday, new blog articles, new web sites, new videos, more and much more. The internet is growing at a phenomenal rate.
And our door to access this massive amount of data are the search engines. Without search engines we will never know where what is in the internet.
Search engines take the keywords from us and search for the pages that its web crawlers have indexed for those keywords.
But is that enough? A typical search can give millions of pages, or at least tens of thousands of pages as result.
We may expect the information that we are looking for in the first few pages. But that is not the case always. The more specific our keywords are, the better will be the result.
If you type Gurudev in Google, usually the very first result will take you to my this website . But then I am sure, most of the people typing the word Gurudev, are looking for information on something else related to the word, and they are definitely not searching for me They might be probably looking for information on some spiritual/religious personalities!
In other words, to give real meaningful results, we should be able to ask the internet, than search the internet.
Search engines today index the entire internet.
Instead we need, intelligent knowledge engines, which instead of indexing the pages on the internet, will actually read and understand the contents of the web pages, and then instead of searching for keywords that we type in, these knowledge engines will actually reply to the questions we ask!
These Knowledge Engines, instead of displaying millions of pages of results, can provide a couple of links which contain the most accurate answers to our questions.
So instead of showing every web page which contains the word Gurudev, it should ask us further, what information on Gurudev are you looking for? Probably it can also provide a few options like,
Are you looking for a person with this name?
Are you looking for the meaning/origin of this word?
Are you looking for spiritual personalities?
and so on
In other words, the knowledge engine should also be able to think on the lines of humans.
It should be able to perform many more tasks which today’s search engines cannot do.
For instance it should be able to look at pictures/images and understand what they are about. It should be able to understand the picture. Search engines today only rely on keywords that mark the images. If the keywords are misleading, then even the search engines will be mislead. For instance, if one takes an image of Dog and marks it with a keyword monkey, search engines today can never know that it is the image of a Dog!
One good example of this type of misleading search engines is google bombs . Google bomb is where a given page is linked all over the internet as much as possible with a specific set of keywords, and Google search engine rates that page highest for those keywords. For instance when you search for the words miserable failure,a few months back the very first link used to take you to white house representing George Bush!! This was because these google bombers had created a large number of webpages all over the internet, with keywords miserable failure linking to the homepage of George Bush!
So in order to avoid these misguides, our K-engine should be able to understand things. It should be able to see images, watch videos, listen to audio/songs.
And then we should be able to query it even by trying to sing a particular tune, and ask it for that song!! Of course, this require speech recognition also.
Or we should also be able to describe it a particular scene from a movie and ask it for the videos related to that scene/movie!
There is another difficulty here. Information which is both right and wrong is spread everywhere on the internet. If there is one webpage which gives correct information on a subject, there are many more which also give wrong information on the same subject. It is sometimes really difficult to find out which version is correct and which one wrong!
So the K-engine (Knowledge engine) should also be able to differentiate between correct and incorrect information. And this expertise would come only on having its basics right. In other words, probably we need to first provide it a lot of basic information which is accurate, and then allow it to crawl the internet and increase its knowledge, where by it is able to differentiate between correct and incorrect information based on the basic knowledge it already has.
Even then it might be still difficult for our K-engine to sometimes decide on correct and wrong information. In that case it can store both the information as probable answers, and then a real human expert in the company which runs the K-engine can tell it which version of the answer is the correct one!
Simpler said that done, developing a K-engine of this high level of sophistication would require an advanced combination of natural language processing and artificial intelligence.
But the benefits of such a K-engine would be enormous and would open the doors to the next generation of technological advancement on the internet.
A speech recognition feature added to this K-engine would mean that we can quickly and easily move internet search even to the hand held mobile devices, where we ask questions as if we are talking to a very knowledgeable friend
This feature can also be used to ask customized questions and even make the K-engine perform some tasks to us like:
Hi, can you please see if there are any tickets available to the movie Back to the future at PVR cinemas in Bangalore for this Saturday evening ? If yes, can you please book four tickets for me? My card number is .
This of course requires, that K-engine be integrated with online purchase systems and the databases of PVR cinemas. In other words, there is a lot of scope for practical application of K-engines to an even more greater extent.
We can give some interesting name like Brahmacle (Brahma + Miracle) to this K-engine.
Life simplified than ever before
Gurudev,
Brahmacle is disappointing from you I would expect a much better name.
really interesting.. if you work on converting these ideas of yours into a reality then India will instantly become an IT superpower and you will be the next Bill Gates ;)
why that google image gurudev ?? was it neccesary ?? :(
Gurudev,
I guess we agree that a gap exists in our efforts to create a K-Engine. A part of it is semantics. Here is a hirearchy Any comments/ suggestions at beyond LOGIC?= ?
-root (beyond LOGIC?= features needed to understand like humans.
LOGIC-Statements(HIGHER ORDER+TEMPORAL+MODAL)
FOPC Predicates(e.g VERBNET ,SUO-KIF word net mappings)
RDF Triples(Simple Predicates standardized)
RDBMS/XML(advantage XML)
I am not sure it will show correct(Hierarchy),Here’s a verbal version
– The semantic web is an improvement on the RDBMS model where the semantics is implicit in the definition of entities, tables etc.
-The semantic web model deals with the RDF triples.
-RDF triples model is a subset of FOPC.
– FOPC is a subset of LOGIC(FORMAL LOGIC).
-Is there something beyond LOGIC?
Comments and suggestions -How do you represent all beyond logic. Can it be done at all?/(Recall the limitations of turing machines)
Oh yes! it is feasible, for I use the word ‘impossible’ with great caution :)
I believe that even though intelligence can be simulated via software,
consciousness/self awareness is a proprietary feature of life! We cannot CREATE consciousness/awareness in a software, we can only SIMULATE intelligence/thought process in it.
In other words, the K-engine might be answering millions of questions.. but it by itself will not be AWARE of its own existence!
Had penned down a bit on this sometime back
http://hitxp.wordpress.com/2007/05/13/artificial-intelligence-and-awareness/
Hi GD,
Are you trying to create a GOD here? :-)
Even though such a search engine that answers your question sounds exciting, is it feasible?
One question to answer in a parallel ti this is how of intelligence/conciousness in embedded in our brain alone?
Arpitha
Semantic web describes how future internet should be like.
What Gurudev described is how future search engine should be like. If what Gurudev wrote about becomes a reality then we wont need any future version of internet like web 3, web 4 etc at all!
Regards
Maitreyi
No Arpitha
Semantic web/RDF/Web 3.0 is about moving the internet from a document oriented nature to a data oriented nature, where it becomes easy for software apps to differentiate between formatting code and actual data, and the relationship between the data.
In other words, semantic web would make it easy for existing classical search engines like google to index only meaningful data. Currently many a times even data in a page which is outside the actual page content gets indexed! For instace, ad information in the page, unrelated user comments etc. Semantic web would remove all such unrelated data from indexing of a particular page.
But the search mentioned in the above article goes much beyond reading organized data and relies completely on AI.
The mastery of AI fields (which are still in their infancy) like image recognition, audio recognition, video recognition, etc are required for this, along with natural language processing abilities.
In other words, suppose the engine reads a pdf document which is a e-book version of some book, the engine should NOT index it, but should read and understand it (make sense of the data), and then store it in a format as to represent its understanding!
To summarize, semantic web requires the data representation to be changed from the existing style, to make sense of the data.
Where as, in case of above engine, it should be able to make sense of the data, even if it is in the web 1.0 format. Because for the above K-engine, what matters is the actual data itself, and not how it is represented!
In other words, its like looking at a software (K-engine) like a human, instead of looking at it like a software :)
As I said, its easier said than done, needs a lot of AI and natural language processing. Infact AI here becomes almost HI (Human Intelligence)!
The one which you are referring to is something called ‘ Semantic Web’. There’s been an extensive research going on in this field for years. Especially the research papers on IEEE or W3 Consortium will provide an insight into the efforts going into this stream.