Search This Blog

Thursday 7 July 2016

The First One on Chatbots

Chatbot is a computer program that talks like a human to the user usually via a chat-room/messenger interface. It is daunting task for a computer to understand what a user is saying and respond to him/her in a sensible manner, since our languages are complex and we do not follow strict grammar in a casual conversation. But there are some shortcuts through which it can fool the user into believing that the responder on the other side is human and not a robot.

Sane and even some insane humans have enough intelligence infrastructure in their brain, to understand what other humans are saying(assuming they speak the same language) and respond to them with appropriate response. But how can a computer do that? It is obvious that the AI scientists are trying to mimic/embody the human or human like intelligence in computer for a while and with varying measures of success. How can then chatbots can accomplish its goal - talk like a human? Well, as mentioned above, there are tricks and shortcuts with we can "program" chatbots to be intelligent enough.

There are various ways devised and discovered from our experience with the field of artificial intelligence to embody intelligence into chatbots. The very old one(to my knowledge) is pattern-matching with some level of context management and that is what we going to look into. For instance,

Kayalvizhi: What is your name?
Edinburgh: My name is Edinburgh.
Edinburgh: What is your name?
Kayalvizhi: My name is Kayalvizhi.

I admit that is a lame example, but it illustrates the point. Our usual conversations tend to be in a pattern. Usually the human languages have lot more words than how many are used in day-to-day life. With this fact on hand we can employ pattern matching to construct a meaningful message to user. Let's see how we can build a bot for the above conversation in RiveScript.

RiveScript

[From Rivescript website] RiveScript is a simple scripting language for chatbots with a friendly, easy to learn syntax. RiveScript exposes a simple plain text scripting language that's easy to learn and begin writing in quickly. RiveScript has a handful of simple rules that can be combined in powerful ways to build an impressive chatbot personality. Write triggers in a simplified regular expression format to match complex sets of word patterns in one go. RiveScript takes a "Unix-like" approach to its development: the core library is small and self-contained and it does one thing very well—takes human input and gives an intelligent response. This flexibility enables RiveScript to be used how you need it to.

Basically RiveScripts consists of triggers and responses and something called topic.

Triggers are messages typed into, by the user. The chatbot read the messages, and it matches with the list triggers(already programmed) and finds a best match and then responds with the response string from the matched trigger.

one.rive   -- our first script.
+ my name is edinburgh
-  that is nice name

When ever you say that "my name is edinburgh" the bot will reply "that is a nice name." This is a very dumb bot, and it understands only one sentence - "my name is edinburgh".

Lets modify it a little bit, so that anyone can talk to our bot.
two.rive
+ my name is *
-  hi, there, how are you?

Conversation 2
Kuzhali: my name is kuzhali
Bot       : hi there, how are you?

Cheran:  my name is cheran
Bot      :  hi there, how are you?

Mark   :  my name is mark
Bot      :  hi there, how are you?

Toyota :  my name is toyota
Bot      :  hi there, how are you?

Wednesday 11 May 2016

Thoughts on DNN and AGI

DNN/Deep learning has impressed us with its capability to recognize image and audio with unprecedented accuracies.

Systems like AlphaGo(which beat the best human player on Go), are trained on huge cluster of powerful machines over very very large dataset. Google spend millions if not billions on such projects.

But the knowledge consumed by AlphaGo cannot be understood("do we need to?" is an ethical question though). In otherwords, we cannot instruct the system to perform a task, procedurally. Humans still have to be trained for developing a certain skillset, That is how we learn to work as a team, Read, integration with other systems.

Training the intelligence systems involves either of the following methods.
  1. Provide the system with large dataset and learning algorithms, and let the system figure out the best possible representation for knowledge.
  2. Provide the system with large dataset and learning algorithms along with suitable knowledge representation model.

Thre are two kinds of knowledge embedded in the data. The image and audio streams are raw in nature, i.e they are concrete knowledge, whereas the text and content of speech and visuals are comprised of abstract knowledge. Infact, they contain knowledge with multiple levels of abstraction.

To put  it in different perspective, if we consider image/audio recognition to be mining of a pattern from clump of data, then we can think of the recognition of abstract entities or ideas among the content of image/audio to be mining patterns within patterns within patterns and so on.

DNN distributes it knowledge over the network in the form of weights. The weights is the knowledge. This works well for concrete data like image and audio. But for abtract ideas, it may not. Even it will, it will be with great diffculty. I will explain why I think so.

Lets take how we feed the inputs to DNN. It takes a vector as the input. Images/audio, naturally gives themselves into a vector. But for, abtract content we need to represent the abstract entities in the form of vector.

For instance, in case of word embeddings, the words of the langauge are assigned an interger. How do we assign an integer to a particular word varies from you to me to another person. We train the system and made it do something useful. Now the knowledge gathered by the system is not share-able. Since the knowledge is represented by the weight matrix, a subset of the matrix outside the whole matrix is probably meaningless. For each element in the matrix, every other element sets the context.

In constrast to DNN, other AI systems such as OpenCog for instance represent, knowledge in the form of atoms in an hypergraph. The entire knowledge base is contain in what is called Atomspace. This atomspace is used to store all kinds of knowledge, declarative and procedural. The atomspace can be instructed to perform something though rewritting the graph, i.e the knowledge.

Mining patterns within patters can be done relatively easily with such represention, by scientists and test different learning algorithms and understand how they behave. i.e we can be the psycologits of AGI machines.

Although it may be possible to build human like brain with just DNNs, it will not be accessible to everyone, due to the huge cost involved. The community at present cannot afford to spend the cost, be it money, time, education. So, I believe it is better to employ an hybrid approach. Use DNNs for recognizing concrete knowledge and OpenCog like systems for more abstract level idea, like metaphors.

How do make AGI possible as a community? Setup a BOINC like infrastructure for AGI training? Distribute the Hypergraph over a P2P network like the meshnet? How do we avoid the corporate lock-in?

Tuesday 15 March 2016

MeshNet in Puducherry

It is been three years almost.  Welcome to new era of mesh-networks. This is the first post about our attempt to implement mesh-network in Puducherry. A series of posts will follow.

File:NetworkTopology-Mesh.svg


What is Microsoft, Google and Apple to computers and services are the Cisco, Juniper, Verizon, and in india Airtel, BSNL and Reliance to network infrastructure. They pour billions if not trillions of dollars into building and maintaining the infrastructure. And even more money into supressing the competition and alternatives.

What do they have?

1. Establishment - BSNL, Reliance and Tata are unquestionably the backbone of networking infrastructure in India.
2. Marketing - Airtel owes its strong presence in India to its marketing.
3. Power -  Some of these organizations can effect regulations of govt, if they wish.
4. Money - In terms of billions a year.

Money and power creates a positive loopback aiding these organizations to attain even more money and power. So our sane intial step will be to create a network which complements and works in harmony(whenever necessary) with the already existing network.

What do we have? A clutch of interested volunteers. If we are to rival these organizations(which is possible, I agree) we need the support of the people, not just few of us who want to deliver better alternatives to the people. Let's face it. In general people are reluctant to try and learn new things. The reason is they don't want a better system, when exisitng one gets the job done. we resist change, because we don't trust each other. So our ideal goal should be to make the people more liberal. Making people liberal can be a life long project on its own. So lets settle for the mesh-net.

Nothing works better than drama to gain people's attention. Attention is not enough. You know why male perfume advertisements throw a few hot girls in there. We need to attract people. We need to organize a grand event(what do we do in that event?) in a college(no other place comes to my mind, suggestions?)

As already mentioned, people won't trust us with their devices to install some great tool which will enhance public knwoledge(but they are ready to install crappy games and flash light apps, which will steal the shit out their phone[1])

As a branch of thoughts from our first MeshNet meetup at open-drop
* We need to install mesh nodes among few potential zones.
* We need devices to do so? How many devices do we have at hand(Should have a discussion on which cheap and feature-rich router to use for our campaign, if you will)
* We need funding
* We need a dedicated machine to compile openwrt to multiple targets(I guess that can be arranged)

Lets say we do organize a grand event. Who are all going to come to that event. What is going to be our influence over them and what is going to their influence over our activity from there on. How do we make sure that people stick to the plan. What activity will demostrate the power of mesh network. How do we illustrate the difference between centralized and decentralized networks to them, the laymen.

Talking and presenting are the only key ingredients, I agree. There is a little problem with that. It take too much time, and active participation from us. How many of us have the gumption and conviction to go through with this?

People and society have their own issues. And we do too?
* Our own goals
* Finanicial issues
* Peronal issues
* Gradual decline of (interest/attention?)taste/motivation in one thing

If talking and presenting are not our only devices, what else are we gonna employ?
Cookipedia and its kind, are good ideas.
Another messaging app?
Localizeed P2P file sharing? (We already have wikipedia dump)

If we are gonna host services, where do we host'em? (assuming ipfs is not our initial step) and what purpose the internet using individuals do mostly? browse web? share stuff? watch porn?

We must admit that this is unchartered territory for most of us. We need to sit and go through countless frustrating hours to get it working both technically(we are lucky here, since most of the things are in place) and politically(so damn unlucky).

I think and hope the responses to this post generates more questions than it answers.

Lets talk about some technical problems:
how do we identify other people, in mesh-network.
Someone mentioned about statistical algorithms for semi-human-readable(seeding names instead registering by hand) names in the comments section of this post[2]

And very importantly, we need to have bigger goals than just implementing mesh-net as an alternative internet. We by ourselves need a direction, a higher purpose to empower people and let them grow, and for which the mesh-net will act as a viable tool. The desire to implement mesh-net by itself will not drive motivation. It is not the desire to gain money that drives us to gain money, it is the thought of spending it after gaining it.
    உள்ளுவது எல்லாம் உயர்வுள்ளல் ‍மற்றது
    தள்ளினும் தள்ளாமை நீர்த்து.                     - வள்ளுவன்.
[1] http://truthinmedia.com/exclusive-top-10-flashlight-apps-are-stealing-your-data-even-pics-off-your-phone/
[2] https://nohats.ca/wordpress/blog/2012/04/09/you-cant-p2p-the-dns-and-have-it-too/